dbvcs
note
this page is obsolete. I need to think about decentralization (being distributed).
stupid schema
lineid - line
fileid - filename - rev - date
fileid - filelineno - lineid
fileid - wc-l
fileid - checksums
linked lists and overlays
they might help, but that's rocket science.
we can introduce a notion of a history unit - like a file with a tree of revisions, but with more powerful cherry-picking functionality. In principle, a history unit can even be split into multiple units, each representing a separate file history - and with cherries traversing between them - and other units. Imagine dragging #include's out of a .c file into a header - along with all the history of headers. That's possible thanks to a line-oriented design.
The question is, how to implement the whole thing: history units, revisions, cherries, changesets, etc.
any types of objects
lines, text chunks, files, paragraphs, all kinds of blobs, all kinds of diffs (binary, text, ...)
stored and derived: you can get a diff between a file's versions and you can base a file's versioning on diffs
unlimited flexibility
typed referencing - a file can consist of lines and paragraphs, interspersed
better space efficiency - a new line in the middle of a long file built with small paragraphs triggers only a new paragraph and a reference in the file replaced
paragraphs can be language-specific syntactical elements, like functions - it can lead to enhanced cherry picking and code reusage in general
tagging
what about tagging every lineid with fileid's?
footnotes
id's are probably 32 or 64-bit. lineid's are probably at least 64-bit.
pro's and con's
major cons: line based
cons: considerable overhead
pros: O(1) retrieval of any rev
Topic revision: r2 - 26 May 2009 - 09:43:39 - Main.AndrewPantyukhin