Musings on a perfect file system
Note: after I started writing this, I noticed that Hans Reiser was smoking pretty much the same weed, but he actually did a lot of real work and implemented some ideas in Reiser4.
- a universal data store, processor and transport agent
- distributed
- peer-to-peer
- like IP, ubiquitous
- DBMS-based
- embedded full-featured versioning
- copy-on-write with on-demand real-time garbage collection
- different versions of one piece of data can well be found in different parts of the globe after a COW
- disk drives and remote storage are handled similarly
- either reordering is independent of FS, or
- reordering is embedded in FS, so that it can be done in an efficient manner, considering servers will have many clients
- like DNS, recursive and non-recursive operation modes for nodes
- enterprise-wide proxies are possible
- compression
- encryption
- any kind of programmable transformations via hooks at many levels
- should make many apps obsolete immediately
- most p2p apps
- most vcs, at least in part of versioning functionality
- many RDBMS
- many other DBMS
- most DB-based apps can be rewritten with simple unix-style tools
- ubiquity means almost every piece of storage in every device is part of a global internet-based file system
- keeping data local just means marking it as such (to be stored locally)
- backups come with versioning and COW snapshots
- making remote backups just means marking data (or parts of it, e.g. spans of version trees and/or datetime spans of snapshots) to be replicated remotely
- p2p-enabled free, secure, ultra-reliable distributed backups
- e.g. you dedicate 50% of your hard drive for others' data
- your data is replicated on hard drives of others'
- privacy can be ensured via strong encryption, like gpg
- smart caching is vital
- writing/locking/collaboration management is quite complicated
- trust management
- data transfer can be done via existing protocols, like ftp, http/webdav, ssh/scp/sftp
- efficient connection caching is vital
- btw, centralization of connection caching/multiplexing on the OS level sounds like a good idea
- if implemented, most apps can benefit from things like ssh connection multiplexing and smtp postfix/anvil management
- successful transition requires high-level fuse-like implementations
- transfer speeds cover wide range
- from dead-slow background backups
- to real-time high-traffic interactive medical imaging
- external search engines (if relevant) become find(1) backends
- adaptation to different load patterns
- built-in fully-featured scheduling capabilities
- between all kinds of objects
- mirroring can be slow by default, but link-speed if prioritized
- different profiles for different devices
- diskless operation would not require the logic to deal with local hard drives
- the simplest profiles should be as simple as FAT16
- on-disk format should be compatible across all profiles
- at the very least it should be upwards compatible - from simplest to fullest
- meta-data is a first-class citizen
- as flexible as data itself
- can be aggregated/mirrored separately from data
- index aggregators =~ search engines
- can include derivatives of several degrees
- indexes of indexes to optimize and accelerate search
- very verbose logs can be kept in metadata as space permits
- logging everyone who accessed a piece of data
- can be used later to redirect clients who want the piece
- flexible, pluggable auth methods
- built-in support for things like payment
- variable block (chunk) size, may be fixed in simpler profiles
- optional per-file-tunable per-file/per-block checksums
- optionally delayed
- fast non-checksummed writes under load
- fast non-verified reads under load
- background checksumming when load is low
- foreground checksumming on request
- background verification (scrapping) under low load
- on-demand tunable mandatory read verification
Introducing structure
- thanks to DB features old situations with lots of small files can be converted to structured DB entities
- data multiplexing (e.g. multimedia) can be done at file system level
Imagine a perfect workflow
- you don't work with files and folders
- you don't download stuff
- just play /Music/SomeBand/ASong - and the global file system gets the data to you
- caching it locally
- it may be cached by your ISP
- when your neighbors request the same song, they will probably get most of the data from you
- you don't send stuff
- just work on /MyCompany/MyDept/CurrentProject
- edit /Wikipedia/Article
- cache (=mirror=clone) any open resources for offline read-write access
- branch, edit, merge any data
- manage processing resources
- indexing everywhere
- instead of going to a website to rate a movie, you create a new attribute with your identity, score, and optional comment/essay
- one or more imdb-like central entities can then collect these attributes from users and calculate averages, and mirror comments
- *everything* is one file system
The problem is
- making everything dead-simple
Is file system limited to data?
- of course not
- processes
- RAM
- devices
- any possible objects
- all globally shareable
- plan9-style
- mv /MyLaptop/Processes/Rendering /MyCompany/ServerFarm/Processes/
- cp /Music/One/Album /MyDesktop/Devices/CDWriter
- in fact the real syntax would imply mirroring the Album to the CDWriter
- cat /MyKitchen/Devices/CoffePot/Status
- vim /People/JohnDoe/PoliceRecord
- cat /Orbit/Hubble/InfraCam/Raw| * /SomeSuperComputer/Filters/InterpolateInfra| * /AnotherCluster/GenericDataMiner| * /BlackMesa/Visualize| * /Astro/Data/GreenMen51
Links
Google Video
Wikipedia
Categories
Articles
Misc
Topic revision: r8 - 09 Nov 2007 - 13:29:07 - Main.AndrewPantyukhin