sibylfs.io

Tom Ridge

2015-09-21
(with Thomas Tuerk, David Sheets, Andrea Giugliano, Anil Madhavapeddy, Peter Sewell;
talk at departmental Industrial Advisory Board; 20 mins talk+? mins qus)

The problem

The problem: specifications

The problem: implementations and specifications

Another problem: how do implementations differ?

Do specifications really matter? Do file systems matter?

Who said this (c. 2009)?

Speaking with my 'git' hat on, I can tell that
  • git was designed to have almost minimal requirements from the filesystem, and to not do anything even half-way clever.
  • despite that, we've hit an absolute metric sh*tload of filesystem bugs and misfeatures...

...we found a real data-loss CIFS bug thanks to that. Afaik, the bug has been there for a year and half. Don't tell me nobody uses cifs...

Before that, we had cross-directory rename bugs. Or the inexplicable "pread() doesn't work correctly on HP-UX". Or the "readdir() returns the same entry multiple times" bug. And all of this without ever doing anything even remotely odd. No file locking, no rewriting of old files, no lseek()ing in directories, no nothing...

...But I also think that the "we write meta-data synchronously, but then the actual data shows up at some random later time" is just crazy talk. That's simply insane. It guarantees that there will be huge windows of times where data simply will be lost if something bad happens...

...The point here? Sometimes those filesystem people who say "you must use fsync() to get well-defined semantics" are the same people who SCREWED IT UP SO DAMN BADLY THAT FSYNC ISN'T ACTUALLY REALISTICALLY USEABLE!

And more recently http://www.itworld.com/article/2868393/linus-torvalds-apples-hfs-is-probably-the-worst-file-system-ever.html

OK, so what should a specification (in general) look like?

sibylfs.io

What have we done?

How does everything fit together?

System structure

Example test script

A test script is essentially a list of libc calls to execute on a real-world system

 @type script
 #####################################
 # Test link___link_nonempty_dir1__d2__sl_dotdot_d2___nonempty_dir1__nonexist_4
 #####################################
 mkdir "empty_dir1" 0o777
 mkdir "empty_dir2" 0o777
 mkdir "nonempty_dir1" 0o777
 mkdir "nonempty_dir1/d2" 0o777
 open "nonempty_dir1/d2/f3.txt" [O_CREAT;O_WRONLY] 0o666
 write! (FD 3) "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor inc" 83
 close (FD 3)
 ... // further setup commands
 link "nonempty_dir1/d2/sl_dotdot_d2" "nonempty_dir1/nonexist_4"
 ...

Example trace (behaviour of real-world system)

A trace records the libc calls from the test script, and the responses received from the real-world system

# processing file 'link___link_nonempty_dir1__d2__sl_dotdot_d2___nonempty_dir1__nonexist_4-int.trace' ...
        @type trace
        #####################################
        # Test link___link_nonempty_dir1__d2__sl_dotdot_d2___nonempty_dir1__nonexist_4
        #####################################
     5: mkdir "empty_dir1" 0o777
        Tau
        RV_none
     6: mkdir "empty_dir2" 0o777
        Tau
        RV_none
     7: mkdir "nonempty_dir1" 0o777
        Tau
        RV_none
     8: mkdir "nonempty_dir1/d2" 0o777
        Tau
        RV_none
     9: open "nonempty_dir1/d2/f3.txt" [O_CREAT;O_WRONLY] 0o666
        Tau
        RV_num(3)
    10: write! (FD 3) "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor inc" 83
        Tau
        RV_num(83)

...

    28: link "nonempty_dir1/d2/sl_dotdot_d2" "nonempty_dir1/nonexist_4"
        Tau
        RV_none

...

Example checked trace

Testing

Test results

We found the following sorts of bugs

Test results, stats

Test results, strange behaviours

A FreeBSD bug

Testing summary

A virtuous circle

A virtuous circle

Conclusion

Possible uses of the spec

Future work

Lots of possibilities. In the short term, we hope for take-up from industry.

We are currently working on a verified file system implementation. Compared to existing file systems, we hope for:

Thanks for listening. Any questions?