Software applications run on a wide variety of platforms (filesystems, virtual slices, mobile hardware, etc.) that do not provide 100% uptime. As such, these applications may crash at any unfortunate moment losing volatile data and, when re-launched, they must be able to correctly recover from persistent state. From a verification perspective, these kinds of bugs can be particularly frustrating because, even when it has been formally proved for a program $P$ that $P |= phi$, the proof is foiled by these external events that crash and restart the program.
In this paper, we introduce a novel technique capable of automatically proving that a program correctly recovers from a crash via a reduction to reachability. Our technique takes an input control-flow automaton and transforms it into a novel encoding that blends the capture of snapshots of pre-crash states into a symbolic search for a proof that recovery terminates and every recovered execution simulates some crash-free execution. Our encoding is designed to then enable us to apply existing abstraction techniques (interpolation, CEGAR, termination refinement, etc.) in order to do the work that is necessary to prove recoverability.
We have realized our technique in a tool called Eleven82, capable of analyzing C/C++ programs to detect recoverability bugs or prove their absence. We have applied our tool to examples drawn from industrial file systems and databases, including GDBM, LevelDB, LMDB, PostgreSQL, SQLite, VMware and ZooKeeper. Within minutes, our tool is able to discover bugs or prove that these fragments—which use sophisticated recovery algorithms such as shadow paging and write-ahead logging—are crash recoverable.
Wed 20 Jan
|10:30 - 10:55|
|10:55 - 11:20|
|11:20 - 11:45|
|11:45 - 12:10|