Reducing Crash Recoverability to Reachability (POPL 2016 - Research Papers)

Who

Eric Koskinen, Junfeng Yang

Track

POPL 2016 Research Papers

Time Zone

The program is currently displayed in (GMT-05:00) Guadalajara, Mexico City, Monterrey.

Use conference time zone: (GMT-05:00) Guadalajara, Mexico City, MonterreySelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 20 Jan 2016 11:45 - 12:10 at Grand Bay North - Track 1: Algorithmic Verification Chair(s): Arie Gurfinkel

Abstract

Software applications run on a wide variety of platforms (filesystems, virtual slices, mobile hardware, etc.) that do not provide 100% uptime. As such, these applications may crash at any unfortunate moment losing volatile data and, when re-launched, they must be able to correctly recover from persistent state. From a verification perspective, these kinds of bugs can be particularly frustrating because, even when it has been formally proved for a program $P$ that $P |= phi$, the proof is foiled by these external events that crash and restart the program.

In this paper, we introduce a novel technique capable of automatically proving that a program correctly recovers from a crash via a reduction to reachability. Our technique takes an input control-flow automaton and transforms it into a novel encoding that blends the capture of snapshots of pre-crash states into a symbolic search for a proof that recovery terminates and every recovered execution simulates some crash-free execution. Our encoding is designed to then enable us to apply existing abstraction techniques (interpolation, CEGAR, termination refinement, etc.) in order to do the work that is necessary to prove recoverability.

We have realized our technique in a tool called Eleven82, capable of analyzing C/C++ programs to detect recoverability bugs or prove their absence. We have applied our tool to examples drawn from industrial file systems and databases, including GDBM, LevelDB, LMDB, PostgreSQL, SQLite, VMware and ZooKeeper. Within minutes, our tool is able to discover bugs or prove that these fragments—which use sophisticated recovery algorithms such as shadow paging and write-ahead logging—are crash recoverable.

Eric Koskinen

Yale University

Junfeng Yang

Columbia University