Parser combinators are a popular approach to parsing where context-free grammars are represented as executable code. However, conventional parser combinators do not support left recursion, and can have worst-case exponential runtime. These limitations hinder the expressivity and performance predictability of parser combinators when constructing parsers for programming languages. In this paper we present general parser combinators that support all context-free grammars and construct a parse forest in cubic time and space in the worst case, while behaving nearly linearly on grammars of real programming languages. Our general parser combinators are based on earlier work on memoized Continuation-Passing Style (CPS) recognizers. First, we extend this work to achieve recognition in cubic time. Second, we extend the resulting cubic CPS recognizers to parsers that construct a binarized Shared Packed Parse Forest (SPPF). Our general parser combinators bring the best of both worlds: the flexibility and extensibility of conventional parser combinators and the expressivity and performance guarantees of general parsing algorithms. We used the approach presented in this paper as the basis for Meerkat, a general parser combinator library for Scala.
Tue 19 JanDisplayed time zone: Guadalajara, Mexico City, Monterrey change
10:30 - 12:00 | Parsing & Domain-Specific Languages IPEPM at Room Harbor View Chair(s): Kenichi Asai Ochanomizu University | ||
10:30 30mTalk | Practical, General Parser Combinators PEPM Anastasia Izmaylova Centrum Wiskunde & Informatica, Ali Afroozeh Centrum Wiskunde & Informatica, Tijs van der Storm CWI DOI Pre-print | ||
11:00 30mTalk | Operator Precedence for Data-Dependent Grammars PEPM DOI Pre-print | ||
11:30 30mTalk | Everything Old Is New Again: Quoted Domain-Specific Languages PEPM Shayan Najd , Sam Lindley University of Edinburgh, Josef Svenningsson Chalmers University of Technology, Sweden, Philip Wadler University of Edinburgh DOI |