Basics of context-free grammars

Simple arithmetic expressions:

E -> E "+" E 
E -> ?num?

E -> E "+" E | ?num?

Terminology: terminal, nonterminal, symbol, rule, grammar

N.B. RHS of rules cannot be the empty list

Basics of parse trees

Each terminal is associated with a set of strings

Parse trees are trees which are well-formed according to the grammar, and where the strings associated with terminal nodes are those associated to the terminal at that node

Parsing

Given a grammar, and an input string, parsing is the process of determining (somehow, and in some form) what parse trees correspond to the input.

There may be exponentially many parse trees:

E -> E E | "1"

There may be infinitely many parse trees:

E -> E | ""

Combinator parsing

Represent rules for a nonterminal:

E -> "(" E "+" E ")" 
  | "1"

By a recursive function:

(* following code ignores results and returns a dummy value *)
let rec parse_E i = (
  ((a "(") **> parse_E **> (a "+") **> parse_E **> (a ")"))
  ||| (a "1"))
  i

Given an input string s, a parser returns a list of things of the form (v,s'), where v is the result of parsing a prefix of s, and s' is the remainder of the input that wasn't parsed.

This is prefix parsing.

Combinator parsing, terminal parsers

e.g. following used for (a "1") or more generally (a s)

let a s = (fun i -> 
  if starts_with i s then
    let (u,v) = split i (String.length s) in
    [(u,v)]
  else 
    [])

let _ = (a "1") "123"
(* [("1", "23")] *)

let _ = (a "1") "abc"
(* [] *)

Combinator parsing, combinators

The combinators ||| and **> are defined in the obvious way:

let (|||) p1 p2 = fun s -> List.append (p1 s) (p2 s)

let ( **> ) p1 p2 = (fun s0 ->
  (* apply p2 to results (e1,s1) from p1 *)
  let f (e1,s1) =
    p2 s1
    |> List.map (fun (e2,s2) -> ((e1,e2),s2))
  in
  s0
  |> p1
  |> List.map f
  |> List.concat)

Here, x |> f is just f x

Note these combinators are written infix e.g. p1 **> p2

let _ = ((a "1") **> (a "2")) "123"
(* [(("1", "2"), "3")] *)

Combinator parsing, combinators

There is another function which we need, the "action" function:

let (>>) p f = (fun s ->
  s
  |> p
  |> List.map (fun (e,s) -> (f e, s)))

Note this is written infix e.g. p >> f

let _ = ((a "1") >> (fun _ -> 1)) 
(* "123" = [(1, "23")] *)

With the following result

(* E -> "(" E "+" E ")" *)
let rec parse_E i = 
  let f1 = 
    ((a "(") **> parse_E **> (a "+") **> parse_E **> (a ")") 
      >> (fun (_,(e1,(_,(e2,_)))) -> e1+e2))
    ||| ((a "1") >> (fun _ -> 1))
  in
  f1 i

let _ = parse_E "(1+(1+1))"
(* # - : (int * string) list = [(3, "")] *)

The glory of combinator parsing

Arguably the best programmer interface to parsing: full power of host functional language available (parameterization, modules, types, interactive use and debugging etc)

Some problems with combinator parsing

Cannot handle all grammars; e.g. no left recursion: the following leads to non-termination (or, more realistically, stack overflow)

E -> E "+" E | ...

Inefficient; essentially because there is no attempt to reuse the results of parses of the same part of the input (memoization doesn't really help)

E -> F G | F H

and also because prefix parsing can lead to lots of work which is eventually thrown away

E -> F "X"
F -> A B

In this example, assuming there is no "X" in the input, there may be substantial work involved in combining the results of parsing an A followed by parsing a B, before the parser attempts (and fails) to parse an "X".

Summary of problems

Combinator parsing has a great programmer interface, but lousy performance, and can't handle all context-free grammars.

The goal

Consider a combinator parsing library with the following features:

Handles all context-free grammars
Sound
Complete
Conceptually simple (also, helps if implementation is simple)
Efficient
Real-world performance that matches or improves on the current-best parsers for any given subclass of context-free grammar

For context-free grammar parsing, this set of features, I would argue, is the best you could hope for.

Motivating combinator parsing

In several recent publications academics have been trying to construct modular parsers, which can be easily combined in different ways.

Example: given a parser for lambda calculus, and a parser for arithmetic expressions, how to construct a parser for the combined language? e.g. parser accepts:

((\ y. y + 1) 2) + 3

In the following few slides: a modular presentation of parsers for 3 different languages, and their combination.

Modular parsing: arithmetic

let rec parse_A h = (fun i -> mkntparser "arithexp" (

  (* plus expressions *)
  (((parse_A h) ***> w ***> (a "+") ***> w ***> (parse_A h))
   >>>> (fun (e1,(_,(_,(_,e2)))) -> `Plus(e1,e2)))

  |||| ...

  (* helper parser *)
  |||| h)
  i)

Note boilerplate.

The parser w parses whitespace.

The parser h is a "helper parser".

Note that this example includes left recursion: you can't do this with traditional combinator parsing.

Modular parsing: arithmetic

And we can add in subtraction, numbers and brackets in the obvious way.

let rec parse_A h = (fun i -> mkntparser "arithexp" (
  ...
  (* subtraction *)
  |||| (((parse_A h) ***> w ***> (a "-") ***> w ***> (parse_A h))
        >>>> (fun (e1,(_,(_,(_,e2)))) -> `Minus(e1,e2)))

  (* numbers *)
  |||| (parse_num 
        >>>> (fun s -> `Num(int_of_string (content s))))

  (* brackets *)
  |||| (((a "(") ***> w ***> (parse_A h) ***> w ***> (a ")")) 
        >>>> (fun (_,(_,(e,(_,_)))) -> e))

  |||| ...)
  i)

Parsing arithmetic expressions

  let txt = "1+2-3"

  let _ = (p3_run_parser (parse_A never) txt)
  (* [`Minus (`Plus (`Num 1, `Num 2), `Num 3);
  `Plus (`Num 1, `Minus (`Num 2, `Num 3))] *)

never, the helper parser in this example, is a parser that never parses.

Note multiple results - the grammar is ambiguous as well as left-recursive.

Interesting because it goes beyond traditional combinator parsing, but there is more!

Parsing "boolean" expressions

let rec parse_B h = (fun i -> mkntparser "boolexp" (

  (* true and false *)
  ((a "true") >>>> (fun _ -> `Bool true))
  |||| ((a "false") >>>> (fun _ -> `Bool false))

  (* if-then-else expressions *)
  |||| (((a "if") ***> w ***> (parse_B h) ***> w ***> 
            (a "then") ***> w ***> (parse_B h) ***> w ***> 
            (a "else") ***> w ***> (parse_B h))
        >>>> (fun (_,(_,(b,(_,(_,(_,(e1,...))))))) -> `Ite(b,e1,e2)))

  (* numeric comparison < *)
  |||| (((parse_B h) ***> w ***> (a "<") ***> w ***> (parse_B h)) 
        >>>> (fun (e1,(_,(_,(_,e2)))) -> `Lt(e1,e2)))

  |||| h)
  i)

Again, note left recursion.

This is just an example; you might choose e.g. to include < in the arithmetic parser

Parsing lambda expressions

let rec parse_L h = (fun i -> mkntparser "lambdaexp" (

  (* lambda: \ x body *)
  (((a "\\") ***> w ***> parse_var ***> w ***> (parse_L h))  
   >>>> (fun (_,(_,(x,(_,body)))) -> `Lam(x,body)))

  (* application: u v *)
  |||| (((parse_L h) ***> w ***> (parse_L h)) 
        >>>> (fun (e1,(_,e2)) -> `App(e1,e2)))

  (* variables: x *)
  |||| (parse_var >>>> (fun s -> `Var s))  (* var *)

  (* brackets: (e) *)
  |||| (((a "(") ***> w ***> (parse_L h) ***> w ***> (a ")")) 
        >>>> (fun (_,(_,(e,(_,_)))) -> e))

  |||| h)
  i)

this is a very succinct and direct parser for lambda expressions
there are no tricks: the grammar exactly matches the datatype

Magic

We want a parser that can handle the language which includes all features of the previous 3 languages.

Magic

We want a parser that can handle the language which includes all features of the previous 3 languages.

  let parse_U = (
    let rec l i = parse_L h i 
    and a i = parse_A h i 
    and b i = parse_B h i
    and h i = (??? what goes here???) i
    in
    l)

What is the missing definition of h?

Hint: really only one possible answer for h

Magic

We want a parser that can handle the language which includes all features of the previous 3 languages.

  let parse_U = (
    let rec l i = parse_L h i 
    and a i = parse_A h i 
    and b i = parse_B h i
    and h i = (l |||| a |||| b) i
    in
    l)

Note the power of combinator parsing.

Note the power of "completeness" - all results are returned; so in particular all parsers return; so we don't need to worry about nontermination.

Note that the parsers l, a, b and h all parse the same language, indeed, they are the same parsers.

Memoization

The version of parse_U on the previous slide is fine, but it can be made faster using memoization.

Memoization: memo f i is like f i, but saves the result and reuses it to avoid recomputation.

In the following, the memoization function memo_p3 takes an explicit table argument, in which to store the results of computations.

  let parse_memo_U () = (
    let tbl = MyHashtbl.create 100 in
    let rec l i = memo_p3 tbl (parse_L h) i 
    and a i = memo_p3 tbl (parse_A h) i 
    and b i = memo_p3 tbl (parse_B h) i
    and h i = memo_p3 tbl (l |||| a |||| b) i
    in
    l)

Note the power of combinator parsing.

Example

let txt = "\\ x if (1+((\\ y y) 2)) < 4 then true else false"

let _ = (p3_run_parser parse_U txt)

(* 
[`Lam
   ("x",
    `Ite
      (`Lt (`Plus (`Num 1, `App (`Lam ("y", `Var "y"), `Num 2)), `Num 4),
       `Bool true, `Bool false))]
*)

Note nested recursion between grammars (lambda includes arith includes lambda etc)

Modular evaluation

let rec eval_A h x = (match x with

| `Plus(e1,e2) -> (match (eval_A h e1,eval_A h e2) with
    | (`Num x, `Num y) -> `Num(x+y)
    | (x,y) -> `Err("eval_A 1",`Plus(x,y)))

  | `Minus(e1,e2) -> (match (eval_A h e1,eval_A h e2) with
    | (`Num x, `Num y) -> `Num(x-y)
    | (x,y) -> `Err("eval_A 2",`Minus(x,y)))

  | `Num x -> (`Num x)

  | _ -> h x)

Note we plan to evaluate e.g. "x+1", but this evaluator doesn't (and shouldn't) know anything about x, envs etc

Modular evaluation, lambda expressions

let rec eval_L h env x = (match x with

  | `App(e1,e2) -> (
    let v1 = eval_L h env e1 in
    match v1 with
    | `Clos(env1,`Lam((x:string),e1)) -> (
      let env' = fupdate env1 (x,`Clos(env,e2)) in
      eval_L h env' e1)
    | _ -> `Err("eval_L 2",`App(v1,e2)))

  | `Lam(x,e) -> (`Clos(env,`Lam(x,e)))

  | `Var(x) -> (eval_L h env (env x))

  | `Clos(env,e) -> (eval_L h env e) 

  | _ -> h env x)

Note need for an env.

Note how env is passed to the helper function: the evaluator for arithmetic expressions e.g. x + 2 needs to know the value of x (but eval_A doesn't know anything about an env!)

The point is: the helper for eval_A needs to know the env, not eval_A itself

Combining the evaluators

  let eval = 
    let rec l env x = eval_L a env x
    and a env x = eval_A (b env) x
    and b env x = eval_B (eval_err env) x
    and eval_err env x = (match x with
      | `Err x -> `Err x
      | _ -> (l env x))
    in
    l

Putting it all together

  let parse_and_eval txt = (
    p3_run_parser 
      ((parse_memo_U ()) >>>> eval empty_env)
      txt)
      
  (* y is a lambda calculus fixpoint combinator *)
  let y = "(\\ f ((\\ x (f (x x))) (\\ x (f (x x)))))"
  (* sigma n = n + n-1 + ... + 1 *)
  let sigma = "(\\ g (\\ x (if (x < 2) then 1 else (x+(g (x-1))))))"
  let txt = "("^y^" "^sigma^") 5"
  let [r] = parse_and_eval txt
  (* `Num 15 *)

Motivating soundness and completeness

It's about understanding what the behaviour is...

let rec parse_MANY p = (fun i -> (mkntparser ... (
  (((a "")) >>>> (fun _ -> []))
  |||| ((p ***> (parse_MANY p)) >>>> (fun (x,xs) -> x::xs))))
  i)

let parse_LIST bra elt sep ket = (
  let sep_elt = (sep ***> elt) >>>> (fun (_,e) -> e) in
  ((bra ***> ket) >>>> (fun _ -> []))
  |||| ((bra ***> elt ***> parse_MANY(sep_elt) ***> ket) 
    >>>> (fun (_,(e,(es,_))) -> e::es)))

let _ = p3_run_parser (parse_LIST (a "[") (a "x") (a ";") (a "]")) "[x;x;x]"

let _ = p3_run_parser (parse_LIST (a "") (a "x") (a "") (a "")) "xxx"
(* i.e. ... = parse_MANY (a "x") *)

Support for equational reasoning about parsers: (a "") ***> p =~ p etc

What do I mean by efficient?

Distinguish between parsing time, and time to execute the actions
Efficient, practical parsers execute in time $O(n^3)$ (where n is the length of the input)
Real-world performance also needs to be acceptable

For every polynomial-time algorithm you have, there is an exponential algorithm that I would rather run. (Perlis)

Outline

Semantics of parsing
Key ideas, and a bit of implementation code
Some measurements
The state of play
Conclusion

Semantics of parsing

What does sound and complete mean for a parser?

Soundness

Soundness means: every parse tree returned is well-formed according to the grammar.

Completeness

Completeness (almost) means: every parse tree that is well-formed according to the grammar is returned.
Problem: there may be an infinite number of parse trees

e.g.

E -> E E E | "1" | ""

 E         E              E           
 |       / | \          / | \  
"1"     E  E  E       ""  E  ""
        |  |  |         / | \            
       "" "1" ""       E  E  E           
                       |  |  | 
                      "" "1" ""

A reasonable definition of completeness

What causes the number of parse trees to be infinite?

         E
       / | \
      .  .  .
     .   .   .
    .    .    .
   .     .     .
  .      .      .
 /       |       \
 ""..."" E  ""...""

In general you need: a nonterminal (E say) and a sequence of productions

E ->+ "" ... "" E "" ... ""

Bad and good parse trees

Bad parse trees are such as those in the previous slide
Good parse trees are those that do not contain a bad subtree
There are only a finite (but typically very large) number of good parse trees
They embody all parsing information
It is possible to motivate good parse trees in several other ways, but that is beyond the scope of this talk

Completeness

Completeness means: every good parse tree is returned.

Fixing combinator parsing to handle all context-free grammars

If we restrict to good parse trees, it is possible to construct combinator parsers that are sound and complete (CPP'11).
At the implementation level we use a parsing context. Parsing contexts eliminate bad trees.
A parsing context is a set of pairs (X,s), where X is a nonterminal and s is a string
The meaning of the context is something like "A parse of nonterminal X on string s is already in process"
If you attempt to parse the same nonterminal on the same string, you abandon the parse.
Somewhat more complicated than this because combinator parsing is prefix parsing
For details, see CPP'11 paper

Semantics of parsing, summary

Sound, complete, mechanized proof in HOL4
Executable, usable as a reference parser
Mechanization interesting for other reasons
- wellformedness conditions on terminal parsers (lexing is a form of disambiguation)
- what is the maximum depth of a good parse tree? $|NT|*(n+1)$ ☜ link
My current view on "semantics of parsing" ☜
Efficiency: parsing worst case $O(n^5)$, but it often achieves this worst case ☹ ☜

Time to think again

We can handle context-free grammars ☺
Takes a long time ☹
Thinks... thinks...
Conclusion: difficult to preserve simplicity of combinator parsing and also achieve good performance (but sophisticated implementation techniques such as trampolining may get there - see GLL)

Time to think again

Revisit goals, weaken...
Abandon combinator parsing as a parsing technique, but retain as the programmer interface
- the user's conceptual model must remain simple
Use any other technique to do the actual parsing
- our implementation is based on Earley parsing; the implementation can be horribly complicated providing no details intrude on the user's mental model
The problem now is to reconcile an interface based on combinator parsing with an existing technique such as Earley parsing
Another problem is to implement Earley parsing correctly and efficiently (I won't discuss this further in this talk)

Two different worlds

Combinator parsing is well integrated with the host language

let rec parse_E = (fun i -> (mkntparser "E" (
  ((parse_E ***> parse_E ***> parse_E) >>>> (fun (x,(y,z)) -> x+y+z))
  |||| ((a "1") >>>> (fun _ -> 1))
  |||| ((a "") >>>> (fun _ -> 0))))
  i)

Earley parsing uses concrete datastructures

let g = [
  ("E",[NT "E";NT "E";NT "E"]);
  ("E",[TM "1"]);
  ("E",[TM "eps"])]

Picture (SD) of proposed approach ☜
If the programmer writes code such as parse_E there is no obvious way to get a grammar such as g

Idea, forced

Somehow interpret the combinators in different ways
In one way, we can write the function grammar_of_parser such that

grammar_of_parser parse_E = [
  ("E",[NT "E";NT "E";NT "E"]);
  ("E",[TM "1"]);
  ("E",[TM "eps"])]

In another way, parse_E appears to behave as a standard combinator parser

Solution

The input type is no longer a string, but a (3-way) sum type

type ('a,'b,'c) sum3 = Inl of 'a | Inm of 'b | Inr of 'c

Think of 3 components: left, middle and right
A parser is a function with the following type

type input = (inl,inm,inr) sum3
type 'a output = (outl,outm,'a outr) sum3

Given an argument of type inl, it gives a result of type outl, etc

Left component

type inl = unit
type outl = symbol

type input = (inl,inm,inr) sum3
type 'a output = (outl,outm,'a outr) sum3
type 'a parser3 = (input -> 'a output)

Given an argument Inl(), a parser returns a value of form Inl(sym)

parse_E (Inl()) = Inl(NT("E"))

Define the following function

let sym_of_parser p = (dest_inl (p (Inl ())))

Middle component

type mid = { 
  rules7: parse_rule list; 
  tmparsers7: (term,raw_parser) fmap 
}
type inm = mid
type outm = mid

type input = (inl,inm,inr) sum3
type 'a output = (outl,outm,'a outr) sum3
type 'a parser3 = (input -> 'a output)

Given an argument of type Inm mid, a parser (such as parse_E) returns a value of form Inm mid', where mid' is mid extended with all grammar rules "reachable" from the relevant symbol (such as E). Must be careful about termination.

Also collects all the terminal parsers. Example...

Middle component

let grammar_of_parser p = (dest_inm (p (Inm empty_mid)))

Then

grammar_of_parser parse_E = {
  rules7=[("(E*E)", Seq (NT "E", NT "E")); 
    ("(E*(E*E))", Seq (NT "E", NT "(E*E)"));
    ("((E*(E*E))+1)", Alt (NT "(E*(E*E))", TM "1"));
    ("(((E*(E*E))+1)+eps)",Alt(NT "((E*(E*E))+1)",TM "eps"));
    ("E", Atom (NT "(((E*(E*E))+1)+eps)"))];
  ... 
}

Earley parsing

We can extract the grammar from a parser.

We feed this to an Earley parser, and we get back information about the parse.

This information needs to be in a compact form (e.g. not parse trees!). I use an oracle.

Oracle

A top-down parser:

The oracle answers the following problem:

Given a rule $S \rightarrow A B$ in $\Gamma$ and a substring $s_{i,j}$, what are the indexes $k$ such that $\Gamma \vdash A \rightarrow^* s_{i,k}$ and $\Gamma \vdash B \rightarrow^* S_{k,j}$ ?

In code:

type ty_oracle = (symbol * symbol) -> substring -> int list

A substring is $s_{i,j}$, or SS(s,i,j) in code

Right component

type inr = {
  ss4: substring;
  lc4: local_context;
  oracle4: ty_oracle
}
type 'a outr = 'a list

type input = (inl,inm,inr) sum3
type 'a output = (outl,outm,'a outr) sum3
type 'a parser3 = (input -> 'a output)

The right component corresponds to the usual combinator parser. The ss4 field is the input string, lc4 is the parsing context (to eliminate bad trees) and oracle4 is the parsing information returned by the Earley parser.

Definition of combinators

What happens when executing (p1 ***> p2) (Inr i0)?

let seqr p1 p2 = (fun i0 -> 
  (* get symbol for each parser *)
  let sym1 = sym_of_parser p1 in
  let sym2 = sym_of_parser p2 in
  let SS(s,i,j) = i0.ss4 in
  (* call oracle on substring SS(s,i,j), get back list of k s *)
  let ks = i0.oracle4 (sym1,sym2) i0.ss4 in
  (* for each k we do the following *)
  let f1 k = (
    (* call p1 on SS(s,i,k) ... *)
    let rs1 = dest_inr (p1 (Inr { i0 with ss4=(SS(s,i,k)) })) in
    (* ... and p2 on SS(s,k,j) .. *)
    let rs2 = dest_inr (p2 (Inr { i0 with ss4=(SS(s,k,j)) })) in
    (* and combine the results *)
    list_product rs1 rs2)
  in
  List.concat (List.map f1 ks))

What's missing from these slides

Complete definitions of combinators
Code for the Earley parser (surprisingly difficult to get right)

Experiments (on version 2013-05-15)

Various grammars, various inputs

We want to compare the performance of my approach (P3) with existing approaches; none of the existing approaches have a combinator interface
First problem: most parsers cannot actually parse these grammars; if they can, they typically bug out with very small inputs
Haskell Happy parser is the only one I found that allowed me to run a respectable number of tests

Performance comparison with Happy

On grammars aho_s and E_EEE, P3 clearly outperforms Happy, e.g. for aho_s

There is (probably) a performance bug in Happy
On brackets, Happy bugged out. This is (probably) a functional correctness bug in Happy.

Performance comparison with Happy

On S_xSx and aho_sml, P3 is worse than Happy by a small (2-4) factor. Following is S_xSx:

However, Earley appears to be asymptotically better than Happy on these grammars, and so with big enough inputs, Earley would win; unfortunately Happy ran out of memory before this could be evidenced (probably another memory usage/gc bug)

Performance

There is also additional overhead after parsing: constructing the oracle and executing the actions
With the released version, constructing the oracle can take about as long as parsing
Executing the actions appears to behave as it should (i.e. best possible performance that could be expected)
The released version does not attain $O(n^3)$ bound (but Happy is even worse)
A new implementation of Earley combines parsing with oracle construction, cuts execution time by a significant factor, and appears to fit the expected bound (actually $O(n^3.ln(n))$).

State of play

The initial release is usable. No bugs found so far. Lacks features that would make it really usable (e.g. debugging features)
The new version can work with arbitrary input types, not just strings. So you can use a lexer.
The new version fixes a minor annoyance with memoization.
The new version has significant performance improvements.
The new version has some support for debugging.
I use this technology all the time e.g. to port Netsem spec to Lem

Conclusion

Combinator parsing library; fancy examples
Good performance, competitive with best parsers currently around; oracle architecture means we can swap in faster parsers when they become available; we can even analyse the grammar and choose the fastest parser for the given grammar class
New version (to be released before Christmas hopefully) has even better performance
- I also plan to start using other backends for restricted grammar classes
Needs a bit more engineering e.g. to support debugging
Main challenges: good trees; oracle architecture; 3 interpretations of combinators; Earley implementation; correctness and efficiency; tying everything together
I should verify the implementation of Earley
I should test performance on large grammars from restricted classes, against existing parsers for these subclasses.

Simple, efficient, sound-and-complete combinator parsing for all context-free grammars, using an oracle.

Prelude: quick intro to parsing and combinator parsing

Quick question

Basics of context-free grammars

Basics of parse trees

Parsing

Combinator parsing

Combinator parsing, terminal parsers

Combinator parsing, combinators

Combinator parsing, combinators

With the following result

The glory of combinator parsing

Some problems with combinator parsing

Summary of problems

The goal

Sneak peak

Motivating combinator parsing

Modular parsing: arithmetic

Modular parsing: arithmetic

Parsing arithmetic expressions

Parsing "boolean" expressions

Parsing lambda expressions

Magic

Magic

Magic

Memoization

Example

Modular evaluation

Modular evaluation, lambda expressions

Combining the evaluators

Putting it all together

Motivating soundness and completeness

What do I mean by efficient?

The talk proper

Outline

Semantics of parsing

Soundness

Completeness

A reasonable definition of completeness

Bad and good parse trees

Completeness

Fixing combinator parsing to handle all context-free grammars

Semantics of parsing, summary

A new approach

Time to think again

Time to think again

Two different worlds

Idea, forced

Solution

Left component

Middle component

Middle component

Earley parsing

Oracle

Right component

Definition of combinators

What's missing from these slides

Experiments (on version 2013-05-15)

Performance comparison with Happy

Performance comparison with Happy

Performance

State of play

Conclusion