Categories: parsing



I have been interested in various aspects of parsing since 2009. My research is particularly focused on parsing general context-free grammars, using combinator parsers. I have published several papers, formalized various proofs about parsers in the HOL4 theorem prover, and produced quite a few OCaml implementations to try out various ideas.

Verified parsing

This work produced a verified, sound and complete approach to combinator parsing. This introduced the idea of "good parse trees". The problem with this approach is that it is too inefficient with left-recursive grammars. There is a post describing verified parsing here.


These are projects which I have largely abandoned:

The downside is that the interface to P4 is more complicated than that for P3.

Older links are:

The best documentation for P3 internals is the extended version of the following paper: Tom Ridge. Simple, efficient, sound and complete combinator parsing for all context-free grammars, using an oracle. In Proceedings of SLE 2014. ridge14sle.pdf extended version.

(* _G 1 parses "1", or "1" followed by _G 2; in general, _G n parses
   the number n, optionally followed by _G (n+1); there are an infinite
   number of nonterminals (_G n) generated lazily on demand *)

let _G = ref (fun n -> failwith "")

let _ = _G := fun n -> 
  let an = a (string_of_int n) in
  let alts = lazy(alts[
    (rhs an) >> (fun s -> c s);
    (an >-- (!_G (n+1))) >> (fun (x,y) -> (c x)^y)])
  mkntparser_lazy (mk_pre_parser ()) alts

Related posts: