P3 is an efficient, sound and complete combinator parsing library. The approach can be implemented in a small pure fragment of OCaml, and should be simple to port to other functional languages.
New versions of P3 are now available from github here. At the moment the documentation is on the github wiki for P3.
The best documentation for P3 internals is the extended version of the following paper: Tom Ridge. Simple, efficient, sound and complete combinator parsing for all context-free grammars, using an oracle. In Proceedings of SLE 2014. ridge14sle.pdf (extended version).
The most recent paper on P3 "Simple, efficient, sound and complete combinator parsing for all context-free grammars, using an oracle" was accepted to SLE 2014. A preprint is accessible via publications.
Just pushed a new release of P3 to github. Main changes are dramatic performance improvements for large grammars. Check out the P3 wiki for further information.
Tomorrow I will push the latest changes to P3 to github. See the news section on the P3 github wiki for changes.
The following draft paper (under submission) describes the design of the P3 combinator parsing library.
Tom Ridge. Simple, efficient, sound and complete combinator parsing for all context-free grammars, using an oracle (unpublished draft). ridge14p3_design.pdf. Old slides are here (best viewable in firefox, due to latexmathml).
The draft includes various performance measurements, particularly against the Haskell Happy parser generator. For reference, the testing infrastructure was as follows:
- The machine was
model: 58 model name: Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz stepping: 9 microcode: 0x12 cpu MHz: 1600.000 cache size: 6144 KB
- Only one core was used during testing.
- The machine has 16 GB of memory. As far as I am aware, no swapping occurred during tests.
- The machine is running xubuntu-13.10-desktop-amd64.iso.
- uname says: Linux pc1177 3.11.0-13-generic #20-Ubuntu SMP Wed Oct 23 07:38:26 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
- The OCaml version is 4.01.0 installed from opam.
- The Haskell ghc version is 7.6.3. The Happy version is 1.19.1
O(n^3)bound, and a functional version that hits the
O(n^3. ln n)bound.
I reimplemented the core Earley parser used by P3. The new version
of the core Earley parser,
earley3, now appears to meet the
desired asymptotic worst-case time complexity of
O(n^3. ln n). The following are some observations on implementing algorithms so that they achieve a given complexity bound.