Date: 2018-05-22
Categories: research; programming; parsing

First Python program: an Earley parser!

Next semester we are introducing Python as the first language we teach to our students. To learn Python I read a few books, then started to write some code. My first piece of code involved translating a piece of OCaml (https://github.com/tomjridge/tjr_simple_earley) to Python.

The result is https://github.com/tomjridge/tjr_python_earley_parser

From the README.md:

This code is based on the OCaml implementation https://github.com/tomjridge/tjr_simple_earley but adapted to Python. Documentation for the algorithm itself (not this implementation in Python) can be found at that link.

The OCaml code is much faster (as should be expected).

Even so, the Python code is not slow. For example, for an input "1"*100, and a grammar E -> E E E | "1" | eps, parsing takes about .75 of a second. This is notable because most other Earley parsers tend to fall over when given a grammar like this.

This is my first piece of Python code, and I was pleasantly surprised:

TODO:


UPDATE

Inspired by imperative programming (!) I attempted to replicate the Python experience in OCaml, using a monad. The result is:

This actually is pretty readable, and close to the concision that Python allows.


Related posts: