Date: 2014-12-04
Categories: parsing
Parsing examples
My work on P1, E3, P3 and P4 was motivated in part by a desire to have simple tools that I could use in other areas of research. These tools are mostly source-to-source translators. For example, in the past I have had various ad-hoc tools to translate OCaml to LaTeX, or OCaml to HOL, HOL to Lem, Lem to OCaml etc.
The idea is to start making these tools maintainable and not one-off efforts to be used for a particular task then thrown away.
Previous versions of P1 etc have not been engineered well enough to make the resulting tools maintainable or usable. I am hoping to change that by investing more engineering effort into P1 etc.
So I have started a new repository on github https://github.com/tomjridge/example_grammars/ which contains grammars and grammar-based tools that I use.
To reduce grammar maintenance, I have added additional functionality to P1 etc:
Actions now use positional variables (
x1,x2,...
) rather than requiring afun (x,(y,...)) ->
nested tuple pattern. This makes maintenance easier and also allows p1 and p4 to use the same grammar files (p1 has a right-associative sequential combinator, while p4 has a left-associative seq. comb.). For example, the file https://github.com/tomjridge/example_grammars/blob/master/src/eee.p1 usesfun ...
actions, whereas the file https://github.com/tomjridge/example_grammars/blob/master/src/eee2.p1x uses positional variables.Explicitly-named symbols in actions are supported. In the following excerpt, the variables in the second alternative are
w1,t,w2
rather thanx1,x2,...
.
OCAMLTYPEXPR -> "'" ?ident? {{ Tex.tyvar ([x1;x2]|>ss_concat|>c) }}
| "(" w1=?w? t=TYPEXPR w2=?w? ")" {{ "("^(Tex.w w1)^t^(Tex.w w2)^")" }}
- Multiple actions are now supported. For example, to map to tex you might have an action
{tex{...}}
, whilst to map to markdown, you might have an action{md{...}}
. There is an example using multiple actions here: https://github.com/tomjridge/example_grammars/blob/master/src/eee3.p1x.acts
As a (somewhat largish) example, a grammar for a Lem/OCaml language is https://github.com/tomjridge/example_grammars/blob/master/src/ocaml.cppo Note that this can be used with p1 (for correctness/simplcity) and p4 (for speed). This is possible because p1 and p4 have essentially the same interface. At the moment there is only a single set of actions (to map to LaTeX). I am currently using this to produce typeset Lem code for inclusion in a paper.
Related posts:
- 2018-06-14 A typed DSL for parsing
- 2018-05-22 First Python program: an Earley parser!
- 2017-11-14 New OCaml parsing algorithm: tjr_simple_earley
- 2017-09-17 Two new OCaml libraries: P0 and tjr-csv
- 2016-02-19 Tree-structured text
- 2016-02-09 Simple implementation of an Earley-like parsing algorithm
- 2015-06-26 P5 scala parsing library
- 2014-12-19 Parsing the IMAP protocol
- 2014-12-04 Parsing examples
- 2014-11-21 Talk on parsing at the University of Sussex
- 2014-09-26 P1 combinator parsing library for OCaml
- 2014-09-26 E3 earley parser library for OCaml
- 2014-09-18 SLE 2014 conference, and Parsing at SLE workshop, slides
- 2014-09-07 ICFP 2014, OCaml workshop, slides and video
- 2014-07-11 P3 paper accepted for SLE 2014
- 2014-04-15 New release of P3 code on github
- 2014-03-02 New release of P3 code on github
- 2013-12-16 New release of P3 code on github
- 2013-12-03 Implementing algorithms efficiently
- 2013-11-08 Talk on parsing and P3 given at Cambridge
- 2011-12-01 Verified parsing