Date: 2014-12-19
Categories: parsing

Parsing the IMAP protocol

There was a recent discussion on the OCaml mailing list about parsing IMAP http://article.gmane.org/gmane.comp.lang.caml.inria/61731. The apparent need is for a scannerless, incremental parser.

I'm not quite clear what people mean by incremental parsing. There are probably several things that should be distinguished:

Anyway, Earley parsing can probably support all these use cases quite nicely, and E3 is already scannerless, so I thought I would try my hand at producing an IMAP protocol parser.

The IMAP protocol is defined in RFC 3501, using a variant of BNF. The variant is defined in RFC 2234. I first defined a parser for ABNF https://github.com/tomjridge/example_grammars/blob/master/src/abnf_rfc2234/abnf_grammar.p1x. With this I was able to parse the IMAP protocol directly from RFC 3501. The grammar is here, and the result of parsing this grammar is an s-expression here.

The next step is to generate parsing code that can parse this grammar and produce abstract syntax trees corresponding to the expressions in the IMAP protocol.



Related articles:

  • 2017-11-14 New OCaml parsing algorithm: tjr_simple_earley
  • 2017-09-17 Two new OCaml libraries: P0 and tjr-csv
  • 2016-02-19 Tree-structured text
  • 2016-02-09 Simple implementation of an Earley-like parsing algorithm
  • 2015-06-26 P5 scala parsing library
  • 2014-12-19 Parsing the IMAP protocol
  • 2014-12-04 Parsing examples
  • 2014-11-21 Talk on parsing at the University of Sussex
  • 2014-09-26 P1 combinator parsing library for OCaml
  • 2014-09-26 E3 earley parser library for OCaml
  • 2014-09-18 SLE 2014 conference, and Parsing at SLE workshop, slides
  • 2014-09-07 ICFP 2014, OCaml workshop, slides and video
  • 2014-07-11 P3 paper accepted for SLE 2014
  • 2014-04-15 New release of P3 code on github
  • 2014-03-02 New release of P3 code on github
  • 2013-12-16 New release of P3 code on github
  • 2013-12-03 Implementing algorithms efficiently
  • 2013-11-08 Talk on parsing and P3 given at Cambridge
  • 2011-12-01 Verified parsing