The Definitive ANTLR 4 Reference – You, too, can be a parsing guru – #programming #bookreview

The Definitive ANTLR 4 Reference
Terence Parr
(Pragmatic Bookshelf – paperback)

The self-described “maniac” behind ANTLR — “ANother Tool for Language Recognition” — is at it again. Terence Parr has rewritten ANTLR “from scratch” and celebrated by bringing out a new edition of his book, The Definitive ANTLR 4 Reference.

Parr, a professor of computer science and graduate program director at the University of San Francisco, says his book is “specifically targeted at any programmer interested in learning how to build data readers, language interpreters, and translators. This book is about how to build things with ANTLR specifically, of course, but you’ll learn a lot about lexers and parsers in general. Beginners and experts alike will need this book to use ANTLR 4 effectively. To get your head around the advanced topics in Part III, you’ll need some experience with ANTLR by working through the earlier chapters.”

Also: “Readers should know Java to get the most out of the book.” ( Java 1.6 or later is required.)

According to Parr: “ANTLR v4 is a powerful parser generator that you can use to read, process, execute, or translate structured text or binary files. It’s widely used in academia and industry to build all sorts of languages, tools, and frameworks. Twitter search uses ANTLR for query parsing, with more than 2 billion queries a day. The languages for Hive and Pig and the data warehouse and analysis systems for Hadoop all use ANTLR. Lex Machina uses ANTLR for information extraction from legal documents. Oracle uses ANTLR within the SQL Developer IDE and its migration tools. The NetBeans IDE parses C++ with ANTLR. The HQL language in the Hibernate object-relational mapping framework is built with ANTLR.”

So…it’s out there in many different and big ways. But ANTLR also can be used for smaller projects.

Notes Parr: “…you can build all sorts of useful tools such as configuration file readers, legacy code converters, wiki markup renderers, and JSON parsers. I’ve built little tools for creating object-relational database mappings, describing 3D visualizations, and injecting profiling code into Java source code, and I’ve even done a simple DNA pattern matching example for a lecture.”

Parr’s 305-page, 15-chapter book is divided into four major parts:

  1. Introducing ANTLR and Computer Languages
  2. Developing Language Applications with ANTLR Grammar
  3. Advanced Topics
  4. ANTLR Reference

This latest version of ANTLR “has some important new capabilities that reduce the learning curve and make developing grammars and language applications much easier. The most important new feature,” Parr adds, “is that ANTLR v4 gladly accepts every grammar you give it (with one exception regarding indirect left recursion….)”

To properly understand that exception and how it must be dealt with, you will need to read “Dealing with Precedence, Left Recursion, and Associativity” in Chapter 5.

This is not a book for programming beginners. But Terence Parr is a good writer who injects both clarity and occasional humor into his descriptions. And he provides numerous code examples and illustrations to help guide you along the way to becoming a parsing guru and mastering ANTLR v4.

Si Dunn