I learned to use lex and yacc long ago, painfully and perhaps poorly, partly from snippets which may or may not have been good examples and partly from reading the generated scanning and parsing code. I imagine that my experience was typical. The old metaphor about how sausage is made is apt. And although I have since used other parser generators much more than I ever used bison, bison was the right tool for some recent projects. So I decided to take a look at John Levine's updated book, which is written in the helpfully assuring voice you'd expect, demonstrating experience though well chosen examples and seasoned with interesting historical notes.
* Levine makes no apologies for the tools. Where there be dragons, he offers practical advice on slaying them or -- better -- walking around them. (It seems like there was a Monty Python sketch about this, or should have been.)
* Generally well written, could have used some more editing to smooth out some convoluted bits.
* More than the usual number of errors, mostly minor and all readily fixable, in the examples.
* There is apparently no code available for download. This may not be a bad thing. I found myself thinking more than usual as I typed, which means I needed to go through the exercise. You're not going to learn how to parse and scan in the real world just by reading about it.
* The exercises are few but highly relevant. You should at least read them even if you have no time to do them.
* The index is good, but you probably won't need it.
Suggestions for Reading Order:
In general, each chapter read is expected to be read as a whole, from beginning to end. Internal back-references within chapters are frequent and effectively demand linear reading and comprehension. This seems to be a deliberate design choice, to avoid the need for more repetition than is absolutely essential. [Insert awful joke about left recursion here.]
The chapters, read as wholes, need not be read in ordinal order. This order worked well for me:
[1,] 2, 3, 7, 8
What You Get
Chapter 1 is the usual get-our-feet-wet chapter. If you don't like what you see, you're done.
Chapters 2 and 3 demonstrate how to apply flex and bison, respectively. Chapter 2 includes simple actions in the scanner, so that the examples *do* something. Old hands will recognize this; new students will see their patience rewarded in Chapter 3 where all but the simplest actions are placed in the parser, relieving the scanner of ill-suited tasks.
Chapter 4 presents a workable parser for SQL. It's an interesting choice and can be skipped if you're pressed for time.
Chapters 5 and 6 are useful references for flex and bison. These are not the typically disappointing rehash of man pages or online documentation. The information is clear, concise and full of essential observations and advice. You can skip these chapters but you will return to them. These two chapters are pretty much the manuals I wish I had when I started using lex and yacc.
Chapter 7 is one of the most coherent *practical* discussions of ambiguity and conflicts I've seen. It's about as good as possible without building the state machines, and Levine leverages the reports generated by bison to avoid discussion of first and follow sets. Instead Levine provides an informal and pragmatic pattern for reasoning about shift/reduce conflicts reported by bison, including the classic if/then/else anti-pattern. This is concise coaching which will help you anticipate and resolve common problems. Levine gently encourages the reader to think carefully about whether ambiguities may be a sign of bad language design and not just conflicts to be resolved by ad hoc rules.
Chapter 8 provides a brief but informative discussion of reporting and recovering from errors. If your language is ever going to have users, you'll have to deal with errors. Error-reporting is easy. Useful error-reporting is a little harder. And error-recovery is non-trivial for all but trivial languages. Levine provides some practical examples which serve to underscore the virtue of failing fast over flailing hard.
Chapter 9 collects some recent developments in flex and bison, some of which may be suitable for production. If you just need to get a parser going quickly and reliably in C for a simple language, or you can call C libraries, then flex and bison are your buddies. Levine provides a frank and fair assessment of the current (ca. 2009) support for implementing C++ parsers and scanners. Along with pure parsers and GLR parsing, these are advanced topics in the sense that their implementations in bison/flex are less mature than LALR targeting C implementation. Levine offers much practical advice which might be summed up as, "I could, but should I?" If the answer for your application is 'Yes, I need to cut against the historical grain of bison/flex,' then you'll do well to become familiar with some of the newer parser-generating kits which were created for that purpose. But if the significance of the question is not obvious, then this book will help you break it down into smaller parts and begin to answer it.
Overall, this is a must-read for anyone trying to make effective use of flex and bison. You could do it the hard way, but your job is going to be hard enough anyway. So you might as well get a good start from someone who knows the stuff inside and out. This book does that.