Fog Creek Software
Discussion Board


I am interested in learning the theory necessary to make the task of writing language processors (e.g. XML processing) easier. I don't want a book on SAX, I want a book that will teach me timeless stuff. What if I wanted to write a program that had its own scripting language (such as CityDesk's scripting language or a word processing program that lets the user define macros), or something of that nature?

I was thinking of getting a book on compilers but I am not interested in the code generation part of compilation, which a significant amount of any text on compilation will be devoted to. I was thinking of getting a book on programming languages (such as ) but I fear it won't be relevant to what I am doing.

Where should I look?

I have been looking around (both in my bookshelf and on the Internet). Any assistance is much appreciated.

Warren Henning
Friday, April 5, 2002


You might want to look at "Constructing Language Processors for Little Languages", by Randy M. Kaplan.

The book is about 8 years old, and doesn't go into the crazy depths of compiler construction like the dragon book (no expression folding, optimizations, etc). But it should give you a good introduction.

However, be warned, as all of the examples are written in C! It also makes use of lex and yacc, which can be overkill in relation to something like Joel's cityscript, which is more related to a preprocessor than a full blown compiler.


James Wann
Friday, April 5, 2002

I concur.  I had to write a processor for a proprietary interpreted language a few years ago, and the Kaplan book was very helpful.

Chris Dunford
Friday, April 5, 2002

Take a look at YABasic, which is an open source Basic Interpreter.

It uses Flex and Bison for the parsing, which saves a lot of the grunt work.

Ged Byrne
Friday, April 5, 2002

It might be worth looking at "Compiler Construction" by Niklaus Wirth, which gives a good, although sometimes very terse, introduction to recursive descent parsers, and is nice and short (< 200 pages). The book uses Oberon, a successor to Pascal, and develops a compiler for a mini-language called Oberon 0.

Andrew Simmons
Saturday, April 6, 2002

Only partially on topic - have a look at Lemon [ ] - It is very similar to Yacc, except that it's done (IMHO) better - whereas Yacc is a "framework" (along the lines of MFC), which bends your code to the Yacc way, Lemon tries to be a "library" that doesn't interfere with how your code is organized (like FLTK). It doesn't fully succeed in doing that, but it's either (IMHO) to write a robust Lemon parser than a robust Yacc parser.

Plus, it has only slightly differenbt, yet more complete syntax and semantics, that make it at the same time more readable and better checked at compile time.

(I'm not affiliated with Lemon in any way - I just like it and  it's SQLite brother a lot)

Ori Berger
Saturday, April 6, 2002

Have a look at the programmar tool. Usefull examples and documentation.

The guys there are _very_ helpfull.

James Ladd
Sunday, April 7, 2002

We recently used clarion to develop a contained logic system. Its not a parser, nor a script langauge but produced what looked like a progammable scripting logic system. Around this programmable logic system we created an application framework, then DLL'd the system an now it can be called from any windows app.

This point is that what is a langauge, if it allows data to declared of some kind, and information processed with the use of logic statements, is it a language?

Well most users of the this logic system call it a langauge.

If we allow syntax parsing next it will look like a real langauge, object emulation today provides a lot of power on ever more powerful systems.

Stephen Ryan
Tuesday, March 25, 2003

*  Recent Topics

*  Fog Creek Home