Fog Creek Software
Discussion Board




Type of parser used for ASP->PHP compiler

I'm wondering what kind of parsing technique was used for the Fogcreek ASP->PHP compiler. Was it a recursive decent parser or a simple code filter?

Matthew Lock
Saturday, December 27, 2003

One of these days I'll write about it. It's a real lexer, parser, ast, code generator, which is why I keep calling it a compiler. It is NOT just a simple filter (what someone called "put $ in front of every variable name and cross your fingers).

Joel Spolsky
Saturday, December 27, 2003

Excellent. Many so called compilers and code converters around are really just a fancy search and replace programs.

Matthew Lock
Sunday, December 28, 2003

Ahh... parsing :-)

Lex (Flex) and Yacc (Bison) rule. Creating ASTs from them for well-behaving languages is fun and simple. Dumping code from ASTs when the target language needn't be optimized too much is also fun and simple.

Joel, do write about it. It would definitely be a most interesting read.

Eli Bendersky
Sunday, December 28, 2003

Yes, I think this article will be very welcome here. The Joel on Software web-log seems to touch a lot on software management issues (which is good), so it could use some more Computer Science-oriented articles. (like Back to Basics, for example)

In any case, one should be aware that Lex and Yacc, while commonly used, are not the cutting edge of compiler design. Lex is simply a tokenizer that extract tokens based on regular expressions _without any state_.  A simple Perl script can do better than that.

As for Yacc - it is nice and all, and very useful. However, it too can be improved upon. One such implementation is the Scannerless Generalized LR parser:

http://www.cwi.nl/htbin/sen1/twiki/bin/view/SEN1/SGLR

Another is Damian Conway's Parse::RecDescent module:

http://search.cpan.org/~dconway/Parse-RecDescent-1.94/

which is very powerful and capable, but kind of slow.

One thing that should be remembered is that formal languages that are commonly used for parsing languages are not Turing complete. So, a custom parser is needed for some patterns.

Shlomi Fish
Sunday, December 28, 2003

*  Recent Topics

*  Fog Creek Home