Fog Creek Software
Discussion Board

Lexer/Parser Tools for Visual Studio 6

Does anyone know of lexer/parser generation tools like JLex & JCUP which outout C++ programs for Windows?

Thursday, April 10, 2003

Here's a commercial tool:

Visual Parse

Here's an excellent freeware one:


ANTLR is written in Java, and generates Java, C++, or C# code.

Chris Tavares
Thursday, April 10, 2003

Or you can go to Boost ( ) and try the Spirit parser library.

Although it is kinda limited on VS6.  It will most likely rule on VS7.1.

flamebait sr.
Thursday, April 10, 2003

What's wrong with good old Lex/Yacc (or Flex/Bison, which is the same thing?) They do work on Windows you know; VC6 is a perfrectly good C implementation.

(There's Lemon, which I think is better designed and either to write robust parsers with, but I can argue the reasons only with someone who actually has experience writing parsers -- it's not self evident or easy to see).

The best places to look for _any_ development tool are [ ] and [ ]. You'll find Flex, Bison, Lemon and many others there. Both sites are Unix oriented, but ~95% of the useful stuff also works relatively well on Windows.

Ori Berger
Friday, April 11, 2003

Thanks for the help.

Friday, April 11, 2003

The original poster specifically asked for C++ parser generators; Lex & Yacc spit out C code.

I have issues with the code that these tools (lex & yacc) generate; it's very difficult to write a yacc parser that doesn't have memory leaks, for example. And the scads of global variables etc. used by a yacc parser mean that it's difficult to integrate into other apps, or to put two parsers into one application.

I happen to like ANTLR as a free tool because:

1) ANTLR lets you do the lexer and parser in the same grammar file. You don't need two tools anymore.

2) ANTLR generates a recusive-descent parser, which means you at least have a chance of reading the generated code. Yacc parsers are... less than clear.

3) ANTLR handles arbitrary lookahead in grammars, Yacc is limited to one token lookahead. So it's easier to define your grammar.

4) ANTLR generates real OO code in the language I use (C++ and C# in my case). So it's much easier for me to use these parsers.

I guess it's just a matter of taste in the end. Lemon looks like a good suggestion if you need a C parser; it explicitly handles some of the complaints I have about yacc.

Chris Tavares
Friday, April 11, 2003

I have used both Visual Parse and ANTLR.  Here are my impressions:

Visual Parse is a slick tool with a fancy GUI.  The GUI lets you debug grammar ambiguities (it uses an LALR(1) engine, just like Yacc).  It would be nice if the GUI was accurate; as it is I prefer the textual debug output of Bison.  Greek to some, but it told me where the problems were.  There are other annoyances too: you cannot debug a parser without stepping through the accompanying lexer, but lexers are usually much easier to get right.  But, the tool is worth it if you need something NOW.

In contrast, ANTLR is an LL(k)-based recursive-descent generator.  For those situations where LL(k) is not sufficient, ANTLR features syntactic lookaheads and semantic predicates.  It is much easier to get a grammar working with these features as compared to plain LL(1) or LALR(1).  On the downside, ANTLR suffers from under-documentation, especially in C++ output mode.  The tool is written in Java, and is primarily meant for Java. This may be an issue for cross-platform builds if you cannot get a JVM for all of the platforms you desire.  ANTLR uses C++ exceptions extensively; if your compiler does not support exceptions efficiently in the no-throw case then performance will suck rocks.  Overall, if you can work around these problems, then ANTLR is definitely worth the money: it's free!

David Jones
Saturday, April 12, 2003

Global variables in lex/yacc shouldn't be a problem - they have a command line switch to prefix every symbol with a given name, to avoid collisions.

However I must admit that I'm mystified by yacc's control flow. I once wrote a simple calculator with yacc, it worked, but I really had no idea exactly how and when each of the production handlers was being executed.

Dan Maas
Tuesday, April 15, 2003

Global variables can still be a problem, even without name collisions. Consider a multithreaded server that tries to use a yacc parser in each thread to parse message headers, for example. That would fail, because the threads would stomp on each other's parser state.

Chris Tavares
Tuesday, April 15, 2003

*  Recent Topics

*  Fog Creek Home