Fog Creek Software
g
Discussion Board




XML Languages as political compromise

I've been thinking about this off and on since I read Ken Arnold's critique of Ant:

  http://www.artima.com/weblogs/viewpost.jsp?thread=7435

(I haven't looked at nAnt, but I'd bet that his comments apply to it as well).

Ken asserts that XML is a poor choice for Ant's configuration language, and that a scripting language would be better. I think he makes a good case for it.

Ah, but *which* language? Should we:

1. Create yet another language and with it the burdens of getting it implemented correctly and having everyone learn it

2. Use the major host language (Java for Java projects, C# for .NET projects, etc).  Python, Perl and Ruby developers may say yes, but it seems impractical for Java, C# and C++.

3. Use an established scripting language. But then we've added another language and toolset on to the project -- and what about all the people who dislike *that* language?

It seems to me that while 3 may be the best choice technically, XML was chosen  exactly to avoid bringing in another toolset.

XML: When you can't decide on another language. (*)

* Please don't tell me how great XML is (or isn't!) for data. I'm specifically addressing the situations when it's shoehorned into serving as a pseudo-language.

Xtremely Mucked-up Lad
Friday, October 17, 2003

Xtremely Mucked-up Lad, are you some kind of super-hero? :-P

Jimmy Olsen
Friday, October 17, 2003


Xtremely Mucked-up Lad was kicked out of the Justice League of America for exactly this sort of thing.

The Green Lantern
Friday, October 17, 2003

Xtremely Mucked-up Lad is a registered trademark of Marvel Comics.

Stan Lee
Friday, October 17, 2003

If the configuration subsystem just relies on receiving a set of key-value pairs, then XML is a solid language to choose.  If you jump over to XML.com, Mark Pilgrim has a nice article covering the Atom API. I'm not drawing paralells between config files and APIs, but rather between using Structs versus using XML. 

A well constructed XML file, with a solid DTD, can allow one to generate a lot of necessary configuration information in an orderly place, while allowing for vendor extensibility (through namespaces of course, don't muddy the base namespace).  It seems like XML is a handy config file, a bit verbose for some uses, but that verbosity is what allows one to know that they didn't accidentally delete an important key-value pair when they were editing.

And I'd love to have a config file I could run against a validator to make sure it will actually run.  I've hosed enough config files to see the utility in that.

Lou
Friday, October 17, 2003

It strikes me the problem with Ant isn't the use of XML, rather its the lack of a decent editor for the config.

Tony Edgecombe
Friday, October 17, 2003

The value in using XML over some other scripting language is two-fold:

1. It's more likely that people from a variety of language backgrounds will all either know, or be able to get, XML.

2. XML parsers are written, tested, beaten the hell out of. Language parsers are a lot more complex, and certainly one step removed from the Ant coders' goal of getting a nice portable build system.

I've seen XML abused pretty badly. Ant build files isn't it.

Brad Wilson (dotnetguy.techieswithcats.com)
Friday, October 17, 2003

This one is easy, it should just support the Bean Scripting Framework[1], providing a choice of scripting language.

[1] http://www-124.ibm.com/developerworks/projects/bsf

Ged Byrne
Friday, October 17, 2003

"If the configuration subsystem just relies on receiving a set of key-value pairs, then XML is a solid language to choose."

I would say that a Windows INI file or a Java properties file would be an even more solid choice for this situation. The advantage XML offers over these formats is that it supports nesting.

John Topley (www.johntopley.com)
Friday, October 17, 2003

XML is a syntax. A language includes semantics.

So Ant actually does define a new, albeit limited, language for building software. Buiilding software often requires real code in the build process, so Ant will probably be turing complete in the future (If it isn't there yet - I'm not following development in the last year or so).

Scons uses Python as the host language. A simple build script borrows Python's syntax, but uses Scons semantics - unless you really need code within your build script, then -- compared to Ant -- you're just replacing one syntax with another.

e.g., instead of
<target name="compile" depends="init">
    <javac srcdir="${src}" destdir="${build}"/>
</target>

You write:
Java(source='$src', target='$target')

I don't think anyone can argue that one syntax is superior to the other.

However, because scons really is embedded in Python, you can write:

for number in range(1000):
if prime(number): Java(source='$src',
                                    target='$target'+`number`,
                                    JAVACFLAGS='-DPRIME='+`number`)

Whereas in Ant you'd have to write a tasklet for that (assuming Ant is not yet turing complete).

Try Scons; You'll probably be pleasantly surprised.
[ http://scons.sourceforge.net ]

Ori Berger
Friday, October 17, 2003

XML is great and I won't hear a bad word against it. One of my favourite uses of it (excluding interfaces) is as an internal data structure for programs that do complicated state manipulation.

Tim H
Friday, October 17, 2003

> Buiilding software often requires real code in the build process

Thank you! That is *exactly* the insight that's needed for the problem.

Many XML files do not need loops or conditionals; instead whatever code is generating the XML has the loops.

Build scripts are different, since they are generally the *first* code that is executed; if some other code generates the build script, then *it* becomes the build script/file.

You could do this with Ant -- break out your barf bags now! -- by having an Ant template and using XSLT to generate the "real" build script.

Builds may not need the whole OO panoply of inheritance, polymorphism, etc. but they *do* generally need loops and conditions and what amount to variables and functions -- in short a (possibly limited) language.

> XML is great and I won't hear a bad word against it.

If your mind is already made up, why bother posting?

> an internal data structure for programs that do complicated state manipulation

Why not use domain objects in your programming language instead? Seems a lot cleaner to me.

> The value in using XML over some other scripting language is two-fold

Most of your objection here just seems to echo the title of the thread.

And it seems to me that your arguments only apply to new or niche languages. If you use Python or Ruby, you're getting a fairly well known language (admittedly less well known than XML), that has been thoroughly documented (books, Web sites, newsgroups, sample code, etc) and rigorously tested.

XML is definitely simpler, Python and Ruby are considerably more powerful, and I think the needs are somewhere in the middle.

> Xtremely Mucked-up Lad was kicked out of the Justice League of America for exactly this sort of thing.

No I've always been a Dark Horse!

You say any more and I'm gonna start telling folks where you *keep* that Green Lantern when it's not in use.

Xtremely Mucked-up Lad
Friday, October 17, 2003

I still think the biggest advantage is getting things out of structs that don't belong in structs.  Next is the advantage of having a parsable config file that's assured of well-formedness and has a DTD to conform to.  Lastly is the advantage of namespaces for vendors or anyone else who wants to extend the config file.

From a user perspective it might not be the easiest format, but its not overly difficult either (if done well).

Lou
Friday, October 17, 2003

> I still think the biggest advantage is getting things out of structs that don't belong in structs.

I thoroughly agree with you for config files, but I maintain that Ant scripts are not merely configs.

Xtremely Mucked-up Lad
Friday, October 17, 2003

"> XML is great and I won't hear a bad word against it.

If your mind is already made up, why bother posting?"

Oh no Xtremely Mucked-up Lad!  You've just fallen prey to your mortal arch-enemy, the Troll!

Jim Rankin
Friday, October 17, 2003

I thought the whole point of a build file (or a Make file for that matter) was that it was declarative, saying what to build, not necessarily how.

I would think that if you want a full featured language for build scripts, you might be better served looking at something like Prolog instead of regular imperative languages.

Chris Tavares
Friday, October 17, 2003

> that it was declarative, saying what to build, not necessarily how.

Not sure if I agree.... you typically *do* say how (compile these files, run these XSLT transformations, etc).

But even if so, "saying what" generally involves conditionals, variables and loops at the very least.

>you might be better served looking at something like Prolog instead of regular imperative languages.

Maybe from a problem domain standpoint you're right.

It does seem like mainstream programmers understand imperative languages well, and non-standard approaches (XSLT, functional languages, etc) tend to confuse them.

Xtremely Mucked-up Lad
Friday, October 17, 2003

"e.g., instead of
<target name="compile" depends="init">
    <javac srcdir="${src}" destdir="${build}"/>
</target>

You write:
Java(source='$src', target='$target')"

This example is bogus. The Ant lines declare a target and its dependencies. The Java call does nothing of that sort, it just replaces the middle XML element of the Ant example.

How do you declare a target and dependencies in scons? How much longer would that make your example?

Chris Nahr
Friday, October 17, 2003

Nant supports in-line C# (or presumably other .NET languages) if you need to do some short programmatic task. I would assume Ant does the same.

And yes, make files should be mostly declarations, configurations, etc. The 'how' part gets segregated off somewhere else, though it might still be largely written in the make file language.

mb
Friday, October 17, 2003

Ged,

"It should support BSF".

It does - at least via the optional.jar package + Rhino. You can use the <script> task to write (in at least) JavaScript.

Using JavaScript seems a lot nicer than XML and I have used the <script> task in places where XML usage didn't seem obvious. The major problem with <script> is that it seems to evaluate things at the end of the <target> it is contained within.

Generally, XML is a poor choice for a language I think. Arnold sums it up well. XML is nice for declaring structure, but doesn't feel right as a programming language: too verbose, things like if/then/else seem clumsy, reusing objects (e.g. via id and refid attributes) is similarly clumsy.

For a "human readable" format it's less readable than a scripting or programming language. In the case of Ant (for Java) I think JavaScript and/or Java should be the "scripting" language, as opposed to Python/Jython, because they should be familiar to those using Ant (well, I'd hope Java is familiar to Java developers, although I have seen some weird things in my time :)).

Walter Rumsby
Friday, October 17, 2003

Chris Nahr, the Java() call does _exactly_ that. It's declarative, not imperative: It says "There's a target, which is actually a java class, and it's sources come from here".

I forgot the "depends=init", which would be declared by
Depends('$target', 'init')

Or alternatively, in the same line

Depends(Java('$target', '$src'), 'init')

Even though SCons borrows Python's Syntax, and Python is imperative, the semantics are actually declarative - the "Java" or "Depends" function call establishes a target or dependency in Scons' internal database, much like semantically parsing the Ant script would.

Scons doesn't actually try to construct anything until the target and dependency database has been built and processed.

Ori Berger
Friday, October 17, 2003

Thanks for the explanation, but I think after adding the Depend call you're still missing another feature of the Ant example: the target was a symbolic name which was different from the actual target directory of the build. In your example, they are both the same. I guess that distinction would require another statement in scons?

Chris Nahr
Saturday, October 18, 2003

You can do it several ways in SCons.

You can use the Alias declaration, to give another name to an existing target; That would require another statement. Or, you can just assign the output of a declarative function call to a variable, and otherwise use that value any other way. The following three examples are essentially equivalent in how they affect the database:

Java(x,y)
Depends(x,z)

Depends(Java(x,y),z)

c = Java(x,y)
Depends(c, z)

The third example, though, lets you refer to the Java target throught the variable name 'c' from that point on. Calling "Alias('c', x)" will introduce a global alias which is also usable from the command line, e.g., "scons c" will create just this one (the variable assignment is "local" to your script).

There isn't a perfect one to one mapping between Ant tasks and projects and SCons targets. You do things a bit differently. I'm neither an Ant/NAnt expert, nor a SCons expert, but the impression I get is that SCons is a little easier for basic use, and a LOT easier to extend.

This may be skewed by my 10 year intensive Python background to my 3 year sparse Java background - but following FAQs on both Ant and SCons, I think it's generally true.

Ori Berger
Saturday, October 18, 2003

*  Recent Topics

*  Fog Creek Home