Fog Creek Software
Discussion Board

I Still Don't "Get" What's Useful About XML

I read an XML book this weekend.  I figure if MS is going to marry it's future to XML that means I have to come along for the ride, like it or not.  So I read up on the proper structure of an XML doc, and the byzantine rules of building DTDs and XSDs, and the rules for namespaces and such.

But I still don't know what you can *do* with XML that is useful.  I still don't get it.  What is XML good for?

I'd like to have my "aha!" moment sooner rather than later.

Monday, November 03, 2003

XML is useful for persisting data on a hard drive and transporting data over the Internet in a standard format.

Dave B.
Monday, November 03, 2003

One way of looking at it:

* ASCII, etc - standard way of exchanging text.
* CSV: commonly accepted way of exchanging flat files.
* XML: standard way for exchanging structured (trees/hierarchies) data.

It's really quite dull -  which might be why you *think* you're not getting it.

Duncan Smart
Monday, November 03, 2003

Have a look at:

Maybe you're not missing much:

"In fact, XML is just a notation for trees, little more than a verbose variant of Lisp S-expressions; and a way to define tree grammars, a poor-man's BNF."


"The essence of XML is this: the problem it solves is not hard, and it does not solve the problem well.'"

Monday, November 03, 2003

XML is a hyped technology.

Fat books are more expensive than skinny ones.

Publishers would rather make more money by selling fat books.

XML isn't that complicated.

Hence, almost every single book on XML is filled with at least 80% worthless fluff.

XML is quite simple.  It's a well-defined text format for storing tree structures.  It's widely supported, so you can trust that anyone you interact with will be able to read and parse it well.

It is not perfect, but it's good enough.  Sometimes it's more important to be well understood than to be perfect.

Richard Ponton
Monday, November 03, 2003

XML means mega-giga-hype.

So, you can profit from it. Not from XML, but by offering tools and libraries which profit from the giga-hype.

Good luck!

Monday, November 03, 2003

You're not missing a whole lot; XML is ridiculously overhyped and overused. 

A relational DBMS is far, far better at storing data than XML, and even the “inventor” (Tim Brey) of XML states that it is not for storing data (I which I could find the link, but it was in a recent article about how he was using XML – “at the edge” whereas the data was stored in a DBMS).

There was a topic recently discussing what was “wong” with IT today.  It talked a lot about programmer mentality but I didn’t see “Fad-Driven Architecture” up there at all.  The latest “cool” thing (which really is a bad “solution” to a problem no one even encountered) is now the OneAndOnlyThing you can use to accomplish a given task.  How quickly people forget the days when we didn’t have DBMS products and had to write our own database management tools, then the network and finally hierarchical models that produced cumbersome, error-prone, and inflexible tools.  XML is indeed a throwback to the hierarchical DBMS we had thrown at us in the 70s.

People say “Use the right tool for the job” when they only know tools and how to mechanically apply them.  This is why we have XML *everywhere* from the source of this page to someone’s SOAP iToaster API.

Monday, November 03, 2003

Processing XML is CPU/memory expensive. So, consider it for communication between very different systems with different data structures.

Evgeny Gesin /Javadesk/
Monday, November 03, 2003

Wow, some of you guys are *really* cynical.  ;)

"Processing XML is CPU/memory expensive. So, consider it for communication between very different systems with different data structures. "

I read all about the features of XML; I'm trying to determine what the business benefits of XML are; what business problems does it solve?

So it's a handy way of accomplishing systems integration tasks?  Like if I had a client who had an ancient order-processing system and a relatively modern accounting system that needed to be able to share data?

Monday, November 03, 2003

How can XML be so overhyped and seen as a "Silver Bullet" for every IT problem?  Every IT trade rag seems to be obsessed with XML?  I mean there are even conferences and fat books dedicated to XML.  Can you imagine an "ASCII Flat File" or "Query String" conference or a 1000 page treatise on all the uses for ascii flat files? 

It seems like as long as you have a big-name and/or big company promoting a technology, IT follows like mice following the pied piper.  I'm sure most of us in IT can spot this in popular culture & teen fads and would see how silly it is, but we do no different when it comes to over-hyped technology.  I consider IT/Developer folks to be highly intelligent but when it comes to over-hyped technology, the discernment part of the brain sure gets turned off. 

Monday, November 03, 2003

It was just suggested to me by a colleague that I may not "get it" about XML because I don't have the kind of problems that XML solves.  I suppose that's possible.

" I consider IT/Developer folks to be highly intelligent but when it comes to over-hyped technology, the discernment part of the brain sure gets turned off.  "

I have to agree with you thtat sometimes the community gets caught up on the "engineering cool factor", but in general, I trust the developer community to be pretty good at telling the difference between that which is useful and that which is not.

Monday, November 03, 2003

>> " I'm trying to determine what the business benefits of XML are; what business problems does it solve?"

You can persist relational database recordsets in XML format onto various media (using ADO).  This allows the user to work with the data on a device that is not connected to the database (i.e. laptop).  You can then resync that data back to the database at a later point in time.  (Albeit, whole database replication may be a better approach).

You can also, like you say, use the heirarchical XML representations of relational database data to communicate between systems. (Legacy or not).  Although it would seem to me that this is just another layer to go through unless of course the system was designed from the start to transport/read data in XML and not some custom format.

You can communicate data to various other companies in a standard format so they (or you) only have to write one program to read the XML.  You don't have to convert the data to one companies format and then another and anothers (or vice versa).  This may be easier said than done as convincing some companies to switch technologies is like pulling teeth. (I think some people are too comfortable with what they know and/or they sometimes don't have IT folks to do things so they don't want to switch to something new)

Graphics User Interfaces.  XML is used to define GUIs or various types of metadata, configuration files etc.  There are games designed around this concept and of course web pages.

As stated previously almost anything hierarchical can be represented using XML.  It's like reinventing the wheel, and everyone wants one, but doesn't know why.

Monday, November 03, 2003

I'll disagree with my namesake, " ", here.

XML isn't really anything useful. It's a flattening out of the database concept, to make structured data storage concepts and data accessible to more people.

With XML, you don't need to understand data storage and file formats in order to access data. This occurs at the level of jsut inspecting the data, outside of any automated access, but also at the production level, since there are so many parsers that now do this for you.

At a business level, the concept of enforcing data structure in a text format overcomes incompetence in programmatically accessing structured data, even though it's unnecessary for capable programmers, adds 10 times more bulk and more complexity.

It's a bit like Excel meets SQL.

Monday, November 03, 2003

"but in general, I trust the developer community to be pretty good at telling the difference between that which is useful and that which is not."

Don't.  We're rarely reporting on anything other than vague ideas, gut-feelings, non-repeatable best-cases, and other less-than-scientific sources.  Looking at what the IT industry has created proves that the most popular product/idea is quite often wrong.

For examples, see: hierarchical databases.  MS Bob.  XML.  Various development models/methodologies.  This quote: “Client Servers were a tremendous mistake. And we are sorry that we sold it to you.” – Larry Ellison, Oracle

Why should you care?  Because our collective ignorance costs us dearly in terms of money, time, and effort.  Think of the effort of people selling us junk and we eat it up not knowing that we could be eating fillet mignon instead of dog food.  But, we get what we deserve by not understanding and demanding better.

Some of it is that the computer science field is impossibly young.  Take a look at the field of medicine.  Germ theory isn’t all that old yet only 100 years ago people were bottling up snake oil and claiming it cured all that ails you.  (One could point at many products on the market and say the same thing, too, only the products are not as easily distinguishable from the real deal.)  One would suspect that back in the earliest days of construction man fooled around with a lot of crappy products before the field of engineering was finally nailed down.  The barbarians that sacked Rome didn’t much care for the engineering knowledge and promptly forgot about aqueducts and hygiene.

However the other part is personal ignorance.  The ONLY thing you can do is try and be as educated and informed as you can and demand better (which unfortunately takes work which is why many people choose to not to). 

As T. Schick Jr. stated:
"The educated person is not the person who can answer the questions, but the person who can question the answers.”


Monday, November 03, 2003

and [the Western World] promptly forgot about aqueducts and hygiene.

Monday, November 03, 2003

> You can communicate data to various other companies in a standard format

But programmers have always been able to do this. Dolts would get it wrong. XML does not solve a new problem here.

> XML is used to define GUIs or various types of metadata, configuration files etc.  There are games designed around this concept and of course web pages.

Again, there is no real need for this, and it doesn't work as smoothly as hucksters would have you believe.

Monday, November 03, 2003

Dear Norrick,
                    XML is great for transferring data between different systems.

                    At work we have an Oracle database. If I want to take out the data and work with it I export a report to an XML file and then import the XML file into Access 2002. Works a dream.

                    If I want to send somebody the data in print form then I export to .pdf. Perhaps we can think of xml as a kind of .pdf, a Portable data format.

                    Let's look at another use. Word 2003 saves natively in XML it appears. This means that I can write a test for students and put the answers in and because the questions and answers are tagged I now have two (or more0 separate versions. I can just set up a template for student and teacher version and maybe another, and any tests I write in that template are automatically generated as two or three separate documents.

                    I suspect that XML is something with a lot of little uses, but somehow it's got mislabelled; to consider it as a data storage format is perverse.

Stephen Jones
Monday, November 03, 2003

"The educated person is not the person who can answer the questions, but the person who can question the answers.”

I think the only question IT is asking is : "Thank you sir, may I have another?" 

Monday, November 03, 2003

I remember when I first used XML and XSLT I thought they were wonderful.  It was just great having data in a format that I could throw around any which way.

Then I discovered Perl (and now Ruby) and discovered I could have that freedom with any text data, and even some binary.  Not so great after all.

Ged Byrne
Monday, November 03, 2003

Well, I used perl long before XML.  I was joining, splitting, and regexing all over the place.

I got things done MUCH faster than if I had tried to use XML.

1) Anyone trying to read my file formats outside of perl (or another super-tex-processing language) like C or Java would have a hard time of it.

2) My files had absolutely zilch in the way of Unicode and non-ASCII character set support in general.

3) If I wasn't really careful, I'd get the occaisional "yes, a customer actually did put '~!~' in their data, which breaks your program because you used that as a field seperator" bug.

So XML isn't the be-all, end-all technology that's going to do the best job with least usage of resources.  However, it's well understood.  A lot of the inconveniences of dealing with XML come from the fact that XML has already been designed with escape sequences and character set issues in mind.

So you can
a) quickly implement a fast, simple solution that is not as robust as XML and is hard for other people to use.

b) slowly implement a fast, robust solution that is not easy for other people to use.

c) quickly implement a slow, robust XML solution that is easy for other people to use.

(where "slow" = "CPU and/or memory intensive")

Your custom solution may be fast and robust, but no matter how simple you make it, it's going to be hard for other people to use simply because it's yet another format for them to learn.  Now, you can make their work easier by providing parsers for you grammar in any language they might need, but then you've just added a whole lotta lines of code to your project, haven't you?

Richard Ponton
Monday, November 03, 2003

XML is person-readable EDI.

"> You can communicate data to various other companies in a standard format

But programmers have always been able to do this. Dolts would get it wrong. XML does not solve a new problem here."

No, *programmers* always get it wrong. Work on EDI for a while and you'll soon discover that one thing EDI isn't is a "standard" - it's a guideline, or a starting point. Like "here's something to look at while you come up with your own way of handling text documents"

We've been moving people to XML instead of EDI - for the most part they are MUCH more receptive to XML. I can glance at an XML invoice and identify errors. For larger documents, I can buy a $100 XML/XSD validator and verify it. For EDI you're spending $6k for a validator.

Maching parsing XML is 1000x easier than EDI as well. Trust me. :-)


Monday, November 03, 2003

I feel that a lot of the technologies surrounding XML are needlessly complex, despite XML itself being a simple concept.

XML is basically just matching tags. A nice simple idea that you should be able to run with.

But then people decided they needed 50 million other things to go with it, and so we got all the other technologies. Instead of making them as simple as XML is, they made them complex.

So now you can't just work with XML, because someone is going to want DTD, or XSLT, or XSD, or XPath, or DOM, or any number of other things that no two vendors quite implement the same. Each has their own little quirks.

The cynic in me says that it was intentionally made complex so that a cottage industry could spring up around it in providing tools and experts. That always seems to be the way with things in IT.

Sum Dum Gai
Monday, November 03, 2003

great for config files.
great for standardized data exchange
Also, my resume is in xml. It's easy to create different file types and formats.

Tom Vu
Monday, November 03, 2003

I didn't mean EDI. EDI is junk. My point was that XML is not and never has been necessary to define data formats and to establish communicating systems between different organisations.

Monday, November 03, 2003

Maybe this is a bit too language-specific, but XML is the bee's knees for serializing objects in VB.Net/C#.

In VB6, I had to create lengthy, tedious code to transform a CSV file into a hierarchical graph of objects.  With VB.Net, it's just two lines of code, and it's done.

Robert Jacobson
Monday, November 03, 2003

Somebody said that XML is great for config files.

In my opinion, they are cryptic and hard to edit unless you have a special editor.

.INI files are much, much better.

Monday, November 03, 2003

XML is just a textual way to represent the information structure called the InfoSet.

Another C# developer
Monday, November 03, 2003

"I didn't mean EDI. EDI is junk."

What's wrong with EDI?  In Australia, there's an actual real standard, AS????, I've forgotten the number.  It defines the layout, (but not the actual data of course), the same as XML provides.  I didn't have any problems with it.  YMMV.

My main gripe with XML is someone takes a flat file and converts it to XML.  Great, now it's twice the size as well.  Real good.

Resume padding occurs, (usually why the above happens) so what used to be a simple transfer gets complicated so someone can add "XML" to the skills part of their resume.

To find a record, you need to load and process the whole damn file.  Gah.

I alway thought the idea of XML was that you could lob a complicated lump of data at a random person, and they could figure out it's structure / content for themselves.  No support calls, yay!

Monday, November 03, 2003

"I didn't mean EDI. EDI is junk. My point was that XML is not and never has been necessary to define data formats and to establish communicating systems between different organisations."

You just answered your own question.
EDI is an ANSI standard - heavily documented and decades of history in installed systems. IMO there isn't an established base more established than X12.

Yet it's still royally screwed up. Part of the reason (IMHO) is that it's exceptionally hard to read and visually parse.

That's where XML comes in.


Monday, November 03, 2003

I don't think xml solves any hard problems in itself, since it's a common language in which you solve them, but it really helps out the html world.  People who must mechanically deal with html must love xml.  Xml really seemed to use html's popularity to catapult it into the mainstream, just like Javascript used Java's name despite having nothing to do with Java.  There have been arguably better attempts, such as this one from 1975:

Maybe xml really demonstrates how to serve and ride a killer app.

On the theme of html bias... xml has these weird "attributes" (you know, in <foo bar="bob">, bar="bob" is the attribute) that seem to have no point except for making html pretty.  (In html, 'data' is printed to screen while 'metadata' isn't.  But for abstract data, metadata IS data.)  The problem here is that attributes should be deep and interesting for being a Syntactical Feature, but they aren't.  BTW, a couple days ago I noticed Erik Naggum supporting this position.

Tayssir John Gabbour
Monday, November 03, 2003

BTW, I think the industry considers "market" to be one meaning of technology.  Many companies have found ways to exploit the xml market, such as Electric Minds which released a free high-quality xml parser to gain word-of-mouth.  In some ways, these things are resources that a nimble small company can ride into a market.  Big companies can churn money from customers too.  O'Reilly loves xml.  What do tech companies produce but tech, right?  So by definition, xml is a tech.

But xml is nice so people don't have to remember to include escape character systems in their homebrewed file formats... and portable tools can give locking, redundancy, etc...

Tayssir John Gabbour
Monday, November 03, 2003

The *value* of XML is the hype.

That is, it has become a lingua franca because everyone knows about it.

It has many many problems because the politics involved with those who created it, and their belief in perfection though complexity (or something like that), but the problems are easy enough to find out. And fairly common across all XML systems, because everyone wants to use it, because of the hype.

Tuesday, November 04, 2003

You can print it out and make a paper hat out of it.

Tuesday, November 04, 2003

Stephen Jones said it best (as usual): XML is a communication mechanism. That means that it should be created on the fly, sent down the wire and interpreted at the other end.

Obviously, formatting the message in this way is going to eat processor cycles, as will digesting it at the destination, so you only go to the trouble if you can't find a simpler solution. At this point, I must state that I've never had cause to use XML so I'm already on shaky ground. I'd only consider XML if I needed to communicate with a "foreign" system (different product, OS, language, schema, etc.) *and* my vendors hadn't provided me with prebuilt tools.

I see XML as a messaging format, not a file format. It's definitely *not* appropriate for config files. Imagine if you stored config files as TCP/IP packet dumps and used a socket to read the file. That would be stupid, but it's not that far removed from the current use of XML...

Paul Sharples
Tuesday, November 04, 2003

Err... Not a huge amount as one of my fellow standards comittee members put it.
"XML-It's just a .CSV file with knobs on"
There are advantages in that you don't have to write your own parsers, it's well standardised and if you use XSLT stuff you can transform it around.
However the Ahaa moment I got was when I started to use some of the query tools on it. The whole thing hung together fairly well. Need a little lookup table, stick in it XML and query it. End user editable and no database required. In this case I was transforming from our local terminology in and out of standards terminolgy. Since our local terminolgy is user configurable I needed a simple lookup table and it met that need rather well.

Peter Ibbotson
Tuesday, November 04, 2003

Two valuable uses I've seen:

1. In pharmaceutical research (e.g. clinical trials of drug compounds) in the US, all trials data must be archived in a human readable format. This is just in case 50 years from now, someone needs to review the data, at the very least, a person can get to it, even if the original technology in which it was generated is long gone.  XML allows you to put data descriptors in with the data, too.

2. XML is transformable. If you store your content in XML, you can write an engine to format it to another mark up tag set or to an output format. In theory, you have one source for content.  (In practice it's pretty tricky, but doable.)

Lauren B.
Tuesday, November 04, 2003

Yeah, but, archiving pharmaceutical research and changing the presentation of data is not rocket science. Really, it's not. You don't need XML to do this.

The research just needs the format to be defined and strict.

Tuesday, November 04, 2003

Thanks all for talking so clearly about this subject! I'd say XML is a versatile way of storing tree-structured data for exchange between different applications (or for internal storing if not great amount of data is required, ex: user documents, config. information, anything).

· wasted space: sure an internal own format would be more efficient.
· wasted time: sure a .ini would be less cpu-consuming to read
· raw data is not so human readable as a txt/ini/cfg/dat file (it's strongly suggested to use a "xml notepad" application)

The *really good* thing is not about xml itself, it is about everybody over there is supporting it. So translate/import/export-ing data between applications/systems never was so *easy*. I see XML for data like esperanto language.... (ok, not good example 'cause few people speak it, but you know what I mean)

Tuesday, November 04, 2003

Good summary. I used XML in an app back in 1999. I used it to retrieve remote "articles" as defined by the DTD I created and gave to remote publishes, my spider would round up these XML documents, import them into a RDBMS and then I'd manage the data in Perl structures.  Communications.

Also, if any of you have had to work with so-called CSV files that YOU DID NOT write then you might have a better appreciation for XML for communicating data between IT bodies. 

Tuesday, November 04, 2003

One thing I really like about XML is that the data format itself can be validated by applications that don't understand the content of the XML.

I can pass any XML document into a validator and get a line-number for where the structure fails. Accomplishing that kind of generic validation is difficult to accomplish with a home-brewed document format (because sometimes a grammar can have subtle ambiguities).

Benji Smith
Tuesday, November 04, 2003

Agreeing with Benji, I have to admit I can't believe we're having this debate.

Given a structure, write the code to fetch all the children of a man named "Smith"

XML: Use a parser, write the XPath expression, using a cheap tool to test and validate

Not-XML: Find the developer, get the spec, which:
a) Isn't written down
b) Has mutated since its creation
Write the code to do it, test the living heck out of it with old data. Find out the first document you have to run has a new twist that nobody had provided for.

I would have thought most developers with a few years experience have had occasion to do both options above and would recognize the value of a defined data exchange framework with attendant tools.


Tuesday, November 04, 2003

Philo, that could easily be done with other encoding techniques provided someone wrote the tools to do so.  There's nothing 'special' about XML that does this; in fact, XML happens to do this in a bloated, extra-verbose way that is wholly unnecessary for machine-to-machine transmission.

Tuesday, November 04, 2003

"wholly unnecessary for machine-to-machine transmission."

Problem is that that's a unicorn. I don't think that *any* data stream will ever be completely isolated from human eyes; most notably in the development, testing, and bug-fixing areas.

Self-describing data formats can save hours of troubleshooting, testing, and debugging. If you're not using a self-describing format, then you're going to be spending time writing tools to handle your format.

Maybe it's just me, but every time I open an EDI document, and the spec, and the validator, I long for my XML style instead. I'd be interested to watch one of you who eschew XML work with comma-delimited, fixed-length, or EDI data for prolonged periods of time. Maybe I'd learn something.

BTW, two other thoughts:
1) Nobody is making you use XML. If you think it's bloated, don't use it.
2) If you MUST use XML, then why not go with key tags? <A>data</A><B E="Attribute">More Data</B> etc...


Tuesday, November 04, 2003

The only thing special about xml is that it is standardized-- that's really it, not that that's a bad thing.

Tuesday, November 04, 2003

As we all know, computers can exchange data in any existing format and XML doesn't have any advantage over them.

The "self-description" has little to do with data exchange -- but the semantics underlying the data -- its meaning. Tags and a tree structure are neither sufficient, nor the best choices for modeling data.

To quote Fabian Pascal:
"the meaning of any logical model is conveyed -- and made accessible -- by the data model underlying it. XML's data structure is hierarchic and unless data types, integrity and manipulation are added to it (to produce a full-fledged data model), just a bunch of tags are not sufficient for self-description. But if the three components are added--which is what is happening in W3C--the result will be the same nightmares that prompted us to rid ourselves of hierarchic DBMSs years ago."

Tuesday, November 04, 2003

XML is just text with structure that makes it readable by computers ("readable" here means that you can obtain a useful structure from it, not just a bunch of bytes...)

I think people saying that XML isn't useful are just being cynical, or never has been faced with a problem where a good answer is to use XML. Obviously, XML got its good amount of hype, but hey, it also happened to Java ;-)

Leonardo Herrera
Tuesday, November 04, 2003

"I would have thought most developers with a few years experience have had occasion to do both options above and would recognize the value of a defined data exchange framework with attendant tools."

But you see, Philo, all these developers understand that reducing developer productivity is a small sacrifice to pay for saving some space on disk or in memory, or some CPU cycles parsing a file.  They know that disk, RAM, and CPU cycles are expensive, but a developer's time is cheap.

Jim Rankin
Tuesday, November 04, 2003

more truth is written here about XML than probably was in the fat book.

Tuesday, November 04, 2003

MR, time to ante up - what kinds of work have you done interchanging data with other applications at other organizations, and over what length of time?

My current work is with 200+ distributors and 60+ suppliers. Even with half of them integrated, bringing on new suppliers (even the most agreeable ones) generally requires changes to EDI document maps. We bring on about a supplier a week.

But I haven't touched the XML schema in four months.


Tuesday, November 04, 2003

Ahh... Philo sounds like you've spent too long reading those EDI documents (I understand the pain).
My only objection to XML is it's verboseness, along with the hype. I kept on seeing press releases written using the word "XML" like it was a programming language (ok XSL kinda is) Basically a lot of folks got confused by PR

Peter Ibbotson
Wednesday, November 05, 2003

Peter - I'll be the first to agree that
XML is overhyped
Many magazine managers think their projects will fail if they don't use XML
XML may seem overly verbose

But I've been refuting the topic and several posts which seem to imply that XML has *no* use whatsoever.


Wednesday, November 05, 2003

...and yes, I've spent too much time in EDI. But having been up to my eyeballs in "standard text exchange format" that is hard to read and where the standard is observed only in the breach, I've gotta tell you that XML is a godsend.


Wednesday, November 05, 2003

XML is like SQL: it's already out there and fully developed. You can immediately do Real Work with it. There's a zillion books on it, so no need to explain a new system. It may not be the best system, but it generally beats most home-grown ones.

And like SQL, it's not going away any time soon.

Thursday, November 06, 2003

*  Recent Topics

*  Fog Creek Home