Fog Creek Software
Discussion Board




XML

Unfortunalty I'm no expert in this field, however I sent an article about XML to a expert in this field and this is the result of that discussion. Could anyone inlighten me on this.

Dear Nigel,

    Thank you for sharing the CityDesk article with me. I have a couple of issues with what this gentleman says which I hope should clarify my vision.

    i) He's right in that you cannot write a good UI for a browser - when that UI is for feature rich text editing. But you can write a good UI for forms and other controlled data capture. Since we do not wish to encourage 'War and Peace' in comment fields we shouldn't have a problem here. Nevertheless, you can deploy Word and similar applications via a browser. These 'web enabled' applications share much the same UI as that of the original application. You are familiar with this concept from the use of such tools as Citrix.

    iii) He's right in that XML is not a good mechanism for storing data. Anybody who suggests it should be put to that use is clearly deranged. At the moment an RDB is the only way. However, XML is an excellent mechanism for framing the transport of data, facilitating the messaging concept within the object-oriented paradigm. It is this feature, which helps separate the reliance of the client side software on the underlying RDB which needs to be exploited.

    If we say that our objective is to deploy all applications over the browser, this does not preclude us from using existing applications (we web-enable them), or the Relational Database.

    What XML can give us is the ability to seamlessly integrate both front-end and back-end applications from more than one supplier, this we can do if we can create the UI and the necessary messaging from UI to server-side. It is in this scenario that XML comes into its own - it should be used to facilitate data-interchange. One advantage of this approach is the ability to develop tiers of the client-server model with a greater degree of independence from each other - lowering development costs in the long-term.

---------
My thoughts...

To me the above sounds like a whole lot of hot air, even thou it sounds all very dandy. The last line gave me a good chuckle as this is the sort of answere I get all the time "- lowering development costs in the long-term"

Nigel Soden
Monday, November 19, 2001

I'm not exactly sure what you want us to say about his(?) resonce.  I'll agree that he certainly felt the need to be long-winded about it. 

However, I agree with him on certain points.  XML is an excellent way to transfer data.  What I mean is, there are marshalling and parsing libraries available free for download, and XML compresses very well.  That means you don't have to spend nearly as much time working on transferring data, and your transfer doesn't take as long.

So far as I know, these are the _only_ advantages to XML.  Anyone who tells you differently is selling something.

Bill Dunsford
Monday, November 19, 2001

<quote>
So far as I know, these are the _only_ advantages to XML. Anyone who tells you differently is selling something.
</quote>

Anyother advantage is that xml can easily be transformed from one dialect into another, either by xslt or other means.

This means that if you have to two applications that use XML, then getting the two to work together should be much easier than it has been in the past.

Ged Byrne
Tuesday, November 20, 2001

XSLT is a fantastic mechanism for data transformation. When developing web sites I now use an XML base with Cocoon http://xml.apache.org/cocoon/ and it separates programming via the Cocoon's own creation XSP (Java in XML like JSP) from content management (XML pages that can call XSP via XML tags) from style (XSL pages that convert to HTML, WAP, PDF, etc).

So if you're creating a web site for both the web and handhelds it's quicker for you to deploy additional versions such as handheld as you should only have to change the stylesheet if you've designed it correctly.

It also makes changes to the site quicker due to modularity.

Right now the server does most of the work but in the future the browser can transform the XML and XSL to HTML.

We can also provide just the XML to content partners for a universal data exchange standard as already discussed. The next step is for us to be able to parse the data via Flash's XML parser to generate dynamic Flash interfaces. Something that has been rather tricky up until now.

Michael Glenn
Tuesday, November 20, 2001

I've just gotta write in response to this one....
"He's right in that XML is not a good mechanism for storing data. Anybody who suggests it should be put to that use is clearly deranged. At the moment an RDB is the only way"

So... what do ya do with data that doesn't fit the relational model??? hmmm.
Lets see now, some case studies with hierarchical, structured, non-relational information. 
Anyone want to try sticking an aircraft maintenance manual in RDB, try keeping it up to date & delivering the data in multiple formats? So, what do you do when the engines get upgraded & the information for them comes from a different supplier (who's also much bigger than you!)?

Or even better, managing all the intricate details of a tricky piece of legislation as it grows through life, maintaining the appropriate relationships (i.e. links) to not only the Bills that create and shape it, the Hansard that debates it, the amending legislation that affects it, subordinate legislation, proceeds of cases, legislation and cases cited, the verdict of the case, appeals, their supporting documentation & memorabilia, And, all the versions, including specific versions thereof.

I'm not going to turn this into a 'this vs. that' type of slinging match, but there is great merit in keeping certain types of information in XML/SGML.  Just as there is for RDB. Using RDB can certainly help to facilitate access to that data, as to can OODB.  Don't close your eyes to the possibilities.

It's all about the DATA baby... it's really the only thing that has value as a persistent item. 

Applications come and go, get upgraded, no longer work with data from older versions, using XML has certainly mitigated these risks, and has opened up the playing field for using information cleverly.

Once you've got the DATA, it's a trivial & menial task to deliver to yet another app, and a RDB becomes just another app.
Your applications give your Data behaviour.

If that makes me deranged, sorry guys, I don't think so and neither does it seem, do many others.

p.s. I'm not selling anything either *grin*

Ray
Tuesday, November 20, 2001

Before C.J. Date we already had Heirarchical and Network databases and they were always more efficient at dealing with record oriented data than relational tables.

Compare.

for (findfirstmember(set), endofset(set), findnextmember(set))
    {
      /* process this member, ie transaction line
    }

with

Select field,....fieldn from table,...tablen
      Where some conditional clause parsed from either UI choices or business logic

wait an unknown interval for an unknown data set size

Then process row by row either in memory if you're lucky enough or in a temporary file if the data set is any size, etc, etc.

Unfortunately, the relational model won for a lot of complicated reasons.  That's not to say that there isn't a benefit in using relational calculus as opposed to set theory but once you get into anything but trivial data sets you have to overlay something akin to a network arrangement using domains.  Those domains aren't implemented in the RDB directly but layered in the application code. Which is entirely the wrong place for it.

Simon Lucy
Wednesday, November 21, 2001

I'll try to write my understanding of the problem as clearly as possible ;-)  To think about the importance of XML in some context, I have found useful to think in terms of "XML is ASCII of the future" -- as someone (apparently clever)said.

The XML is good or bad for storing or passing the data the same way as ASCII was earlier.  (If you know UNIX with the standard utilities (working with stdin and stdout) that can be used to get quite complex behaviour via pipe and redirection mechanism, then you know how ASCII is important here).

XML adds some very important things to the ASCII:
1) There is no problem with character encoding and its interpretation.
2) It is possible to add explicitly the context of the information.
3) It is possible to describe explicitly the kind of the document.
4) It is possible to prescribe the rules for certain kind of document.
5) It is possible to verify if the document keeps the rules.

There is the norm describing general mechanisms for doing that (i.e. application indedendent).

To avoid some confusion:

6) XML is not directly related to web documents, nor to the web-enabled applications.

Add 1) XML is excelent for storing text documents, it may not be that good for storing database content (for performance reasons).  It is no problem to enter any thinkable character from any human and other beings language.  It is possible to name that language in some context tag.

Add 2) In ASCII file you could decide the meaning of the part of the text only from some "visual" features like line and page breaks, empty lines, headings underlined using dashes or equal signs.  (Some of the context information can be decided only from the content of the document, i.e. cannot be decided by a computer.)  Or you could introduce some special, application dependent kind of marking or rendering the content.  XML defines general mechanism to capture the content, the structure, and the meaning.  This is the reason why the XML document can be converted into some other form even using very general tools, as Ged Byrne and Michael Glenn wrote earlier.

Add 3, 4, 5) These things together make possible to use XML for reliable passing any data that can be encoded into text form.  The output utility can generate application-independent, widely-used type of document which follows or standard or de-facto standard.  The input utility can verify if the intput stream is correct with respect to the declared document type. 

This way, you can do Unix-like processing of the data.  You can build general utilities (or components, if you like) that are specialized for some kind of processing.  When you have well-designed set of such utilities, you can combine their functionality to obtain the processed information more easily (probably in more steps) than if you wrote some special utility for doing only one kind of processing (in one step).

In other words, one utility need not to be interested in what other utility will process the data.  Someone else can decide how the utilities will be combined and what utilities will be used.  You can use _existing_ utilities to get new functionality via new way of combining them.  In my opinion, this is one of the reasons that made Unix that successful.  This is exactly what managers mean by fancy wording like "increasing the interoperability" and "lowering the cost of development".

Add 6) XML should not be mixed with web-enabled applications.  The web-enabled applications will probably use XML for communication, because of the above mentioned reasons.  Think in terms "ASCII of the future".

Petr Prikryl
Wednesday, November 21, 2001


Blah.

I actually use XML to store data. So what? sue me.

My scenario is an app (similar to citydesk) that allows the user to store information pieces (ie, written articles). It have some constrains, like number of articles (well under 200), and publishing rules.

To add another article, the app just dumps the file over a predetermined directory. If the user need to port that article to other computer, no problem. If the user need to view the contents using some text editor, no problem. If the user feels brave can even delete articles, outside of the application.

Ugly? Perhaps. Old fashioned? Don't think so, Microsoft Word still uses files to store data (a word document IS data). Appropiate? Sure it is. The development of this feature is just so easy, under windows (you only need to deploy the MSXML parser and call it from your app).

So, my $0.2: XML is useful IF you need it :D

Leonardo Herrera
Thursday, November 22, 2001

Microsoft Word does use files to store data like text,images , graphics , bookmarks , cross-references, indexes etc. But then it uses ole-structured storage to optimize access to these differenct elements , essentially simulating a file system within a file . This is closer to a database architecture than a sequential file structure.

shailesh kumar
Friday, November 23, 2001

I would like to react to shailesh kumar's notice (Nov. 23).

Even though the ole-structured storage format is used in MS-Word, it is still one file that is pretty sequential stream of bytes ;-)

From other point of view, XML helps you to describe arbitrarily complex structures inside your document. The only difference is that the content must be encoded as text (in comparison with the binary format of Word's DOC).

This is also the reason why XML is not perfect for implementing databases. It's more verbose (everything in text), it gives you too much flexibility, and it must be parsed for correct interpreation of the data. These are the reasons of performance degradation when you use it for implementing databases.

On the other hand, when passing the data via Internet, you probably have to convert the binary stream into plain ASCII first (to make it application and environment independent) and then usually some layer does the compression to minimize the size of tranported data.
The experience shows that XML data after compression are not much bigger than the compressed binary.  This is one of the technical reasons why XML is very acceptable for moving data over the net.

Petr Prikryl
Friday, November 23, 2001

Actually Word is an interesting case -- most of the application doesn't care about the storage format (which is as it should be). You can, in fact, store a Word doc as XML instead of native Word Binary and it keeps full fidelity. And from the UI you can't tell which version you're working with.

Mike Gunderloy
Friday, November 23, 2001


The fact that the application doesn't know what kind of storage is using is not unusual. Storage is different of the memory representation of the object being treated (where "object" is a document, an image, a blueprint, etc.)

Another simple example of this is any graphics program.

Leonardo Herrera
Friday, November 23, 2001

A note to it is not possible to write a nice GUI within the browser.

One escape to that have been Java applets.
However their usage has dropped at some time.

I believe because they were plagued either by poor browser VM implementation (Netscape) or by limited availability (Microsoft stopped development at a version which is now grossly outdated), with the result that they ate up transmission time and then didn't work properly.
Another item was the applets being confined to some rectangular area in the html page.

The VM availability got better when the Netscape VM was cut out, the result being the Sun Java plug in.

Recently Sun created a method for breaking the Java application out of the browser.
Java Web Start (which will be part of the 1.4 Jave edition):
http://java.sun.com/products/javawebstart/index.html

It allows to deploy Java clients over the web and then to run this app locally on the client box. Apps are cached. There is a security model, which allows use of guarded resources (file i/o, printer, clipboard), if the user permits. Updates are easy.

The initial startup is triggered by clicking on a web page link, then the Web Start manager takes over.

While applets benefit from caching and updating as well, the possibility to have real GUI apps (windows, dialogs, ..)
will give better client interfaces.

Look at the demos, to get an idea:

http://java.sun.com/products/javawebstart/demos.html 

Marc van Woerkom
Saturday, November 24, 2001

*  Recent Topics

*  Fog Creek Home