Fog Creek Software
Discussion Board

Welcome! and rules

Joel on Software

Simple XML reading question


I'm trying to do about the most simple thing I can with XML as I'm just starting out. My document looks like this

<?xml version="1.0" encoding="utf-8" ?>
    <TagA>Something I Want To Get</TagA>

I use the following code

System.Xml.XmlTextReader reader = new System.Xml.XmlTextReader(Server.MapPath("../XML/XML1.xml"));

(hope the formatting is OK!)

I cant figure out how to get the content inside TagA. I must be being a bit blind because I couldn't find out how from the docs

thanks in advance


Tuesday, April 29, 2003

you need to do something like this:

if(reader.Name=="TagA" && reader.NodeType == XmlNodeType.Element)
  Response.Write("ElementString=" + reader.ReadElementString());

Consider using the DOM if the documents are not huge (it loads it all into a tree in memory):

XmlDocument doc = new XmlDocument();

XmlNode tagA = doc.SelectSingleNode("/root/TagA");
Console.WriteLine( tagA.InnerText );

Duncan Smart
Tuesday, April 29, 2003

Duncan is correct.  The big twist between XmlReader and XmlDomDocument is this: when you read a single "node" with XmlReader, you are NOT reading an entire XML element, just the start tag! Then you need another read command to read the contents, and then yet another to read the end tag.  Only empty XML elements are read in one go.

Check the various properties (NodeType, IsStartElement, IsEmptyElement, ...) to see what you've got and react accordingly.  XmlReader is fast and generally easy to use, but you do have to put quite a few guard statements into your code if you aren't 100% sure of the data format.

Chris Nahr
Tuesday, April 29, 2003

Forgot to mention... Duncan used ReadElementString which is a "trick" method because it swallows an entire XML element, including the end tag.  You can do this only if you're sure that this element will only have text data, though.

Chris Nahr
Tuesday, April 29, 2003

To Chris Nahr:

Sorry, I don't understand the remark: "You can do this only if you're sure that this element will only have text data".

I thought that XML was a text-based data format, where everything, also encoded binary data, was text. Can you explain what you mean?

Stephen Muires
Wednesday, May 7, 2003

Re. "You can do this only if you're sure that this element will only have text data" -- I think he means that the element can only contain text and cannot contain *other elements*

Duncan Smart
Wednesday, May 7, 2003

Yes, "text" in XML lingo means "a character sequence that does not contain XML tags".  An XML element can be empty or contain stuff, where "stuff" can be text, child nodes, or a mix of both (though that's only common with text mark-up formats like XHTML).

If you try to read child nodes with a method that expects nothing but text, the child nodes might be ignored, an exception might be thrown, or the method might concatenate all text found in the subtree. Usually, none of this is desirable.

Chris Nahr
Saturday, May 10, 2003

*  Recent Topics

*  Fog Creek Home