Fog Creek Software
g
Discussion Board




XML Validity

I believe the .NET framework's System.XML namespace provides an XMLValidatingReader to validate XML.

Not using .NET but with VB6 and the XML parser 3.0 or above, how would one validate an XML document against either the DTD or the schema and not using external validators that are available on the Internet for free. How would one code the validator for an XML document using MSXML parser? Also, I observe an IXMLDOMDocumentType object there in the MSXML parser but do not know how to use it. I am not asking for code. I am asking for a brief explanation of what goes into validating an XML. Does IXMLDOMDocumentType lend any help in that or the procedure for validating XML on your own is very rudimentary?

Plus, when I have my XML document with a DTD in it and I try loading it with MSXML DOMDocument Object, the document returns an invalid (meaning junk and not an XML that is not valid) XML that cannot be parsed. How do I parse the DTD contained in an XML document?

Sathyaish Chakravarthy
Thursday, February 5, 2004

For instance, I got this code from MSDN that just reads the name of the DTD in an XML.

[C++]
#include "tchar.h"
#include "stdio.h"
#import "msxml3.dll"
using namespace MSXML2;

inline void TESTHR( HRESULT _hr )
  { if FAILED(_hr) throw(_hr); }

void main()
{
  try
  {
      IXMLDOMDocumentPtr docPtr;

      //init
      TESTHR(CoInitialize(NULL));
      TESTHR(docPtr.CreateInstance(_T("Msxml2.DOMDocument")));

      // Load a document.
      _variant_t varXml(_T("c:\\book.xml"));
      _variant_t varOut((bool)TRUE);
      varOut = docPtr->load(varXml);
      if ((bool)varOut == FALSE)
      {// Show error description - IXMLDOMParseError sample.
        IXMLDOMParseErrorPtr errPtr = docPtr->GetparseError();
      // Print error details.
      }
      else
      {
        IXMLDOMDocumentTypePtr docTypPtr = docPtr->doctype;
        if (docTypPtr)
        {
            _bstr_t bstrTyp(docTypPtr->name);
            _tprintf(_T("Document type name = %s\n"), (TCHAR*)bstrTyp);
        }
      }
  }
  catch (_com_error &e)
  {
      _tprintf(_T("Error:\n"));
      _tprintf(_T("Code = %08lx\n"), e.Error());
      _tprintf(_T("Code meaning = %s\n"), (TCHAR*) e.ErrorMessage());
      _tprintf(_T("Source = %s\n"), (TCHAR*) e.Source());
      _tprintf(_T("Error Description = %s\n"), (TCHAR*) e.Description());
  }
  catch(...)
  {
      _tprintf(_T("Unknown error!"));
  }
  CoUninitialize();
}

[/C++]


To start with I tried the same thing in VB and this time it recognized my doc's DTD. I realized I did not know how to initialize the IXMLDOMDocumentType object.

[VB]
Private Sub Command1_Click()

Dim xmlDoc As DOMDocument
Set xmlDoc = New DOMDocument

If Not xmlDoc.Load(App.Path & "\Collection.xml") Then
    MsgBox "Unable to load XML"
    Set xmlDoc = Nothing
    Exit Sub
End If

Dim xmlDocType As IXMLDOMDocumentType
Set xmlDocType = xmlDoc.doctype
MsgBox xmlDocType.Name
Set xmlDocType = Nothing
Set xmlDoc = Nothing

End Sub

[/VB]


But I get the feeling that an application trying to validate on the basis of this object IXMLDOMDocumentType will have to parse the DTD just like another XML and read all the nodes and make sense of them, which is going to be very painful. What I expect will be the most horrific excercise will be verifying the XML against the DTD i.e the actual validation which the application will have to handle.

Sathyaish Chakravarthy
Thursday, February 5, 2004

Er, if you're using MSXML, why aren't you just using its own methods for validation?  For instnace, you might want to look in the help file for the topic "Validate an XML Document Against an XML Schema"

jburka
Thursday, February 5, 2004

Hi,

This has not been tested, but we use something along these lines.  Hope this help!

  Dim oXML As MSXML2.FreeThreadedDOMDocument40 
  Dim oxmlSchema As MSXML2.XMLSchemaCache40
  Dim blnValidXML As Boolean   
  Dim sSchemaFile As String 
   
  sSchemaFile = "C:\SomeXmlSchemaFile.xsd"
   
  'Load the Schema
  Set oxmlSchema = New MSXML2.XMLSchemaCache40
   
  oxmlSchema.Add "", sSchemaFile
   
  Set oXML = New MSXML2.FreeThreadedDOMDocument40
  Set oXML.schemas = oxmlSchema
   
  blnValidXML = oxmlMessage.loadXML(sxmlMessage)
  If Not blnValidXML Then
      'Raise Error - String Passed In is not a Valid XML or does not confirm to the XML Schema
      Err.Raise vbObjectError + 1000, "", oXML.parseError.reason
  End If
 
  Set oxmlSchema = Nothing   
  Set oxmlMessage = Nothing

Garett
Thursday, February 5, 2004

Ooops!

s/oxmlMessage/oXML/g

Garett
Thursday, February 5, 2004

Thanks a lot. But for the few typos, it worked like a charm.

What about DTDs now? To test your sample, I generated the XML schema, I didn't write it myself. I have not yet learnt XSD. But I am familiar with DTDs and have written quite many of them. If I have to validate against a DTD, how do I do it with MSXML?

Sathyaish Chakravarthy
Thursday, February 5, 2004

I guess I could try the same path for a DTD, load an external DTD first into an IXMLDOMDocumentType, then create a blank DOMDocument, assign the DTDObject to the docType property of the DocumentObject and then try to load the XML document into the document object.

Am I on the right track?

Sathyaish Chakravarthy
Thursday, February 5, 2004

You could use the <!DOCTYPE> tag within your XML to point to the DTD then do something like this.  Again not tested and I apologize beforehand for any typos. 

Dim oXML As New Msxml2.DOMDocument40
oXML.async = False
oXML.validateOnParse = True
oXML.Load "C:\SomeFile.xml"

Here is an article that provides an example, along with a more thorough explanation.

Garett
Thursday, February 5, 2004

Where's the article, hey?

Sathyaish Chakravarthy
Thursday, February 5, 2004

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/xmlsdk/htm/dtd_dev_1fqs.asp

Garett
Thursday, February 5, 2004

*  Recent Topics

*  Fog Creek Home