C/C++ code fragments, performance, threads

Dennis Lang
lang.dennis @ comcast.net

Home


Measure Various Ways to Read (parse) XML Documents

The following table shows the results to open, and read every node and attribute from a large xml document. The test was run multiple times with the results averaged together.

Two key factors I noticed when reviewing these products are:

  • Do you need to validate the xml ?
  • Do you need to support wide characters ?

    The following tests only measure the time to read and extract the name and value of every xml node and all of its attribute names and values. In my case, I did not need to validate the XML document.

    The pseudo code is:

    
    void test()
    {
        XmlObject xml;
        xml.open(filename)
        load(xml.top());
    }
    
    void load(XmlNode& node)
    {
        // Read all attributes
        while (node.GetAttribute(name, value))
        {
            node.NextAttribute())
        }
    
        XmlNode cNode
        while (node.GetChild(cNode))
        {
            load(cNode);
        }
    }
    
    Timing results:

    Product License Platform/Source Seconds Data char or wchar_t
    Chilkat free (I think) Windows package Too slowwchar_t
    MSXML3.0 free Windows dll 9.00 wchar_t
    MSXML6.0 free Windows dll 8.40 wchar_t
    TinyXML opensource/free Source code 5.90 char
    XmlParser opensource/free Source code 5.75 either, timed wchar_t
    CMarkup Commerical/$250 unlimited use Windows package 2.77 either, timed char
    My own xml reader Private   2.00 char

    Based on my very simple test and narrow testing criteria, I would recommend CMarkup as the best solution, ignoring my own custom reader. CMarkup has a very clean and easy to use interface and is very fast.

    Links to product websites:

  • CMarkup
  • TinyXml
  • Chilkat
  • XmlParser

    XML solutions I did not test:

  • libXml++
  • xml C Parser Gnome
  • MS Xml Lite  ,   More on MS XML Lite
  • Apache Xerces XML parser
  • Assembler, must be fast, BSD free license, requires schema to access data