Ziring logo Neal's Java Page Marquee    

XML Hierarchical Structure Creator

Version 1.5
This is the introduction page for Xhsc, a specialized XML utility written in Java.  Xhsc is free software, distributed under the Clarified Artistic License.   You can use Xhsc for your XML processing, if you have the need to convert flat-structured XML into deeply nested XML.


What is Xhsc?

Xhsc is a program and a Java class library designed to generate hierarchically structure XML documents from XML, XHTML, and text data that is not hierarchically structured.  In other words, Xhsc uses information drawn from the content of a document to impose a tree structure on the document.  Xhsc can be used in two ways: as a standalone Java application, or as a Java package incorporated into a larger Java-based system.

Xhsc requires you to describe the structure you want to create by supplying a specification, written in XML.  The specification tells Xhsc which tags (or text matches) denote hierarchy levels in the input document; it will then create those levels in the output document.  The diagram below shows how this works.
Xhsc operation: input, transformation, output
Xhsc's hierarchy creation is fairly specific kind of transformation to apply; to help you fine-tune the output format, Xhsc can also apply XSLT transforms.  If the specification you give to Xhsc is a valid XSLT stylesheet, then the stylesheet will be applied to the hierarchical XML data before it is output to the output file.

Downloads and Requirements

To use Xhsc you need the following:
Xhsc will run anywhere that Java 2 is supported, including Linux, Windows, and Solaris.
If you just want to run Xhsc or use it in a program, download the executable Jar:
If you want to modify or extend Xhsc, or just see how it was written, download the source Jar.  This also includes the source code for the user guide (in Docbook format).
You can read the user guide on-line, or download the HTML version.

Frequently Asked Questions

This section answers a few questions about Xhsc.
Q. How mature is Xhsc?  Is is ready for real use?
A. The core of Xhsc is quite stable and mature (hence the version designation of 1.5).  It is been used successfully to convert multi-megabyte flat-structured XHTML documents into deeply nested SGML.  The documentation for Xhsc is mediocre at best, and badly needs more examples and explanation.
Q. Why do I need Xhsc?  Can't I use XSLT instead?
A. Yes, you could use XSLT for most of the XML processing that Xhsc can do.  However, I wrote Xhsc because there were certain kinds of changes I wanted to make to some XHTML documents that turned out to be very difficult in XSLT.  For creating nested structure from flat structure, an Xhsc specification can be far shorter, simpler, and more maintainable than an equivalent XSLT specification.  For tasks that need the more general capabilities of XSLT, any Xhsc specification can also be an XSLT stylesheet; Xhsc will apply the stylesheet transform to the results of its hierarchy-building process.
Q. What data formats can Xhsc read in and process?  Can it read HTML?
A. XML and plain text are the only formats that Xhsc can read.  If you need to process HTML, you must first convert it to XHTML using HTML Tidy.
Q. How deep can I build nested structures using Xhsc?
A. The depth of the nested structures is limited only by your patience in writing the Xhsc specifications.  The deepest I've every tried was nine layers deep.
Q. How does Xhsc handle XML comments and processing directives?
A. It ignores them completely.
Q. What kinds of output formats does Xhsc support?
A. The output of Xhsc is normally legal XML.  Internally, Xhsc always employs XSLT to generate its output.  If the Xhsc specification is a valid XSLT stylesheet, then it is used to generate the output; otherwise, Xhsc creates an internal XSLT 'identity' transform and uses that.
Q. Can I used Xhsc in my Java-based system or product?
A. Yes.  Xhsc is open software, you may use it in any way you like that is compatible with the terms of the clarified Artistic License.
Q. How did you write the Xhsc documentation?  Why does it look so weird?
A. The Xhsc user guide was written in Docbook.  It is a very versatile text-based document preparation system designed for writing technical books and manuals.  One of the original motivations for writing Xhsc was to convert legacy HTML documents into Docbook SGML, so it seemed natural at the time to use Docbook.
If you have more questions about Xhsc, write to me using the feedback form, and I'll put the question up here.

Other XML Tools

Here are some other sites that offer specialized XML tools.


[Ziring MicroWeb Home]   [Java Page]  [Sign Guestbook]

This page written by Neal Ziring, last modified 11/22/05.