XML Hierarchical Structure Creator
Version 1.5
This is the introduction page for
Xhsc, a specialized XML utility
written in Java. Xhsc is free software, distributed under the
Clarified
Artistic License. You can use Xhsc for your XML processing,
if you have the need to convert flat-structured XML into deeply nested
XML.
What is Xhsc?
Xhsc is a program and a Java class
library designed to generate hierarchically structure XML documents from
XML, XHTML, and text data that is not hierarchically structured.
In other words, Xhsc uses information drawn from the content of a
document to impose a tree structure on the document. Xhsc can be
used in two ways: as a standalone Java application, or as a Java package
incorporated into a larger Java-based system.
Xhsc requires you to describe the structure you want to create by
supplying a specification, written in XML. The specification tells
Xhsc which tags (or text matches) denote hierarchy levels in the input
document; it will then create those levels in the output document.
The diagram below shows how this works.
Xhsc's hierarchy creation is fairly specific kind of transformation to
apply; to help you fine-tune the output format, Xhsc can also apply
XSLT transforms. If the
specification you give to Xhsc is a valid XSLT stylesheet, then the
stylesheet will be applied to the hierarchical XML data before it is
output to the output file.
Downloads and Requirements
To use Xhsc you need the following:
- A good understanding of XML.
- Java 2 SDK version 1.4 or later [download]
- The Xhsc Jar file (download below).
Xhsc will run anywhere that Java 2 is supported, including Linux,
Windows, and Solaris.
If you just want to run Xhsc or use it in a program, download the
executable Jar:
If you want to modify or extend Xhsc, or just see how it was written,
download the source Jar. This also includes the source code for
the user guide (in Docbook format).
You can read the
user guide on-line, or
download the HTML version.
Frequently Asked Questions
This section answers a few questions
about Xhsc.
- Q. How
mature is Xhsc? Is is ready for real use?
- A. The core of Xhsc is quite stable and mature (hence the version
designation of 1.5). It is been used successfully to convert
multi-megabyte flat-structured XHTML documents into deeply nested SGML.
The documentation for Xhsc is mediocre at best, and badly needs
more examples and explanation.
- Q. Why do I
need Xhsc? Can't I use XSLT instead?
- A. Yes, you could use XSLT for most of the XML processing that
Xhsc can do. However, I wrote Xhsc because there were certain
kinds of changes I wanted to make to some XHTML documents that turned
out to be very difficult in XSLT. For creating nested structure
from flat structure, an Xhsc specification can be far shorter, simpler,
and more maintainable than an equivalent XSLT specification. For
tasks that need the more general capabilities of XSLT, any Xhsc
specification can also be an XSLT stylesheet; Xhsc will apply the
stylesheet transform to the results of its hierarchy-building process.
- Q. What data
formats can Xhsc read in and process? Can it read HTML?
- A. XML and plain text are the only formats that Xhsc can read.
If you need to process HTML, you must first convert it to XHTML
using HTML Tidy.
- Q. How deep
can I build nested structures using Xhsc?
- A. The depth of the nested structures is limited only by your
patience in writing the Xhsc specifications. The deepest I've
every tried was nine layers deep.
- Q. How does
Xhsc handle XML comments and processing directives?
- A. It ignores them completely.
- Q. What
kinds of output formats does Xhsc support?
- A. The output of Xhsc is normally legal XML. Internally,
Xhsc always employs XSLT to generate its output. If the Xhsc
specification is a valid XSLT stylesheet, then it is used to generate
the output; otherwise, Xhsc creates an internal XSLT 'identity'
transform and uses that.
- Q. Can I
used Xhsc in my Java-based system or product?
- A. Yes. Xhsc is open software, you may use it in any way
you like that is compatible with the terms of the clarified Artistic
License.
- Q. How did
you write the Xhsc documentation? Why does it look so weird?
- A. The Xhsc user guide was written in Docbook. It is a very
versatile text-based document preparation system designed for writing
technical books and manuals. One of the original motivations for
writing Xhsc was to convert legacy HTML documents into Docbook SGML, so
it seemed natural at the time to use Docbook.
If you have more questions about Xhsc, write to me using the
feedback form,
and I'll put the question up here.
Other XML Tools
Here are some other sites that offer
specialized XML tools.
[Ziring
MicroWeb Home] [Java Page] [Sign
Guestbook]
This page written by Neal Ziring,
last modified 11/22/05.