README.jaxp 8.49 KB
Newer Older
Tom Tromey committed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
This file describes the jaxp (xml processing) implementation of GNU Classpath.
GNU Classpath includes interfaces and implementations for basic XML processing
in in the java programming language, some general purpose SAX2 utilities, and
transformation.

These classes used to be maintained as part of an external project GNU JAXP
but are now integrated with the rest of the core class library provided by
GNU Classpath.

PACKAGES
    
. javax.xml.* ... JAXP 1.3 interfaces

. gnu.xml.aelfred2.* ... SAX2 parser + validator
. gnu.xml.dom.* ... DOM Level 3 Core, Traversal, XPath implementation
. gnu.xml.dom.ls.* ... DOM Level 3 Load & Save implementation
. gnu.xml.xpath.* ... JAXP XPath implementation
. gnu.xml.transform.* ... JAXP XSL transformer implementation
. gnu.xml.pipeline.* ... SAX2 event pipeline support
20
. gnu.xml.stream.* ... StAX pull parser and SAX-over-StAX driver
Tom Tromey committed
21 22 23 24 25 26 27 28 29 30 31 32
. gnu.xml.util.* ... various XML utility classes
. gnu.xml.libxmlj.dom.* ... libxmlj DOM Level 3 Core and XPath
. gnu.xml.libxmlj.sax.* ... libxmlj SAX parser
. gnu.xml.libxmlj.transform.* ... libxmlj XSL transformer
. gnu.xml.libxmlj.util.* ... libxmlj utility classes

In the external directory you can find the following packages.
They are not maintained as part of GNU Classpath, but are used by the
classes in the above packages.

. org.xml.sax.* ... SAX2 interfaces
. org.w3c.dom.* ... DOM Level 3 interfaces
33
. org.relaxng.datatype.* ... RELAX NG pluggable datatypes API
Tom Tromey committed
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137

CONFORMANCE

    The primary test resources are at http://xmlconf.sourceforge.net
    and include:

    SAX2/XML conformance tests
	That the "xml.testing.Driver" addresses the core XML 1.0
	specification requirements, which closely correspond to the
	functionality SAX1 provides.  The driver uses SAX2 APIs to
	test that functionality It is used with a bugfixed version of
	the NIST/OASIS XML conformance test cases.
	
	The AElfred2 parser is highly conformant, though it still takes
	a few implementation shortcuts.  See its package documentation
	for information about known XML conformance issues in AElfred2.

	The primary issue is using Unicode character tables, rather than
	those in the XML specification, for determining what names are
	valid.  Most applications won't notice the difference, and this
	solution is smaller and faster than the alternative.

	For validation, a secondary issue is that issues relating to
	entity modularity are not validated; they can't all be cleanly
	layered.  For example, validity constraints related to standalone
	declarations and PE nesting are not checked.

        The current implementation has also been tested against Elliotte
        Rusty Harold's SAXTest test suite (http://www.cafeconleche.org/SAXTest)
        and achieves approximately 93% conformance to the SAX specification
        according to these tests, higher than any other current Java parser.

    SAX2
	SAX2 API conformance currently has a minimal JUNIT (0.2) test suite,
	which can be accessed at the xmlconf site listed above.  It does
	not cover namespaces or LexicalHandler and Declhandler extensions
	anywhere as exhaustively as the SAX1 level functionality is
	tested by the "xml.testing.Driver".  However:

	    - Applying the DOM unit tests to this implementation gives
	      the LexicalHandler (comments, and boundaries of DTDs,
	      CDATA sections, and general entities) a workout, and
	      does the same for DeclHandler entity declarations.
	    
	    - The pipeline package's layered validator demands that
	      element and attribute declarations are reported correctly.
	
	By those metrics, SAX2 conformance for AElfred2 is also strong. 
    
    DOM Level 3 Core Tests
        The DOM implementation has been tested against the W3C DOM Level 3
        Core conformance test suite (http://www.w3.org/DOM/Test/). Current
        conformance according to these tests is 72.3%. Many of the test
        failures are due to the fact that GNU JAXP does not currently
        provide any W3C XML Schema support.

    XSL transformation
        The transformer and XPath implementation have been tested against
        the OASIS XSLT and XPath TC test suite. Conformance against the
        Xalan tests is currently 77%.


libxmlj
========================================================================

libxmlj is an effort to create a 100% JAXP-compatible Java wrapper for
libxml2 and libxslt. JAXP is the Java API for XML processing, libxml2
is the XML C library for Gnome, and libxslt is the XSLT C library for
Gnome.

libxmlj currently supports most of the DOM Level 3 Core, Traversal, and
XPath APIs, SAX2, and XSLT transformations. There is no W3C XML Schema
support yet.

libxmlj can parse and transform XML documents extremely quickly in
comparison to Java-based JAXP implementations. DOM manipulations, however,
involve JNI overhead, so the speed of DOM tree construction and traversal
can be slower than the Java implementation.

libxmlj is highly experimental, doesn't always conform to the DOM
specification correctly, and may leak memory. Production use is not advised.

The implementation can be found in gnu/xml/libxmlj and native/jni/xmlj.
See the INSTALL file for the required versions of libxml2 and libxslt.
configure --enable-xmlj will build it.

Usage
------------------------------------------------------------------------

To enable the various GNU JAXP factories, set the following system properties
(command-line version shown, but they can equally be set programmatically):

  AElfred2:
   -Djavax.xml.parsers.SAXParserFactory=gnu.xml.aelfred2.JAXPFactory

  GNU DOM (using DOM Level 3 Load & Save):
   -Djavax.xml.parsers.DocumentBuilderFactory=gnu.xml.dom.DomDocumentBuilderFactory

  GNU DOM (using AElfred-only pipeline classes):
   -Djavax.xml.parsers.DocumentBuilderFactory=gnu.xml.dom.JAXPFactory

  GNU XSL transformer:
   -Djavax.xml.transform.TransformerFactory=gnu.xml.transform.TransformerFactoryImpl

Tom Tromey committed
138 139 140 141 142
  GNU StAX:
   -Djavax.xml.stream.XMLEventFactory=gnu.xml.stream.XMLEventFactoryImpl
   -Djavax.xml.stream.XMLInputFactory=gnu.xml.stream.XMLInputFactoryImpl
   -Djavax.xml.stream.XMLOutputFactory=gnu.xml.stream.XMLOutputFactoryImpl

143 144 145
  GNU SAX-over-StAX:
   -Djavax.xml.parsers.SAXParserFactory=gnu.xml.stream.SAXParserFactory

Tom Tromey committed
146 147 148 149 150 151 152 153 154 155 156 157
  libxmlj SAX:
   -Djavax.xml.parsers.SAXParserFactory=gnu.xml.libxmlj.sax.GnomeSAXParserFactory

  libxmlj DOM:
   -Djavax.xml.parsers.DocumentBuilderFactory=gnu.xml.libxmlj.dom.GnomeDocumentBuilderFactory

  libxmlj XSL transformer:
   -Djavax.xml.transform.TransformerFactory=gnu.xml.libxmlj.transform.GnomeTransformerFactory

When using libxmlj, the libxmlj shared library must be available.
In general it is picked up by the runtime using GNU Classpath. If not you
might want to try adding the directory where libxmlj.so is installed
158
(by default ${prefix}/lib/classpath/) with ldconfig or specifying in the
Tom Tromey committed
159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176
LD_LIBRARY_PATH environment variable. Additionally, you may need to specify
the location of your shared libraries to the runtime environment using the
java.library.path system property.

Missing (libxmlj) Features
------------------------------------------------------------------------ 

See BUGS in native/jni/xmlj for known bugs in the libxmlj native bindings.

This implementation should be thread-safe, but currently all
transformation requests are queued via Java synchronization, which
means that it effectively performs single-threaded. Long story short,
both libxml2 and libxslt are not fully reentrant.  

Update: it may be possible to make libxmlj thread-safe nonetheless
using thread context variables.

Update: thread context variables have been introduced. This is very
177
untested though, libxmlj therefore still has the single thread
Tom Tromey committed
178
bottleneck.
179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204


Validation
===================================================

Pluggable datatypes
---------------------------------------------------
Validators should use the RELAX NG pluggable datatypes API to retrieve
datatype (XML Schema simple type) implementations in a schema-neutral
fashion. The following code demonstrates looking up a W3C XML Schema
nonNegativeInteger datatype:

  DatatypeLibrary xsd = DatatypeLibraryLoader
    .createDatatypeLibrary(XMLConstants.W3C_XML_SCHEMA_NS_URI);
  Datatype nonNegativeInteger = xsd.createDatatype("nonNegativeInteger");

It is also possible to create new types by derivation. For instance,
to create a datatype that will match a US ZIP code:

  DatatypeBuilder b = xsd.createDatatypeBuilder("string");
  b.addParameter("pattern", "(^[0-9]{5}$)|(^[0-9]{5}-[0-9]{4}$)");
  Datatype zipCode = b.createDatatype();

A datatype library implementation for XML Schema is provided; other
library implementations may be added.