org.apache.crimson.tree
Class XmlDocumentBuilder

java.lang.Object
  extended byorg.apache.crimson.tree.XmlDocumentBuilder
All Implemented Interfaces:
org.xml.sax.ContentHandler, org.xml.sax.ext.DeclHandler, org.xml.sax.DTDHandler, org.xml.sax.ext.LexicalHandler
Direct Known Subclasses:
XmlDocumentBuilderNS

public class XmlDocumentBuilder
extends Object
implements org.xml.sax.ContentHandler, org.xml.sax.ext.LexicalHandler, org.xml.sax.ext.DeclHandler, org.xml.sax.DTDHandler

This class is a SAX2 ContentHandler which converts a stream of parse events into an in-memory DOM document. After each Parser.parse() invocation returns, a resulting DOM Document may be accessed via the getDocument method. The parser and its builder should be used together; the builder may be used with only one parser at a time.

This builder optionally does XML namespace processing, reporting conformance problems as recoverable errors using the parser's error handler.

Note: element factories are deprecated because they are non-standard and are provided here only for backwards compatibility. To customize the document, a powerful technique involves using an element factory specifying what element tags (from a given XML namespace) correspond to what implementation classes. Parse trees produced by such a builder can have nodes which add behaviors to achieve application-specific functionality, such as modifing the tree as it is parsed.

The object model here is that XML elements are polymorphic, with semantic intelligence embedded through customized internal nodes. Those nodes are created as the parse tree is built. Such trees now build on the W3C Document Object Model (DOM), and other models may be supported by the customized nodes. This allows both generic tools (understanding generic interfaces such as the DOM core) and specialized tools (supporting specialized behaviors, such as the HTML extensions to the DOM core; or for XSL elements) to share data structures.

Normally only "model" semantics are in document data structures, but "view" or "controller" semantics can be supported if desired.

Elements may choose to intercept certain parsing events directly. They do this by overriding the default implementations of methods in the XmlReadable interface. This is normally done to make the DOM tree represent application level modeling requirements, rather than matching an XML structure that may not be optimized appropriately.

Author:
David Brownell

Field Summary
private  Vector attrTmp
           
private  boolean disableNamespaces
           
private  Doctype doctype
           
protected  XmlDocument document
           
protected  ParentNode[] elementStack
           
private  boolean expandEntityRefs
           
private  ElementFactory factory
           
private  boolean ignoreComments
           
private  boolean ignoreWhitespace
           
private  boolean inCDataSection
           
private  boolean inDTD
           
private  Locale locale
           
protected  org.xml.sax.Locator locator
           
private  boolean putCDATAIntoText
           
protected  int topOfStack
           
 
Constructor Summary
XmlDocumentBuilder()
          Default constructor is for use in conjunction with a SAX2 parser.
 
Method Summary
 void attributeDecl(String eName, String aName, String type, String valueDefault, String value)
          Report an attribute type declaration.
 void characters(char[] buf, int offset, int len)
          Receive notification of character data.
 Locale chooseLocale(String[] languages)
          Chooses a client locale to use for diagnostics, using the first language specified in the list that is supported by this builder.
 void comment(char[] ch, int start, int length)
          Report an XML comment anywhere in the document.
 XmlDocument createDocument()
          This is a factory method, used to create an XmlDocument.
 void elementDecl(String name, String model)
          Report an element type declaration.
 void endCDATA()
          Report the end of a CDATA section.
 void endDocument()
          Receive notification of the end of a document.
 void endDTD()
          Report the end of DTD declarations.
 void endElement(String namespaceURI, String localName, String qName)
          Receive notification of the end of an element.
 void endEntity(String name)
          Report the end of an entity.
 void endPrefixMapping(String prefix)
          End the scope of a prefix-URI mapping.
 void externalEntityDecl(String name, String publicId, String systemId)
          Report a parsed external entity declaration.
 boolean getDisableNamespaces()
          Returns true if namespace conformance is not checked as the DOM tree is built.
 XmlDocument getDocument()
          Return the result of parsing, after a SAX parser has used this as a content handler during parsing.
 ElementFactory getElementFactory()
          Deprecated.  
 Locale getLocale()
          Returns the locale to be used for diagnostic messages by this builder, and by documents it produces.
(package private)  String getMessage(String messageId)
           
(package private)  String getMessage(String messageId, Object[] parameters)
           
 void ignorableWhitespace(char[] buf, int offset, int len)
          Receive notification of ignorable whitespace in element content.
 void internalEntityDecl(String name, String value)
          Report an internal entity declaration.
 boolean isIgnoringLexicalInfo()
          Returns true if certain lexical information is automatically discarded when a DOM tree is built, producing smaller parse trees that are easier to use.
 void notationDecl(String n, String p, String s)
          Receive notification of a notation declaration event.
 void processingInstruction(String name, String instruction)
          Receive notification of a processing instruction.
 void setDisableNamespaces(boolean value)
          Controls whether namespace conformance is checked during DOM tree construction, or (the default) not.
 void setDocumentLocator(org.xml.sax.Locator locator)
          Receive an object for locating the origin of SAX document events.
 void setElementFactory(ElementFactory factory)
          Deprecated.  
 void setExpandEntityReferences(boolean value)
          Internal API used by JAXP implementation.
 void setIgnoreComments(boolean value)
          Internal API used by JAXP implementation.
 void setIgnoreWhitespace(boolean value)
          Internal API used by JAXP implementation.
 void setIgnoringLexicalInfo(boolean value)
          Controls whether certain lexical information is discarded.
 void setLocale(Locale locale)
          Assigns the locale to be used for diagnostic messages.
 void setPutCDATAIntoText(boolean value)
          Internal API used by JAXP implementation.
 void skippedEntity(String name)
          Receive notification of a skipped entity.
 void startCDATA()
          Report the start of a CDATA section.
 void startDocument()
          Receive notification of the beginning of a document.
 void startDTD(String name, String publicId, String systemId)
          Report the start of DTD declarations, if any.
 void startElement(String namespaceURI, String localName, String qName, org.xml.sax.Attributes attributes)
          Receive notification of the beginning of an element.
 void startEntity(String name)
          Report the beginning of an entity in content.
 void startPrefixMapping(String prefix, String uri)
          Begin the scope of a prefix-URI Namespace mapping.
 void unparsedEntityDecl(String name, String publicId, String systemId, String notation)
          Receive notification of an unparsed entity declaration event.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

document

protected XmlDocument document

locator

protected org.xml.sax.Locator locator

locale

private Locale locale

factory

private ElementFactory factory

attrTmp

private Vector attrTmp

elementStack

protected ParentNode[] elementStack

topOfStack

protected int topOfStack

inDTD

private boolean inDTD

inCDataSection

private boolean inCDataSection

doctype

private Doctype doctype

disableNamespaces

private boolean disableNamespaces

ignoreWhitespace

private boolean ignoreWhitespace

expandEntityRefs

private boolean expandEntityRefs

ignoreComments

private boolean ignoreComments

putCDATAIntoText

private boolean putCDATAIntoText
Constructor Detail

XmlDocumentBuilder

public XmlDocumentBuilder()
Default constructor is for use in conjunction with a SAX2 parser.

Method Detail

isIgnoringLexicalInfo

public boolean isIgnoringLexicalInfo()
Returns true if certain lexical information is automatically discarded when a DOM tree is built, producing smaller parse trees that are easier to use. Obsolete: for backwards compatibility


setIgnoringLexicalInfo

public void setIgnoringLexicalInfo(boolean value)
Controls whether certain lexical information is discarded.

That information includes whitespace in element content which is ignorable (note that some nonvalidating XML parsers will not report that information); all comments; which text is found in CDATA sections; and boundaries of entity references.

"Ignorable whitespace" as reported by parsers is whitespace used to format XML markup. That is, all whitespace except that in "mixed" or ANY content models is ignorable. When it is discarded, pretty-printing may be necessary to make the document be readable again by humans.

Whitespace inside "mixed" and ANY content models needs different treatment, since it could be part of the document content. In such cases XML defines a xml:space attribute which applications should use to determine whether whitespace must be preserved (value of the attribute is preserve) or whether default behavior (such as eliminating leading and trailing space, and normalizing consecutive internal whitespace to a single space) is allowed.

Parameters:
value - true indicates that such lexical information should be discarded during parsing. Obsolete: for backwards compatibility

setIgnoreWhitespace

public void setIgnoreWhitespace(boolean value)
Internal API used by JAXP implementation. Access is set to "public" to enable inter-package access. Use JAXP DocumentBuilderFactory class to access this functionality.


setExpandEntityReferences

public void setExpandEntityReferences(boolean value)
Internal API used by JAXP implementation. Access is set to "public" to enable inter-package access. Use JAXP DocumentBuilderFactory class to access this functionality.


setIgnoreComments

public void setIgnoreComments(boolean value)
Internal API used by JAXP implementation. Access is set to "public" to enable inter-package access. Use JAXP DocumentBuilderFactory class to access this functionality.


setPutCDATAIntoText

public void setPutCDATAIntoText(boolean value)
Internal API used by JAXP implementation. Access is set to "public" to enable inter-package access. Use JAXP DocumentBuilderFactory class to access this functionality.


getDisableNamespaces

public boolean getDisableNamespaces()
Returns true if namespace conformance is not checked as the DOM tree is built.


setDisableNamespaces

public void setDisableNamespaces(boolean value)
Controls whether namespace conformance is checked during DOM tree construction, or (the default) not. In this framework, the DOM Builder is responsible for enforcing all namespace constraints. When enabled, this makes constructing a DOM tree slightly slower. (However, at this time it can't enforce the requirement that parameter entity names not contain colons.)


getDocument

public XmlDocument getDocument()
Return the result of parsing, after a SAX parser has used this as a content handler during parsing.


getLocale

public Locale getLocale()
Returns the locale to be used for diagnostic messages by this builder, and by documents it produces. This uses the locale of any associated parser.


setLocale

public void setLocale(Locale locale)
               throws org.xml.sax.SAXException
Assigns the locale to be used for diagnostic messages. Multi-language applications, such as web servers dealing with clients from different locales, need the ability to interact with clients in languages other than the server's default.

When an XmlDocument is created, its locale is the default locale for the virtual machine. If a parser was recorded, the locale will be associated with that parser.

Throws:
org.xml.sax.SAXException
See Also:
chooseLocale(java.lang.String[])

chooseLocale

public Locale chooseLocale(String[] languages)
                    throws org.xml.sax.SAXException
Chooses a client locale to use for diagnostics, using the first language specified in the list that is supported by this builder. That locale is then automatically assigned using setLocale(). Such a list could be provided by a variety of user preference mechanisms, including the HTTP Accept-Language header field.

Parameters:
languages - Array of language specifiers, ordered with the most preferable one at the front. For example, "en-ca" then "fr-ca", followed by "zh_CN". Both RFC 1766 and Java styles are supported.
Returns:
The chosen locale, or null.
Throws:
org.xml.sax.SAXException
See Also:
MessageCatalog

getMessage

String getMessage(String messageId)

getMessage

String getMessage(String messageId,
                  Object[] parameters)

setDocumentLocator

public void setDocumentLocator(org.xml.sax.Locator locator)
Receive an object for locating the origin of SAX document events.

Specified by:
setDocumentLocator in interface org.xml.sax.ContentHandler

createDocument

public XmlDocument createDocument()
This is a factory method, used to create an XmlDocument. Subclasses may override this method, for example to provide document classes with particular behaviors, or provide particular factory behaviours (such as returning elements that support the HTML DOM methods, if they have the right name and are in the right namespace).


setElementFactory

public final void setElementFactory(ElementFactory factory)
Deprecated.  

Assigns the factory to be associated with documents produced by this builder.


getElementFactory

public final ElementFactory getElementFactory()
Deprecated.  

Returns the factory to be associated with documents produced by this builder.


startDocument

public void startDocument()
                   throws org.xml.sax.SAXException
Receive notification of the beginning of a document.

Specified by:
startDocument in interface org.xml.sax.ContentHandler
Throws:
org.xml.sax.SAXException

endDocument

public void endDocument()
                 throws org.xml.sax.SAXException
Receive notification of the end of a document.

Specified by:
endDocument in interface org.xml.sax.ContentHandler
Throws:
org.xml.sax.SAXException

startPrefixMapping

public void startPrefixMapping(String prefix,
                               String uri)
                        throws org.xml.sax.SAXException
Begin the scope of a prefix-URI Namespace mapping.

Specified by:
startPrefixMapping in interface org.xml.sax.ContentHandler
Throws:
org.xml.sax.SAXException

endPrefixMapping

public void endPrefixMapping(String prefix)
                      throws org.xml.sax.SAXException
End the scope of a prefix-URI mapping.

Specified by:
endPrefixMapping in interface org.xml.sax.ContentHandler
Throws:
org.xml.sax.SAXException

startElement

public void startElement(String namespaceURI,
                         String localName,
                         String qName,
                         org.xml.sax.Attributes attributes)
                  throws org.xml.sax.SAXException
Receive notification of the beginning of an element.

Specified by:
startElement in interface org.xml.sax.ContentHandler
Throws:
org.xml.sax.SAXException

endElement

public void endElement(String namespaceURI,
                       String localName,
                       String qName)
                throws org.xml.sax.SAXException
Receive notification of the end of an element.

Specified by:
endElement in interface org.xml.sax.ContentHandler
Throws:
org.xml.sax.SAXException

characters

public void characters(char[] buf,
                       int offset,
                       int len)
                throws org.xml.sax.SAXException
Receive notification of character data.

Specified by:
characters in interface org.xml.sax.ContentHandler
Throws:
org.xml.sax.SAXException

ignorableWhitespace

public void ignorableWhitespace(char[] buf,
                                int offset,
                                int len)
                         throws org.xml.sax.SAXException
Receive notification of ignorable whitespace in element content. Reports ignorable whitespace; if lexical information is not ignored the whitespace reported here is recorded in a DOM text (or CDATA, as appropriate) node.

Specified by:
ignorableWhitespace in interface org.xml.sax.ContentHandler
Parameters:
buf - holds text characters
offset - initial index of characters in buf
len - how many characters are being passed
Throws:
org.xml.sax.SAXException - as appropriate

processingInstruction

public void processingInstruction(String name,
                                  String instruction)
                           throws org.xml.sax.SAXException
Receive notification of a processing instruction.

Specified by:
processingInstruction in interface org.xml.sax.ContentHandler
Throws:
org.xml.sax.SAXException

skippedEntity

public void skippedEntity(String name)
                   throws org.xml.sax.SAXException
Receive notification of a skipped entity.

Specified by:
skippedEntity in interface org.xml.sax.ContentHandler
Throws:
org.xml.sax.SAXException

startDTD

public void startDTD(String name,
                     String publicId,
                     String systemId)
              throws org.xml.sax.SAXException
Report the start of DTD declarations, if any.

Specified by:
startDTD in interface org.xml.sax.ext.LexicalHandler
Throws:
org.xml.sax.SAXException

endDTD

public void endDTD()
            throws org.xml.sax.SAXException
Report the end of DTD declarations.

Specified by:
endDTD in interface org.xml.sax.ext.LexicalHandler
Throws:
org.xml.sax.SAXException

startEntity

public void startEntity(String name)
                 throws org.xml.sax.SAXException
Report the beginning of an entity in content.

Specified by:
startEntity in interface org.xml.sax.ext.LexicalHandler
Throws:
org.xml.sax.SAXException

endEntity

public void endEntity(String name)
               throws org.xml.sax.SAXException
Report the end of an entity.

Specified by:
endEntity in interface org.xml.sax.ext.LexicalHandler
Throws:
org.xml.sax.SAXException

startCDATA

public void startCDATA()
                throws org.xml.sax.SAXException
Report the start of a CDATA section.

If this builder is set to record lexical information then this callback arranges that character data (and ignorable whitespace) be recorded as part of a CDATA section, until the matching endCDATA method is called.

Specified by:
startCDATA in interface org.xml.sax.ext.LexicalHandler
Throws:
org.xml.sax.SAXException

endCDATA

public void endCDATA()
              throws org.xml.sax.SAXException
Report the end of a CDATA section.

Specified by:
endCDATA in interface org.xml.sax.ext.LexicalHandler
Throws:
org.xml.sax.SAXException

comment

public void comment(char[] ch,
                    int start,
                    int length)
             throws org.xml.sax.SAXException
Report an XML comment anywhere in the document.

Specified by:
comment in interface org.xml.sax.ext.LexicalHandler
Throws:
org.xml.sax.SAXException

elementDecl

public void elementDecl(String name,
                        String model)
                 throws org.xml.sax.SAXException
Report an element type declaration.

Specified by:
elementDecl in interface org.xml.sax.ext.DeclHandler
Throws:
org.xml.sax.SAXException

attributeDecl

public void attributeDecl(String eName,
                          String aName,
                          String type,
                          String valueDefault,
                          String value)
                   throws org.xml.sax.SAXException
Report an attribute type declaration.

Specified by:
attributeDecl in interface org.xml.sax.ext.DeclHandler
Throws:
org.xml.sax.SAXException

internalEntityDecl

public void internalEntityDecl(String name,
                               String value)
                        throws org.xml.sax.SAXException
Report an internal entity declaration.

Specified by:
internalEntityDecl in interface org.xml.sax.ext.DeclHandler
Throws:
org.xml.sax.SAXException

externalEntityDecl

public void externalEntityDecl(String name,
                               String publicId,
                               String systemId)
                        throws org.xml.sax.SAXException
Report a parsed external entity declaration.

Specified by:
externalEntityDecl in interface org.xml.sax.ext.DeclHandler
Throws:
org.xml.sax.SAXException

notationDecl

public void notationDecl(String n,
                         String p,
                         String s)
                  throws org.xml.sax.SAXException
Receive notification of a notation declaration event.

Specified by:
notationDecl in interface org.xml.sax.DTDHandler
Throws:
org.xml.sax.SAXException

unparsedEntityDecl

public void unparsedEntityDecl(String name,
                               String publicId,
                               String systemId,
                               String notation)
                        throws org.xml.sax.SAXException
Receive notification of an unparsed entity declaration event.

Specified by:
unparsedEntityDecl in interface org.xml.sax.DTDHandler
Throws:
org.xml.sax.SAXException