|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.xml.dtm.ref.DTMDefaultBase org.apache.xml.dtm.ref.DTMDefaultBaseTraversers org.apache.xml.dtm.ref.DTMDefaultBaseIterators org.apache.xml.dtm.ref.dom2dtm.DOM2DTM
The DOM2DTM
class serves up a DOM's contents via the
DTM API.
Note that it doesn't necessarily represent a full Document
tree. You can wrap a DOM2DTM around a specific node and its subtree
and the right things should happen. (I don't _think_ we currently
support DocumentFrgment nodes as roots, though that might be worth
considering.)
Note too that we do not currently attempt to track document
mutation. If you alter the DOM after wrapping DOM2DTM around it,
all bets are off.
Nested Class Summary | |
static interface |
DOM2DTM.CharacterNodeHandler
|
Nested classes inherited from class org.apache.xml.dtm.ref.DTMDefaultBaseIterators |
|
Nested classes inherited from class org.apache.xml.dtm.ref.DTMDefaultBaseTraversers |
|
Field Summary | |
(package private) static boolean |
JJK_DEBUG
|
(package private) static boolean |
JJK_NEWCODE
|
private int |
m_last_kid
The current position in the DTM tree. |
private int |
m_last_parent
The current position in the DTM tree. |
protected Vector |
m_nodes
The node objects. |
private boolean |
m_nodesAreProcessed
true if ALL the nodes in the m_root subtree have been processed; false if our incremental build has not yet finished scanning the DOM tree. |
private org.w3c.dom.Node |
m_pos
The current position in the DOM tree. |
(package private) boolean |
m_processedFirstElement
True iff the first element has been processed. |
private org.w3c.dom.Node |
m_root
The top of the subtree. |
(package private) TreeWalker |
m_walker
|
(package private) static String |
NAMESPACE_DECL_NS
Manefest constant |
Fields inherited from class org.apache.xml.dtm.ref.DTMDefaultBase |
m_blocksize, m_documentBaseURI, m_dtmIdent, m_elemIndexes, m_expandedNameTable, m_exptype, m_firstch, m_indexing, m_initialblocksize, m_mgr, m_mgrDefault, m_namespaceDeclSetElements, m_namespaceDeclSets, m_nextsib, m_parent, m_prevsib, m_shouldStripWhitespaceStack, m_shouldStripWS, m_size, m_traversers, m_wsfilter, m_xstrf, NOTPROCESSED |
Fields inherited from interface org.apache.xml.dtm.DTM |
ATTRIBUTE_NODE, CDATA_SECTION_NODE, COMMENT_NODE, DOCUMENT_FRAGMENT_NODE, DOCUMENT_NODE, DOCUMENT_TYPE_NODE, ELEMENT_NODE, ENTITY_NODE, ENTITY_REFERENCE_NODE, NAMESPACE_NODE, NOTATION_NODE, NTYPES, NULL, PROCESSING_INSTRUCTION_NODE, TEXT_NODE |
Constructor Summary | |
DOM2DTM(DTMManager mgr,
javax.xml.transform.dom.DOMSource domSource,
int dtmIdentity,
DTMWSFilter whiteSpaceFilter,
XMLStringFactory xstringfactory,
boolean doIndexing)
Construct a DOM2DTM object from a DOM node. |
Method Summary | |
protected int |
addNode(org.w3c.dom.Node node,
int parentIndex,
int previousSibling,
int forceNodeType)
Construct the node map from the node. |
void |
dispatchCharactersEvents(int nodeHandle,
org.xml.sax.ContentHandler ch,
boolean normalize)
Directly call the characters method on the passed ContentHandler for the string-value of the given node (see http://www.w3.org/TR/xpath#data-model for the definition of a node's string-value). |
protected static void |
dispatchNodeData(org.w3c.dom.Node node,
org.xml.sax.ContentHandler ch,
int depth)
Retrieve the text content of a DOM subtree, appending it into a user-supplied FastStringBuffer object. |
void |
dispatchToEvents(int nodeHandle,
org.xml.sax.ContentHandler ch)
Directly create SAX parser events from a subtree. |
int |
getAttributeNode(int nodeHandle,
String namespaceURI,
String name)
Retrieves an attribute node by by qualified name and namespace URI. |
org.xml.sax.ContentHandler |
getContentHandler()
getContentHandler returns "our SAX builder" -- the thing that someone else should send SAX events to in order to extend this DTM model. |
org.xml.sax.ext.DeclHandler |
getDeclHandler()
Return this DTM's DeclHandler. |
String |
getDocumentTypeDeclarationPublicIdentifier()
Return the public identifier of the external subset, normalized as described in 4.2.2 External Entities [XML]. |
String |
getDocumentTypeDeclarationSystemIdentifier()
A document type declaration information item has the following properties: 1. |
org.xml.sax.DTDHandler |
getDTDHandler()
Return this DTM's DTDHandler. |
int |
getElementById(String elementId)
Returns the Element whose ID is given by
elementId . |
org.xml.sax.EntityResolver |
getEntityResolver()
Return this DTM's EntityResolver. |
org.xml.sax.ErrorHandler |
getErrorHandler()
Return this DTM's ErrorHandler. |
private int |
getHandleFromNode(org.w3c.dom.Node node)
Get the handle from a Node. |
int |
getHandleOfNode(org.w3c.dom.Node node)
Get the handle from a Node. |
org.xml.sax.ext.LexicalHandler |
getLexicalHandler()
Return this DTM's lexical handler. |
String |
getLocalName(int nodeHandle)
Given a node handle, return its XPath-style localname. |
String |
getNamespaceURI(int nodeHandle)
Given a node handle, return its DOM-style namespace URI (As defined in Namespaces, this is the declared URI which this node's prefix -- or default in lieu thereof -- was mapped to.) |
protected int |
getNextNodeIdentity(int identity)
Get the next node identity value in the list, and call the iterator if it hasn't been added yet. |
org.w3c.dom.Node |
getNode(int nodeHandle)
Return an DOM node for the given node. |
protected static void |
getNodeData(org.w3c.dom.Node node,
FastStringBuffer buf)
Retrieve the text content of a DOM subtree, appending it into a user-supplied FastStringBuffer object. |
String |
getNodeName(int nodeHandle)
Given a node handle, return its DOM-style node name. |
String |
getNodeNameX(int nodeHandle)
Given a node handle, return the XPath node name. |
String |
getNodeValue(int nodeHandle)
Given a node handle, return its node value. |
protected int |
getNumberOfNodes()
Get the number of nodes that have been added. |
String |
getPrefix(int nodeHandle)
Given a namespace handle, return the prefix that the namespace decl is mapping. |
javax.xml.transform.SourceLocator |
getSourceLocatorFor(int node)
No source information is available for DOM2DTM, so return null here. |
XMLString |
getStringValue(int nodeHandle)
Get the string-value of a node as a String object (see http://www.w3.org/TR/xpath#data-model for the definition of a node's string-value). |
String |
getUnparsedEntityURI(String name)
The getUnparsedEntityURI function returns the URI of the unparsed entity with the specified name in the same document as the context node (see [3.3 Unparsed Entities]). |
boolean |
isAttributeSpecified(int attributeHandle)
5. |
private static boolean |
isSpace(char ch)
Returns whether the specified ch conforms to the XML 1.0 definition of whitespace. |
private org.w3c.dom.Node |
logicalNextDOMTextNode(org.w3c.dom.Node n)
Utility function: Given a DOM Text node, determine whether it is logically followed by another Text or CDATASection node. |
protected org.w3c.dom.Node |
lookupNode(int nodeIdentity)
Get a Node from an identity index. |
boolean |
needsTwoThreads()
|
protected boolean |
nextNode()
This method iterates to the next node that will be added to the table. |
void |
setIncrementalSAXSource(IncrementalSAXSource source)
Bind an IncrementalSAXSource to this DTM. |
void |
setProperty(String property,
Object value)
For the moment all the run time properties are ignored by this class. |
Methods inherited from class org.apache.xml.dtm.ref.DTMDefaultBaseIterators |
getAxisIterator, getTypedAxisIterator |
Methods inherited from class org.apache.xml.dtm.ref.DTMDefaultBaseTraversers |
getAxisTraverser |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
static final boolean JJK_DEBUG
static final boolean JJK_NEWCODE
static final String NAMESPACE_DECL_NS
private transient org.w3c.dom.Node m_pos
private int m_last_parent
private int m_last_kid
private transient org.w3c.dom.Node m_root
boolean m_processedFirstElement
private transient boolean m_nodesAreProcessed
protected Vector m_nodes
TreeWalker m_walker
Constructor Detail |
public DOM2DTM(DTMManager mgr, javax.xml.transform.dom.DOMSource domSource, int dtmIdentity, DTMWSFilter whiteSpaceFilter, XMLStringFactory xstringfactory, boolean doIndexing)
mgr
- The DTMManager who owns this DTM.domSource
- the DOM source that this DTM will wrap.dtmIdentity
- The DTM identity ID for this DTM.whiteSpaceFilter
- The white space filter for this DTM, which may
be null.xstringfactory
- XMLString factory for creating character content.doIndexing
- true if the caller considers it worth it to use
indexing schemes.Method Detail |
protected int addNode(org.w3c.dom.Node node, int parentIndex, int previousSibling, int forceNodeType)
node
- The node that is to be added to the DTM.parentIndex
- The current parent index.previousSibling
- The previous sibling index.forceNodeType
- If not DTM.NULL, overrides the DOM node type.
Used to force nodes to Text rather than CDATASection when their
coalesced value includes ordinary Text nodes (current DTM behavior).
protected int getNumberOfNodes()
getNumberOfNodes
in class DTMDefaultBase
protected boolean nextNode()
nextNode
in class DTMDefaultBase
public org.w3c.dom.Node getNode(int nodeHandle)
getNode
in interface DTM
getNode
in class DTMDefaultBase
nodeHandle
- The node ID.
protected org.w3c.dom.Node lookupNode(int nodeIdentity)
protected int getNextNodeIdentity(int identity)
getNextNodeIdentity
in class DTMDefaultBase
identity
- The node identity (index).
private int getHandleFromNode(org.w3c.dom.Node node)
%OPT% This will be pretty slow.
%OPT% An XPath-like search (walk up DOM to root, tracking path; walk down DTM reconstructing path) might be considerably faster on later nodes in large documents. That might also imply improving this call to handle nodes which would be in this DTM but have not yet been built, which might or might not be a Good Thing.
%REVIEW% This relies on being able to test node-identity via object-identity. DTM2DOM proxying is a great example of a case where that doesn't work. DOM Level 3 will provide the isSameNode() method to fix that, but until then this is going to be flaky.
node
- A node, which may be null.
DTM.NULL
.public int getHandleOfNode(org.w3c.dom.Node node)
%OPT% This will be pretty slow.
%REVIEW% This relies on being able to test node-identity via object-identity. DTM2DOM proxying is a great example of a case where that doesn't work. DOM Level 3 will provide the isSameNode() method to fix that, but until then this is going to be flaky.
node
- A node, which may be null.
DTM.NULL
.public int getAttributeNode(int nodeHandle, String namespaceURI, String name)
getAttributeNode
in interface DTM
getAttributeNode
in class DTMDefaultBase
nodeHandle
- int Handle of the node upon which to look up this attribute..namespaceURI
- The namespace URI of the attribute to
retrieve, or null.name
- The local name of the attribute to
retrieve.
nodeName
) or DTM.NULL
if there is no such
attribute.public XMLString getStringValue(int nodeHandle)
getStringValue
in interface DTM
getStringValue
in class DTMDefaultBase
nodeHandle
- The node ID.
protected static void getNodeData(org.w3c.dom.Node node, FastStringBuffer buf)
There are open questions regarding whitespace stripping. Currently we make no special effort in that regard, since the standard DOM doesn't yet provide DTD-based information to distinguish whitespace-in-element-context from genuine #PCDATA. Note that we should probably also consider xml:space if/when we address this. DOM Level 3 may solve the problem for us.
%REVIEW% Actually, since this method operates on the DOM side of the fence rather than the DTM side, it SHOULDN'T do any special handling. The DOM does what the DOM does; if you want DTM-level abstractions, use DTM-level methods.
node
- Node whose subtree is to be walked, gathering the
contents of all Text or CDATASection nodes.buf
- FastStringBuffer into which the contents of the text
nodes are to be concatenated.public String getNodeName(int nodeHandle)
getNodeName
in interface DTM
getNodeName
in class DTMDefaultBase
nodeHandle
- the id of the node.
public String getNodeNameX(int nodeHandle)
getNodeNameX
in interface DTM
getNodeNameX
in class DTMDefaultBase
nodeHandle
- the id of the node.
public String getLocalName(int nodeHandle)
getLocalName
in interface DTM
getLocalName
in class DTMDefaultBase
nodeHandle
- the id of the node.
public String getPrefix(int nodeHandle)
%REVIEW% Are you sure you want "" for no prefix?
%REVIEW-COMMENT% I think so... not totally sure. -sb
getPrefix
in interface DTM
getPrefix
in class DTMDefaultBase
nodeHandle
- the id of the node.
public String getNamespaceURI(int nodeHandle)
%REVIEW% Null or ""? -sb
getNamespaceURI
in interface DTM
getNamespaceURI
in class DTMDefaultBase
nodeHandle
- the id of the node.
private org.w3c.dom.Node logicalNextDOMTextNode(org.w3c.dom.Node n)
public String getNodeValue(int nodeHandle)
getNodeValue
in interface DTM
getNodeValue
in class DTMDefaultBase
nodeHandle
- The node id.
public String getDocumentTypeDeclarationSystemIdentifier()
getDocumentTypeDeclarationSystemIdentifier
in interface DTM
getDocumentTypeDeclarationSystemIdentifier
in class DTMDefaultBase
public String getDocumentTypeDeclarationPublicIdentifier()
getDocumentTypeDeclarationPublicIdentifier
in interface DTM
getDocumentTypeDeclarationPublicIdentifier
in class DTMDefaultBase
public int getElementById(String elementId)
Element
whose ID
is given by
elementId
. If no such element exists, returns
DTM.NULL
. Behavior is not defined if more than one element
has this ID
. Attributes (including those
with the name "ID") are not of type ID unless so defined by DTD/Schema
information available to the DTM implementation.
Implementations that do not know whether attributes are of type ID or
not are expected to return DTM.NULL
.
%REVIEW% Presumably IDs are still scoped to a single document, and this operation searches only within a single document, right? Wouldn't want collisions between DTMs in the same process.
getElementById
in interface DTM
getElementById
in class DTMDefaultBase
elementId
- The unique id
value for an element.
public String getUnparsedEntityURI(String name)
XML processors may choose to use the System Identifier (if one is provided) to resolve the entity, rather than the URI in the Public Identifier. The details are dependent on the processor, and we would have to support some form of plug-in resolver to handle this properly. Currently, we simply return the System Identifier if present, and hope that it a usable URI or that our caller can map it to one. TODO: Resolve Public Identifiers... or consider changing function name.
If we find a relative URI reference, XML expects it to be resolved in terms of the base URI of the document. The DOM doesn't do that for us, and it isn't entirely clear whether that should be done here; currently that's pushed up to a higher level of our application. (Note that DOM Level 1 didn't store the document's base URI.) TODO: Consider resolving Relative URIs.
(The DOM's statement that "An XML processor may choose to completely expand entities before the structure model is passed to the DOM" refers only to parsed entities, not unparsed, and hence doesn't affect this function.)
getUnparsedEntityURI
in interface DTM
getUnparsedEntityURI
in class DTMDefaultBase
name
- A string containing the Entity Name of the unparsed
entity.
public boolean isAttributeSpecified(int attributeHandle)
isAttributeSpecified
in interface DTM
isAttributeSpecified
in class DTMDefaultBase
attributeHandle
- The attribute handle in question.
true
if the attribute was specified;
false
if it was defaulted.public void setIncrementalSAXSource(IncrementalSAXSource source)
source
- The IncrementalSAXSource that we want to recieve events from
on demand.public org.xml.sax.ContentHandler getContentHandler()
public org.xml.sax.ext.LexicalHandler getLexicalHandler()
public org.xml.sax.EntityResolver getEntityResolver()
public org.xml.sax.DTDHandler getDTDHandler()
public org.xml.sax.ErrorHandler getErrorHandler()
public org.xml.sax.ext.DeclHandler getDeclHandler()
public boolean needsTwoThreads()
private static boolean isSpace(char ch)
S
for details.
ch
- Character to check as XML whitespace.
public void dispatchCharactersEvents(int nodeHandle, org.xml.sax.ContentHandler ch, boolean normalize) throws org.xml.sax.SAXException
dispatchCharactersEvents
in interface DTM
dispatchCharactersEvents
in class DTMDefaultBase
nodeHandle
- The node ID.ch
- A non-null reference to a ContentHandler.normalize
- true if the content should be normalized according to
the rules for the XPath
normalize-space
function.
org.xml.sax.SAXException
protected static void dispatchNodeData(org.w3c.dom.Node node, org.xml.sax.ContentHandler ch, int depth) throws org.xml.sax.SAXException
There are open questions regarding whitespace stripping. Currently we make no special effort in that regard, since the standard DOM doesn't yet provide DTD-based information to distinguish whitespace-in-element-context from genuine #PCDATA. Note that we should probably also consider xml:space if/when we address this. DOM Level 3 may solve the problem for us.
%REVIEW% Note that as a DOM-level operation, it can be argued that this routine _shouldn't_ perform any processing beyond what the DOM already does, and that whitespace stripping and so on belong at the DTM level. If you want a stripped DOM view, wrap DTM2DOM around DOM2DTM.
node
- Node whose subtree is to be walked, gathering the
contents of all Text or CDATASection nodes.
org.xml.sax.SAXException
public void dispatchToEvents(int nodeHandle, org.xml.sax.ContentHandler ch) throws org.xml.sax.SAXException
dispatchToEvents
in interface DTM
dispatchToEvents
in class DTMDefaultBase
nodeHandle
- The node ID.ch
- A non-null reference to a ContentHandler.
org.xml.sax.SAXException
public void setProperty(String property, Object value)
property
- a String
valuevalue
- an Object
valuepublic javax.xml.transform.SourceLocator getSourceLocatorFor(int node)
null
here.
node
- an int
value
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |