org.apache.crimson.parser
Class InputEntity

java.lang.Object
  extended byorg.apache.crimson.parser.InputEntity
All Implemented Interfaces:
org.xml.sax.Locator

final class InputEntity
extends Object
implements org.xml.sax.Locator

This is how the parser talks to its input entities, of all kinds. The entities are in a stack.

For internal entities, the character arrays are referenced here, and read from as needed (they're read-only). External entities have mutable buffers, that are read into as needed.

Note: This maps CRLF (and CR) to LF without regard for whether it's in an external (parsed) entity or not. The XML 1.0 spec is inconsistent in explaining EOL handling; this is the sensible way.

Author:
David Brownell

Field Summary
private  char[] buf
           
private static int BUFSIZ
           
private  org.xml.sax.ErrorHandler errHandler
           
private  int finish
           
private  org.xml.sax.InputSource input
           
private  boolean isClosed
           
private  boolean isPE
           
private  int lineNumber
           
private  Locale locale
           
private  boolean maybeInCRLF
           
private  String name
           
private static char[] newline
           
private  InputEntity next
           
private  Reader reader
           
private  StringBuffer rememberedText
           
private  boolean returnedFirstHalf
           
private  int start
           
private  int startRemember
           
 
Constructor Summary
private InputEntity()
           
 
Method Summary
private  void checkRecursion(InputEntity stack)
           
private  boolean checkSurrogatePair(int offset)
           
 void close()
           
private static String convertToFileURL(String filename)
           
private  void fatal(String messageId, Object[] params)
           
private  void fillbuf()
           
 char getc()
          gets the next Java character -- might be part of an XML text character represented by a surrogate pair, or be the end of the entity.
 int getColumnNumber()
          returns -1; maintaining column numbers hurts performance
 String getEncoding()
          Returns the name of the encoding in use, else null; the name returned is in as standard a form as we can get.
static InputEntity getInputEntity(org.xml.sax.ErrorHandler h, Locale l)
           
 int getLineNumber()
          Returns the current line number in this input source
private  org.xml.sax.Locator getLocator()
           
 String getName()
           
 char getNameChar()
          returns the next name char, or NUL ... faster than getc(), and the common "name or nmtoken must be next" case won't need ungetc().
 String getPublicId()
          Returns the public ID of this input source, if known
 String getSystemId()
          Returns the system ID of this input source, if known
 boolean ignorableWhitespace(org.xml.sax.ContentHandler handler)
          whitespace in markup (flagged to app, discardable) the document handler's ignorableWhitespace() method is called on all the whitespace found
 void init(char[] b, String name, InputEntity stack, boolean isPE)
           
 void init(org.xml.sax.InputSource in, String name, InputEntity stack, boolean isPE)
          Use this for an external parsed entity
 boolean isDocument()
           
 boolean isEOF()
          returns true iff there's no more data to consume ...
 boolean isInternal()
           
 boolean isParameterEntity()
           
(package private)  boolean isXmlDeclOrTextDeclPrefix()
          This method is used to disambiguate between XMLDecl, TextDecl, and PI by doing a lookahead w/o consuming any characters.
 boolean maybeWhitespace()
          optional grammatical whitespace (discarded)
 boolean parsedContent(org.xml.sax.ContentHandler contentHandler, ElementValidator validator)
          normal content; whitespace in markup may be handled specially if the parser uses the content model.
 boolean peek(String next, char[] chars)
          returns false iff 'next' string isn't as provided, else skips that text and returns true NOTE: two alternative string representations are both passed in, since one is faster.
 boolean peekc(char c)
           
 InputEntity pop()
           
 String rememberText()
           
 void startRemembering()
           
 void ungetc()
          two character pushback is guaranteed
 void unparsedContent(org.xml.sax.ContentHandler contentHandler, ElementValidator validator, boolean ignorableWhitespace, String whitespaceInvalidMessage)
          CDATA -- character data, terminated by "]]>" and optionally including unescaped markup delimiters (ampersand and left angle bracket).
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

start

private int start

finish

private int finish

buf

private char[] buf

lineNumber

private int lineNumber

returnedFirstHalf

private boolean returnedFirstHalf

maybeInCRLF

private boolean maybeInCRLF

name

private String name

next

private InputEntity next

input

private org.xml.sax.InputSource input

reader

private Reader reader

isClosed

private boolean isClosed

errHandler

private org.xml.sax.ErrorHandler errHandler

locale

private Locale locale

rememberedText

private StringBuffer rememberedText

startRemember

private int startRemember

isPE

private boolean isPE

BUFSIZ

private static final int BUFSIZ
See Also:
Constant Field Values

newline

private static final char[] newline
Constructor Detail

InputEntity

private InputEntity()
Method Detail

getInputEntity

public static InputEntity getInputEntity(org.xml.sax.ErrorHandler h,
                                         Locale l)

isInternal

public boolean isInternal()

isDocument

public boolean isDocument()

isParameterEntity

public boolean isParameterEntity()

getName

public String getName()

convertToFileURL

private static String convertToFileURL(String filename)

init

public void init(org.xml.sax.InputSource in,
                 String name,
                 InputEntity stack,
                 boolean isPE)
          throws IOException,
                 org.xml.sax.SAXException
Use this for an external parsed entity

Throws:
IOException
org.xml.sax.SAXException

init

public void init(char[] b,
                 String name,
                 InputEntity stack,
                 boolean isPE)
          throws org.xml.sax.SAXException
Throws:
org.xml.sax.SAXException

checkRecursion

private void checkRecursion(InputEntity stack)
                     throws org.xml.sax.SAXException
Throws:
org.xml.sax.SAXException

pop

public InputEntity pop()
                throws IOException
Throws:
IOException

isEOF

public boolean isEOF()
              throws IOException,
                     org.xml.sax.SAXException
returns true iff there's no more data to consume ...

Throws:
IOException
org.xml.sax.SAXException

getEncoding

public String getEncoding()
Returns the name of the encoding in use, else null; the name returned is in as standard a form as we can get.


getNameChar

public char getNameChar()
                 throws IOException,
                        org.xml.sax.SAXException
returns the next name char, or NUL ... faster than getc(), and the common "name or nmtoken must be next" case won't need ungetc().

Throws:
IOException
org.xml.sax.SAXException

getc

public char getc()
          throws IOException,
                 org.xml.sax.SAXException
gets the next Java character -- might be part of an XML text character represented by a surrogate pair, or be the end of the entity.

Throws:
IOException
org.xml.sax.SAXException

peekc

public boolean peekc(char c)
              throws IOException,
                     org.xml.sax.SAXException
Throws:
IOException
org.xml.sax.SAXException

ungetc

public void ungetc()
two character pushback is guaranteed


maybeWhitespace

public boolean maybeWhitespace()
                        throws IOException,
                               org.xml.sax.SAXException
optional grammatical whitespace (discarded)

Throws:
IOException
org.xml.sax.SAXException

parsedContent

public boolean parsedContent(org.xml.sax.ContentHandler contentHandler,
                             ElementValidator validator)
                      throws IOException,
                             org.xml.sax.SAXException
normal content; whitespace in markup may be handled specially if the parser uses the content model.

content terminates with markup delimiter characters, namely ampersand (&) and left angle bracket (<).

the document handler's characters() method is called on all the content found

Throws:
IOException
org.xml.sax.SAXException

unparsedContent

public void unparsedContent(org.xml.sax.ContentHandler contentHandler,
                            ElementValidator validator,
                            boolean ignorableWhitespace,
                            String whitespaceInvalidMessage)
                     throws IOException,
                            org.xml.sax.SAXException
CDATA -- character data, terminated by "]]>" and optionally including unescaped markup delimiters (ampersand and left angle bracket). This should otherwise be exactly like character data, modulo differences in error report details.

The document handler's characters() or ignorableWhitespace() methods are invoked on all the character data found

Parameters:
contentHandler - gets callbacks for character data
validator - text() or ignorableWhitespace() methods are called appropriately
ignorableWhitespace - if true, whitespace characters will be reported using contentHandler.ignorableWhitespace(); implicitly, non-whitespace characters will cause validation errors
Throws:
IOException
org.xml.sax.SAXException

checkSurrogatePair

private boolean checkSurrogatePair(int offset)
                            throws org.xml.sax.SAXException
Throws:
org.xml.sax.SAXException

ignorableWhitespace

public boolean ignorableWhitespace(org.xml.sax.ContentHandler handler)
                            throws IOException,
                                   org.xml.sax.SAXException
whitespace in markup (flagged to app, discardable)

the document handler's ignorableWhitespace() method is called on all the whitespace found

Throws:
IOException
org.xml.sax.SAXException

peek

public boolean peek(String next,
                    char[] chars)
             throws IOException,
                    org.xml.sax.SAXException
returns false iff 'next' string isn't as provided, else skips that text and returns true

NOTE: two alternative string representations are both passed in, since one is faster.

Throws:
IOException
org.xml.sax.SAXException

isXmlDeclOrTextDeclPrefix

boolean isXmlDeclOrTextDeclPrefix()
                            throws IOException,
                                   org.xml.sax.SAXException
This method is used to disambiguate between XMLDecl, TextDecl, and PI by doing a lookahead w/o consuming any characters. We look for "".

Returns:
true iff next chars match either the prefix for XMLDecl or TextDecl
Throws:
IOException
org.xml.sax.SAXException

startRemembering

public void startRemembering()

rememberText

public String rememberText()

getLocator

private org.xml.sax.Locator getLocator()

getPublicId

public String getPublicId()
Returns the public ID of this input source, if known

Specified by:
getPublicId in interface org.xml.sax.Locator

getSystemId

public String getSystemId()
Returns the system ID of this input source, if known

Specified by:
getSystemId in interface org.xml.sax.Locator

getLineNumber

public int getLineNumber()
Returns the current line number in this input source

Specified by:
getLineNumber in interface org.xml.sax.Locator

getColumnNumber

public int getColumnNumber()
returns -1; maintaining column numbers hurts performance

Specified by:
getColumnNumber in interface org.xml.sax.Locator

fillbuf

private void fillbuf()
              throws IOException,
                     org.xml.sax.SAXException
Throws:
IOException
org.xml.sax.SAXException

close

public void close()

fatal

private void fatal(String messageId,
                   Object[] params)
            throws org.xml.sax.SAXException
Throws:
org.xml.sax.SAXException