|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.xml.utils.FastStringBuffer
Bare-bones, unsafe, fast string buffer. No thread-safety, no parameter range checking, exposed fields. Note that in typical applications, thread-safety of a StringBuffer is a somewhat dubious concept in any case.
Note that Stree and DTM used a single FastStringBuffer as a string pool, by recording start and length indices within this single buffer. This minimizes heap overhead, but of course requires more work when retrieving the data.
FastStringBuffer operates as a "chunked buffer". Doing so reduces the need to recopy existing information when an append exceeds the space available; we just allocate another chunk and flow across to it. (The array of chunks may need to grow, admittedly, but that's a much smaller object.) Some excess recopying may arise when we extract Strings which cross chunk boundaries; larger chunks make that less frequent.
The size values are parameterized, to allow tuning this code. In theory, Result Tree Fragments might want to be tuned differently from the main document's text.
%REVIEW% An experiment in self-tuning is included in the code (using nested FastStringBuffers to achieve variation in chunk sizes), but this implementation has proven to be problematic when data may be being copied from the FSB into itself. We should either re-architect that to make this safe (if possible) or remove that code and clean up for performance/maintainability reasons.
Field Summary | |
private static int |
CARRY_WS
Manefest constant: Carry trailing whitespace of one chunk as leading whitespace of the next chunk. |
(package private) static boolean |
DEBUG_FORCE_FIXED_CHUNKSIZE
|
(package private) static int |
DEBUG_FORCE_INIT_BITS
|
(package private) char[][] |
m_array
Field m_array holds the string buffer's text contents, using an array-of-arrays. |
(package private) int |
m_chunkBits
Field m_chunkBits sets our chunking strategy, by saying how many bits of index can be used within a single chunk before flowing over to the next chunk. |
(package private) int |
m_chunkMask
Field m_chunkMask is m_chunkSize-1 -- in other words, m_chunkBits worth of low-order '1' bits, useful for shift-and-mask addressing within the chunks. |
(package private) int |
m_chunkSize
Field m_chunkSize establishes the maximum size of one chunk of the array as 2**chunkbits characters. |
(package private) int |
m_firstFree
Field m_firstFree is an index into m_array[m_lastChunk][], pointing to the first character in the Chunked Array which is not part of the FastStringBuffer's current content. |
(package private) FastStringBuffer |
m_innerFSB
Field m_innerFSB, when non-null, is a FastStringBuffer whose total length equals m_chunkSize, and which replaces m_array[0]. |
(package private) int |
m_lastChunk
Field m_lastChunk is an index into m_array[], pointing to the last chunk of the Chunked Array currently in use. |
(package private) int |
m_maxChunkBits
Field m_maxChunkBits affects our chunk-growth strategy, by saying what the largest permissible chunk size is in this particular FastStringBuffer hierarchy. |
(package private) int |
m_rebundleBits
Field m_rechunkBits affects our chunk-growth strategy, by saying how many chunks should be allocated at one size before we encapsulate them into the first chunk of the next size up. |
(package private) static char[] |
SINGLE_SPACE
|
static int |
SUPPRESS_BOTH
Manefest constant: Suppress both leading and trailing whitespace. |
static int |
SUPPRESS_LEADING_WS
Manefest constant: Suppress leading whitespace. |
static int |
SUPPRESS_TRAILING_WS
Manefest constant: Suppress trailing whitespace. |
Constructor Summary | |
|
FastStringBuffer()
Construct a FastStringBuffer, using a default allocation policy. |
private |
FastStringBuffer(FastStringBuffer source)
Encapsulation c'tor. |
|
FastStringBuffer(int initChunkBits)
Construct a FastStringBuffer, using default maxChunkBits and rebundleBits values. |
|
FastStringBuffer(int initChunkBits,
int maxChunkBits)
Construct a FastStringBuffer, using a default rebundleBits value. |
|
FastStringBuffer(int initChunkBits,
int maxChunkBits,
int rebundleBits)
Construct a FastStringBuffer, with allocation policy as per parameters. |
Method Summary | |
void |
append(char value)
Append a single character onto the FastStringBuffer, growing the storage if necessary. |
void |
append(char[] chars,
int start,
int length)
Append part of the contents of a Character Array onto the FastStringBuffer, growing the storage if necessary. |
void |
append(FastStringBuffer value)
Append the contents of another FastStringBuffer onto this FastStringBuffer, growing the storage if necessary. |
void |
append(String value)
Append the contents of a String onto the FastStringBuffer, growing the storage if necessary. |
void |
append(StringBuffer value)
Append the contents of a StringBuffer onto the FastStringBuffer, growing the storage if necessary. |
char |
charAt(int pos)
Get a single character from the string buffer. |
private void |
getChars(int srcBegin,
int srcEnd,
char[] dst,
int dstBegin)
Copies characters from this string into the destination character array. |
String |
getString(int start,
int length)
|
(package private) StringBuffer |
getString(StringBuffer sb,
int start,
int length)
|
(package private) StringBuffer |
getString(StringBuffer sb,
int startChunk,
int startColumn,
int length)
Internal support for toString() and getString(). |
boolean |
isWhitespace(int start,
int length)
|
int |
length()
Get the length of the list. |
void |
reset()
Discard the content of the FastStringBuffer, and most of the memory that was allocated by it, restoring the initial state. |
static void |
sendNormalizedSAXcharacters(char[] ch,
int start,
int length,
org.xml.sax.ContentHandler handler)
Directly normalize and dispatch the character array. |
(package private) static int |
sendNormalizedSAXcharacters(char[] ch,
int start,
int length,
org.xml.sax.ContentHandler handler,
int edgeTreatmentFlags)
Internal method to directly normalize and dispatch the character array. |
int |
sendNormalizedSAXcharacters(org.xml.sax.ContentHandler ch,
int start,
int length)
Sends the specified range of characters as one or more SAX characters() events, normalizing the characters according to XSLT rules. |
void |
sendSAXcharacters(org.xml.sax.ContentHandler ch,
int start,
int length)
Sends the specified range of characters as one or more SAX characters() events. |
void |
sendSAXComment(org.xml.sax.ext.LexicalHandler ch,
int start,
int length)
Sends the specified range of characters as sax Comment. |
void |
setLength(int l)
Directly set how much of the FastStringBuffer's storage is to be considered part of its content. |
private void |
setLength(int l,
FastStringBuffer rootFSB)
Subroutine for the public setLength() method. |
int |
size()
Get the length of the list. |
String |
toString()
Note that this operation has been somewhat deoptimized by the shift to a chunked array, as there is no factory method to produce a String object directly from an array of arrays and hence a double copy is needed. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
static final int DEBUG_FORCE_INIT_BITS
static boolean DEBUG_FORCE_FIXED_CHUNKSIZE
public static final int SUPPRESS_LEADING_WS
sendNormalizedSAXcharacters(char[],int,int,org.xml.sax.ContentHandler,int)
,
Constant Field Valuespublic static final int SUPPRESS_TRAILING_WS
public static final int SUPPRESS_BOTH
sendNormalizedSAXcharacters(char[],int,int,org.xml.sax.ContentHandler,int)
,
Constant Field Valuesprivate static final int CARRY_WS
int m_chunkBits
int m_maxChunkBits
int m_rebundleBits
int m_chunkSize
int m_chunkMask
char[][] m_array
int m_lastChunk
The insertion point for append operations is addressed by the combination of m_lastChunk and m_firstFree.
int m_firstFree
FastStringBuffer m_innerFSB
static final char[] SINGLE_SPACE
Constructor Detail |
public FastStringBuffer(int initChunkBits, int maxChunkBits, int rebundleBits)
For coding convenience, I've expressed both allocation sizes in terms of a number of bits. That's needed for the final size of a chunk, to permit fast and efficient shift-and-mask addressing. It's less critical for the inital size, and may be reconsidered.
An alternative would be to accept integer sizes and round to powers of two; that really doesn't seem to buy us much, if anything.
initChunkBits
- Length in characters of the initial allocation
of a chunk, expressed in log-base-2. (That is, 10 means allocate 1024
characters.) Later chunks will use larger allocation units, to trade off
allocation speed of large document against storage efficiency of small
ones.maxChunkBits
- Number of character-offset bits that should be used for
addressing within a chunk. Maximum length of a chunk is 2^chunkBits
characters.rebundleBits
- Number of character-offset bits that addressing should
advance before we attempt to take a step from initChunkBits to maxChunkBitspublic FastStringBuffer(int initChunkBits, int maxChunkBits)
public FastStringBuffer(int initChunkBits)
ISSUE: Should this call assert initial size, or fixed size? Now configured as initial, with a default for fixed.
public FastStringBuffer()
private FastStringBuffer(FastStringBuffer source)
Method Detail |
public final int size()
public final int length()
public final void reset()
public final void setLength(int l)
l
- New length. If l<0 or l>=getLength(), this operation will
not report an error but future operations will almost certainly fail.private final void setLength(int l, FastStringBuffer rootFSB)
public final String toString()
(It really is a pity that Java didn't design String as a final subclass of MutableString, rather than having StringBuffer be a separate hierarchy. We'd avoid a lot of double-buffering.)
toString
in class Object
public final void append(char value)
NOTE THAT after calling append(), previously obtained references to m_array[][] may no longer be valid.... though in fact they should be in this instance.
value
- character to be appended.public final void append(String value)
NOTE THAT after calling append(), previously obtained references to m_array[] may no longer be valid.
value
- String whose contents are to be appended.public final void append(StringBuffer value)
NOTE THAT after calling append(), previously obtained references to m_array[] may no longer be valid.
value
- StringBuffer whose contents are to be appended.public final void append(char[] chars, int start, int length)
NOTE THAT after calling append(), previously obtained references to m_array[] may no longer be valid.
chars
- character array from which data is to be copiedstart
- offset in chars of first character to be copied,
zero-based.length
- number of characters to be copiedpublic final void append(FastStringBuffer value)
NOTE THAT after calling append(), previously obtained references to m_array[] may no longer be valid.
value
- FastStringBuffer whose contents are
to be appended.public boolean isWhitespace(int start, int length)
start
- Offset of first character in the range.length
- Number of characters to send.
CURRENTLY DOES NOT CHECK FOR OUT-OF-RANGE.
public String getString(int start, int length)
start
- Offset of first character in the range.length
- Number of characters to send.
StringBuffer getString(StringBuffer sb, int start, int length)
sb
- StringBuffer to be appended tostart
- Offset of first character in the range.length
- Number of characters to send.
StringBuffer getString(StringBuffer sb, int startChunk, int startColumn, int length)
Note that this operation has been somewhat deoptimized by the shift to a chunked array, as there is no factory method to produce a String object directly from an array of arrays and hence a double copy is needed. By presetting length we hope to minimize the heap overhead of building the intermediate StringBuffer.
(It really is a pity that Java didn't design String as a final subclass of MutableString, rather than having StringBuffer be a separate hierarchy. We'd avoid a lot of double-buffering.)
sb
- startChunk
- startColumn
- length
-
public char charAt(int pos)
pos
- character position requested.
public void sendSAXcharacters(org.xml.sax.ContentHandler ch, int start, int length) throws org.xml.sax.SAXException
Note too that there is no promise that the output will be sent as a single call. As is always true in SAX, one logical string may be split across multiple blocks of memory and hence delivered as several successive events.
ch
- SAX ContentHandler object to receive the event.start
- Offset of first character in the range.length
- Number of characters to send.
org.xml.sax.SAXException
- may be thrown by handler's
characters() method.public int sendNormalizedSAXcharacters(org.xml.sax.ContentHandler ch, int start, int length) throws org.xml.sax.SAXException
ch
- SAX ContentHandler object to receive the event.start
- Offset of first character in the range.length
- Number of characters to send.
org.xml.sax.SAXException
- may be thrown by handler's
characters() method.static int sendNormalizedSAXcharacters(char[] ch, int start, int length, org.xml.sax.ContentHandler handler, int edgeTreatmentFlags) throws org.xml.sax.SAXException
ch
- The characters from the XML document.start
- The start position in the array.length
- The number of characters to read from the array.handler
- SAX ContentHandler object to receive the event.edgeTreatmentFlags
- How leading/trailing spaces should be handled.
This is a bitfield contining two flags, bitwise-ORed together:
org.xml.sax.SAXException
- Any SAX exception, possibly
wrapping another exception.public static void sendNormalizedSAXcharacters(char[] ch, int start, int length, org.xml.sax.ContentHandler handler) throws org.xml.sax.SAXException
ch
- The characters from the XML document.start
- The start position in the array.length
- The number of characters to read from the array.handler
- SAX ContentHandler object to receive the event.
org.xml.sax.SAXException
- Any SAX exception, possibly
wrapping another exception.public void sendSAXComment(org.xml.sax.ext.LexicalHandler ch, int start, int length) throws org.xml.sax.SAXException
Note that, unlike sendSAXcharacters, this has to be done as a single call to LexicalHandler#comment.
ch
- SAX LexicalHandler object to receive the event.start
- Offset of first character in the range.length
- Number of characters to send.
org.xml.sax.SAXException
- may be thrown by handler's
characters() method.private void getChars(int srcBegin, int srcEnd, char[] dst, int dstBegin)
srcBegin
- index of the first character in the string
to copy.srcEnd
- index after the last character in the string
to copy.dst
- the destination array.dstBegin
- the start offset in the destination array.
IndexOutOfBoundsException
- If any of the following
is true:
srcBegin
is negative.
srcBegin
is greater than srcEnd
srcEnd
is greater than the length of this
string
dstBegin
is negative
dstBegin+(srcEnd-srcBegin)
is larger than
dst.length
NullPointerException
- if dst
is null
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |