|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object java.util.regex.Matcher
An engine that performs match operations on a character sequence
by interpreting a
Pattern
.
A matcher is created from a pattern by invoking the pattern's matcher
method. Once created, a matcher can be used to
perform three different kinds of match operations:
The matches
method attempts to match the entire
input sequence against the pattern.
The lookingAt
method attempts to match the
input sequence, starting at the beginning, against the pattern.
The find
method scans the input sequence looking for
the next subsequence that matches the pattern.
Each of these methods returns a boolean indicating success or failure. More information about a successful match can be obtained by querying the state of the matcher.
This class also defines methods for replacing matched subsequences with
new strings whose contents can, if desired, be computed from the match
result. The appendReplacement
and appendTail
methods can be used in tandem in order to collect
the result into an existing string buffer, or the more convenient replaceAll
method can be used to create a string in which every
matching subsequence in the input sequence is replaced.
The explicit state of a matcher includes the start and end indices of the most recent successful match. It also includes the start and end indices of the input subsequence captured by each capturing group in the pattern as well as a total count of such subsequences. As a convenience, methods are also provided for returning these captured subsequences in string form.
The explicit state of a matcher is initially undefined; attempting to
query any part of it before a successful match will cause an IllegalStateException
to be thrown. The explicit state of a matcher is
recomputed by every match operation.
The implicit state of a matcher includes the input character sequence as
well as the append position, which is initially zero and is updated
by the appendReplacement
method.
A matcher may be reset explicitly by invoking its reset()
method or, if a new input sequence is desired, its reset(CharSequence)
method. Resetting a
matcher discards its explicit state information and sets the append position
to zero.
Instances of this class are not safe for use by multiple concurrent threads.
Field Summary | |
(package private) int |
acceptMode
|
(package private) static int |
ENDANCHOR
Matcher state used by the last node. |
(package private) int |
first
The range of string that last matched the pattern. |
(package private) int |
from
The range within the string that is to be matched. |
(package private) int[] |
groups
The storage used by groups. |
(package private) int |
last
The range of string that last matched the pattern. |
(package private) int |
lastAppendPosition
The index of the last position appended in a substitution. |
(package private) int[] |
locals
Storage used by nodes to tell what repetition they are on in a pattern, and where groups begin. |
(package private) static int |
NOANCHOR
|
(package private) int |
oldLast
The end index of what matched in the last match operation. |
(package private) Pattern |
parentPattern
The Pattern object that created this Matcher. |
(package private) CharSequence |
text
The original string being matched. |
(package private) int |
to
The range within the string that is to be matched. |
Constructor Summary | |
(package private) |
Matcher()
No default constructor. |
(package private) |
Matcher(Pattern parent,
CharSequence text)
All matchers have the state used by Pattern during a match. |
Method Summary | |
Matcher |
appendReplacement(StringBuffer sb,
String replacement)
Implements a non-terminal append-and-replace step. |
StringBuffer |
appendTail(StringBuffer sb)
Implements a terminal append-and-replace step. |
(package private) char |
charAt(int i)
Returns this Matcher's input character at index i. |
int |
end()
Returns the index of the last character matched, plus one. |
int |
end(int group)
Returns the index of the last character, plus one, of the subsequence captured by the given group during the previous match operation. |
boolean |
find()
Attempts to find the next subsequence of the input sequence that matches the pattern. |
boolean |
find(int start)
Resets this matcher and then attempts to find the next subsequence of the input sequence that matches the pattern, starting at the specified index. |
private boolean |
find(int from,
int to)
Initiates a search to find a Pattern within the given bounds. |
(package private) CharSequence |
getSubSequence(int beginIndex,
int endIndex)
Generates a String from this Matcher's input in the specified range. |
(package private) int |
getTextLength()
Returns the end index of the text. |
String |
group()
Returns the input subsequence matched by the previous match. |
String |
group(int group)
Returns the input subsequence captured by the given group during the previous match operation. |
int |
groupCount()
Returns the number of capturing groups in this matcher's pattern. |
boolean |
lookingAt()
Attempts to match the input sequence, starting at the beginning, against the pattern. |
private boolean |
match(int from,
int to,
int anchor)
Initiates a search for an anchored match to a Pattern within the given bounds. |
boolean |
matches()
Attempts to match the entire input sequence against the pattern. |
Pattern |
pattern()
Returns the pattern that is interpreted by this matcher. |
String |
replaceAll(String replacement)
Replaces every subsequence of the input sequence that matches the pattern with the given replacement string. |
String |
replaceFirst(String replacement)
Replaces the first subsequence of the input sequence that matches the pattern with the given replacement string. |
Matcher |
reset()
Resets this matcher. |
Matcher |
reset(CharSequence input)
Resets this matcher with a new input sequence. |
int |
start()
Returns the start index of the previous match. |
int |
start(int group)
Returns the start index of the subsequence captured by the given group during the previous match operation. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
Pattern parentPattern
int[] groups
int from
int to
CharSequence text
static final int ENDANCHOR
static final int NOANCHOR
int acceptMode
int first
int last
int oldLast
int lastAppendPosition
int[] locals
Constructor Detail |
Matcher()
Matcher(Pattern parent, CharSequence text)
Method Detail |
public Pattern pattern()
public Matcher reset()
Resetting a matcher discards all of its explicit state information and sets its append position to zero.
public Matcher reset(CharSequence input)
Resetting a matcher discards all of its explicit state information and sets its append position to zero.
input
- The new input character sequence
public int start()
IllegalStateException
- If no match has yet been attempted,
or if the previous match operation failedpublic int start(int group)
Capturing groups are indexed from left to right, starting at one. Group zero denotes the entire pattern, so the expression m.start(0) is equivalent to m.start().
group
- The index of a capturing group in this matcher's pattern
IllegalStateException
- If no match has yet been attempted,
or if the previous match operation failed
IndexOutOfBoundsException
- If there is no capturing group in the pattern
with the given indexpublic int end()
IllegalStateException
- If no match has yet been attempted,
or if the previous match operation failedpublic int end(int group)
Capturing groups are indexed from left to right, starting at one. Group zero denotes the entire pattern, so the expression m.end(0) is equivalent to m.end().
group
- The index of a capturing group in this matcher's pattern
IllegalStateException
- If no match has yet been attempted,
or if the previous match operation failed
IndexOutOfBoundsException
- If there is no capturing group in the pattern
with the given indexpublic String group()
For a matcher m with input sequence s, the expressions m.group() and s.substring(m.start(), m.end()) are equivalent.
Note that some patterns, for example a*, match the empty string. This method will return the empty string when the pattern successfully matches the empty string in the input.
IllegalStateException
- If no match has yet been attempted,
or if the previous match operation failedpublic String group(int group)
For a matcher m, input sequence s, and group index g, the expressions m.group(g) and s.substring(m.start(g), m.end(g)) are equivalent.
Capturing groups are indexed from left to right, starting at one. Group zero denotes the entire pattern, so the expression m.group(0) is equivalent to m.group().
If the match was successful but the group specified failed to match any part of the input sequence, then null is returned. Note that some groups, for example (a*), match the empty string. This method will return the empty string when such a group successfully matches the emtpy string in the input.
group
- The index of a capturing group in this matcher's pattern
IllegalStateException
- If no match has yet been attempted,
or if the previous match operation failed
IndexOutOfBoundsException
- If there is no capturing group in the pattern
with the given indexpublic int groupCount()
Group zero denotes the entire pattern by convention. It is not included in this count.
Any non-negative integer smaller than or equal to the value returned by this method is guaranteed to be a valid group index for this matcher.
public boolean matches()
If the match succeeds then more information can be obtained via the start, end, and group methods.
public boolean find()
This method starts at the beginning of the input sequence or, if a previous invocation of the method was successful and the matcher has not since been reset, at the first character not matched by the previous match.
If the match succeeds then more information can be obtained via the start, end, and group methods.
public boolean find(int start)
If the match succeeds then more information can be obtained via the
start, end, and group methods, and subsequent
invocations of the find()
method will start at the first
character not matched by this match.
IndexOutOfBoundsException
- If start is less than zero or if start is greater than the
length of the input sequence.public boolean lookingAt()
Like the matches
method, this method always starts
at the beginning of the input sequence; unlike that method, it does not
require that the entire input sequence be matched.
If the match succeeds then more information can be obtained via the start, end, and group methods.
public Matcher appendReplacement(StringBuffer sb, String replacement)
This method performs the following actions:
It reads characters from the input sequence, starting at the
append position, and appends them to the given string buffer. It
stops after reading the last character preceding the previous match,
that is, the character at index start()
- 1.
It appends the given replacement string to the string buffer.
It sets the append position of this matcher to the index of
the last character matched, plus one, that is, to end()
.
The replacement string may contain references to subsequences
captured during the previous match: Each occurrence of
$g will be replaced by the result of
evaluating group
(g).
The first number after the $ is always treated as part of
the group reference. Subsequent numbers are incorporated into g if
they would form a legal group reference. Only the numerals '0'
through '9' are considered as potential components of the group
reference. If the second group matched the string "foo", for
example, then passing the replacement string "$2bar" would
cause "foobar" to be appended to the string buffer. A dollar
sign ($) may be included as a literal in the replacement
string by preceding it with a backslash (\$).
Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string. Dollar signs may be treated as references to captured subsequences as described above, and backslashes are used to escape literal characters in the replacement string.
This method is intended to be used in a loop together with the
appendTail
and find
methods. The
following code, for example, writes one dog two dogs in the
yard to the standard-output stream:
Pattern p = Pattern.compile("cat"); Matcher m = p.matcher("one cat two cats in the yard"); StringBuffer sb = new StringBuffer(); while (m.find()) { m.appendReplacement(sb, "dog"); } m.appendTail(sb); System.out.println(sb.toString());
sb
- The target string bufferreplacement
- The replacement string
IllegalStateException
- If no match has yet been attempted,
or if the previous match operation failed
IndexOutOfBoundsException
- If the replacement string refers to a capturing group
that does not exist in the patternpublic StringBuffer appendTail(StringBuffer sb)
This method reads characters from the input sequence, starting at
the append position, and appends them to the given string buffer. It is
intended to be invoked after one or more invocations of the appendReplacement
method in order to copy the
remainder of the input sequence.
sb
- The target string buffer
public String replaceAll(String replacement)
This method first resets this matcher. It then scans the input
sequence looking for matches of the pattern. Characters that are not
part of any match are appended directly to the result string; each match
is replaced in the result by the replacement string. The replacement
string may contain references to captured subsequences as in the appendReplacement
method.
Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string. Dollar signs may be treated as references to captured subsequences as described above, and backslashes are used to escape literal characters in the replacement string.
Given the regular expression a*b, the input "aabfooaabfooabfoob", and the replacement string "-", an invocation of this method on a matcher for that expression would yield the string "-foo-foo-foo-".
Invoking this method changes this matcher's state. If the matcher is to be used in further matching operations then it should first be reset.
replacement
- The replacement string
public String replaceFirst(String replacement)
This method first resets this matcher. It then scans the input
sequence looking for a match of the pattern. Characters that are not
part of the match are appended directly to the result string; the match
is replaced in the result by the replacement string. The replacement
string may contain references to captured subsequences as in the appendReplacement
method.
Given the regular expression dog, the input "zzzdogzzzdogzzz", and the replacement string "cat", an invocation of this method on a matcher for that expression would yield the string "zzzcatzzzdogzzz".
Invoking this method changes this matcher's state. If the matcher is to be used in further matching operations then it should first be reset.
replacement
- The replacement string
private boolean find(int from, int to)
private boolean match(int from, int to, int anchor)
int getTextLength()
CharSequence getSubSequence(int beginIndex, int endIndex)
beginIndex
- the beginning index, inclusiveendIndex
- the ending index, exclusive
char charAt(int i)
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |