java.util.regex
Class Pattern.BnM

java.lang.Object
  extended byjava.util.regex.Pattern.Node
      extended byjava.util.regex.Pattern.BnM
Enclosing class:
Pattern

static final class Pattern.BnM
extends Pattern.Node

Attempts to match a slice in the input using the Boyer-Moore string matching algorithm. The algorithm is based on the idea that the pattern can be shifted farther ahead in the search text if it is matched right to left.

The pattern is compared to the input one character at a time, from the rightmost character in the pattern to the left. If the characters all match the pattern has been found. If a character does not match, the pattern is shifted right a distance that is the maximum of two functions, the bad character shift and the good suffix shift. This shift moves the attempted match position through the input more quickly than a naive one postion at a time check.

The bad character shift is based on the character from the text that did not match. If the character does not appear in the pattern, the pattern can be shifted completely beyond the bad character. If the character does occur in the pattern, the pattern can be shifted to line the pattern up with the next occurrence of that character.

The good suffix shift is based on the idea that some subset on the right side of the pattern has matched. When a bad character is found, the pattern can be shifted right by the pattern length if the subset does not occur again in pattern, or by the amount of distance to the next occurrence of the subset in the pattern. Boyer-Moore search methods adapted from code by Amy Yu.


Field Summary
(package private)  char[] buffer
           
(package private)  int[] lastOcc
           
(package private)  Pattern.Node next
           
(package private)  int[] optoSft
           
 
Constructor Summary
(package private) Pattern.BnM(char[] src, int[] lastOcc, int[] optoSft, Pattern.Node next)
           
 
Method Summary
(package private)  Pattern.Node dup(boolean not)
           
(package private)  boolean match(Matcher matcher, int i, CharSequence seq)
          This method implements the classic accept node.
(package private) static Pattern.Node optimize(Pattern.Node node)
          Pre calculates arrays needed to generate the bad character shift and the good suffix shift.
(package private)  boolean study(Pattern.TreeInfo info)
          This method is good for all zero length assertions.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

buffer

char[] buffer

lastOcc

int[] lastOcc

optoSft

int[] optoSft

next

Pattern.Node next
Constructor Detail

Pattern.BnM

Pattern.BnM(char[] src,
            int[] lastOcc,
            int[] optoSft,
            Pattern.Node next)
Method Detail

optimize

static Pattern.Node optimize(Pattern.Node node)
Pre calculates arrays needed to generate the bad character shift and the good suffix shift. Only the last seven bits are used to see if chars match; This keeps the tables small and covers the heavily used ASII range, but occasionally results in an aliased match for the bad character shift.


match

boolean match(Matcher matcher,
              int i,
              CharSequence seq)
Description copied from class: Pattern.Node
This method implements the classic accept node.

Overrides:
match in class Pattern.Node

study

boolean study(Pattern.TreeInfo info)
Description copied from class: Pattern.Node
This method is good for all zero length assertions.

Overrides:
study in class Pattern.Node

dup

Pattern.Node dup(boolean not)