|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object javatools.parsers.BloomFilter<E>
E
- Object type that is to be inserted into the Bloom filter, e.g. String or Integer.public class BloomFilter<E>
Implementation of a Bloom-filter, as described here: http://en.wikipedia.org/wiki/Bloom_filter Inspired by the SimpleBloomFilter-class written by Ian Clark. This implementation provides a more evenly distributed Hash-function by using a proper digest instead of the Java RNG. Many of the changes were proposed in comments in his blog: http://blog.locut.us/2008/01/12/a-decent-stand-alone-java-bloom-filter-implementation/
Constructor Summary | |
---|---|
BloomFilter(int bitSetSize,
int expectedNumberOfFilterElements)
Constructs an empty Bloom filter. |
Method Summary | |
---|---|
void |
add(E element)
Adds an object to the Bloom filter. |
void |
addAll(java.util.Collection<? extends E> c)
Adds all elements from a Collection to the Bloom filter. |
void |
clear()
Sets all bits to false in the Bloom filter. |
boolean |
contains(E element)
Returns true if the element could have been inserted into the Bloom filter. |
boolean |
containsAll(java.util.Collection<? extends E> c)
Returns true if all the elements of a Collection could have been inserted into the Bloom filter. |
int |
count()
Returns the number of elements added to the Bloom filter after it was constructed or after clear() was called. |
static long |
createHash(byte[] data)
Generates a digest based on the contents of an array of bytes. |
static long |
createHash(java.lang.String val)
Generates a digest based on the contents of a String. |
static long |
createHash(java.lang.String val,
java.lang.String charset)
Generates a digest based on the contents of a String. |
boolean |
equals(java.lang.Object obj)
Compares the contents of two instances to see if they are equal. |
double |
expectedFalsePositiveProbability()
Calculates the expected probability of false positives based on the number of expected filter elements and the size of the Bloom filter. |
boolean |
getBit(int bit)
Read a single bit from the Bloom filter. |
double |
getFalsePositiveProbability()
Get the current probability of a false positive. |
double |
getFalsePositiveProbability(double numberOfElements)
Calculate the probability of a false positive given the specified number of inserted elements. |
int |
getK()
Returns the value chosen for K. K is the optimal number of hash functions based on the size of the Bloom filter and the expected number of inserted elements. |
int |
hashCode()
Calculates a hash code for this class. |
void |
setBit(int bit,
boolean value)
Set a single bit in the Bloom filter. |
int |
size()
Returns the number of bits in the Bloom filter. |
Methods inherited from class java.lang.Object |
---|
getClass, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public BloomFilter(int bitSetSize, int expectedNumberOfFilterElements)
bitSetSize
- defines how many bits should be used for the filter.expectedNumberOfFilterElements
- defines the maximum number of elements the filter is expected to contain.Method Detail |
---|
public static long createHash(java.lang.String val, java.lang.String charset) throws java.io.UnsupportedEncodingException
val
- specifies the input data.charset
- specifies the encoding of the input data.
java.io.UnsupportedEncodingException
- if the charset is unsupported.public static long createHash(java.lang.String val) throws java.io.UnsupportedEncodingException
val
- specifies the input data. The encoding is expected to be UTF-8.
java.io.UnsupportedEncodingException
- if UTF-8 is not supported.public static long createHash(byte[] data)
data
- specifies input data.
public boolean equals(java.lang.Object obj)
equals
in class java.lang.Object
obj
- is the object to compare to.
public int hashCode()
hashCode
in class java.lang.Object
public double expectedFalsePositiveProbability()
public double getFalsePositiveProbability(double numberOfElements)
numberOfElements
- number of inserted elements.
public double getFalsePositiveProbability()
public int getK()
public void clear()
public void add(E element) throws java.io.UnsupportedEncodingException
element
- is an element to register in the Bloom filter.
java.io.UnsupportedEncodingException
- if UTF-8 is unsupported.public void addAll(java.util.Collection<? extends E> c) throws java.io.UnsupportedEncodingException
c
- Collection of elements.
java.io.UnsupportedEncodingException
- if UTF-8 is unsupported.public boolean contains(E element) throws java.io.UnsupportedEncodingException
element
- element to check.
java.io.UnsupportedEncodingException
- if UTF-8 is unsupported.public boolean containsAll(java.util.Collection<? extends E> c) throws java.io.UnsupportedEncodingException
c
- elements to check.
java.io.UnsupportedEncodingException
- if UTF-8 is unsupported.public boolean getBit(int bit)
bit
- the bit to read.
public void setBit(int bit, boolean value)
bit
- is the bit to set.value
- If true, the bit is set. If false, the bit is cleared.public int size()
public int count()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |