ISSearch
Class ISParser

java.lang.Object
  extended byISSearch.ISParser
All Implemented Interfaces:
ISParserInterface

public class ISParser
extends Object
implements ISParserInterface

This mandatory class must implement all functions prescribed by ISParserInterface


Constructor Summary
ISParser()
          Creates a new instance of ISParser
 
Method Summary
 boolean isStopword(String who)
          Decides whether the given token is claimed as stopword or not.
 ISDocumentInterface parse(Reader input)
          Performs the input analysis.
 String stem(String who)
          Applies the Porter stemming algorithm and returns the resulting word stem.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ISParser

public ISParser()
Creates a new instance of ISParser

Method Detail

isStopword

public boolean isStopword(String who)
Decides whether the given token is claimed as stopword or not. This function must apply the FreeWAIS stopword list. This function must be implementen case-insensitive (e.g., both tokens 'the' and 'ThE' should be properly recognized as stopwords)

Specified by:
isStopword in interface ISParserInterface
Parameters:
who - The String to be checked.
Returns:
true if the given string is a stopword, false otherwise.

parse

public ISDocumentInterface parse(Reader input)
Performs the input analysis. Returns the container object that implements the ISDocumentInterface and contains extracted words, word stems, and links.

Specified by:
parse in interface ISParserInterface
Parameters:
input - the input of the parser (e.g., text file or HTTP connection), represented by the Reader
Returns:
Container object with terms and links or null if any internal error occurs.

stem

public String stem(String who)
Applies the Porter stemming algorithm and returns the resulting word stem. The output must be normalized (using String.toLowerCase() and String.trim())

Specified by:
stem in interface ISParserInterface
Parameters:
who - The word to be stemmed.
Returns:
word stem, trimmed and lowercase.