|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object ISSearch.ISDBCrawler
The Crawler class of the Web search engine. This class is used to start and stop the Crawler, to reset the engine and to control crawling parameters.
Field Summary | |
private ISDBinterface |
dbinterface
The Built-In database Interface of the crawler |
Fields inherited from interface ISSearch.ISCrawlerInterface |
RUNNING, STOPPED |
Constructor Summary | |
ISDBCrawler()
Creates a new instance of ISCrawler |
Method Summary | |
void |
addLink(URL link)
Adds a new link to the URL queue, if the link is not yet visited. |
void |
closeDB()
Closes the database connection of the built-in database interface. |
URL |
getBest()
Returns the best candidate to be visited next. |
int |
getCrawlingDepth()
Returns the current maximum allowed crawling depth. |
ISDocumentInterface |
getCurrentDocument()
Returns the last document visited by the Crawler. |
URL |
getCurrentURL()
Returns the last URL visited by the Crawler. |
ISDBinterface |
getDBInterface()
Returns the built-in database interface of the crawler |
int |
getMaxQueueSize()
Returns the maximum allowed size of the URL Queue |
int |
getQueueSize()
Returns the current size of the URL queue |
int |
getState()
Returns the current state of the crawler. |
boolean |
isVisited(URL doc)
Checks if the URL of the given document is already visited by the crawler. |
boolean |
openDB()
Initializes the internal database interface and opens its database connection |
void |
reset()
Resets the crawler. |
void |
run()
When an object implementing interface Runnable is used
to create a thread, starting the thread causes the object's
run method to be called in that separately executing
thread.
|
void |
setCrawlingDepth(int depth)
Sets the maximum allowed crawling depth. |
void |
setQueueMaxSize(int m)
Set the maximum allowed size of the URL queue |
void |
start()
Starts the thread of the crawler and changes the engine state to RUNNING |
void |
stop()
Stops the crawler. |
boolean |
store(URL link,
ISDocumentInterface doc)
Stores the crawled document and its URL into the database |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
private ISDBinterface dbinterface
Constructor Detail |
public ISDBCrawler()
Method Detail |
public boolean store(URL link, ISDocumentInterface doc)
store
in interface ISDBCrawlerInterface
link
- the URL of the crawled documentdoc
- extracted terms and links from the document
public boolean openDB()
openDB
in interface ISDBCrawlerInterface
public void closeDB()
closeDB
in interface ISDBCrawlerInterface
public ISDBinterface getDBInterface()
getDBInterface
in interface ISDBCrawlerInterface
public void addLink(URL link)
addLink
in interface ISCrawlerInterface
link
- The URL link representation of the new targetpublic URL getBest()
getBest
in interface ISCrawlerInterface
null
if the queue is empty.public int getCrawlingDepth()
getCrawlingDepth
in interface ISCrawlerInterface
public ISDocumentInterface getCurrentDocument()
getCurrentDocument
in interface ISCrawlerInterface
public URL getCurrentURL()
getCurrentURL
in interface ISCrawlerInterface
public int getMaxQueueSize()
getMaxQueueSize
in interface ISCrawlerInterface
public int getQueueSize()
getQueueSize
in interface ISCrawlerInterface
public int getState()
RUNNING
and STOPPED
.
getState
in interface ISCrawlerInterface
RUNNING
oder STOPPED
public boolean isVisited(URL doc)
isVisited
in interface ISCrawlerInterface
true
if the engine was able to recognize
the given URL as already visited, false
.public void setCrawlingDepth(int depth)
setCrawlingDepth
in interface ISCrawlerInterface
depth
- The maximum allowed craling depth.public void setQueueMaxSize(int m)
setQueueMaxSize
in interface ISCrawlerInterface
m
- The maximum allowed Queue sizepublic void start()
RUNNING
start
in interface ISCrawlerInterface
public void stop()
STOPPED
.
stop
in interface ISCrawlerInterface
public void reset()
STOPPED
,
reset
in interface ISCrawlerInterface
public void run()
Runnable
Runnable
is used
to create a thread, starting the thread causes the object's
run
method to be called in that separately executing
thread.
The general contract of the method run
is that it may
take any action whatsoever.
run
in interface Runnable
Thread.run()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |