|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
Interface of the main Crawler class of the Web search engine. This class is used to start and stop the Crawler, to reset the engine and to control crawling parameters.
Runnable
,
Thread
,
InetAddress
,
URL
,
HttpURLConnection
,
InputStreamReader
,
BufferedReader
,
Exception
Field Summary | |
static int |
RUNNING
The Running state of the current thread |
static int |
STOPPED
The Idle state of the current thread |
Method Summary | |
void |
addLink(URL link)
Adds a new link to the URL queue, if the link is not yet visited. |
URL |
getBest()
Returns the best candidate to be visited next. |
int |
getCrawlingDepth()
Returns the current maximum allowed crawling depth. |
ISDocumentInterface |
getCurrentDocument()
Returns the last document visited by the Crawler. |
URL |
getCurrentURL()
Returns the last URL visited by the Crawler. |
int |
getMaxQueueSize()
Returns the maximum allowed size of the URL Queue |
int |
getQueueSize()
Returns the current size of the URL queue |
int |
getState()
Returns the current state of the crawler. |
boolean |
isVisited(URL doc)
Checks if the URL of the given document is already visited by the crawler. |
void |
reset()
Resets the crawler. |
void |
setCrawlingDepth(int depth)
Sets the maximum allowed crawling depth. |
void |
setQueueMaxSize(int m)
Set the maximum allowed size of the URL queue |
void |
start()
Starts the thread of the crawler and changes the engine state to RUNNING |
void |
stop()
Stops the crawler. |
Methods inherited from interface java.lang.Runnable |
run |
Field Detail |
public static final int RUNNING
public static final int STOPPED
Method Detail |
public void start()
RUNNING
public void stop()
STOPPED
.
public void reset()
STOPPED
,
public void addLink(URL link)
link
- The URL link representation of the new targetpublic int getState()
RUNNING
and STOPPED
.
RUNNING
oder STOPPED
public int getQueueSize()
public void setQueueMaxSize(int m)
m
- The maximum allowed Queue sizepublic int getMaxQueueSize()
public void setCrawlingDepth(int depth)
depth
- The maximum allowed craling depth.public int getCrawlingDepth()
public URL getBest()
null
if the queue is empty.public boolean isVisited(URL doc)
true
if the engine was able to recognize
the given URL as already visited, false
.public ISDocumentInterface getCurrentDocument()
public URL getCurrentURL()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |