Future Work

The BINGO! system is a first step towards a new generation of information search tools based on the focused crawling paradigm. Most importantly, we will need much more comprehensive, long-term and large-scale, experiments to evaluate the strengths and weaknesses of our approach and to improve the algorithms and control-parameter calibration. To this end, we have implemented BINGO! as a framework that can easily be extended with new building blocks for individual aspects (e.g., feature selection or authority ranking) and is thus well suited for further exploration of the design space for focused crawling.

Another goal of our future work is to combine BINGO! with the XXL search engine that we have been implementing for ranked retrieval of XML data. XXL (for "flexible XML search language") is a search language that, like XQuery and many other XML query languages, integrates SQL-style logical conditions with pattern matching capabilities along paths that correspond to parent-child element relationships and XLink pointers. In contrast to almost all other query languages, XXL supports semantic similarity comparisons that are based on information-retrieval-style term statistics for element contents and ontological distances for element names. So the result of an XXL query is a ranked list of XML element paths in descending order of estimated relevance.

To speed up the evaluation of such complex queries we have designed an ontology index, and we plan to use BINGO! for populating and maintaining the contents of this index.