A knowledge base can never be complete and inevitably exhibits gaps. Suppose a user finds the biography of a singer interesting and then wants to find all songs and albums by this singer, including the latest ones. Crawling additional Web sites on music and extracting the missing data is often infeasible because of site restrictions and because the site's information is continuously changing. Moreover, some knowledge is inherently ephemeral: for example, the current rating of a movie (by averaging user reviews) or the chart rank of a song. The approach to fill these gaps would be to harness the semi-structured data exported via Web service APIs by an increasing number of accurate Web databases on music, movies, books, business directories, etc. This would require retrieving data from Web services on the fly, whenever the local knowledge base does not suffice to answer a user's knowledge needs. Obviously, such a federated architecture entails several problems of high complexity: mapping search requests onto service interfaces, cost/benefit-oriented routing of queries to promising services, integrating results from different services, and more.
Part of the YAGO-NAGA project at the Max-Planck Institute for Informatics in Saarbrücken/Germany.