Decoration
max planck institut
informatik
mpii logo Minerva of the Max Planck Society
 

WebChild: Commonsense Knowledge from the Web

Overview

WebChild is a large collection of commonsense knowledge, automatically extracted and disambiguated from Web contents. WebChild contains triples that connect nouns with adjectives via fine-grained relations like hasShape, hasTaste, evokesEmotion, etc. The arguments of these assertions, nouns and adjectives, are disambiguated by mapping them onto their proper WordNet senses.

Large-scale experiments demonstrate the high accuracy (more than 80 percent) and coverage (more than four million fine grained disambiguated assertions) of WebChild.

Approach

Our method is based on semi-supervised Label Propagation over graphs of noisy candidate assertions. We automatically derive seeds from WordNet and by pattern matching from Web text collections. The Label Propagation algorithm provides us with domain sets and range sets for 19 different relations, and with confidence-ranked assertions between WordNet senses.

Publications

WebChild: Harvesting and Organizing Commonsense Knowledge from the Web  PDF
Niket Tandon, Gerard de Melo, Fabian Suchanek, Gerhard Weikum (2014)
In: Proc. ACM WSDM 2014, New York City, NY, USA.
Acceptance rate: 18%.

Deriving a Web-Scale Common Sense Fact Database   PDF   BibTeX
Niket Tandon, Gerard de Melo, Gerhard Weikum (2011)
In: Proc. AAAI 2011. San Francisco, CA, USA.
Acceptance rate: 25%.

Downloads and Further Information

hasProperty (307MB, 22Mi.) triples
hasProperty sample with schema sample

More datasets coming soon! For more information, please get in touch with Niket Tandon. Please see https://gate.d5.mpi-inf.mpg.de/webchild/

Webchild Commonsense Browser

Take a look at the commonsense browser here