max planck institut
mpii logo Minerva of the Max Planck Society

LEILA: Learning to Extract Information by Linguistic Analysis

 Research   Downloads   Corpora   Publications   People 




How to use LEILA

  1. Download the Java tools
  2. Download the Link Grammar Parser
  3. Download some recent version of Java (1.5+) if you don't have it
  4. Download the Java-source, the class-files and the documentation of LEILA here
  5. Run Leila.class. LEILA will tell you how to set it up.

How data flows in LEILA

The flow of data with LEILA is as follows:

    Corpus      ->   Proper sentences  ->  Parsed sentences  ->  Model   ->  Output Pairs
    documents        (*.LGI)               (*.LGO)               (*.MDL)     (*.TXT)
    (*.HTML)                                     '------------------------->

       ''  '' ''  ''


The corpus can be any set of text or HTML documents. These documents can be spread across different folders or subfolders. The class extracts the proper sentences from from the corpus documents. Each document generates one LGI file containing the sentences. These LGI-files are given to the Link Grammar Parser (called by, which produces parse trees for the sentences. Each LGI-file generates one LGO-file containing the parse trees. The class tries to find patterns for the target relation in the LGO-files. It generalizes these patterns and stores them as a model in a MDL-file. The class applies the model to extract output pairs for the target relation from the LGO-files. It stores them in one large plain text file. All of these steps are done automatically in the right order by must know the target relation. The target relation is given by a function that decides whether a pair of words is an example, a counterexample or a candidate for the relation. This function should be implemented in a class that extends To LEILA, it does not matter how the function actually works internally. The most common way is to load a list of example pairs from a text file. To decide whether a pair of words is an example pair, the function can just check whether the pair is in the list. Often, the counterexamples need not be present in a list, but they can be deduced algorithmically on the fly. See the experimental section of "LEILA: Learning to Extract Information by Linguistic Analysis" (pdf, ppt, bib) for examples.


Existing relations in LEILA

The following relations ship with LEILA: