Automatically Learning Text Patterns to Extract Relations from Domain Documents

Ruiz-Casado et al. extend WordNet by extracting hyponym, hyperonym, holonym and meronym relations from the simple english version of WikiPedia. The authors automize

  • the extraction of relations and
  • the process of learning textual patterns modeling these relations (e.g. X is one of the PLURAL-NOUN in Y)

The described approach applies the following steps:

  • word sense disambiguation
  • pattern extraction based on existing relations in WordNet
  • pattern generalization of the patterns retrieved in the step above
  • identification of new relations

The authors show that the precision of the generated patterns is similar to patterns written by hand.


  • implement a simple vector-space-based component for word sense disambiguation
  • extract lexical patterns from a corpus based on known relations
  • implement an algorithm for generalizing these patterns
  • test these pattern on the corpus for identifying new relations.


