GLOBAL RULE INDUCTION FOR INFORMATION EXTRACTION
Abstract
The ability to extract desired pieces of information from natural language texts is an important task with a growing number of potential applications. This paper presents a novel pattern rule induction learning system, GRID, which emphasizes the use of global feature distribution in all of the training instances in order to make better decision on rule induction. GRID uses chunks as contextual units instead of tokens, and incorporates features at lexical, syntactical and semantic levels simultaneously. The features chosen in GRID are general and they were applied successfully to both semi-structured text and free text. Our experimental results on some publicly available webpage corpora and MUC-4 test set indicate that our approach is effective.
Remember to check out the Most Cited Articles! |
---|
Check out Notable Titles in Artificial Intelligence. |