Two Approaches for the Resolution of Word Mismatch Problem Caused by English Words and Foreign Words in Korean Information Retrieval
Abstract
In Korean text these days, the use of English words with or without phonetic translations are growing at a high speed. To make matters worse, the Korean transliteration of an English word may vary greatly. The mixed use of English words and their various transliterations in the same document or document collection may cause severe word mismatch problems in Korean information retrieval. There are two possible approaches to tackle this problem: transliteration and back-transliteration method. We argue that our newly proposed transliteration approach is more advantageous for the resolution of the word mismatch problem than the previously proposed back-transliteration approach. Our information retrieval experiment results support this argument.