Please login to be able to save your searches and receive alerts for new content matching your search criteria.
In this paper we investigate the usefulness of morphosyntactic information as well as clustering in modeling Polish for automatic speech recognition. Polish is an inflectional language, thus we investigate the usefulness of an N-gram model based on morphosyntactic features. We present how individual types of features influence the model and which types of features are best suited for building a language model for automatic speech recognition. We compared the results of applying them with a class-based model that is automatically derived from the training corpus. We show that our approach towards clustering performs significantly better than frequently used SRI LM clustering method. However, this difference is apparent only for smaller corpora.
Analyzing the logical structure of a sentence is important for understanding natural language. In this paper, we present a task of Recognition of Requisite Part and Effectuation Part in Law Sentences, or RRE task for short, which is studied in research on Legal Engineering. The goal of this task is to recognize the structure of a law sentence. We investigate the RRE task regarding both the linguistic features and problem modeling aspects. We also propose solutions and present experimental results in a Japanese legal text domain. We got 88.58% with a supervised learning model and 88.84% with a semi-supervised learning model in the Fβ=1 score on the Japanese National Pension Law corpus.