ACOUSTIC MODELING FOR MANDARIN LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION
After reviewing the history on Mandarin speech recognition in the previous chapter, we will now describe a few key technologies to build a highly accurate Mandarin Large Vocabulary Continuous Speech Recognition (LVCSR) system. LVCSR is the foundation for many useful speech-based applications, including keyword spotting, translation, voice indexing, etc. The core technologies developed on western languages are easily applicable to Chinese Mandarin. However, as noted in the previous chapter, we need to take care of the special characteristics of the Chinese language in order to achieve very high accuracy. Our emphasis will be on the differences and extra features used in the Mandarin system, with some brief summarization of the backbone technologies that are language independent. Finally we will present a state of the art Mandarin speech recognizer jointly developed by University of Washington (UW) and SRI International, and discuss unsolved challenges.