Abstract
Four algorithms for syllabifying phones are compared in automatically scoring English oral proficiency. The first algorithm clusters consonants into groups with the vowel nearer to them temporally, taking into account the maximal onset principle. A Hidden Markov Model (HMM) predicts the syllable boundaries based on their sonority value in the second algorithm. The third one employs three HMMs which are tuned to specific categories of utterances. The final algorithm uses a genetic algorithm to identify a set of rules for syllabifying the phones. They were evaluated by: (1) how well they syllabified utterances from the Boston University Radio News Corpus (BURNC) and (2) how well they worked as part of a process to automatically score English speaking proficiency. A measure of the temporal alignment of the syllables was utilized to judge how satisfactorily they syllabified utterances. Their suitability in the proficiency process was assessed with the Pearson correlation between the computer’s predicted proficiency scores and the scores determined by human examiners. We found that syllabification-by-genetic-algorithm performed the best in syllabifying the BURNC, but that syllabification-by-grouping (i.e., syllables are made by grouping non-syllabic consonant phones with the vowel or syllabic consonant phone nearest to them with respect to time) performed the best in the English oral proficiency rating application.
Original language | English (US) |
---|---|
Pages (from-to) | 1781-1804 |
Number of pages | 24 |
Journal | Artificial Intelligence Review |
Volume | 52 |
Issue number | 3 |
DOIs | |
State | Published - Oct 1 2019 |
Keywords
- ASR phone recognition
- Automatic speaking proficiency scoring
- Automatic syllabification
- Maximal onset principle
- Sonority sequencing principle
ASJC Scopus subject areas
- Language and Linguistics
- Linguistics and Language
- Artificial Intelligence