Abstract
In this paper, we examine the performance of automatically detecting Brazil’s prominent syllables using five machine learning classifiers and seven sets of features consisting of three features: pitch, intensity, and duration, taken one at time, two at a time, and all three. Prominent syllables are the foundation of Brazil’s prosodic intonation model. We found that using pitch, intensity, and duration as features produces the best optimal results. Our findings also revealed that in terms of accuracy, F-measure, and Cohen’s kappa coefficient that bagging an ensemble of decision tree learners performed the best (accuracy = 95.9 ± 0.2 %; F-measure = 93.7 ± 0.4; κ = 0.907 ± 0.005). The performance of our current model proves to be significantly better than any other automatic detection software that exists or that of human transcription experts of prosody.
Original language | English (US) |
---|---|
Pages (from-to) | 583-592 |
Number of pages | 10 |
Journal | International Journal of Speech Technology |
Volume | 18 |
Issue number | 4 |
DOIs | |
State | Published - Dec 1 2015 |
Keywords
- Brazil’s prosodic intonation model
- Machine learning
- Prominent syllable detection
- ToBI
ASJC Scopus subject areas
- Software
- Language and Linguistics
- Human-Computer Interaction
- Linguistics and Language
- Computer Vision and Pattern Recognition