Automatic prominent syllable detection with machine learning classifiers

David O. Johnson, Okim Kang

Research output: Contribution to journalArticlepeer-review

14 Scopus citations

Abstract

In this paper, we examine the performance of automatically detecting Brazil’s prominent syllables using five machine learning classifiers and seven sets of features consisting of three features: pitch, intensity, and duration, taken one at time, two at a time, and all three. Prominent syllables are the foundation of Brazil’s prosodic intonation model. We found that using pitch, intensity, and duration as features produces the best optimal results. Our findings also revealed that in terms of accuracy, F-measure, and Cohen’s kappa coefficient that bagging an ensemble of decision tree learners performed the best (accuracy = 95.9 ± 0.2 %; F-measure = 93.7 ± 0.4; κ = 0.907 ± 0.005). The performance of our current model proves to be significantly better than any other automatic detection software that exists or that of human transcription experts of prosody.

Original languageEnglish (US)
Pages (from-to)583-592
Number of pages10
JournalInternational Journal of Speech Technology
Volume18
Issue number4
DOIs
StatePublished - Dec 1 2015

Keywords

  • Brazil’s prosodic intonation model
  • Machine learning
  • Prominent syllable detection
  • ToBI

ASJC Scopus subject areas

  • Software
  • Language and Linguistics
  • Human-Computer Interaction
  • Linguistics and Language
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Automatic prominent syllable detection with machine learning classifiers'. Together they form a unique fingerprint.

Cite this