TY - JOUR
T1 - Examining the effect of high-frequency information on the classification of conversationally produced English fricativesa)
AU - Kharlamov, Viktor
AU - Brenner, Daniel
AU - Tucker, Benjamin V.
N1 - Publisher Copyright:
© 2023 Acoustical Society of America.
PY - 2023/9/1
Y1 - 2023/9/1
N2 - This study examines the role of frequencies above 8 kHz in the classification of conversational speech fricatives [f, v, θ, ð, s, z, ʃ, ʒ, h] in random forest modeling. Prior research has mostly focused on spectral measures for fricative categorization using frequency information below 8 kHz. The contribution of higher frequencies has received only limited attention, especially for non-laboratory speech. In the present study, we use a corpus of sociolinguistic interview recordings from Western Canadian English sampled at 44.1 and 16 kHz. For both sampling rates, we analyze spectral measures obtained using Fourier analysis and the multitaper method, and we also compare models without and with amplitudinal measures. Results show that while frequency information above 8 kHz does not improve classification accuracy in random forest analyses, inclusion of such frequencies can affect the relative importance of specific measures. This includes a decreased contribution of center of gravity and an increased contribution of spectral standard deviation for the higher sampling rate. We also find no major differences in classification accuracy between Fourier and multitaper measures. The inclusion of power measures improves model accuracy but does not change the overall importance of spectral measures.
AB - This study examines the role of frequencies above 8 kHz in the classification of conversational speech fricatives [f, v, θ, ð, s, z, ʃ, ʒ, h] in random forest modeling. Prior research has mostly focused on spectral measures for fricative categorization using frequency information below 8 kHz. The contribution of higher frequencies has received only limited attention, especially for non-laboratory speech. In the present study, we use a corpus of sociolinguistic interview recordings from Western Canadian English sampled at 44.1 and 16 kHz. For both sampling rates, we analyze spectral measures obtained using Fourier analysis and the multitaper method, and we also compare models without and with amplitudinal measures. Results show that while frequency information above 8 kHz does not improve classification accuracy in random forest analyses, inclusion of such frequencies can affect the relative importance of specific measures. This includes a decreased contribution of center of gravity and an increased contribution of spectral standard deviation for the higher sampling rate. We also find no major differences in classification accuracy between Fourier and multitaper measures. The inclusion of power measures improves model accuracy but does not change the overall importance of spectral measures.
UR - http://www.scopus.com/inward/record.url?scp=85172759983&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85172759983&partnerID=8YFLogxK
U2 - 10.1121/10.0021067
DO - 10.1121/10.0021067
M3 - Article
C2 - 37756577
AN - SCOPUS:85172759983
SN - 0001-4966
VL - 154
SP - 1896
EP - 1902
JO - Journal of the Acoustical Society of America
JF - Journal of the Acoustical Society of America
IS - 3
ER -