TY - JOUR
T1 - Do all roads lead to Rome?
T2 - Modeling register variation with factor analysis and discriminant analysis
AU - Egbert, Jesse
AU - Biber, Douglas
N1 - Funding Information:
Funding: National Science Foundation, Directorate for Social, Behavioral and Economic Sciences, Division of Behavioral and Cognitive Sciences (Grant/Award Number: 1147581).
PY - 2018/9/25
Y1 - 2018/9/25
N2 - Previous theoretical and empirical research on register variation has argued that linguistic co-occurrence patterns have a highly systematic relationship to register differences, because they both share the same functional underpinnings. The goal of this study is to test this claim through a comparison of two statistical techniques that have been used to describe register variation: factor analysis (as used in Multi-Dimensional analysis, MDA) and canonical discriminant analysis (CDA). MDA and CDA have different statistical bases and thus give priority to different analytical considerations: linguistic co-occurrence in the case of MDA and the prediction of register differences in the case of CDA. Thus, there is no statistical reason to expect that the two techniques, if applied to the same corpus, will produce similar results. We hypothesize that although MDA and CDA approach register variation from opposite sides, they will produce similar results because both types of statistical patterns are motivated by underlying discourse functions. The present paper tests this claim through a case-study analysis of variation among web registers, applying MDA and CDA to analyze register variation in the same corpus of texts.
AB - Previous theoretical and empirical research on register variation has argued that linguistic co-occurrence patterns have a highly systematic relationship to register differences, because they both share the same functional underpinnings. The goal of this study is to test this claim through a comparison of two statistical techniques that have been used to describe register variation: factor analysis (as used in Multi-Dimensional analysis, MDA) and canonical discriminant analysis (CDA). MDA and CDA have different statistical bases and thus give priority to different analytical considerations: linguistic co-occurrence in the case of MDA and the prediction of register differences in the case of CDA. Thus, there is no statistical reason to expect that the two techniques, if applied to the same corpus, will produce similar results. We hypothesize that although MDA and CDA approach register variation from opposite sides, they will produce similar results because both types of statistical patterns are motivated by underlying discourse functions. The present paper tests this claim through a case-study analysis of variation among web registers, applying MDA and CDA to analyze register variation in the same corpus of texts.
KW - discriminant analysis
KW - factor analysis
KW - linguistic co-occurrence
KW - multi-dimensional analysis
KW - register variation
KW - text classification
KW - web registers
UR - http://www.scopus.com/inward/record.url?scp=85040552442&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85040552442&partnerID=8YFLogxK
U2 - 10.1515/cllt-2016-0016
DO - 10.1515/cllt-2016-0016
M3 - Review article
AN - SCOPUS:85040552442
SN - 1613-7027
VL - 14
SP - 233
EP - 273
JO - Corpus Linguistics and Linguistic Theory
JF - Corpus Linguistics and Linguistic Theory
IS - 2
ER -