Exploring potential unknown subgroups in your data: An introduction to finite mixture models for applied linguistics

Tove Larsson, Gregory R. Hancock

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

This article provides an introduction to finite mixture models in an applied linguistics context. Mixture models can be used to address questions relating to whether there are unknown subgroups in one's data, and if so, which participants/texts are likely to belong to which subgroup. Put differently, the technique enables us to assess whether our data might come from a heterogeneous population that is made up of latent classes. As such, mixture models offer a model-based framework to answer research questions for which the field previously has either attempted to use nonparametric heuristic techniques (e.g., cluster analysis) or has left entirely unanswered. An example of such research questions would be, ‘Does the treatment work equally well for all the participants, or are there unknown subgroups in the data that respond differently to the treatment?’ The article starts by introducing univariate mixture models and then broadens the scope to cover bivariate and multivariate mixture models. It also discusses some known pitfalls of the technique and how one might ameliorate these in practice.

Original languageEnglish (US)
Article number100117
JournalResearch Methods in Applied Linguistics
Volume3
Issue number2
DOIs
StatePublished - Aug 2024

Keywords

  • Data heterogeneity
  • Latent classes
  • Mixture modeling
  • Population subgroups
  • Underlying groupings

ASJC Scopus subject areas

  • Social Sciences (miscellaneous)
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Exploring potential unknown subgroups in your data: An introduction to finite mixture models for applied linguistics'. Together they form a unique fingerprint.

Cite this