How operationalizations of word types affect measures of lexical diversity

Scott Jarvis, Brett James Hashimoto

Research output: Contribution to journalArticlepeer-review

16 Scopus citations

Abstract

This study tests three measures of lexical diversity (LD), each using five operationalizations of word types. The measures include MTLD (measure of textual lexical diversity), MTLD-W (moving average MTLD with wraparound measurement), and MATTR (moving average type-Token ratio). Each of these measures is tested with types operationalized as orthographic forms, lemmas using automated POS tags, lemmas using manually corrected POS tags, flemmas (list-based lemmas that do not distinguish between parts of speech), and word families. These measures are applied to 60 narrative texts written in English by adolescent native speakers of English (n=13), Finnish (n=31), and Swedish (n=16). Each individual LD measure is evaluated in relation to how well it correlates with the mean LD ratings of 55 human raters whose inter-rater reliability was exceedingly high (Cronbach s alpha=.980). The overall results show that the three measures are comparable but two of the operationalizations of types produce mixed results across measures.

Original languageEnglish (US)
Pages (from-to)163-194
Number of pages32
JournalInternational Journal of Learner Corpus Research
Volume7
Issue number1
DOIs
StatePublished - Mar 1 2021
Externally publishedYes

Keywords

  • Lexical diversity
  • MATTR
  • MTLD
  • MTLD-W
  • Word types

ASJC Scopus subject areas

  • Language and Linguistics
  • Education
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'How operationalizations of word types affect measures of lexical diversity'. Together they form a unique fingerprint.

Cite this