Abstract
This study tests three measures of lexical diversity (LD), each using five operationalizations of word types. The measures include MTLD (measure of textual lexical diversity), MTLD-W (moving average MTLD with wraparound measurement), and MATTR (moving average type-Token ratio). Each of these measures is tested with types operationalized as orthographic forms, lemmas using automated POS tags, lemmas using manually corrected POS tags, flemmas (list-based lemmas that do not distinguish between parts of speech), and word families. These measures are applied to 60 narrative texts written in English by adolescent native speakers of English (n=13), Finnish (n=31), and Swedish (n=16). Each individual LD measure is evaluated in relation to how well it correlates with the mean LD ratings of 55 human raters whose inter-rater reliability was exceedingly high (Cronbach s alpha=.980). The overall results show that the three measures are comparable but two of the operationalizations of types produce mixed results across measures.
Original language | English (US) |
---|---|
Pages (from-to) | 163-194 |
Number of pages | 32 |
Journal | International Journal of Learner Corpus Research |
Volume | 7 |
Issue number | 1 |
DOIs | |
State | Published - Mar 1 2021 |
Externally published | Yes |
Keywords
- Lexical diversity
- MATTR
- MTLD
- MTLD-W
- Word types
ASJC Scopus subject areas
- Language and Linguistics
- Education
- Linguistics and Language