Short texts, best-fitting curves and new measures of lexical diversity

Research output: Contribution to journalArticlepeer-review

209 Scopus citations

Abstract

Following up on recent work by Malvern and Richards (1997, this issue; McKee et al., 2000) concerning the measurement of lexical diversity through curve fitting, the present study compares the accuracy of five formulae in terms of their ability to model the type-token curves of written texts produced by learners and native speakers. The most accurate models are then used to consider unresolved issues that have been at the forefront of past research on lexical diversity: the relationship between lexical diversity and age, second language (L2) instruction, L2 proficiency, first language (L1) background, writing quality and vocabulary knowledge. The participants in the study comprise 140 Finnish-speaking and 70 Swedish-speakinglearners of English, and an additional group of 66 native English speakers. The data include written narrative descriptions of a silent film, and the results show that two of the curve-fitting formulae provide accurate models of the type-token curves of over 90% of the texts. The texts for which accurate models were obtained were subjected to further analyses, and the results indicate a clear relationship between lexical diversity and amount of instruction, but a more complicated relationship between lexical diversity and L1 background, writing quality and vocabulary knowledge.

Original languageEnglish (US)
Pages (from-to)57-84
Number of pages28
JournalLanguage Testing
Volume19
Issue number1
DOIs
StatePublished - Jan 2002
Externally publishedYes

ASJC Scopus subject areas

  • Language and Linguistics
  • Social Sciences (miscellaneous)
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Short texts, best-fitting curves and new measures of lexical diversity'. Together they form a unique fingerprint.

Cite this