Methodological issues in contrastive lexical bundle research: The influence of corpus design on bundle identification

Fan Pan, Randi Reppen, Douglas Biber

Research output: Contribution to journalArticlepeer-review

16 Scopus citations

Abstract

This study explores the influence of corpus design when comparing lexical bundle use across groups, examining how the number of texts and average length of texts can impact conclusions about group differences. The study compares the use of lexical bundles by L1-English versus L2-English writers, based on analysis of two sub-corpora of academic articles that are matched for discipline, writer expertize, time of publication, and audience. However, the two sub-corpora differ with respect to the number of texts and the average length of texts. Three experiments examined the influence of differences in corpus composition. The results show that differences in the number of words and number of texts across sub-corpora can have a strong effect on claimed differences in bundle use across groups. This effect is found even when the texts in the corpora are closely matched for their register and topic.

Original languageEnglish (US)
Pages (from-to)215-229
Number of pages15
JournalInternational Journal of Corpus Linguistics
Volume25
Issue number2
DOIs
StatePublished - Aug 28 2020

Keywords

  • Corpus design
  • Lexical bundle type distribution vs. token distribution
  • Topic variation

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Methodological issues in contrastive lexical bundle research: The influence of corpus design on bundle identification'. Together they form a unique fingerprint.

Cite this