Abstract
This study explores the influence of corpus design when comparing lexical bundle use across groups, examining how the number of texts and average length of texts can impact conclusions about group differences. The study compares the use of lexical bundles by L1-English versus L2-English writers, based on analysis of two sub-corpora of academic articles that are matched for discipline, writer expertize, time of publication, and audience. However, the two sub-corpora differ with respect to the number of texts and the average length of texts. Three experiments examined the influence of differences in corpus composition. The results show that differences in the number of words and number of texts across sub-corpora can have a strong effect on claimed differences in bundle use across groups. This effect is found even when the texts in the corpora are closely matched for their register and topic.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 215-229 |
| Number of pages | 15 |
| Journal | International Journal of Corpus Linguistics |
| Volume | 25 |
| Issue number | 2 |
| DOIs | |
| State | Published - Aug 28 2020 |
Keywords
- Corpus design
- Lexical bundle type distribution vs. token distribution
- Topic variation
ASJC Scopus subject areas
- Language and Linguistics
- Linguistics and Language
Fingerprint
Dive into the research topics of 'Methodological issues in contrastive lexical bundle research: The influence of corpus design on bundle identification'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS