Abstract
Keyword analysis has become an indispensable tool for discourse analysts, being applied to identify the words that are especially characteristic of the texts in a target discourse domain. But, surprisingly, the statistical computation of keyness makes no reference to those texts. Rather, once a corpus has been constructed, it is treated as a homogeneous whole for the computation of keyness. As a result, the keywords in such lists are relatively frequent in the corpus, but they are often not widely dispersed across the texts of that corpus and are thus not truly representative of the target discourse domain. The purpose of this study is to propose a new method for keyword analysis - text dispersion keyness - that is based on text dispersion, rather than corpus frequency. We compare the effectiveness of this measure to four other methods for computing keyness, carrying out a series of case studies to identify the keywords that are typical of online travel blogs. A variety of quantitative and qualitative analyses are carried out to compare these methods based on their content-generalisability and content-distinctiveness, demonstrating that text dispersion keyness is a superior measure for generating keyword lists.
Original language | English (US) |
---|---|
Pages (from-to) | 77-104 |
Number of pages | 28 |
Journal | Corpora |
Volume | 14 |
Issue number | 1 |
DOIs | |
State | Published - 2018 |
Keywords
- Analysis
- Distinctiveness
- Generalisability
- Lexical dispersion
- Word importance
ASJC Scopus subject areas
- Language and Linguistics
- Linguistics and Language