The effects of N-gram probabilistic measures on the recognition and production of four-word sequences

Antoine Tremblay, Benjamin V. Tucker

Research output: Contribution to journalArticlepeer-review

64 Scopus citations

Abstract

The present study investigates the processing and production of four-word sequences such as I don't really know, at the age of, and I think it's the. Specifically, we investigate the influence of families of probabilistic measures such as unigram, bigram, trigram, and quadgram frequency of occurrence, logarithmic (log) probability of occurrence, and mutual information. Log probability of occurrence emerged as the predominant predictor family in the onset latency analysis, suggesting that recognition is mainly underpinned by competition between a target N-gram and its family members. In contrast, the amount of experience one has with an N-gram (frequency of occurrence) surfaced as the most prominent predictor in production. Further, probabilistic measures tied to trigrams surfaced as the best predictors in the onset latency analysis, while the measures tied to unigrams were most predictive of production durations.Finally, the interactions between probabilistic measures tied to unigrams, bigrams, trigrams, and quadgrams suggest that N-grams of different lengths are processed in parallel in both recognition and production.

Original languageEnglish (US)
Pages (from-to)302-324
Number of pages23
JournalMental Lexicon
Volume6
Issue number2
DOIs
StatePublished - 2011
Externally publishedYes

Keywords

  • Frequency of occurrence
  • Log probability of occurrence
  • Logit
  • Mixed-effects regression
  • Multi-word sequences
  • Mutual information
  • N-grams
  • Speech processing
  • Speech production

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language
  • Cognitive Neuroscience

Fingerprint

Dive into the research topics of 'The effects of N-gram probabilistic measures on the recognition and production of four-word sequences'. Together they form a unique fingerprint.

Cite this