Equating in small-scale language testing programs

Geoffrey T. LaFlair, Daniel Isbell, L. D.Nicolas May, Maria Nelly Gutierrez Arvizu, Joan Jamieson

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

Language programs need multiple test forms for secure administrations and effective placement decisions, but can they have confidence that scores on alternate test forms have the same meaning? In large-scale testing programs, various equating methods are available to ensure the comparability of forms. The choice of equating method is informed by estimates of quality, namely the method with the least error as defined by random error, systematic error, and total error. This study compared seven different equating methods to no equating – mean, linear Levine, linear Tucker, chained equipercentile, circle-arc, nominal weights mean, and synthetic. A non-equivalent groups anchor test (NEAT) design was used to compare two listening and reading test forms based on small samples (one with 173 test takers the other, 88) at a university’s English for Academic Purposes (EAP) program. The equating methods were evaluated based on the amount of error they introduced and their practical effects on placement decisions. It was found that two types of error (systematic and total) could not be reliably computed owing to the lack of an adequate criterion; consequently, only random error was compared. Among the seven methods, the circle-arc method introduced the least random error as estimated by the standard error of equating (SEE). Classification decisions made using the seven methods differed from no equating; all methods indicated that fewer students were ready for university placement. Although interpretations regarding the best equating method could not be made, circle-arc equating reduced the amount of random error in scores, had reportedly low bias in other studies, accounted for form and person differences, and was relatively easy to compute. It was chosen as the method to pilot in an operational setting.

Original languageEnglish (US)
Pages (from-to)127-144
Number of pages18
JournalLanguage Testing
Volume34
Issue number1
DOIs
StatePublished - Jan 1 2017

Keywords

  • English for academic purposes
  • equating
  • listening
  • placement
  • reading
  • sample size

ASJC Scopus subject areas

  • Language and Linguistics
  • Social Sciences (miscellaneous)
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Equating in small-scale language testing programs'. Together they form a unique fingerprint.

Cite this