Post hoc evaluation of analytic rating scales for improved functioning in the assessment of interactive L2 speaking ability

Language Testing in Asia

Table 3 Summary of adherence to Linacre’s (2002a) guidelines for the 9-point rating scales

Rating scale	Category observations ≥ 10	Monotonic average measures	Outfit MNSQs < 2.0	Monotonic threshold calibrations	Thresholds 1.4–5.0 logits apart	Peaked probability curves
Fluency	X	✓	✓	✓	X	X
Accuracy	X	✓	✓	X	X	X
Complexity	X	✓	✓	✓	X	X
Interaction	X	X	X	X	X	X
Effectiveness	X	X	✓	✓	X	X