Skip to main content

Table 8 Multilevel model results of holistic scores in Study 2

From: Validity evidence of Criterion® for assessing L2 writing proficiency in a Japanese university context

 

Model 1

Model 2

Model 3

 

Null

Null + time

Null + time + classa

Fixed effects

Level 1 (n = 243)

Coefficient (SE)

Coefficient (SE)

Coefficient (SE)

 Intercept (γ00)—initial status

3.75*** (0.07)

3.46*** (0.08)

3.82*** (0.13)

 Time (γ10)—rate of change

--

0.29*** (0.04)

0.31** (0.07)

Level 2 (n = 81)

   

 Class (γ01)—initial status

--

--

−0.18** (0.06)

 Class (γ11)—rate of change

--

--

−0.01 (0.03)

Random effects

Level 1 (n = 243)

   

 Within-student variance (r)

0.34

0.19

0.19

Level 2 (n = 81)

   

 Between-student variance (u0)

0.27

0.38

0.32

 Between-student variance (u1)

--

0.07

0.07

 Chi-square (u0; df)

274.17*** (80)

280.92*** (80)

248.53*** (79)

 Chi-square (u1; df)

--

138.13*** (80)

137.99*** (79)

 Intraclass correlation

.44

  

Reliability

 Intercept (β0)

0.71

0.71

0.67

 Time (β1)

--

0.41

0.41

Model fit

 Deviance (# of estimated parameters)

522.55 (3)

470.36 (6)

454.10 (8)

 Model comparison test:

   

  Chi-square (df)

--

52.19*** (3)c

16.26*** (2)d

 AICb

528.55

482.36

470.18

  1. Note. SE = Standard error. aOf the five classes, the highest class was coded as 0 and the lowest as 4. bAkaike Information Criterion (Deviance + 2*number of estimated parameters). cComparison between Models 1 and 2. dComparison between Models 2 and 3
  2. The design effect for Model 1 is: 1 + intraclass correlation*([the average sample size within each cluster] − 1)) = 1 + 0.44*([243/81] − 1)) = 1.88. Values over 1 indicate the violation of the assumption of independence of observations and suggest the need to use multilevel models (e.g., McCoach & Adelson, 2010)
  3. *p < .05; **p < .01; ***p < .001. These notations also apply to Table 9