Skip to main content

Table 1 Methods for evaluating the marking reliability in education

From: The analysis of marking reliability through the approach of gauge repeatability and reproducibility (GR&R) study: a case of English-speaking test

Authors

Pearson’s correlation coefficient

Fleiss’ kappa or Cohen’s kappa

ICC

ANOVA

Percentage of agreement

Standard deviation

Spearman’s rank correlation coefficient

Bland–Altman plot

SSR

Split-halves technique

Paired sample t-test

Infit MnSq and Outfit MnSq

Akeju (1972)

∕

  

∕

        

Sullivan and Hall (1997)

∕

   

∕

       

Wang (2009)

   

∕

        

Hallgren (2012)

 

∕

∕

         

Mukundan and Nimehchisalem (2012)

∕

           

Bird and Yucel (2013)

     

∕

      

Zhao (2013)

∕

           

Davis (2016)

∕

∕

 

∕

∕

       

Saeed et al. (2019)

  

∕

         

Aprianoto and Haerazi (2019)

 

∕

          

Khan et al. (2020)

∕

      

∕

    

Marshall et al. (2020)

∕

       

∕

∕

  

Rashid and Mahmood (2020)

      

∕

     

Doosti and Safa (2021)

  

∕

         

Lyness et al. (2021)

 

∕

  

∕

       

Nimehchisalem et al. (2021)

∕

           

Soemantri et al. (2022)

  

∕

         

Li (2022)

∕

          

∕

Detey et al. (2023)

      

∕

     

Stuart and Barnett (2023)

∕

∕

          

Naqvi et al. (2023)

     

∕

    

∕

 

Total

10

5

4

3

3

2

2

1

1

1

1

1