Research | Open | Published:
Peer assessment in an EFL context: attitudes and friendship bias
Language Testing in Asiavolume 3, Article number: 11 (2013)
This study reports a research project which compared teacher and peer assessment of English university students’ compositions. In addition, it investigated possible friendship bias in peer assessment as well as the impacts of this practice on learners’ attitudes towards it. To this aim, a total of 38 university students of English who were passing their writing course took a proficiency test and filled in a pre-questionnaire. Afterwards, training and practice sessions on using Jacobs et al.’s composition profile followed. The actual peer assessment of compositions, teacher assessment, and administration of a post-questionnaire were the subsequent practices employed respectively. To analyze the collected data from the 26 subjects who participated in all parts of the study paired-sample t-tests and chi square were applied. The results revealed no significant difference between the learners’ peer assessment and teacher assessment. No friendship bias was found in peer assessment, but this practice led to the change of students’ attitudes towards a positive perception on peer assessment.
Alternative assessment asks students to show what they can do, that is to say, students are evaluated on what they integrate and produce rather than on what they are able to recall (Macias, 1995, cited in Coombe et al., 2007). As one of the main forms of alternative assessment, peer assessment has gained much importance in educational learning and educational research. It is considered as "an arrangement in which individuals consider the amount, level, value, worth, quality, or success of the products or outcomes of learning of peers of similar status" (Topping, 1998, p. 250). It is "the process of having the readers critically reflect upon, and perhaps suggest grades for the learning of their peers" (Roberts, 2006, p. 80), and being judged for the quality of the appraisals made (Davies, 2006).
Assessment, in any instructional operation is critical; both teachers and learners need to get involved in and have control over the assessment methods, outcomes, and their underlying rationale (Cheng and Warren, 2005). When it comes to assessing students’ writings in EFL contexts specially in traditional teacher-centered classrooms, the incorporation of peer assessment as a learning tool (Lindblom-ylänne et al. 2006) besides the usual teacher assessment not only can change learners perspective toward various types of assessments, but may also lead to outcomes at least as good as teacher assessment and sometimes better (Topping, 1998). Being helpful, beneficial, enjoyable and challenging on the one hand, and feelings of threat or being unnerved due to the subjectivity of assessment, or failing to develop confidence in acting fairly as an assessor (Sambell et al., 1997) on the other hand are some attitudes toward peer assessment indicating that students’ levels of acceptability are varied (Topping, 1998). As "few student evaluations of peer assessment are reported" (Falchikov, 1995, p. 177), the findings reveal that studies on students’ attitudes to this practice are confused and inconclusive.
The impact of peer assessment on language learning is promising, but its efficacy seems to depend on many factors including students’ attitudes, language levels, familiarity with the assessing criteria, the type of skill being assessed, and the possible presence of bias such as gender and friendship. In line with previous studies, although not aiming at reviewing and replicating the extensive literature on peer assessment, this study was conducted to shed light on the status of peer assessment in an EFL context where teacher-centered classes are the norm. The differences between teacher and peer ratings as well as the existence of any friendship bias which has been meagerly dealt with in previous research are considered. Moreover, how this type of assessment may influence the perspective of learners at the tertiary level is examined.
The importance of assessment as an integral part of teaching-learning cycle is apparent to many educationalists. Assessment changed during the changes of the theories and models of learning. Constructive teaching and learning brought assessment in the center and it no longer has the purpose of presenting a form of measurement related to the traditional curricula. The teacher is no more the centre of assessment, but the students go hand in hand with teachers to apply such an interactive type of assessment (Wikstorm, 2007). According to modern theories of assessment which consider assessment as a part of learning and teaching process, rather than the end-of-course evaluation of student achievements, assessment is becoming the process of describing student's performance. Modern views of curriculum and constructivist learning theories looked for a new type of assessment capable of being used as a part of the instruction to help learners in the process of acquiring knowledge, which could lead to the promotion of students’ understanding. Based on the new developments in learning theories, teachers open up discussion of assessment with students; this is actually what presents a major challenge for assessment in 21st century because it is putting demands on the teacher to obtain specific skills needed for this new, additional role. The process of learning should be assessed by more intense, interactive methods and that work should be undertaken in collaboration, either between teacher and student or a group of peers (Wikstorm, 2007).
Coombe et al. (2007) propose several types of alternative assessment that can be used in today's language classrooms with great success: self-assessment, portfolio assessment, student-designed tests, learner-centered assessment, projects, and presentations. Similarly, Cheng and Warren (2005) believe that there are several approaches to classroom assessment such as performance assessment, portfolio assessment, self and peer assessment. They specify that teachers play a major role in traditional pen and paper and performance assessment, whereas self and peer assessments are more student-centered. They allow students to participate in the evaluation and provide opportunities for observation and modeling which help them scrutinize themselves and adjust their performance.
Peer assessment in EFL contexts
Surveying the literature in the EFL context, Cheng and Warren (2005) found that peer assessment has been more commonly incorporated into English language writing instruction where peers respond to and edit each others' written work with the aim of helping with revision. Some of the examples they cite include Hogan (1984), Birdsong and Sharplin (1986), Lynch (1988), Devenney (1989), Jacobs (1989), Rothschild and Klingenberg (1990) Rainey (1990), Bell (1991), Mangelsdorf (1992), Murau (1993), Caulk (1994), Mendonca and Johnson (1994) and Jones (1995).
Findings suggest that student writers selectively take account of peer comments when they revise, preferring to depend more on their own knowledge. Student writers may not always trust their peers, but the same comment from a teacher will be taken into account when they revise (Mendonca and Johnson, 1994). Reviewing the literature related to the outcome of studies on peer assessment of writing, Topping (1998) found that it "appears capable of yielding outcomes at least as good as teacher assessment and sometimes better" (p. 262). Mangelsdorf (1992) reports that peer reviews were always rated negatively by Asian students, and raises the question of the effect of teacher-centered cultures on the way students regard peer comments. However, the merits attributed to applying peer assessment cannot be ignored. Being an effective tool in both group and individual projects (Matsuno, 2009), encouraging reflective learning through observing others' performances and awareness of performance criteria (Saito, 2008), immediate support in the classroom, gains for both the assessor and the assessed, and being individualized and interactive (Black and William, 1998) are some benefits to consider.
Peer assessment in writing
Peer evaluation plays an important role in both first (L1) and second language (L2) writing classrooms, and allows writing teachers to help their students receive more feedback on their papers as well as give students practice with a range of skills important in the development of language and writing ability, such as meaningful interaction with peers, a greater exposure to ideas, and new perspectives on the writing process. It is obvious that peer involvement creates opportunities for interaction, and increases objectivity in assessment. If put in a situation where learners access information about the quality and level of their peers as well as their own performances, there is the possibility that they will be able to clarify their own understanding of the assessment criteria (either set by students themselves or by the teacher), and more importantly, of what is required of them (Patri, 2002). What seems to be important is that students must use clearly defined guidelines to evaluate each other's work, so checklists with lists of points to be assessed are very useful. Although the grades may be generated by students, "the teacher should … reserve the right to make adjustments if necessary" (Kearsky, 2000, cited in Roberts, 2006, p. 91). When students are trained on how to give and use feedback (Min, 2006), peer evaluation can be extremely effective. Teachers can incorporate it as a way to present writing skills to students, ideally creating a student-centered classroom with learners capable of critically evaluating their own written work. Peer review sessions can teach students important writing skills, such as writing to a real audience seeing ideas and points of view other than their own (Paulus, 1999), and discussing how to revise writing effectively.
Given the importance of peer assessment and its impacts on language skills and considering the students' attitude towards it, the main research questions were formally stated as follow:
How similar are teacher and peer ratings of students' English compositions?
Do students favor peer assessment?
Does friendship affect peer rating?
The 26 homogenous subjects of the present study were selected from the initial 38 Iranian university students of English literature during the second educational semester of 2009. They were 24 females and 2 males, ranging in age from 19 to 27 with the mean age of 21, who were in their sixth term of study and were passing their essay writing course with the researcher.
To come up with satisfactory results, some sets of tasks and tests were employed in this study:
The intermediate Nelson Language Proficiency Test (1977). A modified version of the original Intermediate Nelson Proficiency Test with the reliability of 0.83 in piloting was used. It consisted of two parts: a cloze passage and 40 discrete-point items.
The writing checklist. To score the subjects' compositions Jacobs et al.'s (1981, cited in Hughes, 2003) writing scale was used which rather follows an analytical (objective) procedure. According to this scale, five factors were considered in every composition: 1. Content, 2. Organization, 3. Vocabulary, 4. Language Use, and 5. Mechanics. All the subjects received 5 sub-scores (at most 4 for each part) and the total grade was 20.
Pre- and post- questionnaires. Two questionnaires were constructed and validated before being distributed among the participants by consulting the items with several experts in the field. One questionnaire was administered at the beginning and the other at the end of the writing course. These questionnaires were used to evaluate the subjects' attitudes toward peer assessment. They included 9 questions on a 3-point Likert scale with responses of yes, no, not sure in the pre questionnaire and yes, no, and to some extent in the post questionnaire. The reliability of each questionnaire was calculated and turned to be 0.7 and 0.76 for the pre- and post- questionnaires, respectively.
The writing tasks. In three successive sessions, 3 general topics were offered for participants to write about and to submit in the next sessions to be evaluated analytically by their peers and the class teacher (the researcher) and two other teachers.
In order to achieve the desired results, the researcher undertook the following procedure. At the beginning of the semester, after students’ consent to take part in the study, the modified intermediate Nelson Proficiency Test was administered to the whole population. The descriptive report taken from SPSS about the mean and standard deviation of the scores were used to decide on the final homogenous group. In the following session, students filled in the first questionnaire about their attitudes toward peer assessment; they were also asked to write the name of three of their most intimate friends in the same class. The names were used to draw a sociogram and to analyze and display sets of relationships to discover mutual intimate friendship among students. Prior to the assessment program, this procedure led to the identification of the friend and non-friend peers who had to mark other students' compositions in the following sessions.
To mark the compositions, first, the ESL composition profile by Jacobs et al. (1981, cited in Hughes, 2003) was introduced. This profile consists of five traits which tap into different features of a written text by a set of descriptors corresponding to different quality levels. The five traits are content, organization, language use, vocabulary and mechanics and the maximum number considered for each was 4 point. In the following session, students were taught on how to use the profile in assessing compositions of their classmates and for 3 sessions they practiced assessing their peer's writings.
These practice sessions were followed by the actual peer assessment experience. Three general topics were assigned for the next three successive sessions, one at a time. For each meeting, students were required to hand in their compositions and five copies of it to be marked by the teacher, two peers and two other raters. As mentioned before, the researcher had already identified friend and non-friend peers of each student. Accordingly, names of peers (without mentioning their friendship relation) were read out so students knew whose papers they had to mark. After using the checklist to score the writing performances, the peer raters signed the papers and handed in the compositions and scoring tables to be recorded by the teacher. These papers were returned to the writers in the following sessions for discussion during which the teacher and peer corrections were reviewed and the subjects were given feedback regarding their errors in their writings and on parts which needed revision.
In addition to peer raters, the teacher (researcher) and two other EFL instructors assessed the writings. The researcher briefed the two raters on how to score the writings. The writing scores went under statistical analyses. In order to calculate the inter-rater reliability of the sets of scores given by the three raters, the coefficient correlation (Pearson Product Moment) was used. The coefficient alpha was computed and turned to be 0.89, 0.82, 0.90 for the first, second and third writings, respectively.
Finally, at the end of the course, to investigate any possible change in students attitude towards peer assessment a questionnaire similar to the one completed at the beginning of the semester was administered.
Results and discussion
In order to answer the first question about the similarity of teacher and peer-rating of students’ English compositions, paired-samples t-test was applied, once for the peer raters (friends and non-friends) as a whole, then separately for friends and non-friends. For the peers in general, the results, t = .827, P = .416 > .05, indicated that there was no significant difference between the teacher and friend and non-friend peer corrections/ratings, and the mean scores for corrections were quite close to each other (Table 1). Similarly, for the separate groups of peers, the results of the paired-samples t-tests did not reveal any significant differences between the teacher’s corrections and each of the peer groups. As displayed in Table 2, for friends corrections, t = .048 and P = .962 > .05, and for non-friends, t =1.685 and P = .104 > .05. The descriptive statistics for the three comparisons and the t-test results are presented in Tables 1 and 2, respectively.
To investigate whether students favored peer assessment, an analysis of chi-square was run to compare the students’ attitudes as measured through the pre- and post- questionnaire. The chi-square value of 7.65 (P = .022 < .05) indicates that there were significant differences between students’ attitudes toward peer assessment before and after the study. As displayed in Table 3, the students showed more agreement on the post-questionnaire (52.9%) rather than the pre-questionnaire (44.4%).
In addition to the abovementioned findings, to show what learners thought about peer assessment and how they found it after experiencing it, Table 4 is presented. It includes frequencies and percentages of each response to 5 of the questions about peer assessment being difficult, useful, interesting, motivating, and boring. The details in this table indicate how learners’ views changed in each case; expecting the practice to be difficult changed to the opposite, not being sure about whether it would be useful, motivating and interesting changed to learners’ certainty about them, and it was found not to be boring at the end of the term.
Another question in this study was about the effect of friendship on peer rating. In order to investigate any possible bias first of all, the average of the scores offered by peer friends and non-friends for the 3 writings were separately calculated. The results of the paired-samples t-test with the t-value of 1.55 and the p-value of .132 > .05 show that there was no significant difference between the friend and non-friend corrections (Table 5).
The findings of this study concerning peer and teacher assessment are in line with the studies of Jafarpur (1991), Hughs and Large (1993), Miller and Ng (1996), Topping (1998), Falchikov and Goldfinch (2000), Patri (2002), and Saito and Fujita (2004) who have noted high agreement between teacher and peer assessments which indicate an overall similarity in scoring between peers and teachers. The reason behind this agreement may be found in using a clear scoring criterion, as well as the training and practice sessions prior to the actual peer assessment experience.
Concerning friendship bias, this study revealed no significant difference between ratings of friend and non-friend peers while Falchikov (1995) and Morahan-Martin (1996) identified such a bias in peer assessment. The probable reason for this difference in findings may be in the general familiarity and friendship of all the students with one another in the class. Although students named their intimate friends, they did not deny their overall friendship with others who had been their classmates for at least 2 years, so this might have affected their ratings unconsciously. Another point is the possible fear of facing the friends the next week in the class after issuing someone a bad grade (Buchanan, 2004, cited in Roberts, 2006). This problem might be overcome by monitoring and anonymous marking (Alfallay, 2004) which was unfortunately not possible in this study.
The present study also investigated the attitudes of learners towards the use of peer assessment. The change of perception and the positive view points of learners at the end of the course toward the use of peer assessment is similar to users acceptance and positive attitudes found in Patri's (2002) and Saito and Fujita's (2004) studies. Although some students expressed their discomfort and uneasiness about acting like a teacher and were not sure about the benefits and the degree of difficulty of peer assessment, the post-questionnaire revealed their change of perception about this practice. Saito and Fujita (2004) cited a number of researches in some of which learners expressed their negative feelings, dissatisfaction and uneasiness whit this experience while in others students considered it useful, preferred, and found value in it. These mixed feelings are appropriate since learners usually carry mixed feelings and attitudes toward any type of classroom activity.
This study was a tripartite investigation on peer assessment. First, it focused on the differences between peer and teacher assessment in a teacher-centered foreign language learning context. Next, presence of any friendship bias was detected; and finally, learners’ attitudes about this practice were evaluated. The results of this study revealed no significant difference between the learners’ peer assessment and teachers’ assessment. Moreover, no friendship bias was found in peer assessment. However, this practice led to the change of students’ attitude to a positive perspective on peer assessment. While they expected the practice to be difficult, they found it not to be so; learners became assured that peer assessment was useful, motivating and interesting and they found it not to be boring.
Making peer assessment an integral part of evaluation procedures not only encourages learners and teachers to regard assessment as a shared responsibility, it can also be applied to alter the traditional one-way teacher-centered classes to a more learner-centered one. It is obvious that peer involvement creates opportunities for interaction, and increases objectivity in assessment. Saito (2008) believes peer assessment encourages reflective learning through observing others' performances and becoming aware of performance criteria. In general, peer assessment seems to generate positive reactions in students, although some students have concerns and worries it leads to the development of self-awareness, noticing the gap between one's and others' perception, and facilitating further learning and responsibility for it. In addition, focusing on peers' strengths and weaknesses can enhance students' learning, raise their level of critical thinking, and lead them to autonomy.
The results of this study, although statistically significant, are limited to a number of factors such as the design, the instruments, and the chosen skill. In addition, the subjects who were third-year undergraduate EFL students, their familiarity with one another, their proficiency level, and the impossibility of performing blind assessment could have had limiting effects on the results. Therefore, for further studies, taking a broader range and types of participants in to account, considering the link between peer assessment and skills other than writing, examining the effects of gender and various cognitive and personality factors on peer assessment, and applying other types of instruments are some suggestions to offer.
Maryam Azarnoosh has a PhD in TEFL and is a faculty member and head of department of English at Islamic Azad University-Semnan Branch. She has taught in different universities for over 13 years and has presented papers in international conferences and published some in journals. Her research interests include motivation, English language skills, learning strategies, and language teaching, testing and assessment.
AlFallay I: The role of some selected psychological and personality traits of the rater in the accuracy of self- and peer-assessment. System 2004, 32: 407–425. 10.1016/j.system.2004.04.006
Bell J: Using peer responses in ESL writing classes. TESL Canada Journal 1991, 8: 65–71.
Birdsong T, Sharplin W: Peer evaluation enhances students’ critical judgement. Highway One 1986, 9: 23–28.
Black P, William D: Inside the black box: Raising standards through classroom assessment. London: Department of Education and Professional Studies, Kings College; 1998.
Caulk N: Comparing teacher and student responses to written work. TESOL Quarterly 1994, 28: 181–188. 10.2307/3587209
Cheng W, Warren M: Peer assessment of oral proficiency. Language Testing 2005, 22: 93–121. 10.1191/0265532205lt298oa
Coombe C, Folse K, Hubly N: Assessing English language learners. United State of America: University of Michigan Press; 2007.
Davies P: Peer assessment: judging the quality of students’ work by comments rather than marks. Innovations in Education and Teaching International 2006, 43: 69–82. 10.1080/14703290500467566
Devenny R: How ESL teachers and peers evaluate and respond to student writing. RELC Journal 1989, 20: 77–90. 10.1177/003368828902000106
Falchikov N: Peer feedback marking: developing peer assessment. Innovations in Education and Training International 1995, 32: 175–187. 10.1080/1355800950320212
Falchikov N, Goldfinch J: Student peer assessment in higher education: a meta-analysis comparing peer and teacher marks. Review of Educational Research 2000, 70: 287–322.
Hogan P: Peer editing helps students improve written products. Highway One 1984, 7: 51–54.
Hughes AR: Testing for language teachers. Cambridge: CUP; 2003.
Hughes I, Large B: Staff and peer-group assessment of oral communication skills. Studies in Higher Education 1993, 18: 379–385. 10.1080/03075079312331382281
Jacobs G: Miscorrection in peer feedback in writing class. RELC Journal 1989, 20: 68–75. 10.1177/003368828902000105
Jafarpur A: Can naive EFL learners estimate their own proficiency? Evaluation and Research in Education 1991, 5: 145–157. 10.1080/09500799109533306
Jones N: Business writing, Chinese students and communicative language teaching. TESOL Journal 1995, 4: 12–15.
Lindblom-ylänne S, Pihlajamäki H, Kotkas T: Self-, peer- and teacher-assessment of student essays. Active Learning in Higher Education 2006, 7: 51–62. 10.1177/1469787406061148
Lynch T: Peer evaluation in practice. In Individualization and autonomy in language learning. ELT Documents: 131. Edited by: Brookes A, Grundy P. London, UK: Modern English Publications; 1988:119–125.
Mangelsdorf K: Peer reviews in the ESL composition classrooms: what do students think? ELT Journal 1992, 46: 274–284. 10.1093/elt/46.3.274
Matsuno S: Self-, peer-, and teacher-assessments in Japanese university EFL writing classrooms. Language Testing 2009, 26: 75–100. 10.1177/0265532208097337
Mendonca CO, Johnson KE: Peer review negotiations: revision activities in ESL writing instruction. TESOL Quarterly 1994, 28: 745–769. 10.2307/3587558
Miller L, Ng R: Autonomy in the classroom: peer assessment. In Taking control: Autonomy in language learning. Edited by: Pemberton R, Edward SL, Or WWF, Pierson HD. Hong Kong: Hong Kong University Press; 1996:133–146.
Min HT: The effects of trained peer review on EFL students' revision types and writing quality. Journal of Second Language Writing 2006, 15: 118–141. 10.1016/j.jslw.2006.01.003
Morahan-Martin J: Should peers' evaluations be used in class projects?: Questions regarding reliability, leniency, and acceptance. Psychological Reports 1996, 78: 1243–1250. 10.2466/pr0.1996.78.3c.1243
Murau A: Shared writing: students’ perceptions and attitudes of peer review. Working Papers in Educational Linguistics 1993, 9: 71–79.
Patri M: The influence of peer feedback on self and peer-assessment of oral skills. Language Testing 2002, 19: 109–131. 10.1191/0265532202lt224oa
Paulus TM: The effect of peer and teacher feedback on student writing. Journal of Second Language Writing 1999, 8: 265–289. 10.1016/S1060-3743(99)80117-9
Rainey K: Teaching technical writing to non-native speakers. Technical Writing Teacher 1990, 17: 131–135.
Roberts T: Self-, peer-, and group assessment in E-learning. United States of America: Information science publishing; 2006.
Rothschild D, Klingenberg F: Self and peer evaluation of writing in the interactive ESL classroom: an exploratory study. TESL Canada Journal 1990, 8: 52–65.
Saito H: EFL classroom peer assessment: training effects on rating and commenting. Language Testing 2008, 25: 553–581. 10.1177/0265532208094276
Saito H, Fujita T: Characteristics and user acceptance of peer rating in EFL writing classroom. Language Teaching Research 2004, 31: 31–54.
Sambell K, McDowell L, Brown S: ‘But is it fair?’: An exploratory study of student perceptions of the consequential validity of assessment. Studies in Educational Evaluation 1997, 23: 349–371. 10.1016/S0191-491X(97)86215-3
Topping KJ: Peer assessment between students in colleges and universities. Review of Educational Research 1998, 68: 249–276.
Wikstorm N: Alternative assessment in primary years of international baccalaureate education (Master's thesis). 2007.
The author declares that she has no competing interests.