Dynamic vs. diagnostic assessment: impacts on EFL learners’ speaking fluency and accuracy, learning anxiety, and cognitive load

Abdulaal, Mohammad Awad Al-Dawoody; Alenazi, Maryumah Heji; Tajuddin, Azza Jauhar Ahmad; Hamidi, Bahramuddin

doi:10.1186/s40468-022-00179-0

Research
Open access
Published: 29 August 2022

Dynamic vs. diagnostic assessment: impacts on EFL learners’ speaking fluency and accuracy, learning anxiety, and cognitive load

Language Testing in Asia volume 12, Article number: 32 (2022) Cite this article

4087 Accesses
5 Citations
3 Altmetric
Metrics details

Abstract

Despite their importance, dynamic and diagnostic assessments (DigAs) have received little attention in phycological aspects of foreign language teaching and learning settings; therefore, this study compared the effects of dynamic and diagnostic assessment (DigA) on Afghan EFL learners’ speaking fluency and accuracy (SFA), learning anxiety (LA), and cognitive load (CL). To do so, 90 Afghan EFL learners were recruited and assigned to two experimental groups (EGs), namely a dynamic assessment group (DAG), a DigAG, and a control group (CG). After that, the three groups were pretested on SFA, LA, and CL. Then, one EG was trained based on the DA, and the other EG was taught based on the DigA, while the CG received common speaking instruction. After finishing the instruction, all groups were given the posttests of SFA, LA, and CL to evaluate the impacts of the treatment on their performances. The findings of the one-way ANOVA test divulged that both EGs outflanked the CG on their posttests. In fact, using dynamic and DigA developed EFL learners’ SFA and CL and reduced their LA. Though both EGs outflanked the CG, the DAG had better improvement than the DigAG on their posttests. At the end of the research, some implications were provided, and some suggestions were recommended for the next studies.

Introduction

The process of examining how effectively learners are fulfilling the requirements of a particular instructional program is called assessment, and it is a continuous process. This can be done in different ways, for example, by using DA, formative assessment, DigA, performance assessment, etc. (Kazemi & Tavassoli, 2020). Out of these different sorts, dynamic and DigA were chosen to be examined in this research. It is generally agreed that DA is an interactive method of carrying out an assessment that places an emphasis on the learners’ capabilities to react in response to an intervention. It is believed that the DA, which is founded on Vygotsky’s sociocultural theory and the idea of zone of proximal development (ZPD), is capable of uniting assessment and teaching in the improvement of the assessee (Wang, 2015). The active involvement of assessors and the test-takers’ reaction to that intervention are the two most important aspects of DA (Haywood & Lidz, 2007), both of which have the potential to significantly improve the performance of examinees.

DA offers new perspectives on assessment and highlights the fields in which the learner may make strides toward improvement. DA is described as the connection between an assessor and a student that aims to estimate the degree to which students’ modifiability may be altered, as well as the mechanisms by which cognitive functioning and positive changes can be produced and sustained (Lumettu & Runtuwene, 2018). In DA, the interactions that take place between the instructor and the learners allow for forecasts on the likely course of the learners’ future growth (Ghonsooly & Hassanzadeh, 2019).

A notable aspect of DA is the shift in emphasis from a learner’s unique performance qualities to his reactivity to the interventions given (Ebadi & Saeedian, 2015). The objective of DA is to promote students’ development, and the learners’ progress and skills are evaluated based on their growth throughout teaching. Therefore, it is development focused or development referenced (Poehner, 2008). It is not the instrument itself that determines whether a technique is dynamic or static; rather, it is whether or not an intervention is included in the process, regardless of where the intervention happens (Sternberg & Grigorenko, 2002). In other words, tests are neither static nor dynamic in and of themselves; their status is decided by the purpose of the process and the manner in which it is conducted.

The other type of assessment is called diagnostic, and its purpose is to identify a learner’s areas of strength and weakness based on both the assessment and the instruction that they receive. Once this has been accomplished, the information that has been gathered is then used to assist the participant in their learning (Jang & Wagner, 2014). Alderson and Huhta (2011) outline some of the distinguishing features of a “truly” diagnostic test. The most notable of these features is that the test is more likely to be discrete point than integrative, that it is less authentic than proficiency or other tests, and that feedback is given to the test-takers after the test has been completed.

Feedback is an essential component of DigA, and it plays a significant part in the process by supplying the students with the data they need to perform corrective measures. Although this is generally the first thing that comes to mind when we hear the term “feedback” in the context of teaching a second or foreign language, Hattie and Timperley (2007) note that feedback is more than information about the students’ mistakes. Feedback on mistakes, often known as error correction, is unquestionably a component of the information students; yet, the idea encompasses a great deal more than that. In practice, it has been thought that feedback is most successful when it is directed at identifying and correcting misunderstandings and inaccurate understanding shown by the learner’s performance, rather than a total lack of knowledge indicated through the learner’s performance (Hattie & Timperley, 2007). The vast majority of feedback formats are intended to have a positive influence on the subsequent learning activities they are associated with. This, in turn, assists test-takers in becoming self-regulating individuals who are able to independently, seek out relevant feedback, and self-adjust their learning processes (Kazemi & Tavassoli, 2020).

Both diagnostic and DA can produce positive effects on reducing students’ LA level. According to Zhang (2019), anxiety is thought to be related to the levels of motivation, performance, and self-confidence that learners have. It is claimed that lowering the amount of anxiety that learners are experiencing would help them become more motivated to study a foreign language (Yan & Horwitz, 2008). Regarding the influences of self-confidence, van Batenburg et al., (2019) assert that participants’ achievements in EFL oral interactions can be anticipated by the increment in their self-confidence as a result of strategically directed instruction.

According to Piniel and Csizer (2013), there are a number of factors that can cause students to feel anxious in the classroom, including presentations to the entire class, error corrections and low self-confidence, peer pressure, learners’ and instructors’ beliefs about language learning, instructor-learner interactions, teaching methods in the class, prior negative experiences with classmates, and a mismatch between the level of the teaching materials and the level of the students’ TL proficiency. Thus, we can reduce the anxiety sources by the way we assess the students.

The DigA and DA can develop EFL learners’ speaking skills. Speaking is a key skill in foreign language acquisition, according to Marashi and Dolatdoost (2016), since the ability to communicate in a foreign language is at the core of foreign language learning. Accuracy and fluency are the two most important aspects of speaking. Accuracy is defined as “the degree to which the language generated while doing a task corresponds to target language standards” (Ellis, 2003, p. 339). According to Ellis and Barkhuizen (2005), p. 139), accuracy refers to “how well the target language is generated in reference to the target language's rule system.” Numerous investigations have been performed on both accuracy and fluency (e.g., Navidinia et al., 2018; Toni et al., 2017).

Fluency is defined as “the capacity to continue speaking spontaneously with all available language resources, ignoring grammatical errors” (Gower et al., 2005, p. 100). Fluency is defined by Ellis and Barkhuizen (2005), p. 139) as “the creation of words in real-time without excessive pause or hesitation.” Many scholars have looked at speaking fluency (SF) because of its significance (Syamdianita et al., 2018; Vadivel et al., 2021; Wahyurianto, 2018).

de Vries et al. (2015) illustrate that spoken production requires control of the articulatory system and may lead to great CL. CL refers to a multidimensional construct of the cognitive system regarding the load while performing a special task (Paas et al., 2003). Intrinsic CL is considered as an inherent component of the materials themselves and individual degree of previous experience, while extraneous CL originates in the excess information processing caused by the instructional design (Leahy & Sweller, 2016; Wu et al., 2018). Due to the restricted working memory capacity of learners, it is crucial to explore the relationship between an instructional design and CL, so as to accommodate the difficulty level of the learning activities to students’ learning capabilities (Hwang et al., 2020; Lai et al., 2019).

The two phycological variables (CL and LA) of our research play an important role in language learning. In addition, the other variable (speaking skill) of the present is one of the main skills in any language, and mastering this skill is the ultimate goal of EFL learners. Regarding the importance of both independent variables and dependent variables explained and defined above, this study aims to examine and compare the effects of DA and DigAs on Afghan EFL learners’ SFA, LA, and CL. By doing this research, the researchers hope to help EFL learners develop their SFA and reduce their LA by using dynamic and DigAs. Also, the present research can pave the way for the next researchers to examine the effects of dynamic and DigAs on other language skills and other phycological variables.

Review of the literature

Theoretical background

As defined by Lynch (2001), assessment is a series of processes that involves testing and measurement but is not limited to them. It is the organized data we collect to make judgments about people, following examinations or other measurement methods. To aid in the teaching/learning procedure is the basic goal of assessment. As Gipps (1994) said, assessment is to undertake a paradigm shift from a psychometric to a more extensive model of instructional assessment. DA postulates a qualitatively distinctive method of thinking about assessment from how it has been conventionally understood by researchers and classroom educators. Understanding students’ capabilities, teaching, assisting in learners’ improvement, and the pedagogical method of assessment are a dialectically combined activity named DA (Poehner, 2008; Vadivel et al., 2019).

DA is one kind of alternative assessment that integrates teaching and assessment into an interactive pedagogical approach with the provision of suitable forms of mediation (Cho et al., 2020; Ebadi & Rahimi, 2019). DA aims to portray a more complete image of learners’ cognitive structures for enhancing the diagnosis of students’ learning difficulties and for recognizing the developmental trajectory, by means of directly measuring their replies to specific interventions (Ahn & Lee, 2016; Wang & Chen, 2016). DA is capable of promoting learners’ achievements and of probing potential abilities by offering the details of their abilities to develop the intervention programs (Liu et al., 2021; Swanson & Lussier, 2001). For example, Antón (2009) declared that DA empowers a deeper characterization of learners’ actual and latent abilities and advances individualized instruction that can adapt to individual needs.

A significant benefit of DA is making commendations according to developmental capacity that is not shown in old non-DAs (Davin, 2011). In DA, the pupils are taught how to complete specific tasks and are provided with mediated support to master them. Their ability improvement to solve comparable tasks is then assessed (Kirschenbaum, 2008; Rezai et al., 2022). Lidz (2002) observes DA as a collaboration between an assessor as an intervener and a student as an active participant, which tries to evaluate the modifiability degree of the student and the method by which favorable modifications in cognitive functioning can be made and sustained.

DA is primarily founded on Vygotsky’s sociocultural theory of mind which strongly suggests that cognitive development is best comprehended in its cultural and social settings (Ajideh & Nourdad, 2012). It tries to account for the procedures over which improvement and learning happen. Pupils need others’ support to accomplish new tasks, and then after adopting, they can complete the tasks autonomously. So, social interactions facilitate learning. Sociocultural theory, then, offers significant insights to investigators on mental development, educational practices, and the mind. As Nassaji and Cumming (2000) reasonably conclude, outlining the dialogic nature of learning/teaching procedures in the ZPD and planning research that illustrates its nature are basic in sociocultural theory.

ZPD is another theory that supports our study. Vygotsky (1978) described ZPD as the distance between the real developmental level as decided by autonomous problem-solving and the level of potential development as decided through problem-solving by adult help or in association with a more proficient peer. Based on this theory, kids’ cognitive development happens at assisted or potential level (present to future) and at real and unassisted level (past to present). At the real or independent level, the kid can complete the tasks without any support, but at the potential level, the kid needs another person’s (a mediator’s) help (Vygotsky, 1986, 1978). He recommended that the procedure of scaffolding produces capabilities that have been in the process of developing and emerging (that is, have not yet matured) and subsequently shows the unseen potential of a kid that is crucial in not only diagnosis but also prognosis. In fact, ZPD discusses a set of tasks that a kid can accomplish unaided and autonomously and those finished with the help and support of more proficient peers and adults.

The ZPD was understood by Vygotsky to describe the present or actual levels of improvement of the learners and the next levels achievable by the use of mediating semiotic and environmental instruments and proficient adults or peers’ facilitation (Shabani et al., 2010). The idea is that students learn best when working together with others during collaboration, and it is through such collaborative attempts with more capable people that learners learn and internalize new concepts, psychological instruments, and skills. Roosevelt (2008) held that the main objective of education from the Vygotskian view is to keep students in their own ZPDs as often as possible by giving them interesting and culturally meaningful learning and problem-solving activities that are somewhat harder than what they conduct lonely. After doing the tasks jointly, the learners will likely be able to complete the same tasks individually next time, and through that process, the learners’ ZPD for that specific task will have been raised. This process is then repeated at the higher level of task difficulty that the learners’ new ZPD requires (Chaiklin, 2003).

DigA is the other type of assessment that aims at recognizing a student’s strengths and weaknesses in the parts the instruction and assessment are founded, later on applying the data gained to aid in the student’s learning and conduct the teaching (Jang & Wagner, 2014). It is a type of assessment that relied on feedback that gives pupils the information they want to monitor their development so as to get remedial instruction (Ghahderijani et al., 2021; Kazemi, 2018).

Alderson et al. (2015) stated that there are some major variances between the diagnostic test and other kinds of language tests: (1) changing the teaching procedure is the primary goal of the test. (2) In DigA, the instructor is both a diagnostic test user and a diagnostic tester. (3) The linguistic contents are determined by the curriculum. Finally, (4) the test-taker is a foreign language student. They further asserted that diagnostic testing intends to chase the implementation of the curriculum to deliver feedback to both students and educators.

Although most explanations designate both pupils’ weaknesses and strengths as equally significant in DigA, in the actual settings of the classrooms, as Alderson et al. (2015) stated, more focus is given to weaknesses and the type of feedback required to be offered according to them. As matter of fact, the chief role of DigA is to supply the required data on the development of the students. It has been stated that feedback should be of different types, meaning that it is not justified to highlight correctness more than necessary or to deliver only negative or positive kinds of feedback. Rather, it is noteworthy for teachers to use various types of feedback (Harding et al., 2015; Jang & Wagner, 2014).

By using diagnostic and DigAs, teachers can reduce the students’ level of anxiety. The most important component of anxiety is test anxiety which was defined as a propensity to drive out self-centered, interfering responses when students are concerned with testing circumstances (Sarason, 1972). Zeidner (1998) characterized test anxiety as the physiological, phenomenological, and behavioral reactions accompanied by negative consequences and failures in testing situations. Hancock (2001) stated that test anxiety is a disturbing emotional phenomenon that has behavioral and physiological dimensions and that is experienced in evaluation and testing conditions. Test anxiety has emotional, social cognitive, and physiological manifestations. Students’ weak performances in the previous exams can possibly make them anxious so they improve negative feelings about exams and have disparaging perspectives about evaluative conditions. Anxious learners are usually not able to show their comprehensive performances for a test since they forget lesson points that they studied before due to anxiety about the tests (Hancock, 2001). Learners with much test anxiety show poor performances in their exams and evaluative conditions rather than their classmates with lower test anxiety (Cassady & Johnson, 2002). Test anxiety is connected to learners’ characteristics and emotional position and appears when they are subjected to high significant tests regularly in which failure or success in tests is extremely accentuated for them (Sanaeifar & Nafarzadeh Nafari, 2018).

In addition to the anxiety, the dynamic and DigAs can affect students’ CL positively. CL is referred to the quantity of information that is processed by the brain simultaneously (Sisakhti et al., 2021). In routine life, retrieving information from memory, and especially long-term memory (LTM), in order to carry out the given activities is crucial. This retrieval often occurs without intention and is able to continue under complex conditions, such as when conducting some activities at once (Fischer et al., 2007). For instance, when one reads a word in a sentence, its meanings come from our LTM; nevertheless, this process is more challenging when one reads verbal compounds or when he/she reads multiple words in the syntactic structures (McIntyre, 2007). There are reports on the retrieval process being unaffected in higher demands (e.g., concurrent task performances) (Naveh-Benjamin et al., 2000), whereas the impairment of memory retrieval in such conditions is also reported (Moscovitch, 1992).

CL theory deals with the idea that educational materials can be useful if they do not overload the working memory of the students (Assiss Hornay, 2021). In learning a language, students are encountered with different tasks that can even overload their cognitive capacities. In learning a foreign language, students are encountered with multiple activities on language skills, including writing listening, reading, and speaking, and language sub-skills such as pronunciation, vocabulary, and grammar (Sweller, 2007). In addition, the contents of language learning involve different perceptions that may cause an overload of cognitive demand (Lin & Chen, 2006). Thus, it is essential to generate instructional materials that limit unnecessary CL and develop students’ performances.

Using the DigA and DA can help EFL learners develop their speaking skills. Speaking is a productive skill that instructors should do their best to develop in EFL contexts and help students generate utterances when communicating with others. Furthermore, speaking is characterized as contextualized, social, and interactive communicative events. It can assist individuals to establish and maintain social relations, exchanging feelings, and demonstrate their identities. Nunan (1991) stated that to most people, learning speaking is the most significant dimension of learning a foreign language, and success is assessed in terms of the ability to establish conversations in the target language. Speaking is one of the most problematic skills for learners to master since they must master all the components of speaking so as to speak fluently and clearly. There are five components of the speaking skill to master: grammar, pronunciation, vocabulary, comprehension, and fluency (Fulcher & Davidson, 2006). The focus of the current study is on fluency and accuracy.

Speaking accuracy shows “the degree to which the language generated conforms to language standards” (Yuan & Ellis, 2003, p. 2) under which the proper uses of vocabulary, grammar, and pronunciation are considered. Speaking fluency demonstrates the ability to generate the spoken language “without unnecessary hesitation or pausing” (Skehan, 1996, p. 22). The ability to speak English fluently is the goal of the majority of EFL students (Mohammadi & Enayati, 2018), that is why it has always been of particular attention among language students. Putting too much attention on accuracy can cause a lack of fluency, and too much emphasis on fluency can result in a lack of accuracy (Skehan & Foster, 1999). Consequently, it is essential for Afghan EFL learners and teachers to keep a balance between speaking fluency and accuracy.

Empirical background

To examine the impacts of diagnostic and DA on learning language, some studies were conducted. For example, Ajideh and Nourdad (2012) attempted to investigate the effects of DA on EFL students’ reading comprehension at various proficiency levels. One-hundred ninety-seven Iranian university students took part in six groups of this research. The study design was quasi-experimental. The findings of the MANOVA test showed that although DA had improving instant and delayed effects on reading comprehension of students in all proficiency levels, the proficiency groups did not differ meaningfully in their taking profit of this type of assessment.

Wang (2015) explored if DA can advance the combination of instruction and listening comprehension assessment while simultaneously improving students’ study in listening. Five second-year English majors from a technical college in an undeveloped zone of a coastal province in China participated in the research. The assessment applied the cake format, that is, applicants firstly listened to a length of audio material and then were asked to reply to comprehension questions and express their comprehension process. The investigator then intervened to mediate the task. Then, the partakers were exposed to the audio material again and enquired to retell. This procedure went on until the listeners got adequate comprehension of the audio material. An exploration of the information from pupils’ notes, the investigator’s notes, reflective reports, and pupils’ verbal reports showed that DA can offer a better understanding of the difficulty in listening to both the partakers and researcher. The information also showed that the researcher’s mediation and intervention in partakers’ difficulties assisted to make the mediated learning experience for them

In two comparable kinds of research on dynamic and DigA, Nikmard (2017) and Zandi (2018) explored the positive impacts of dynamic and DigA on EFL students’ performance on productive and selective reading comprehension tasks and productive and selective listening comprehension tasks, respectively. Moreover, Ardin (2018) inspected the impacts of dynamic and DigA on EFL students’ performance on narrative and descriptive writing and found out that both diagnostic and DAs positively influenced the pupils’ writing in both narrative and descriptive writing.

Kamali et al. (2018) inspected the impacts of DA on L2 grammar learning of EFL students. What has revealed in their research was that the students that took DA mediations meaningfully outdone those in the CG. They approved that the students had internalized the L2 grammar knowledge and got higher scores since they had been offered suitable feedback in the procedure of DA mediation. The research indicated the benefit of the application of DA in L2 grammar instruction.

In another research, Kazemi and Tavassoli (2020) attempted to discover the efficiency of diagnostic and DA on developing EFL students’ speaking abilities. To do so, 82 intermediate-level EFL students were chosen according to their accomplishments on IELTS (2016). The partakers were then designated into three groups of DigA, CG, and DA. In the DAG, the pupils got three speaking tests in the form of test-mediation-retest; in the DigAG, the applicants got those three speaking tests and feedback on their difficulties, and the students in the CG received the routine of speaking courses by concentrating on the same three speaking tests. Two raters recorded and scored the speaking pretest and posttest as well. To reply to the research questions, a repeated-measures two-way ANOVA was run. The findings indicated progress in the three groups’ achievement from pretest to posttest. Specifically, the dynamic and DigA groups revealed substantial progress; however, the differences in their advancement were not noteworthy. Pedagogical implications and conclusions of the research are further presented.

Suherman (2020) tried to examine the impacts of DA (DA) on EFL students’ reading comprehension. Five Indonesian tertiary-level EFL students took part in this case study. It examined if mediation in DA develops the pupils reading comprehension achievements and scrutinized the extent to which mediation in DA assists learning. The research methods were pretest, mediation, and posttest. The findings showed two principal points. First, the findings of the posttest displayed general progress for all five pupils. As shown by the effect size (0.96) and the finding of paired samples t-test (p-value = 0.0028), it can be inferred that the influence of DA on the partakers’ reading skill performance was highly substantial. Second, mediation in DA seemed to advantage learning with various features in each pupil.

Recently, Chen et al. (2022) examined the effects of integrating DA into a speech recognition learning design to support students’ English-speaking skills, LA and CL. In this research, a DA-based speech recognition (called DA-SR) learning system was planned to ease pupils’ English speaking. Furthermore, a quasi-experiment was applied to estimate the influence of the suggested method on pupils’ speaking learning efficiency, via respectively offering the DA-SR and the corrective feedback-based speech recognition (called CF-SR) approaches for the CG and EGs. The experimental findings showed that both the CF-SR group and DA-SR group can efficiently develop the pupils’ English-speaking skills and lower their English-speaking LA. In addition, this research further proved that the DA-SR approach effectively decreased pupils’ English class performance anxiety and superfluous CL compared to the CF-SR approach.

After making a review of the related literature, it was found that both dynamic and DigAs can produce positive effects on English language learning. In addition, it was found that most studies examined the effects of diagnostic and DigAs on language skills and subskills, and there are few pieces of research dealing with the effectiveness of the mentioned assessments on students’ phycological variables. In fact, there have been few studies on the impacts of the DigA and DA on developing Afghan EFL learners’ LA and CL. Therefore, this research aimed at comparing the effects of DA and DigA on enhancing Afghan EFL learners’ SFA and CL. Besides, this research intended to investigate the impacts of using DA and DigA on reducing Afghan EFL students’ LA. Based on the objectives, this research formulated the following questions:

RQ1: Does using DA and DigA significantly lead to enhancing Afghan EFL learners’ SFA?
RQ2: Does using DA and DigA significantly lead to enhancing Afghan EFL learners’ CL?
RQ3: Does using DA and diagnostic assessment significantly lead to reducing Afghan EFL learners’ foreign language LA?
RQ4: Which type of assessment (dynamic or diagnostic) is more effective in reducing Afghan EFL learners’ foreign language LA and developing their SFA and CL?

Methodology

Design of the study

Since we could not select the participants randomly, we exploited a quasi-experimental design in this study. Accordingly, the participants of this research were selected based on a non-random sampling method. Two experimental groups (dynamic and diagnostic) and a control group were included in this study. There were two independent variables (DA and DigA) and four dependent variables (speaking fluency, speaking accuracy, CL, and LA) in the current study. The participants’ age, proficiency level, and gender were the control variables of the current research.

Participants

Ninety subjects were selected among 129 EFL learners based on the outcomes of the Oxford Quick Placement Test (OQPT). They were chosen from two English institutes in Mazar-i-Sharif, Afghanistan, according to the convenience sampling method. The English proficiency level of the subjects was measured as intermediate, and their age range was between 17 and 31 years old. All the subjects were male and were randomly assigned to two EGs (DAG and DigAG) and a CG each including 30 learners. Because of the gender segregation in the Afghan religious context, we could work only on male students. It should be noted that the ethical requirements were considered as the participants signed the given consent letters.

Instruments

Oxford Quick Placement Test (OQPT)

The first instrument that was employed in this research to make the subjects homogeneous was the OQPT which was developed by the Oxford University Press. It included 60 items that measured the students’ vocabulary knowledge, grammar knowledge, and reading comprehension. It could assist the researchers to have a better comprehension of what levels (i.e., elementary, pre-intermediate, intermediate, and advanced) their respondents were at. Based on this tool, the students whose scores were between one standard deviation (SD) above and one SD below the mean were selected as the intermediate and were regarded as the target population of the research.

Anxiety questionnaire

The other tool for gathering the data was an anxiety questionnaire prepared by Horwitz et al. (1986). This questionnaire comprised 33 items in the form of a 5-point Likert scale. The answers to each item can be one of the following: completely agree, agree, neutral, disagree, and completely disagree. For each item, a score was given ranging from 1 for completely disagree to 5 for completely agree. Two items of the questionnaires are as follows: (1) “I never feel quite sure of myself when I am speaking in my foreign language class,” and (2) “I do not worry about making mistakes in language class.” The validity of this instrument was confirmed by a group of English instructors, and its reliability was measured by Cronbach alpha (r =0.85). This questionnaire was used both as the anxiety pretest and the posttest of the present research.

Cognitive load questionnaire

The other tool used in this research was the CL questionnaire that was designed by Hwang et al. (2013). This questionnaire had two aspects utilizing a 5-point Likert scale, encompassing “mental load” and “mental effort.” The mental load aspect had five items, and the mental effort aspect comprised of three items. One item of “mental load” was “It was hard for me to understand the learning content in the activities,” and one item of “mental effort” was “It was hard for me to realize and follow the instructional approaches in the learning activities.” According to Cronbach’s alpha, the reliability of the questionnaire items was 0.83, and its validity was acceptable according to the ideas of some English experts. This questionnaire was applied both as the pretest and the posttest of the study.

Researcher-developed speaking pretest and posttest

The fourth instrument that was employed in this investigation was a researcher-created speaking pretest which had several items from the participants’ coursebook (i.e., family and friends 5). The respondents were required to speak about the topics for around 2 to 3 min, and their speeches were recorded for the second rater (two raters checked and scored the speaking performances of the participants). The validity of the test was confirmed by some English professors in applied linguistics. The validators were three Afghan university professors who had more than 17 years of teaching experience in English. In addition, the speaking test reliability was computed by utilizing Pearson correlation analysis as (r = 0.86). It should be noted that this test was applied both as the pretest and the posttest of the research.

Speaking checklist

The other instrument was Hughes’s (2003) speaking checklist that was applied to assist the researchers to score their subjects’ speaking performances. The subjects’ speaking accuracy and fluency were scored by using this checklist.

Data collection procedures and analyses

To conduct this research, initially, the OQPT was given to 129 EFL students to determine their general English language ability. Ninety intermediate learners were chosen for the target population of the current investigation. Then, they were randomly divided into two experimental groups (DAG and DigAG) and a control group, and all groups were pretested on SFA, LA, and CL.

After the pretesting process, the groups were trained differently. For example, one experimental group was instructed based on the DA. In the dynamic experimental group, the students in their speaking tasks received interventions from the teacher to both evaluate and develop their speaking abilities. The students received DA-based interventions following Lantolf and Poehner’s (2011) scale. This scale was used to present mediation on the ground of each learner’s answers. If the learner’s response was right, no mediation was offered. But if the pupil’s answers were wrong, the teacher chose one of the eight forms of the mentioned scale which are as follows: (1) pausing by teachers; (2) repeating the entire phrase questioningly; (3) repeating only the erroneous part of the sentences; (4) asking a question, for instance, what is wrong with this sentence; (5) pointing out the wrong words; (6) asking either…or… questions; (7) identifying the right answers; and (8) explaining why. It can be seen that the scale moves from most implicit to most explicit forms in presenting the mediation for the students in the DAG.

The students in the diagnostic group received diagnostic feedback on their strengths and weaknesses on ten speaking tests as the treatment. The most usual sorts of feedback and corrections provided by the teacher were utilizing facial expressions and body gestures, repeating, reformulation, hinting, and echoing. This process was done to take and give all ten speaking tests to the diagnostic group participants. The control students, on the other side, had common speaking instruction with different speaking tasks and activities. They also took the ten speaking tests during the term but without receiving any specific feedback or mediation on their performances. After giving and taking ten speaking tests, all groups were administered the posttests of SFA, LA, and CL to evaluate the influences of the instruction on their performances.

The whole instruction took 20 sessions; in five sessions, the OQPT, the pretests, the posttests, and the questionnaires were administered, and in fifteen sessions, the treatment was performed differently in the three groups. Having completed the data collection process, they were analyzed by applying Statistical Package for Social Sciences (SPSS) software, version 22. Then, several one-Way ANOVA tests were applied to measure the differences between the performances of the three groups in their posttests.

Results

After collecting the data through the mentioned procedures, the researchers analyzed them to gain the final results. First, they got sure about the normality of the data through using the Kolmogorov-Smirnov test (p > 0.05), and then, they presented the results of the one-way ANOVA tests in the following tables:

Based on the descriptive statistics in the above table, all three groups’ performances on the pretests of SF were almost the same; their means show that they were at the same proficiency level of SF before applying the treatment. The mean score of the CG is 13.56, and the mean scores of the diagnostic and DA groups are 14.20 and 14.33, respectively Table 1.

Table 1 Descriptive statistics of speaking fluency pretest of all groups

Dynamic vs. diagnostic assessment: impacts on EFL learners’ speaking fluency and accuracy, learning anxiety, and cognitive load

Abstract

Introduction

Review of the literature

Theoretical background

Empirical background

Methodology

Design of the study

Participants

Instruments

Oxford Quick Placement Test (OQPT)

Anxiety questionnaire

Cognitive load questionnaire

Researcher-developed speaking pretest and posttest

Speaking checklist

Data collection procedures and analyses

Results

Discussion

Conclusions and implications of the research

Limitations and suggestions of the study

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords