Why are some of Let's Go Learn's DORA scores unexpected?
Why do some scores seem very high or very low?
When using a computer-mediated diagnostic, formative assessment like DORA , you may come across scores that look a little different from behavior you observe in the classroom. In this document, Letís Go Learn attempts to address some of the most common misunderstandings about DORA scoring.
High Scores on the Word Recognition Sub-test
The Word Recognition sub-test assesses studentsí decoding skills, using a combination of criterion-referenced real words and phonetically regular invented words. Words are presented to students orally, and they are asked to identify the correct word from four choices. While this sub-test accurately assesses studentsí word recognition ability, it is an out-of-context activity with an oral component. As such, teachers may sometimes see unexpectedly high scores from students who have strong decoding abilities but low comprehension scores. While students can use decoding skills to isolate the correct word, they may still struggle to read and comprehend the word in context. This sub-test is used to evaluate decoding and word analysis skills in isolation and is not assessing contextual reading skills.
These scores, however, are usually an accurate reflection of studentsí graphophonic skills. The validity of this sub-test is particularly high (grade-level delta = 0.19, SE = 0.12), meaning that when the test is administered repeatedly with no instructional time between assessments, students score, on average, a difference of less than two school months between assessments. Also, this sub-test was correlated to both the Diagnostic Assessment of Reading (Riverside) (r = 0.81) and the Woodcock Word Identification Test (r = 0.92), both with statistically and practically significant levels of correlation, indicating very high levels of validity for the sub-test.
High Scores on the Oral Vocabulary Sub-test
When examining scores on the Word Meaning sub-test, it is important to note that it does not assess studentsí reading vocabulary (or words they can both read and define). Instead, it assesses studentsí oral vocabulary, which is often considered predictive of future reading ability. Students cannot read words that do not exist in their oral vocabularies, so an assessment of oral vocabulary can help identify a gap that would prohibit students from achieving in reading. It is a particularly important sub-test for second language learners and for students with developmental language delays. For other students, however, the word meaning sub-test may appear to give unexpectedly high scores.
These scores are routinely not errors. The Word Meaning sub-test has been shown to be reliable, with a statistically significant test-retest correlation (r = .60, SE = 0.19). Further, it has been correlated to the word meaning sub-test of the Diagnostic Assessment of Reading (DAR), with a moderate to high level of correlation (r - .60). In 2003, items in the Word Meaning sub-test underwent major revision to ensure that test scores were not consistently higher than expected.
Low Spelling Scores
Spelling is the most challenging sub-test on DORA, as the answers are completely student-generated as opposed to multiple-choice. If students are performing well on classroom spelling tests, consider the difference in the task. If on Monday they are given a list of 10-25 words (depending on the teacher), they spend the entire week memorizing those words: writing, re-writing, creating flash cards, drilling themselves, drilling their peers, and taking sample tests in class and at home over breakfast. The students are given many opportunities to memorize those words and most will do well on the Friday spelling test.
When they come to take DORA , however, they are seeing words that they have not just spent a week practicing, and they have only one chance to spell the word correctly. It is a sample of how the student is spelling, in general, without any practice. Do you administer a spelling pre-test at the beginning of the week? Do your students get the same score on Monday as they do on Friday? Probably not, but I would look at the Monday score as an indicator of how well the child really spells. With practice, children will learn to memorize and spell words. Ongoing work on classroom spelling lists will improve studentsí diagnostic spelling scores as they continue to be exposed to more words and more complex spelling patterns.
Letís Go Learnís Spelling sub-test has been correlated to two other nationally recognized spelling assessments: the spelling sub-test of the Diagnostic Assessment of Reading (r = .78) and the spelling sub-test of the Wide Range Achievement Test (r = 0.85, SE = 0.210). Both studies indicated a statistically significant correlation to both assessments, assuring the validity of the DORA
Low Comprehension Scores
Many factors affect a studentís ability to successfully comprehend a text. Some students struggle with decoding the text they encounter or with the language structures (i.e., phrases and idioms) used. Other students may possess limited background knowledge about the topic of the text or they may not be interested in what theyíre reading. While Letís Go Learnís comprehension test presents students with non-fiction topics they are likely to have encountered in school, some groups of students may have less familiarity with the subject matter in DORA than in other comprehension assessments.
Using non-fiction passages with topics taught in most classrooms across the nation means reduced variability in assessment results. The language involved in generating non-fiction passages is easier to standardize, as it does not contain conversational colloquialisms that are often regionalized in the U.S. Also, non-fiction passages offer a range of topics common to many classrooms, reducing bias due to race, gender, and culture. While non-fiction is sometimes more difficult for children to read than fiction, Letís Go Learn has made a conscious effort to control for this by writing comprehension questions that are not too difficult and by creating an administration protocol which ensures that children only see questions within their comfort level as the sub-test raises and lowers the difficulty of passages according to success on DORA .
Another factor that can make scores on DORA seem lower is if your students have been tested using traditional teacher-mediated pen-and-paper assessments. On these assessments there is more room for discrepancy, as teachers often ask follow-up questions to clarify studentsí responses and students often become familiar with the administration protocol. Letís Go Learnís DORA removes some of this variability associated with teacher-mediated assessments.
Often comprehension tests, like those utilized in annual state assessments, allow students to re-read the passage after they have seen the questions. This type of assessment can lead to false positive scores, as students learn strategies for skimming that may not be an indication of absolute comprehension ability. Allowing students to re-read passages introduces a new variable to the assessment that is difficult to control for. That is, some students choose to read the passage over again while others choose not to re-read the passage. Allowing students to re-read a passage thus increases the variability of the comprehension sub-test score. By allowing students to read the passages only once, DORA provides a better indicator of how well students will perform in real reading situations. This gets back to the purpose of DORA, which is to provide diagnostic data for teachers to guide instruction, but it could consequently result in scores on the comprehension sub-test that are lower than teachers or parents might expect.
Also, because DORA is criterion-referencedóthat is, based on a set of criteria identified by expertsóit is possible that the items might differ from other criterion-referenced assessments you have encountered. This does not preclude the utility of DORA or mean that its comprehension sub-test does not provide helpful information. It simply means that one must consider its difficulty relative to other available comprehension tests.
The avoidance of false positives, as mentioned earlier, is also a factor that can make scores appear lower. If other comprehension measures used in the past have a lower degree of false positive aversion, then the difference when comparing DORA to these other measures may appear significant. Our philosophy is that it is worth it to avoid incorrectly labeling a low comprehension student as high, even if it means on occasion labeling a high comprehension student as slightly lower than his or her real ability. Have no doubt, comprehension measures must choose one or the other possibility. There is no way to avoid biases.
Another factor that should be considered is the studentís motivation. Longer assessments run a higher risk of fatiguing the student, and the factor that causes the greatest test score variance is student motivation. Therefore, students need to be properly introduced to the idea of DORA . Teachers should stress that this assessment will help them do a better job of instructing students. Also, the assessment should be broken up into manageable sessions and students should be monitored during testing. If some students seem fatigued, the teacher should consider stopping the assessment and resuming it later. The comprehension sub-test is the final test of the assessment, designed intentionally so that other sub-tests can better inform studentsí starting point for the comprehension test; as a result, however, the comprehension sub-test may be most affected by student fatigue or lack of motivation.
Finally, the age of the student should be considered. Sometimes a lower comprehension score is the result of a younger student taking a computer-mediated test for the first time. Unfamiliarity with the medium can result in lower scores, as students may struggle with how the test is organized. Preparing students for the assessment by showing sample questions or discussing the assessment organization will help eliminate this confound.
The Comprehension sub-test scores have also been validated to ensure that the scores a student receives are not abnormally high or low. In test-retest analysis, when students took the DORA comprehension test repeatedly, the grade-level delta score was 0.35 (SD = 0.13). In other words, when students retake the test, 95% of their scores will have a difference of between 0.09 and 0.61 grade levels; almost all students will score less than half a grade level differently. This indicates the consistency of DORA's comprehension sub-test. Further, the Comprehension sub-test has been correlated to both the Diagnostic Assessment of Reading (DAR) comprehension sub-test and the Gray Oral Reading Test, with both indicating medium-high to high levels of correlation to Let's Go Learn assessment.
In summary, many factors might make it appear, on occasion, that studentsí scores on DORA's Silent Reading sub-test are lower than their reading ability compared to other reading measures. However, when examining the biases of each measure, evaluating the statistical soundness of DORA's sub-test validity, and interpreting DORA for what it seeks to do, these discrepancies, if any, can usually be explained or accounted for. Furthermore, there is low probability that any discrepancy between measures will be large enough to negatively affect any particular studentís instructional plan.