May 1, 2007
•
•
Vol. 64•
No. 8All About Accountability / Grain Size: The Unresolved Riddle
In the field of education, as elsewhere, computers are ubiquitous. They play increasingly prominent roles in education assessment, for example, where they carry out testing's three-step tango—namely, test administration, test scoring, and score reporting. Computer-dispensed tests, now administered to thousands of students daily, will soon be dished out to millions of students daily. Some computers can even tailor the items they dispense so that a test's difficulty meshes almost perfectly with a given student's achievement level.
Computers are also becoming smarter about scoring students' tests. Today's electronic scoring devices can ingest a flock of students' test answers—some selected responses and some generated responses—and then both score and analyze the very devil out of them. Finally, like the electronic elves they are, computers can churn out a galaxy of score reports for educators, students, parents, and the public.
Although often overlooked, score reporting is a crucial component of any assessment operation. It is the reporting of students' performances on tests that leads educators to make certain pedagogical decisions about those students. So if those reports aren't helpful or meaningful, then educators may make unsound decisions regarding those students. And unsound decisions negate the reason we test students in the first place.
In human beings, one's strength is almost always one's weakness. For instance, detail-oriented people who are wondrous at keeping track of minutiae are often unable to see the big picture. Similarly, big-picture people, although completely aware of the forest, may lose sight of its individual trees. Similarly, strengths often turn out to be weaknesses whenever machines are involved. And that's precisely what's going on these days in the reporting of students' test performances.
Given the enormous capabilities that computers now give us, one would think that today's computer-based score reports would be spectacular. Regrettably, they often aren't. The main culprit in this score-reporting caper is one simple commodity—namely, inappropriate grain size. Grain size refers to the breadth or scope of something. For instance, in the case of a curricular aim, a large grain size would be a significant, long-term goal that might take a full school year for students to reach. A curricular aim with a smaller grain size would be an instructional objective that students can achieve during a single classroom session.
In the arena of score reporting, grain size refers to the size of the “chunks” of information that we find in a score report. For instance, if a student's performance were reported for every item on an achievement test, one by one, this would be the smallest grain size available. In contrast, if a student's performance on a reading test were reported as a single, overall “reading comprehension” score, this would be a large, whopper-level grain size. The grain-size needs of different audiences, of course, often differ substantially. Teachers are likely to prefer smaller score-report grain sizes than do parents or education policymakers.
If the grain size of test score reports doesn't mesh with the information needs of those who receive those reports, then the reports are almost certain to be of little use. Unfortunately, many of today's computer-based reports of test performances embody grain sizes that are way too tiny or way too big. Too-tiny reporting occurs when teachers receive reports on their students' performance on every tested item and then are asked to make sense of the ensuing profusion of performances. Too-big score reports, on the other hand, report data on student achievement in just two or three lumps. For example, parents learn that their child has displayed “acceptable math concepts” but “weak math skills.” Indeed, the immense number of computerbased options can foster score reports that are patently off target for many users.
On numerous occasions during the last few years, I've attended conferences where technology experts from testing companies rhapsodized about the consummate sophistication of their firm's score reports. Sometimes, in an alleged attempt to provide “flexibility” to users, the resulting reports are so complex that they're almost encyclopedic. In other instances, I've heard state department of education officials pat their own backs when telling their state's teachers that every student's score on every single item will be provided. You show me a teacher who wants to go through reports in that tiny, off-putting grain size, and I'll show you a teacher in serious need of a sabbatical.
My point? Educators should never defer to the score reports pumped out for them—even though those reports have a prestigious computer pedigree. If the grain sizes of the score reports you receive are not optimally useful in making good instructional decisions, then demand redesigned reports by complaining to those educators in charge of score reporting. Those in authority need to understand that the current score reports just aren't working. The folks who designed such reports may be paying more attention to the magic of their machinery than to users' grain-size needs.