All About Assessment / Diagnosing the Diagnostic Test

W. James Popham

Assessment Curriculum

Diagnosis consists of identifying the nature of an illness or other problem through the examination of relevant symptoms. In education, a diagnostic test helps identify a student's learning problems so teachers can provide instruction to remedy those problems. But do such tests actually exist?

In fact, few legitimate diagnostic tests currently roam the education landscape. Legitimate diagnostic tests supply the sort of evidence that teachers need to make defensible instructional decisions. Students' performances on those tests let teachers know what cognitive skills or bodies of knowledge students are having trouble with. Legitimate diagnostic tests, therefore, are patently practical. Although they don't tell teachers how to carry out instruction to rectify deficits in students' achievement—that's where teachers' pedagogical prowess takes center stage—they do let teachers knowwhat must be fixed. If they don't, they're not legitimate.

Then there are the pseudodiagnostic tests. Scads of these are peddled by commercial vendors who recognize that desperate educators will do almost anything to dodge an impending accountability cataclysm. And this "almost anything" includes buying tests that promise to help a teacher raise test scores—even if they don't. Accordingly, today's educators need to be aware of three types of pseudodiagnostic tests currently failing to live up to their claims.

The Too-Few-Items Test

Most education tests measure students' status with respect to a cognitive skill or body of knowledge, which we can refer to as assessed attributes. The test needs to include a sufficient number of items to measure each assessed attribute so that the teacher can arrive at a reasonably accurate inference about how an individual student stands with regard to each of those attributes.

For instance, if the teacher wants to know whether a student can multiply pairs of double-digit numbers, one or two items on a test just won't provide a sufficiently accurate estimate. The number of items required, of course, will depend on the nature of the skill or body of knowledge being measured, but one item per measured attribute definitely doesn't cut it. Yet there are tests currently strutting their diagnostic stuff even though they contain only a single item for each measured attribute.

Moreover, teachers must often undertake complicated and time-consuming analyses of students' responses to individual items to make sense of a student's results on these tests. Legitimate diagnostic tests permit teachers to use a test's results without having to devote hours to intricate interpretations.

The Single-Trait Vertical Scale Test

This second type of pseudodiagnostic test sounds fancier than it really is. Such tests are built to identify students' status regarding a single trait, such as "mathematical competence" or "reading comprehension." A scale based on this trait can be used to track individual students' growth across different grade levels. Indeed, such tests often identify points on the scale as numerical targets for students' grade-to-grade achievement growth.

But making such vertical scales work properly means abandoning any notions of accurateper student diagnosis. Items that might help identify a student's specific skills or knowledge must be statistically linked to one, often mushily described, single trait. Diagnostic dividends are sacrificed to the statistical needs of a viable vertical scale.

This variety of pseudodiagnostic test often claims that students who score at a certain point on the test's vertical scale "typically" have mastered certain subskills or haven't mastered others. But such claims of typicality do not let teachers know whether particular students have or haven't mastered those subskills. Many of a teacher's students may be altogether atypical.

The Fuzzy-Measure Test

A third offending test provides imprecise descriptions of what the test aims to measure. Although from a marketing perspective it may make more sense for vendors to loosely define what the test is assessing (thereby making the test seem more relevant to a wider range of potential purchasers), fuzzy descriptions are of scant diagnostic help to teachers who want an accurate fix on a student's academic shortcomings so they can tackle them instructionally.

What to Look For

Despite the proliferation of pseudodiagnostic tests, it is possible to acquire a legitimate diagnostic test. To be truly diagnostic, such tests need to (1) measure a modest number of significant, high-priority cognitive skills or bodies of knowledge; (2) include enough items for each assessed attribute to give teachers a reasonably accurate fix on a test taker's mastery of that attribute; (3) describe with clarity what the test is assessing; and (4) not be too complicated or time-consuming. Such tests can be created by commercial firms or even—with substantial effort—by state and district assessment staffs.

But educators are not likely to find truly legitimate diagnostic tests as long as they can't tell the difference between the real ones and the rip-offs. When purchasing diagnostic tests, it's a case of "buyer beware" or "buyer be fooled."

James Popham is Emeritus Professor in the UCLA Graduate School of Education and Information Studies. At UCLA he won several distinguished teaching awards, and in January 2000, he was recognized by UCLA Today as one of UCLA's top 20 professors of the 20th century.

Popham is a former president of the American Educational Research Association (AERA) and the founding editor of Educational Evaluation and Policy Analysis, an AERA quarterly journal.

He has spent most of his career as a teacher and is the author of more than 30 books, 200 journal articles, 50 research reports, and nearly 200 papers presented before research societies. His areas of focus include student assessment and educational evaluation. One of his recent books is Assessment Literacy for Educators in a Hurry.