All About Accountability / A Test Is a Test Is a Test—Not!

I don't know about you, but certain things in education vex me. High on my list is a conceptual confusion that's displayed not only by the public at large, but also by most educators I know. It's the belief that all education achievement tests are essentially interchangeable, that one standardized achievement test is pretty much the same as any other standardized achievement test. This is simply not so.

This misperception has serious consequences. For example, every few weeks we are apt to find newspaper reports describing investigations whose results have clear implications for education policymakers. Such studies might contrast the successes of (1) public schools versus private schools, (2) charter schools versus noncharter schools, and (3) board-certified teachers versus non-board-certified teachers.

However, before placing confidence in such empirical investigations—especially in studies that may influence the way we educate our children—we need to be certain the researchers adhered to the fundamental canons of research design. We also need to look at the chief outcome variable that the researchers used to arrive at their conclusions. In studies of this sort, the variable consists of students' scores on standardized achievement tests.

Even though achievement test scores are usually the deciding factor in determining such studies' conclusions, neither the general press nor the education press typically pays any attention to the nature of the tests used. Indeed, a number of reporters seem to imply that we can readily determine student “levels of achievement” no matter what sorts of achievement tests are used. In most instances, they don't even supply the names of the tests. Such reporters have almost certainly succumbed to the misguided notion that a test is a test is a test.

Inadequate scrutiny of the tests used in key investigations is particularly galling whenever a study's results indicate that there is “no significant difference” between the achievement of students from one group and the achievement of students from another group. First of all, it has become increasingly apparent during the last decade that many standardized achievement tests are instructionally insensitive; they are unable to discern the differences between students taught effectively and students taught ineffectively. Second, these instructionally insensitive achievement tests turn out to be so highly correlated with students' socioeconomic status (SES) that variations in students' backgrounds simply fog over the effects of the instructional interventions under study.

So when I read that students of teachers who have received Intervention X fail to outperform students of teachers who have received Intervention Y, I want to shriek out, “On which tests?” If the achievement tests being used are strongly influenced by socioeconomic status, then there's really no point in carrying out a study that uses those instructionally insensitive tests to measure the effects of instruction. Unless reporters describe the specific achievement tests being used—so that interested readers can at least consider the likely instructional sensitivity of those tests—it is folly to place any real confidence in a study's conclusions.

Let me also register my dismay that so many educators seem to believe that the adjective “achievement” in “achievement test” actually means what it says. My dictionary indicates that “achievement” refers to a “result gained by effort” and also to “the quality and quantity of a student's work.” So wouldn't you think that an achievement test would refer to what a student has learnedin school?

Many items in achievement tests do, in fact, assess what students have learned in school, but many measure not only students' socioeconomic status but also the academic aptitudes with which those students were born. Students who, at birth, came up winners in the gene-pool lottery will tend to perform better on these aptitude-linked items than will their genetically less fortunate classmates. Therefore, the greater the proportion of SES-linked and aptitude-linked items found in a given achievement test, the less suitable this test will be for investigating the effect of any intervention on students' test scores.

So if you find yourself pondering the results of an important education research report whose findings are based on student test scores, do not automatically assume that the tests used were appropriate. Because it's not only the public—but also many educators—who are confused about these tests. Where education achievement tests are concerned, most of the world's finest education researchers are as naïve as newborns. Even polished researchers, you see, can mistakenly believe that a test is a test is a test.

James Popham is Emeritus Professor in the UCLA Graduate School of Education and Information Studies. At UCLA he won several distinguished teaching awards, and in January 2000, he was recognized by UCLA Today as one of UCLA's top 20 professors of the 20th century.

Popham is a former president of the American Educational Research Association (AERA) and the founding editor of Educational Evaluation and Policy Analysis, an AERA quarterly journal.

He has spent most of his career as a teacher and is the author of more than 30 books, 200 journal articles, 50 research reports, and nearly 200 papers presented before research societies. His areas of focus include student assessment and educational evaluation. One of his recent books is Assessment Literacy for Educators in a Hurry.

Learn More

ASCD is a community dedicated to educators' professional growth and well-being.

Let us help you put your vision into action.

Discover ASCD's Professional Learning Services

From our issue

Science the Spotlight

Go To Publication