Skip to content
ascd logo

Log in to Witsby: ASCD’s Next-Generation Professional Learning and Credentialing Platform
February 1, 2000
Vol. 57
No. 5

The Results We Want

author avatar
The debates over the value of standardized tests have polarized educators for long enough. It is time to move toward a synthesis of assessment methods.

All standardized tests have their place, even the weak ones. Without them we would have nothing—no way to hold people accountable for teaching or learning. But we need to look at the schools that are doing well on the standardized tests and realize that those schools are focusing on the curriculum and on the students.—Asa HilliardAccountability follows responsibility. If there is no accountability, little by little, people lose their sense of responsibility and start blaming circumstances or others for their poor performance.—Stephen Covey
Which results have value for students and their learning? How can educators assume a greater role in defining assessment and accountability? To answer these questions, we must move beyond a counterproductive criticism of existing tests and toward a more cooperative and transitional path. Clarity on these issues comes from schools whose accomplishments show us a way out of the current confusion that too often surrounds assessment and accountability issues.
The Glendale Union High School District near Phoenix, Arizona, has met high community and state expectations with its improved performance on both standardized and home-grown, end-of-course performance assessments. These summative assessments include juried oral presentations, essays on controversial topics, complex math problems, and hands-on science performance tasks. At Adlai Stevenson High School District in Lincolnshire, Illinois, a similar system of assessments has pushed achievement on every kind of test to record highs.
At Bennet-Kew Elementary School in Inglewood, California, 78 percent of the children come from low-income families. In the 1970s, the principal's no-excuses approach and a solid, systematic literacy program raised schoolwide reading performance from the 3rd to the 50th percentile; scores are even higher now. The 1999 Stanford 9 data show the school at approximately the 62nd percentile in reading and the 74th percentile in math. First grade reading scores—so crucial to a school and its students—have reached the 85th percentile; 1st grade math achievement is at the 85th percentile.
South of Houston, Texas, in the Brazosport School District, we cannot tell the disadvantaged schools from the affluent schools solely by looking at achievement on the state's standardized test—the Texas Assessment of Academic Standards (TAAS). At every school, more than 90 percent of students achieve at or above a 90 percent proficiency level in reading, math, and writing.
We can write off these examples as aberrations or consider what these and hundreds of other schools can teach us: A range of tests has improved teaching practice and helped increasing proportions of children legitimately learn essential skills. Even imperfect tests—and all assessments are imperfect—can promote life-changing improvement and better, richer assessment systems. These schools demonstrate Grant Wiggins's contention that "the issue is not tests per se, but our failure to be results-oriented" (1994, p. 18).

Standardized Tests—Warts and All

Standardized tests have serious limitations. They do not fully measure students' critical and inventive powers. Their multiple-choice format doesn't reveal a student's ability to construct a proposal, build a case, analyze an issue in writing, or originally apply a host of mathematical processes—all things that we value. Norm-referenced tests complicate the assessment mission by seeking to sort and rank students. And their data are less reliable for students near the high end, where a correct answer to even one more item can push a student up several percentile points.
Then why use standardized tests at all? Schools and districts use them because they provide data and a results orientation that are essential to improvement. In many cases, they promote not poorer practice, but a common instructional focus and an abandonment of ineffective practices. Success may be an essential step toward earning the trust of a public that sees our rejection of these tests as a dodge. Success, therefore, may allow us to move beyond these tests' current dominance toward alternative modes of assessment.

The Case for—and Against—Standardized Tests

  • Numerical and intelligible, although approximate, data on how well a child, a teacher, a school, or a district is performing—or improving; and
  • Vital information about patterns of strength and weakness among students in a classroom, a school, or a district. These tests help schools see how well they are doing and in which specific areas they need to get better. Even test critic W. James Popham affirms that standardized test reports are "quite informative" and can illuminate a "child's strengths and weaknesses" (1999a, p. 9) among and within the subject areas. They give "rough approximations of a student's status with respect to the content domain" (p. 10). Teachers and schools can use this information to "devise appropriate classroom instruction" (p. 9). This is precisely the kind of feedback that real schools use as the basis for improvement.
But Popham goes on to say that using these tests "to ascertain educational quality is like measuring temperature with a tablespoon" (p. 10). Much of his critique of norm-referenced tests is trenchant and illuminating, but this metaphor and his conclusion are misleading. It is one thing to point out the limitations of these tests, but another to say that "asserting that low or high test scores are caused by the quality of instruction is illogical" (p. 12).
Popham is convinced that differences in norm-referenced test scores reflect three factors: what is taught in schools, which is the only one "directly linked to educational quality" (p. 15); socio-economic factors that affect out-of-school learning; and innate intelligence, the fact that "some kids were luckier at gene-pool time" (p. 12). Having told us that we can confidently use test scores to inform teaching, he tells us in another article that they don't reflect what is taught: that in schools serving disadvantaged children, test scores "in reality, reflect what children bring to school, not what they learn there" (my emphasis, Popham 1999b, p. 32).
But what happens in the classroom—what is taught and learned there—is an enormous factor, one that can significantly mitigate and even overcome environmental and genetic factors (Haycock, 1998; Mortimore & Sammons, 1987; Bullard & Taylor, 1993; Schmoker, 1999). Hundreds of disadvantaged schools have dramatically raised standardized test scores (Haycock, 1998; Carter, 1999). And the list keeps growing. Teachers who have seen scores go up in proportion to concerted, optimistic instructional-improvement efforts know that teaching makes a difference. They didn't cave in to the notion that only advantaged kids perform well on standardized tests.

Unraveling the Myths

The issues surrounding standardized tests deserve close analysis and clear explanations—something the test manufacturers have been slow to provide. One insidious and widely held notion about norm-referenced tests is that it is futile to expect scores to go up even if kids learn more. Hosts of teachers believe that the game is rigged; even as students perform better, their scores—the percentile scores that we and the public see—will remain the same to retain a bell curve. It is a Sisyphean struggle.
But, fortunately, the scheme isn't that perfect; we can skew the bell curve—up. Whole schools, districts, and even states can earn higher scores by improving their instructional program. Why? Because their scores are being compared with a fairly stable entity: a test and a scoring scheme that are changed only every several years. Simply put, this means that higher raw scores (more items correct) will correspond to higher reported percentile scores. That is, if more students are becoming functionally literate and numerate, scores will go up and increasing numbers will start to cross, say, the 38th or 40th percentile—well within the average range. We can regard these figures with rough confidence—about all we can expect in the assessment game—until students become appreciably more proficient and the test makers renorm their tests to adjust to these improvements.
But that would be the best news of all—at that point, the National Assessment of Educational Progress (NAEP) will affirm that we have made real improvements. NAEP scores are already creeping up. As this trend continues, honesty will require that we adjust our norm-referenced thresholds for determining satisfactory or functional levels of performance.
A related misunderstanding about standardized tests is that they measure only irrelevant, low-level skills. Therefore, students in a school that uses a higher-order curriculum will do poorly. That is, a school can be full of learned, knowledgeable students, despite their very low standardized test scores. In fact, standardized tests are flush with items that gauge students' ability to solve essential math problems and that assess their ability to retain, interpret, and analyze text. These clearly are not trivial or lower-order skills but are, incontrovertibly, foundational abilities. Any well-educated child will have mastered them.
Standardized tests are of great benefit here, especially because they are strongest at establishing respectable levels of literacy and knowledge in the middle range. If children are functionally literate or numerate, they are bound to perform within the average range, which should include students below the 50th percentile. Mean scores, and performance at the higher ends, are dicier: significant percentile differences can pivot on just one or two items. For that reason, it is encouraging to see the trend away from a focus on mean scores and toward a concern with increasing the proportion of students at or above a reasonable cut-off score—the 40th percentile, give or take.
Standardized test results have provided the essential focus and urgency for schools to improve and refine instructional programs in reading, writing, and math practices. Sometimes schools are surprised to see how clear the correlation is between improved teaching methods and higher standardized test scores (Livingston, Castle, & Nations, 1989).Test analysis empowers teachers to "affirm that they do make a difference for all children" (Bamburg & Medina, 1993, p. 36).
Imperfections notwithstanding, standardized tests have given schools focus, guidance, and better results. For Michael Fullan, assessment—including external, standardized testing—is the "coherence-maker," without which schools have little or no chance of improving (Fullan, 1999, private communication).
Ideas have consequences. An unfair and excessive criticism of standardized tests ignores the evidence of school success and alienates practitioners from a useful source of coherence. We must not, however, be complacent about their sufficiency. We have a duty—and an opportunity—to transcend them and their reigning predominance.

Beyond Standardized Tests

Our schools and districts have been painfully slow in creating their own accountability and assessment systems, even decrying their necessity (Theobald & Mills, 1995). Even districts that make big investments in learning about alternative assessment seldom go on to create data-driven accountability systems to monitor student learning or its improvement.
We have delayed long enough. It is time to create local, criterion- and performance-based assessment and accountability systems. If we can produce, monitor, and chart measurable data and improvement over time on these tests, we may be able to dethrone the standardized exams and buy back some opportunity to shape the assessment agenda.
We should give standardized assessments their due and work hard to bring more children to basic proficiency levels. This can be accelerated by replicating the focus, the goal orientation, and the collaboration that characterize improved schools, but are missing in the average school and district.
Simultaneously, we should develop—by district or consortium—our own end-of-course and formative assessments, whose substance captures but goes beyond standardized tests. Within a year or two, teacher teams meeting in the summer could complete such assessments while gradually supplementing—not replacing—objective assessments with summative projects, essential questions, and scientific experiments and proposals that reinforce and build on essential knowledge and concepts.
But this time, let's ensure that these new systems are data-driven. That is, they can clearly reveal both measurable annual progress as well as areas where improvement is needed. This could go a long way toward winning public trust in performance assessment.
Examples of appropriate test questions—similar to those that I have seen in the field—might demystify such assessments:
Science. With what we have learned about the environment in class and from your research, submit a study and a proposal for the best place to store [a selected toxic substance]. The proposal will be scored for practicality, scientific rigor, and knowledge of pertinent factors.
Social studies. Who do you think were the best and least effective presidents of each century? Include references to central issues and events.
Language arts. Select a work of literature read this term and write a detailed essay that demonstrates your interpretation of its importance for you and for our time.
Mathematics. Using your knowledge of mathematical principles learned in this course, create a projected family budget and financial plan that include projected costs for education, leisure, and retirement.
This is largely the stuff of an old-fashioned education. To reestablish such questions is both revolutionary and as old as the most venerable, conservative, private prep academies and colleges (Wiggins, 1994; Cookson, 1985).
Salient arguments exist on both sides in the debates about tests. Some compromise and a sensible transition could enable us to seize, rather than squander, the opportunity to give students the best teaching and testing that we can.

Bamburg, J., & Medina, E. (1993). Analyzing student achievement: Using standardized tests as the first step. In J. Bamburg (Ed.), Assessment: How do we know what they know? Dubuque, IA: Kendall-Hunt.

Bullard, P., & Taylor, B. O. (1993). Making school reform happen. New York: Allyn & Bacon.

Carter, S. C. (1999). No excuses: Seven principles of low-income schools who set the standard for high achievement. Washington, DC: Heritage Foundation.

Cookson, P. W. (1985). Preparing for power: America's elite boarding schools. New York: Basic Books.

Fullan, M. G. (1991). The new meaning of educational change. New York: Teachers College Press.

Glickman, C. D. (1993). Renewing America's schools: A guide for school-based action. San Francisco: Jossey-Bass.

Haycock, K. (1998, Summer). Good teaching matters . . . a lot. Thinking K–16, 3, 3–14.

Livingston, C., Castle, S., & Nations, J. (1989, April). "Testing and curriculum reform: One school's experience." Educational Leadership, 46, 23–25.

Mortimore, P., & Sammons, P. (1987, September). New evidence on effective elementary schools. Educational Leadership, 45, 4–8.

Popham, W. J. (1999a, March). Why standardized tests don't measure educational quality. Educational Leadership, 56, 8–16.

Popham, W. J. (1999b, May 12). Assessment apathy. Education Week, 18, 32.

Rosenholtz, S. J. (1991). Teacher's workplace: The social organization of schools. New York: Teachers College Press.

Schmoker, M. J. (1999). Results: The key to continuous school improvement. Alexandria, VA: ASCD.

Theobald, P., & Mills, E. (1995, February). Accountability and the struggle over what counts. Phi Delta Kappan 76, 462–466.

Wiggins, G. (1994). Assessing student performance: Exploring the purpose and limits of testing. San Francisco: Jossey-Bass.

Mike Schmoker is a former administrator, English teacher, and football coach. He has written dozens of articles for educational journals, newspapers, and TIME magazine as well as multiple bestselling books for ASCD. In an EdWeek survey of national educational leaders, he was identified as among the best sources of practical "nuts and bolts…advice, wisdom and insight" on effective school improvement strategies.

Schmoker is a recipient of the Distinguished Service Award by the National Association of Secondary School Principals for his publications and presentations. As a much sought-after presenter, he delivers keynotes and consults internationally throughout the United States, Canada, Australia, China, and Jordan.

Learn More

ASCD is a community dedicated to educators' professional growth and well-being.

Let us help you put your vision into action.
From our issue
Product cover image 100028.jpg
What Do We Mean by Results?
Go To Publication