November 1, 2005
•
•
Vol. 63•
No. 3All About Accountability / Can Growth Ever Be Beside the Point?
A school is supposed to nurture children's intellectual growth—that is, to promote students' increasing command of significant bodies of knowledge and key cognitive skills. Consequently, if a school does promote greater intellectual growth in its students each year, you would think that any annual evaluation of the school's effectiveness would reflect that growth.
Although the architects of No Child Left Behind (NCLB) clearly wanted U.S. schools to foster gobs of student growth, this law's evaluative requirements currently don't function that way. Each state's success or failure under NCLB hinges on the particular cut score that a state's officials have selected, often arbitrarily, to determine whether a student's performance on a state accountability test classifies that student as proficient. On the basis of that cut score, a state identifies its students as either “not proficient” or “proficient or above.” For simplicity's sake, I'll call this cut score the proficiency point. A state's proficiency point on each of its standardized tests becomes the most significant factor—by far—in determining how many schools will stumble during that state's annual NCLB sweepstakes.
If a designated percentage of a school's students don't earn proficiency on the state-mandated tests, that school is classified as having failed to make adequate yearly progress (AYP). Yet as odd as it may sound, such AYP failure can occur despite the fact that the school has promoted substantial overall growth in its students' achievement levels.
I can illustrate the absurdity of this situation with a fictitious school I'll call Pretend Prep. Let's locate Pretend Prep in a state that uses four levels of NCLB-determined proficiency—below basic, basic, proficient, and advanced. Two years ago, 50 percent of Pretend Prep's students earned such low scores on their state's standardized NCLB tests that they were classified as below basic. The other 50 percent of this imaginary school's students, because of their higher test scores, were classified as proficient.
However, because of an intense, yearlong instructional effort on the part of Pretend Prep teachers, last year all the below-basic students scored well enough on the tests to move up one level to the basic category. Moreover, all the school's proficient students improved their scores so that they, too, jumped up one category to the advanced classification. This represents astonishing academic growth on the part of the school's students. However, because NCLB success is only determined by the percentage of students who score at or above the state's proficiency point, Pretend Prep has shown no AYP-related progress. Despite the blatant evidence of remarkable growth in student achievement, 50 percent of the students remained below the proficiency point, and 50 percent remained above.
Although my example is both fictitious and extreme, it illustrates an important point: In real-world school evaluations, students will often improve on state-mandated tests, sometimes dramatically, but the improved scores will not influence a school's AYP status because those students' scores don't cross the proficiency point.
The major drawback of a school evaluation system that doesn't take growth into account is that it encourages teachers to focus excessive instructional attention on students who are at the cusp of proficiency—just above or below a test's proficiency point. Because students who are well above or well below that proficiency point won't affect a school's evaluation, teachers may be tempted to neglect those two categories of students.
Although I am urging the adoption of a growth-based approach to NCLB school evaluation, I am not advocating the adoption of “value-added” evaluative models, such as those first used in Tennessee but now adopted by a number of other states. These value-added models are explicitly designed to monitor individual students' grade-to-grade achievement growth. By following individual students' growth across grade levels, value-added models can circumvent NCLB's “cross-sectional” analyses whereby test scores of this year's crop of 4th graders, for example, are compared with the test scores of last year's crop of 4th graders. Because of the sometimes considerable disparities in the abilities of different grade-level groups, any evaluative approach that doesn't depend on cross-sectional analyses has great appeal.
For Tennessee's version of the value-added method to work properly, however, student test scores must be statistically converted to a special kind of analytic scale so that student achievement gains in particular content areas represent the same amount of growth at different grade levels. Thus, an analytic scale must be generated so that a 6th grader's 10 percent improvement in mastering 6th grade math content, for example, will be equivalent to a 5th grader's 10 percent improvement in mastering 5th grade math content. Without such analytic scales, most value-added approaches just won't work.
These kinds of analytic scales are difficult to create, however. That's because substantial curricular variation exists between grades, even in the same content area. Moreover, children have an annoying habit of developing cognitively and emotionally in different ways at different times. Accordingly, the only statistically defensible analytic scales for value-added models are excessively general ones, such as scales measuring a student's “quantitative competence” or “language arts mastery.”
But these overly general analytic scales supply teachers with no diagnostically useful information about which skills or bodies of knowledge a student has or hasn't mastered. Consequently, any useful diagnostic information instantly evaporates with the installation of value-added approaches. Regrettably, value-added methods sacrifice effective instructional diagnoses on the altar of statistical precision. We need to find better ways of measuring students' growth for our AYP analyses.
Fortunately, the U.S. Department of Education has appointed a number of study groups to advise Secretary of Education Margaret Spellings about how best to incorporate growth models into NCLB's accountability requirements. I'm hopeful that these advisory groups can come up with reasonable ways to incorporate growth into NCLB's school evaluations and that Secretary Spellings will heed their advice. Although advocates of a value-added strategy sure know how to belt out an alluring siren song, I suggest we all steer clear of that approach.