It wasn't until I served as superintendent for research in a moderately sized school district and, at the same time, started treating my high blood pressure that I began to appreciate the complexity of data use. My experiences on these two fronts—with the district and with my physician—taught me five crucial lessons.
Lesson 1: We tend to be less critical of data that support what we already believe and more critical of data that do not.
Because the blood pressure monitor I have at home routinely gives me readings that are 8–10 points lower than those of the monitor at my doctor's office, I naturally prefer to believe the accuracy of my monitor. For a while, this was a serious point of contention between my doctor and me. I argued that the higher blood pressure readings I kept getting at his office were the result of the stress associated with making that office visit—and that my blood pressure was actually 10 points lower. I believed that the data that supported my beliefs (or, perhaps, my hopes) were more accurate than those that suggested I needed to give up things I enjoyed.
I saw this pattern repeated in my own school district. As a result of No Child Left Behind, our district was required to offer school choice to families in two of our elementary schools. For most of the central office staff, this was an infuriating prospect. Discussions focused on the irrationality of being forced to offer school choice and the damage it would do to already-struggling schools. In support of their position, staff members referred to research that "clearly indicated" that reducing class size would be more effective in improving student achievement than providing school choice would be.
In the case of my blood pressure, I could logically use specific data to justify my position that my blood pressure wasn't as high as the doctor believed it to be. In the case of our Title I schools, some research did suggest that class size reduction might improve student achievement. However, no matter how logical our positions on the data might have been, everyone emphasized flaws in the data that conflicted with their beliefs and overlooked flaws in the supporting data. If we truly believe that what we do makes a difference, then we are obligated to be as critical of research that supports our convictions as of research that contradicts them.
Lesson 2: O = T + E explains a lot about why we do what we do.
For those of you who have taken a class in testing or measurement, you may recognize this equation. Orepresents anobservation or measurement of some phenomenon. It could be a test score, a teacher's perception of a student's behavior, or a blood pressure reading. The equation conveys that this observation is a result of two factors. T represents the true state of the concept being measured. If we are measuring a student's science achievement, then T represents the accurate level of mastery. In the case of my blood pressure, T represents what my systolic and diastolic pressures really are. Our hope is that our observation is as accurate (equivalent to T) as possible.
E refers to error in our observation or measurement. In regard to blood pressure, factors other than the blood pressure itself affect the reading and make it less accurate, such as the stress of going to the doctor or the inaccuracy of the monitor. In education, the data we have on teaching, achievement, or behavior always have some level of error. Test scores, grades, and even our professional judgments are influenced by things other than the true quality of the work we are judging.
However, we often emphasize or deemphasize the E—and our choices typically depend on whether the data (O) fit with our preconceptions. That was the basis of my argument against the accuracy of my doctor's blood pressure readings—I said they weren't accurate because they were taken under conditions of abnormal stress. On the flip side, I contended that my readings from home were more accurate because they were taken in a less stressful setting. In the former case, I emphasized the error; in the latter, I deemphasized it.
During my tenure with the school district, I was struck by the frequency with which educators applied this selective skepticism to data about students' performance. Our state-mandated achievement test was administered each fall and was intended to provide measures (O) of students' mastery of the previous year's academic standards (T). The test is among the best in use today. Our district also administered a separate achievement test in the fall and spring of each year. This test, too, has impressive measurement credentials, and the results of the two tests are closely correlated. Each test provides specific information about a student's performance on particular standards and on the concepts and skills associated with each standard. As such, they would seem to provide accurate indications (O) of students' academic performance (T), although neither is without some level of error (E).
Despite this, teachers and administrators often argued that these measures weren't nearly as accurate as the information teachers obtained from common assessments they developed for their classrooms. They pointed out that the teacher is in the best position to make judgments (O) about a student's ability and that tests typically include measurement error (E). The position they took is that the error in traditional tests is too large to accept whereas the error in teacher judgments is much smaller. We often choose to focus on the error in a set of observations that we don't like and overlook the error in observations that we do like.
Lesson 3: It's easier to focus on what we do rather than why we do it or how we'll measure it.
School improvement plans tend to devote much more time to explaining what schools or districts will do and much less time to indicating what data they will use to measure success. The emphasis on process is understandable. First, it's much easier to collect data on processes than on outcomes: We can talk about the number of curriculums implemented, students participating, hours involved, and so on.
Second, processes are usually much easier and less risky to identify than outcomes. We select, develop, adopt, buy, and design programs or processes. We can easily note the activities we'll adopt in coming months, and this frequently results in long lists of possible improvements. I once visited a school whose initial six-page plan included four and one-half pages listing the initiatives that were to be implemented over a three-year period. All of this was directed toward two goals, which took up four lines (if you count the double-spacing).
Our focus on processes doesn't reflect a lack of caring. In the case of my blood pressure and cholesterol level, I honestly want to lower them, and I'm anxious to try anything that might do this. Yet, when I go to my doctor, I would much rather focus on what I've tried than on whether it has worked. I can easily talk about the fact that I've had no red meat since my last visit, that I've eaten a lot of fish, or that I've been taking the stairs instead of the elevator at work. What is less comfortable is facing the fact that many of the things I've done haven't affected my blood pressure or cholesterol in any appreciable way.
It's also more threatening to focus on effects. Defining a set of outcomes changes the extent to which we can be held accountable, either to ourselves or to others. In order to state what we intend to accomplish through a set of actions requires that (1) we actually know what we're trying to do, (2) we are genuinely convinced of its value, and (3) we accept responsibility for it. Any one of these is daunting enough and might lead us to avoid clarifying our goals, but the combination of the three is downright overwhelming.
We educators often find it intimidating to agree to accept this kind of responsibility. This is due, in part, to a fundamental contradiction: We make lofty statements about "ensuring that every child can master rigorous academic standards," but we also accept the idea that family and community factors explain more of a student's academic achievement than do schools or schooling. In our heart of hearts, we often act under the assumption that some students simply won't be academically successful no matter what we do.
As educators, we want to help students feel good about themselves, to create critical thinkers and participatory members of a democratic society, and to help students develop the academic skills that will enable them to be productive citizens. However, until we clearly, confidently, and convincingly make these ends explicit to ourselves and our audiences and agree on some method to assess the outcomes, we can never be certain whether we are progressing or simply acting. And we can never be held accountable, even to ourselves.
Lesson 4: Data are good.
Just because I made a hole in the living room wall by using a hammer to put in a screw doesn't mean that the hammer is a bad thing to use. Although we can misuse data intentionally or otherwise, data are a tool—and a valuable one at that.
When my doctor first diagnosed my high blood pressure, he prescribed medication and told me to change my diet, eliminate all alcohol, and begin exercising. Being strong willed, I took my medicine each day but didn't make the other lifestyle changes. Not surprisingly, the next visit at the doctor revealed that my blood pressure hadn't changed at all. So I immediately started a regimen in which I ate no meat or dairy products, stopped all alcohol consumption, and began exercising two hours a day. I lost 20 pounds, my blood pressure dropped dramatically, and my cholesterol moved into the low average range. My doctor congratulated me, I was pleased with myself, my wife was happy, and after that visit … I just didn't have the energy to do all of those things anymore.
Over the next year, I worked with my doctor to determine what really did make a difference. It turned out that exercising and taking my medicine maintained my blood pressure at about the same level it would have been had I given up martinis, meat, and cheese (each of which I love). Using the data, then, I was able to achieve what I wanted and needed in a way that made sense for me.
We are also sometimes reluctant to use data out of concern that they can be manipulated or misrepresented. Certainly, this can be true. For example, some time ago my wife suggested that I keep a log of what I eat and drink as part of my cholesterol/blood pressure/weight control efforts. At the end of the first week, she checked my log and chastised me for getting second helpings at several meals. So I started taking single helpings at each meal—but each of these was larger than normal. My wife also noticed that I sometimes hadn't recorded a snack (or two). Now, I'll admit that I intentionally manipulated the first set of data—my larger-than-normal single helpings—but the second set was, honestly, just a result of poor record keeping. Notably, the health factors that I wanted to improve didn't change at all.
No Child Left Behind has emphasized the issue of data collection, and many believe its emphasis on annual testing is severely misguided. However, the argument that testing students annually is somehow damaging or morally wrong seems problematic. In fact, almost no one really believes that this is a bad thing to do. So, the issue isn't really about collecting data, but about who gets to decide what they mean—we, the education professionals, or they, the politicians.
It's possible that both we and they want what is best for students. In fact, we would probably all agree that we are not as successful as we would like to be in educating students from poor families or students with special needs. Unfortunately, we're most likely to disregard the data when they force us to acknowledge harsh realities that we'd rather not see—yet acknowledging these realities has the greatest potential to guide us toward improving our practices.
Lesson 5: Sometimes doing what is most effective isn't worth what we have to give up.
To address my health issues, I worked my tail off for six weeks. I gave up many things I enjoyed and replaced them with water or iced tea, fish and vegetables, and two hours a day at the gym. I reduced both my blood pressure and my cholesterol level. The problem is, this required sacrifices that I wasn't willing to make over the long term. To achieve optimal success, I had to find a way to balance what was possible in terms of effectiveness with what was acceptable in terms of efficiency.
This situation arises all the time in schools. For example, in a district with which I worked recently, the data indicated that almost 70 percent of students entering 9th grade were reading below grade level. As a result, the schedules of all middle schools in the district were revised to include a full period devoted to improving reading, and various resources were diverted to this area.
After one year, the data indicated that almost 50 percent of students entering 9th grade were reading at grade level, and student performance in mathematics, science, and social studies continued at levels comparable to previous years. By the end of the second year, nearly 70 percent of freshmen entered high school reading at grade level, although social studies achievement dropped slightly and greater numbers of 8th and 9th graders failed the state exam in science and mathematics. Three years into the initiative, more than 75 percent of freshmen read at or above grade level, but achievement in mathematics and science continued to decline for both freshmen and sophomores.
In terms of its primary intention—improving the reading level of students entering high school—the program was entirely successful. But this effectiveness came at a substantial cost.
The Heart of the Matter
If we believe that what we do matters, as every good teacher does, we should use every tool available to make the best possible decisions for our students. For both the educator and the physician, professional judgment and intuition are among those tools, but so is the informed and honest use of the data, which can prevent us from leaving to chance the future of the students we teach.