Interviews with mathematics teachers in urban middle schools underscore the dilemmas that arise as they attempt to translate the varied results of alternative assessment into letter grades.
Ms. Webster sighs as she reviews the contents of her school mailbox: a professional journal, an announcement that portfolios are the next staff development topic, and district guidelines for assigning report card grades. Perhaps one of the journal articles will discuss how to incorporate multiple sources of information into a single grade.
Her students are spending a lot of time working in small groups and writing about the process—activities not easily translated into percentages. How will the district's interest in portfolios mesh with its rigid guidelines for grading? She sighs again as she anticipates a weekend spent struggling with assigning a report card grade for each of her students.
* * *
These days teachers are receiving many mixed messages about assessment. For example, teachers are encouraged to use a number of types of alternative assessments to guide instruction and monitor student thinking. How can all this information be recorded in a single letter grade? Teachers are encouraged to challenge students to do complex tasks and to communicate effectively. And yet they realize that low grades may have a negative impact on these efforts. How can grades adequately reflect student progress to date, and still encourage students to persevere? A third tension arises from how different audiences view grades. How can teachers be fair to students and clear in reporting to parents, while fulfilling their obligation to the school district?
To determine how some teachers are dealing with these and other dilemmas, we interviewed eight urban middle school teachers during the 1991–92 year. As participants in the QUASAR (Quantitative Understanding: Amplifying Student Achievement and Reasoning) Project, these teachers—who represent six geographically dispersed sites—are changing the nature of how mathematics is taught, learned, and assessed in their schools (Silver 1989).
Old vs. New Measures
Teachers may believe that students learn better when allowed to explore and discover, work in small groups, discuss and explain, use calculators and computers, reflect and write. Teachers may also think that assessment should be performance-based and thus collect evidence from multiple sources: projects graded on a 4-point scale, student self-assessment, interviews with students, and points from cooperative group work. Yet somehow at the end of the first quarter, teachers must translate all of this varied information into a letter grade to meet district requirements.
One teacher in our study, who had learned to grade open-ended projects according to a 5-point scale, commented on the difficulty: The hardest part is deciding: is a 4 a 100? Is a 3 an 80?... it's hard to look at what is equivalent to the traditional grade scale.
Traditional educational measurement courses teach that only objective, numerical scores should be recorded and used for reporting grades. Meanwhile, researchers and professional organizations encourage teachers to use multiple assessment measures, but give little indication of how to incorporate them into a grade for report cards.
Judge vs. Advocate
A second struggle is between the need to evaluate a student's achievement and the need to encourage and motivate further effort. One QUASAR teacher was emphatic about the negative impact of grades: I just hate grades. They are very discouraging for the children. The ones who get A's, get A's. Some kids come to school every day—in our community that's really wonderful—yet they get F's. I give them F's, because that's what they earn, I guess. That's the system.
Teachers recognize that grades are taken seriously by students, parents, and school administrators, and that poor grades may have unintended consequences for students. As children are challenged to do more complex tasks, it becomes important to support their efforts over long periods of time. As a result, teachers often use value judgments and ethical reasoning in their grading practices. Here's how one QUASAR teacher put it: Once I have the grades averaged, I think about, you know, do the students participate? Do they come to school? Do they mind doing oral presentations? That type of thing. I even tell the kids. “If your average is an 88, and you do all these things, you've got an A.”
The special teacher/pupil relationship increases the difficulty of being impartial (Airasian 1991). Teachers are familiar with personal characteristics of their students, such as attitudes, self-esteem, motivation, and family background. So it is not surprising that grades often reflect justice tempered with mercy.
Classroom vs. District
Once teachers develop a personal grading scheme, they still face the dilemma of how different audiences will interpret the grades. Teachers are being urged to try different ways of grading, scoring, and reporting to best describe what students know and can do. Districts use this same information to make decisions on retention, promotion, and placement in special programs; participation in extracurricular activities; and admission to schools of higher education.
School districts vary widely in the content of their grading policies and procedures. For example, one district in our study established percentages to indicate the relative weight that should be given to certain types of evaluation (for example, 50 percent for district unit tests, 20 percent for teacher-made tests, 10 percent for homework, 20 percent for classwork). Another district provided only vague guidelines (for example, pupils may receive any mark if, in the teacher's judgment, the quality of their work meets the criteria established by the teacher for that subject or course).
To be meaningful, however, grades must be interpreted by all members of a school community in the same way. If a B is construed to mean that a student has mastered from 80 to 90 percent of the material in a course, then the teacher's determination of that grade must be based on mastery of course content. Stiggins, Frisbie, and Griswold (1989) found that different teachers in the same building sometimes adopted different cutoff scores for the same grade, or even used different reporting schemes for the same course. Our interviews with QUASAR teachers showed the same disparity in methods for assigning grades.
Austin and McCann (1992) note that policymakers are beginning to question the various uses of grades and the messages they convey. At one level, grades are a kind of shorthand to communicate student achievement. At another level, educational leaders are concerned that current policies and practices may conflict with efforts to establish an educational system that ensures that all students acquire the knowledge, skills, dispositions, and habits needed for future learning and for productive lives.
Resolving the Dilemmas
By the second year of the project, QUASAR teachers had begun to resolve these dilemmas. In spite of the difficulties encountered, many of the teachers we interviewed preferred grading based on alternative assessments to the use of traditional quizzes and unit tests. “When there was only a right or wrong answer, students wouldn't try,” one teacher commented, “Now, they know that if they can show their thinking, they have a better chance to get a decent grade.” Another teacher who now uses portfolios said he assigns fewer D's and F's: “All students know something, so if they can show their thinking, few answers are an F.”
Most of the portfolio projects include some form of self-assessment so that students can see how the quality of their work has evolved. As students select papers for their portfolios, some ask to complete assignments. As one teacher told us, “One of the troublemaker students last semester got an A on his portfolio, because he went back and corrected all of the work that was wrong. All the other students were amazed!” Parents also like the portfolios, teachers reported, and find them helpful in parent/teacher conferences.
justify answers,
communicate thinking processes,
build on others' reasoning,
voluntarily extend a project, and
use mental mathematics strategies to solve appropriate problems.
Another QUASAR site received permission to develop its own mathematics report card. The first page explains the content covered and is revised each marking period to reflect the concepts taught. The first semester it was used, the report card consisted of a checklist of “Learning and Problem-Solving Strategies” (for example, student justifies methods and solutions, organizes work, sees connections, and so on). After parents and students complained that they needed to see a numerical or letter grade, a grading continuum was added to the report card, which the teachers continue to revise.
Another QUASAR teacher explained why alternative assessments may motivate students to higher levels of understanding of mathematics: Even if a student gets a D, that doesn't tell the whole story. In fact it gets in the way of the whole story. When we talk about behaviors and concepts—that is assessment. If we go to all the work of changing, and the only result is a more complicated way of coming up with a grade, nothing much will be accomplished.... My hope is that alternative assessment will lead to kids demonstrating that their work is quality, and if it isn't, they keep working at it without being labeled failures.
One school in the project has made an intensive effort to ensure that its high school mathematics teachers are aware of the capabilities of QUASAR students—as portrayed in portfolios, not by grades. These teachers have offered not only to send along the portfolios with their graduating middle school students, but also to sit down with the high school teachers and explain the contents. They have also written letters to high school counselors requesting pre-algebra or algebra placement for some students whose standardized test scores alone would have excluded them.
The Tides of Change
Adrift in the tides of change, teachers need assistance to ensure that they are not swept out to sea in the process! Classrooms are moving from a testing culture—where teachers are the sole authority, students work alone, and learning is done for the test—to an assessment culture—where teachers and learners collaborate about learning, assessment takes many forms for multiple audiences, and distinctions between learning and assessment are blurred (Kleinsasser et al. 1992). The challenge remains for teachers—with the support of their districts, their professional organizations, and the educational measurement community—to devise grading systems that adequately reflect this shift.
References
•
Airasian, P. W. (1991). Classroom Assessment. New York: McGraw-Hill.
•
Austin, S., and R. A. McCann. (April 1992). “Here's Another Arbitrary Grade for Your Collection”: A Statewide Study of Grading Policies. Philadelphia: Research for Better Schools.
•
Kleinsasser, A., E. Horsch, and S. Tastad. (April 1992). “Walking the Talk: Moving from a Testing Culture to an Assessment Culture.” Paper presented at the annual meeting of the American Education Research Association, Atlanta.
•
Silver, E. A. (1989). “QUASAR.” The Ford Foundation Letter 20: 1–3.
•
Stiggins, R. J., D. A. Frisbie, and P. A. Griswold. (1989). “Inside High School Grading Practices: Building a Research Agenda.” Educational Measurement: Issues and Practice 8, 2:5–14.
End Notes
•
1 Directed by Edward A. Silver, QUASAR (launched in 1989) is housed at the Learning Research and Development Center at the University of Pittsburgh. Educators from urban middle schools collaborate with a local university or education agency to develop innovative mathematics programs for students in economically disadvantaged communities.