October 1, 2007
•
•
Vol. 65•
No. 2Ask About Accountability / Report Cards, Test Gaps, and Item Types
Question: In our school, we plan to replace traditional letter grades with standards-based report cards and regular parent-teacher conferences that will sometimes be student led. What are your thoughts as we move in this direction?
—Jennifer Chaffman, Teacher's Assistant, Children's Community School, Davidson, North Carolina
Answer: I like the idea of having regular parent-teacher conferences, with some of them student led. But to get the most out of this approach, you folks will need to prepare all participating teachers, parents, and students. Everyone needs to learn how to make these conferences truly productive instead of just genial get-togethers. Set aside planning time to work through a set of guidelines regarding how these conferences ought to function and get those guidelines in the hands of all participants. Rick Stiggins is a strong proponent of student-involved grading conferences and has written often on how that process should work.
Because of your move toward standards-based report cards, your school should pay serious attention to how accurately teachers, parents, and students can determine students' status regarding specific content standards. For example, everyone needs to know how to apply the rubrics used to evaluate a student's mastery of those standards.
But two caveats here. First, the quality of your standards-based report card initiative depends on the clarity and rigor of the reporting system you adopt. Standards-based reports that are less than clear will be less than useful. Second, if parents, students, and teachers are obliged to evaluate a student's progress with respect to an excessive number of content standards, such a report card approach is certain to stumble. Standards-based clarity is a good thing. Standards-based clarity about too many standards is a contradiction in terms. The upper-limit number of standards to include in such report cards should be based on what the participants can easily keep in the forefront of their minds—somewhere between 6 and 12 standards. I lean toward the lower end of that range.
Question: The students in our district must complete state-administered assessments in grades 3–8. We currently use teacher-made tests and textbook assessments to document student learning at lower grades. Are those good enough?
—Kimberly Lisanby-Barber, Principal, Spring Valley, Illinois
Answer: The tests your teachers currently use may be just fine. Then again, they may not. I know it is difficult for many teachers to look at nicely printed, end-of-chapter tests in published textbooks and to suspect that those tests are flawed. Unfortunately, some of those tests may have been turned out by graduate students in need of pocket change. Of course, decent tests exist in textbooks, but your colleagues need to know which ones to use. The same holds true for teacher-made tests.
Your best approach to ensuring effective assessments is to ladle out a lump or two of assessment literacy for those teachers, paying special attention to the testing of early-grade students. As a result, those teachers will not only be able to judge the worthiness of their own classroom tests but also collaboratively evaluate other teachers' tests. As is often the case when teachers tangle with tests, the solution to most problems is to enhance teachers' assessment literacy.
Question: What does research tell us about the reliability of measuring learning using only multiple-choice tests?
—Sharon Stockbauer, Teacher, Alternative Certification Program, Austin, Texas
Answer: Test reliability refers to the consistency with which we measure students. By using only one type of test item, such as all multiple-choice items, we usually end up with consistent measurement. What I think you're getting at with your question, Sharon, deals with the validity of the interpretations we make about students. We use tests to arrive at inferences about the unseen skills and knowledge that our students possess. If those inferences are valid, we can make more astute instructional decisions.
If we assess students using only one type of test item, almost all assessment authorities agree that we rarely get an accurate fix on a student's generalizedskills or knowledge. If a student is learning a high-level mathematical skill, we need to know that the student can use that skill in all kinds of settings, that he or she can correctly answer a wide variety of test items calling for the use of that skill. To reach a more valid interpretation about a student's true level of achievement, we need to employ a range of different types of items, such as selected-response and constructed-response items. Selected-response items call for students to select an answer from a set of presented options; multiple-choice items and true-false items fall in this group. Constructed-response items require students to respond by generating an answer; essay items and short-answer items fall in this group.
The official policy positions of almost all professional assessment groups, such as the American Educational Research Association and the National Council on Educational Measurement, assert that educators should never base an important instructional decision on the results of a single test. The same is true for types of test items. We need to rely not on multiple-choice tests but on measuring our students in multiple ways.