How do we know whether our instruction has made any difference? Even when we employ a range of formative assessments and use the assessment results to guide our instructional decisions, we haven't necessarily answered the question of whether our teaching increased students' learning, and by how much.
With a group of our colleagues, we set out to assess our impact. We wanted to see whether a statistical tool—effect size—could give us a more accurate picture of how much student learning had occurred as a result of our instruction.
A Starting Point
Determining impact requires baseline information. So after studying the development of quality assessments and sharing ideas with one another about ways to check for understanding, our team created pre-assessments for selected instructional units. For instance, for a unit that covered plagiarism, summarizing, paraphrasing, and quoting sources, the pre-assessment included multiple-choice questions like the following:
Plagiarism can best be defined as:
paraphrasing and summarizing the work of others in your own work
capturing or trapping something
presenting the words and ideas of others as your own
quoting and using citations when writing essays.
It also included open-ended questions like this:
Marla and her friends like the poems of Shel Silverstein, so she copied a bunch of the poems using the school photocopier, stapled them together, and made plans to sell the booklet to anyone who wanted it. Is this fair use? Why or why not?
Measuring Growth
Now that we knew where students currently stood in their knowledge of the subject matter, we were ready to teach the unit and assess their growth. We theorized that using a specific statistical tool, effect size, would give us a quantifiable measure for that growth.
In his research on the relative degree to which different instructional approaches influence learning, John Hattie (2009) demonstrated that an effect size of .40 was about equal to one year of growth for a year of schooling. Ideally, teachers produce more than a year's growth each year for their students. Hattie also demonstrated that teachers could use the same statistical tool to estimate the effect of their own classroom instruction on student learning.
We decided to give it a try. We administered a post-test on the plagiarism unit, using differently worded but corresponding items to ensure that our pre-test and post-test were aligned. We used the effect size formula to determine impact. Effect size equals
Figure
On an Excel spreadsheet, we listed each student's pre-assessment and post-assessment scores. We had the program calculate the averages and the standard deviations. We averaged the standard deviation for the pre- and post-assessments to serve as our denominator. In a matter of minutes, we had an effect size of .77. It would seem that students had made sufficient progress to consider the plagiarism unit a success.
Before moving on, however, we wanted to know whether there were any gaps in students' understanding that we should address. Our item analysis revealed that many students missed two specific items:
If you quote your friend in an interview, you don't have to cite him/her or use quotation marks. True or false?
Downloading music from the Internet without paying for it is
illegal and may result in being fined
unethical
unauthorized
no big deal
For the first question, we talked about why students might not understand the need to cite or quote the words of a specific individual with appropriate attribution. We developed a quick set of examples to clarify this concept for our students. For the second question, we realized that the answer we wanted, A, was conflated with answers B and C. Downloading music from the Internet without paying is illegal, but it's also unethical and unauthorized. We realized that the question was unfair, even though we had tried to emphasize the illegal nature of the action in our instruction. We deleted this question from the analysis, and the effect size increased to .89.
Targeted Help for Individual Students
The same tool can be used to determine individual student effect sizes. Instead of the averages, you use individual student scores and still divide by the average standard deviation, thus quickly determining which students still need to develop their understanding. Of the more than 150 students who participated in the plagiarism unit of study, 19 had effect sizes that were below our desired minimum value of .40.
As a team, we made plans to meet with these students to review their assessments and talk about each incorrect answer. We asked each student to tell us about his or her thinking so we could address misunderstandings. For example, in response to the following question, Horacio had chosen answer D:
Summarizing is
using the exact words of an author, copied directly from a source, word for word
putting the main idea(s) of one or several writers into your own words, including only the main point(s)
asking for help from a reliable source
rephrasing the words of an author, putting his/her thoughts in your own words.
In the discussion with Horacio, we found that he understood that using exact words required a citation, but was confused about the difference between summarizing and paraphrasing. This clarification of his thinking provided an opportunity for additional instruction.
Mid-Course Corrections
Determining impact does not have to wait until the end of a lesson. When teachers use multiple versions of an assessment—or a tool that identifies different levels of performance, such as a rubric—teachers can determine impact during the unit of study.
In the video that accompanies this column, 5th grade writing teacher Lisa Forehand teaches her students about peer feedback and critique to strengthen their writing. Earlier, she developed a rubric describing the skills needed to provide robust peer critiques. As a pre-assessment, she observed her students providing feedback and scored them according to the rubric. She then taught the elements of providing critiques over several lessons, and observed students providing critiques again to see whether their skills had improved.
Ms. Forehand was not satisfied with the results of the instruction. She observed that students were more focused on being nice than on looking closely at one another's work. The effect score she calculated confirmed her findings.
In response to this information, Ms. Forehand revisited the areas that required additional focus, especially making sure that feedback is specific. She asked students to note the difference between giving compliments and providing actionable suggestions. As students worked in pairs, she again made observations using the rubric, and later, she calculated their scores again to confirm that the additional instruction had had an impact. The following day, she remarked, "I've always collected student observation information, but analyzing the data this way gives me more specific, timely feedback about the progress I'm making with them. I can teach much more responsively."
A Tool for Making Better Decisions
Comparing pre- and post-assessment results, teachers can estimate the impact of an instructional unit and identify students still in need of instruction. By looking together at the results, collaborative teams can focus their conversation on what they can do to increase their effect on student learning. And they can make better decisions if they have accurate evidence of how much students have grown.