I have bad news, worse news, and a bit of good news. The bad news: Education leadership evaluation is a mess. Our national survey of leadership evaluation instruments reveals an astonishing disregard for what we know about effective feedback and meaningful evaluation. Newspapers provide box scores purporting to evaluate superintendents and principals on the basis of student test scores, but we rarely see any analysis of the impact of leadership on teaching and curriculum (Reeves, 2002a, 2002b).
The worse news: Improving leadership evaluation will be difficult. As the continued use of the grading system and the seven-period high school schedule attests, schools tend to cling to long-standing practices despite mountains of evidence pointing to the need for change.
The potentially good news: A better way exists. School systems can reject their dependence on ambiguous, demoralizing, and destructive leadership evaluation systems. One alternative—Multidimensional Leadership Assessment—has the potential to transform leadership evaluation from a blight on the education landscape to a constructive instrument of education policy.
Ineffective Evaluation
Author Sebastian Junger defined the perfect storm in his book of the same name (1998) as one in which many different variables came together at the same time to create particularly destructive consequences. The Center for Performance Assessment, in our National Leadership Evaluation Study, found a “perfect storm” of failure: The acute and growing shortage of education leaders is accompanied by a leadership evaluation system that simultaneously discourages effective leaders, fails to sanction ineffective leaders, and rarely even considers the goal of improved leadership performance. We reviewed hundreds of leadership evaluation systems and studied thousands of pages of documents in search of an example worthy of emulation.
These leadership evaluation systems do not come from the pens of incompetent bureaucrats; they come from intelligent and thoughtful people. But in almost every case, the evaluation systems are deeply flawed. These systems tolerate mediocrity, fail to recognize excellence, turn a blind eye to abuses, accept incompetence, and systematically demoralize courageous and committed leaders. Despite the exemplary work of such groups as the Council of Chief State School Officers and its Interstate School Leaders Licensure Consortium (ISLLC) (1996), the reality of leadership evaluation remains far removed from the ideal.
Thanks to such educators as Danielson (2002) and Darling-Hammond and Sykes (1999), schools have made significant strides in transforming teacher evaluation standards into practice. Unfortunately, although more than two dozen states and many school systems claim to have adopted the ISLLC standards, many of their administrator evaluation systems fail to implement those standards with any degree of precision. Although we are certain that examples of constructive and specific administrator evaluation systems exist, our failure to find them was not for lack of effort. The National Leadership Evaluation Study findings suggest that such stellar leadership evaluation systems are the exception rather than the rule.
Findings of the Study
More than 18 percent of the leaders we studied had never received an evaluation in their current position. In the words of one of our research subjects, “The worst evaluation experience was no evaluation at all. The message was that I was not important enough for my supervisor to take time to give me an evaluation.”
Of the leaders who were evaluated, 82 percent found leadership evaluation to be inconsistent, ambiguous, and counterproductive.
Fewer than half of the respondents (47 percent) agreed that their most recent leadership evaluation was related to student achievement.
Only 54 percent of the leaders said that their evaluation was based on clear standards.
Only 47 percent of the leaders said that their evaluation was sufficiently specific to help them improve their performance.
The higher the level of leadership responsibility, the lower the satisfaction with leadership evaluation instruments. New administrators more frequently received helpful and constructive coaching and feedback. Evaluation was least helpful for veteran administrators and central office directors. Leadership evaluation was at its worst when school boards were evaluating superintendents.
The narrative comments from respondents in our study reveal feelings of anger, betrayal, and despair. For a nation that will lose about half of its current school leaders to retirement within the next eight years (Dipaola & Tschannen-Moran, 2003), the systematic demoralization of the current leadership pool is destructive and foolish. With the growing number of unfilled leadership positions and an alarming number of leaders leaving the field of education, it is time for fundamental reform. The United States needs a new form of leadership evaluation, and it needs it now.
Ambiguous Leadership Standards
The problem starts with the definition of leadership, particularly in the context of education. Our survey reveals that the expectations articulated in most evaluation systems are at best ambiguous. At worst, they are contradictory, impossible, and inconsistent with common wvalues and mountains of research.
Most of the evaluation systems we reviewed eschewed descriptive rigor in favor of education jargon. The following statements come from local, state, and national performance expectations for school leaders. Each statement is followed by a challenge that any leader being evaluated by such a standard would want to consider.
Expectation: “The administrator facilitates processes and engages in activities ensuring that curricular, cocurricular, and extracurricular programs are designed, implemented, evaluated, and refined.”
Challenge: What in the world does this mean? How would we know if this standard has been met? Do evaluation and refinement refer to what is popular or what is effective?
Expectation: “Stays current with research and theory regarding motivation. Keeps abreast of the latest developments in the field of education.”
Challenge: Any research and theory? Much of it could well contradict the goals and values of the school system. This goal appears to endorse a collection of fads; school leaders could fail to distinguish what is “current” from what is important, valid, tested, and trustworthy.
Expectation: “Provides information on curriculum/instruction.”
Challenge: Is there a single school administrator who can fog a mirror who does not do this? The issue is not whether or not the leader provides information; rather, the issue centers on the quality of that information, and whether it will lead to good decisions and improve student achievement.
Expectation: “Facilitates processes and engages in activities ensuring that relevant demographic data pertaining to students and their families are used in developing the school mission and goals. Diversity is considered in developing learning experiences.”
Challenge: Does this mean that good leaders have different goals for low-income schools than for affluent schools? Is it a good idea to develop different goals for schools on the basis of their ethnic composition? If the families have a culture of low expectations, should schools mirror those expectations?
Expectation: “Participates in professional development activities.”
Challenge: My 4th grader's hamster can participate in professional development activities. What does this tell us about the impact of using new knowledge and skills to become a more effective leader?
Sometimes the expectations are internally contradictory. One district's leadership evaluation instrument requires its effective leader to “carefully weigh consequences of contemplated action”; a few sentences later, the evaluation assesses the same leader on whether he or she “is action-oriented; presses for immediate results” and “is decisive; doesn't procrastinate on decisions.” The same evaluation form requires the leader to simultaneously “hold to personal opinions,” “exhibit a need to control most situations,” and “demonstrate adaptability and flexibility.” We might gently suggest that an administrative certificate and a doctorate are not the criteria sought by this district, but rather some combination of divinity and multiple personality disorder.
Incoherent Leadership Evaluations
Not every leadership evaluation instrument that we examined is so deeply flawed in establishing clear leadership standards. But even those with clear standards often suffer from ambiguous descriptions of performance levels. Typical performance levels include “exceeds expectations,” “meets expected performance levels,” “superior,” or “average”—without any clear indication of which specific leadership behaviors deserve such labels.
Without specification, the leader's rating on these performance levels depends on the idiosyncratic judgment of the evaluator. However wise and insightful an individual evaluator may be, these judgments are doomed to be inconsistent and practically useless for coaching. The person evaluated only knows that one evaluator regarded him or her as “outstanding,” another evaluator believed that the same leadership traits and behaviors merited a rating of “meets standards,” and yet a third evaluator said that the same performance “exceeds expectations.” We should not expect leadership wisdom to emerge from such an ambiguous pool of linguistic slop.
Even if the standards themselves are clear, descriptions of performance devolve into the linguistic quicksand of “sometimes” compared with “seldom,” or “frequently” compared with “often,” or “exceeds expectations” compared with “satisfactory.” Intelligent people of goodwill can disagree about what any of these descriptions mean.
Perhaps the least-helpful performance ratings are such descriptions as “growth needed.” This rating is invariably a negative comment in the context of evaluation, yet I strain to think of a single leader—from Alexander the Great, to Napoleon, to Churchill, to Eleanor Roosevelt, to Martin Luther King, Jr., to the best school leaders I have observed in more than a million miles of travel—who would not enthusiastically check the box next to “growth needed” when describing himself or herself. To put it bluntly, when is growth not needed? Presumably, when one is dead.
Effective evaluation systems enable both the evaluator and the one being evaluated to understand clearly the differences between various levels of performance. Michael Jordan, for example, was acutely aware of the difference between putting the ball in the basket and hitting the rim. His fans shared his perceptions of clarity in evaluation. Sarah Chang, along with the vast majority of her audience, knows the difference between an F-natural and an F-sharp. But do school leaders, to whom we entrust our children and billions of dollars in resources, know the difference between performance that is exemplary, proficient, and below expectations?
Authority/Responsibility Disequilibrium
We wish our leaders to be some mythical combination of folk heroes, in which they have the insight of Lao-tzu, the courage of a New York firefighter, and the work ethic of Paul Bunyan. In the real world of school leadership, however, the relationship between demands and authority leads to results that are more prosaic. This does not stop the developers of leadership evaluations from making the grand resumption that the school principal or district superintendent enjoys enormous powers.
The most glaring examples of the authority/responsibility disequilibrium occur when we hold education leaders responsible for the actions of others—ranging from the most recalcitrant employee to the most apathetic community member—even though they lack the authority to control the actions of either of these stakeholders. One set of leadership standards reviewed in our study, for example, required the leader to “ensure that staff and community understand the analysis of student data.” Leaders can provide information to the staff and community and can even assess the staff's knowledge, but they cannot “ensure” understanding. Another leadership evaluation standard required the leader to “ensure a balanced budget.” Meeting this standard might require controlling the local property tax rate, the price of oil in Iraq, the impact of hail on the roofs of schools in Kansas, or the number of snow days in Idaho. To put it gently, snow happens, along with a host of other natural events that affect the budget and are far beyond the control of school leaders.
The desirable outcomes in a school or district fall along a continuum: Some areas are subject to the leader's control; others are subject only to the leader's influence; and still others are beyond the leader's influence. For example, the leader may directly control the timing and content of a faculty meeting. He or she may directly influence the quality and content of teacher evaluations, working within the constraints of the collective bargaining agreement. He or she can only indirectly influence the quality of teaching in the classroom and the motivation of students. And the extent to which students received adequate prenatal care and early childhood education is far beyond the control or influence of most school leaders. An honest leadership evaluation system will specify expectations for leaders that are appropriate, recognizing the amount of influence or control that the leaders can exert over each area.
A Better Way
Schools need an alternative to the vacuous exercises now called leadership evaluation. A better model would provide specific, accurate, and timely feedback. Rather than an event that occurs once a year (or in the case of senior leaders, every three or four years, always too late to influence performance), evaluation should consist of frequent feedback and provide multiple opportunities for continuous improvement. Rather than providing meaningless performance levels—such as “meets expectations,” “above average,” or “progressing toward standards”—the ideal leadership evaluation system would describe in specific terms the difference between distinguished performance and performance that is proficient, progressing, or failing to meet standards.
We have developed a model for more-effective leadership evaluation that meets these requirements and reflects best practices in performance assessment: the Multidimensional Leadership Assessment (MLA). The MLA model encompasses 10 dimensions of leadership, including resilience, personal behavior, student achievement, decision making, communication, faculty development, leadership development, time/task/project management, technology, and learning. This list, although hardly exhaustive, represents a compromise between the very extensive list of leadership requirements in such documents as the ISLLC standards and the vague assessments used in many school districts. For each dimension of leadership, we developed subcategories of specific leadership behaviors. Figure 1 (p. 54) shows the 10 dimensions and their subcategories.
Figure 1. Major Dimensions for Constructive Leadership Evaluation
Figure 1. Major Dimensions for Constructive Leadership Evaluation
1.1. Constructive reaction to disappointment and failure
1.2. Willingness to admit error and learn from it
1.3. Constructive management of disagreement with leadership and policy decisions
1.4. Constructive management of dissent from subordinates
1.5. Explicit improvement of specific performance areas after considering the previous leadership evaluation
2.2. Emotional self-control
2.3. Compliance with legal and ethical requirements in relationships with employees
2.4. Compliance with legal and ethical requirements in relationships with students
2.5. Tolerance of different points of view within the boundaries of the values and mission of the organization
2.6. Organization, including calendar, desk, office, and building(s)
3.1. Student achievement results
3.2. Student achievement reporting to students, parents, teachers, and school leaders
3.3. Use of student achievement data to make instructional leadership decisions
3.4. Understanding of student requirements and academic standards
3.5. Understanding of present levels of student performance based on consistent assessments reflecting local and state academic standards
3.6. Decisions in teacher assignment, course content, schedule, and student curriculum based on specific needs for improved student achievement
4.1. Factual basis for decisions, including specific reference to internal and external data on student achievement and objective data on curriculum, teaching practices, and leadership practices
4.2. Clear identification of decision-making structure, including which decisions are made by consensus and which are made by the leader with advice from others
4.3. Decisions linked to vision, mission, and strategic priorities
4.4. Decisions evaluated for effectiveness and revised when necessary
5.1. Two-way communication with students
5.2. Two-way communication with faculty and staff
5.3. Two-way communication with parents and community
6.1. Understanding of faculty proficiencies and needs for further development
6.2. Individual consideration of faculty needs linked to vision, mission, and strategic priorities
6.3. Personal participation in leading professional development initiatives
6.4. Congruence of strategic objectives and professional development content
6.5. Recognition and rewards strategically linked to most-important faculty and staff behaviors
6.6. Inclusion of faculty in decision making, including collaboration and advice on major leadership decisions
6.7. Formal and informal feedback to colleagues with the exclusive purpose of improving individual and organizational performance
7.1. Strong assistant administrators who are capable of immediately assuming leadership responsibility in this school or other buildings
7.2. Consistent identification of potential future leaders
7.3. Evidence of delegation and trust in subordinate leaders
Time/Task/Project Management
8.1. Consistent maintenance of daily prioritized task list
8.2. Choices for time management focused on the most-important priorities
8.3. Clear objectives and coherent plans for complex projects
8.4. History of completion of projects on schedule and within budget
9.1. Demonstrated use of technology to improve teaching and learning
9.2. Personal proficiency in electronic communication
9.3. Coherent management of technology resources, technology staff, and information
10.1. Personal understanding of research trends in education and leadership
10.2. Evidence of personal growth and learning
Source: Multidimensional Leadership Assessment, Center for Performance Assessment.
To make the dimensions and subcategories meaningful and useful for their specific needs, districts must develop detailed descriptions of leadership performance that range from exemplary, to proficient, to progressing, to not meeting standards. Figure 2 (pp. 56–57) provides an example of such a continuum of performance for three selected subcategories under the dimension of resilience.
Figure 2. Continuum of Performance for Selected Subcategories of Leadership Behavior
Dimension 1: Resilience
Evaluating Administrators - table
Leadership Subcategory | Exemplary (Systemwide Impact) | Proficient (Local Impact) | Progressing (Leadership Potential) | Not Meeting Standards |
---|
1.1. Constructive reaction to disappointment and failure | Public reports, including accountability documents, plans, and oral presentations, include frank acknowledgment of prior personal and organizational failures and clear suggestions for systemwide learning resulting from those lessons. | Readily acknowledges personal and organizational failures. | Acknowledges personal and organizational failures when confronted with evidence. | Defensive and resistant to the acknowledgment of error. |
1.3. Constructive management of disagreement with leadership and policy decisions | Articulates disagreements with policy and leadership decisions and advocates for a point of view based on the best interests of the organization. Challenges executive authority and policy leaders appropriately with evidence and constructive criticism, but once the decision is made, fully supports and enthusiastically implements organizational decisions. | Accepts and implements leadership and policy decisions. | Sometimes challenges executive and policy leadership without bringing those concerns to appropriate executive and policy authorities. Sometimes implements unpopular policies unenthusiastically or because “I'm just following orders, but I don't like it.” | Ignores or subverts executive and policy decisions that are unpopular or distasteful. |
1.4. Constructive management of dissent from subordinates | Creates constructive contention, assigning roles if necessary to deliberately generate multiple perspectives and consider different sides of important issues. Recognizes and rewards thoughtful dissent, thus conveying broader support for the final decision. Uses dissenting voices to learn, grow, and, when appropriate, acknowledge the leader's error. | Uses dissent to inform final decisions, improve the quality of decision making, and broaden support for final decisions. | Tolerates dissent, but there is little dissent in public because subordinates do not understand the leader's philosophy about the usefulness of dissent. | Dissent is absent as a result of a climate of fear and intimidation. |
A specific, accurate evaluation system, such as the Multidimensional Leadership Assessment, enables leaders to understand clearly their expected performance and to engage in frequent reflection and self-correction. This system also enables school boards and senior leaders to make their expectations clear before a leader is hired. It encourages proactive evaluation—starting the evaluation process before the first day on the job rather than as a reaction to disappointing performance. Most important, this model offers clarity and consistency. Because the standards are explicit and the performance continuum is unambiguous, leaders receive consistent, fair, and constructive feedback.
Time for a Change
The past three decades have witnessed tremendous strides in every area of education assessment. It is time for us to apply those lessons to the assessment of education leaders.
If schools persist in using the current unfair, ambiguous, and demoralizing system of leadership evaluation, then a generation of education leaders will remain subject to the whims of personal preference and newspaper headlines. The accompanying burnout of leaders and national shortage of people willing to occupy key leadership positions will have grave consequences throughout the system.
The Multidimensional Leadership Assessment offers a fairer and more constructive alternative than do the vast majority of existing leadership evaluation systems. While we await studies of the effectiveness of this model, we can be clear about the alternatives before us. The risk of failing to improve leadership evaluation is grave; the risk of providing a fair, specific, and constructive leadership evaluation system is minimal. Multidimensional Leadership Assessment is worthy of serious consideration and critical evaluation.