THE AMERICAN UNIVERSITY CHAPTER   0151

                                                   WASHINGTON, DC 
Research Committee:
Dorrie Arrigo
Helen Ballard
Udean Mars
Gwendolyn Means, Chair

Research Committee:



Amid the many current controversies in our profession, one that continues to be debated is the matter of teacher evaluation.  The Research Committee of Phi Delta Kappa, The American University Chapter, seeks to discover how teachers view their current evaluation system and what recommendations they have for a newly designed evaluation process and instrument.  We begin with the position that evaluation (also called assessment, review, and other terms) is an appropriate and necessary function of most employment, including that of educators.  Its purpose should be to provide a fair method of giving feedback to the employee about his or her job performance, with both recognition for areas of strength and attention given to areas where improvement may be desired.  We also acknowledge that those who are being evaluated are not always comfortable or satisfied with the method and/or the instrument used for this process.

A front-page article entitled “D.C. teacher evaluation becomes a delicate conversation” by Stephanie McCrummen in The Washington Post (March 18, 2011) described teacher evaluation as a “…hypersensitive center of an education reform movement that has taken aim squarely at teachers” (A1).  Given such an adversarial position on the part of their supervisors, teachers are eager to bring about some reform themselves, specifically in the area of how they are evaluated.

One goal of this inquiry is to establish a type of evaluation for teachers that they have designed from their perspective and are willing to be judged by and which also is acceptable to the administrators who perform the evaluations.  We acknowledge that even if such a document is created, other aspects of evaluation may still need to be addressed (e.g. the degree of interaction used before and during the process, follow-up conferences, consideration of any outcomes or consequences, degree of objectivity v. subjectivity and the like).

For this study, teachers in various types of positions were interviewed, representing Early Childhood Education, middle school/junior high and high school levels, primarily in public schools; veteran teachers, newer teachers, specialty areas (English as a Second Language, Special Education), classroom teachers, and resource teachers.  Different school districts were included.  Although the number was limited, representation was diverse enough to provide a summative response.

Focus Questions and the Answers Given

1.When asked about the students with whom they worked in terms of demographics, academics, skill levels, and other factors, teachers described children of diverse backgrounds, cultures, and languages; socio-economic levels were not mentioned; regarding academics, most teachers replied that students came to them without being well prepared for the level being entered.  Even very young children were not aware of simple knowledge such as letter sounds.

2. With regard to standardized testing, most teachers reported that students’ scores do impact upon their evaluations; some teachers stated that even though their own students did not participate in standardized testing (e.g., at the pre-kindergarten or kindergarten level), the overall test results of the student population did impact on their evaluations.  A smaller number reported that if their students did not participate in the testing, the evaluation process did not include a consequence for these teachers.      

3. When asked to comment on the aspects of the current evaluation process which they considered appropriate, teachers mentioned degree of accomplishment in the following areas:

-level of instruction
-delivery of instruction
-learning styles
-differentiated instruction
-using Standards
-attendance, punctuality
-recognition for projects, extra programs, extra time spent at school and at home
-bringing in outside resources
-acknowledgement of growth of individual students in various areas (e.g., improved attendance, higher grades, other indicators of personal and academic increases)

4. When asked to comment on the aspects of the current evaluation process which they considered inappropriate, teachers mentioned the following:

-standardized testing
-“Reflection Questions” (used by some jurisdictions as part of the process), which are not aligned with the grade levels
-lack of validity Of certain points
-use of “Satisfactory” or “Unsatisfactory” only—not using a more specific range to communicate to the teacher where he/she stands
-failure of principal or designate to perform regular evaluations and provide effective, clear feedback
-insufficient numbers of observations on which to base evaluations
-the exceptionally detailed and micro-cited actions of the teacher during observations, reducing the teacher to a rather
  robotic delivery
-the fact that certain indicators could be interpreted inaccurately                                                                       
-the lack of freedom in teaching, due to the excessive number of required factors upon which a teacher is judged
-the focus is on the teacher rather that the students’ progress and activities
-the lengthy, cumbersome instrument being employed rather than a simpler interaction between teacher and observer
-inconsistent personnel to observe; same individual should come in fall and spring
-lack of pre-conference
-lack of time frame, scheduling
-the fact that the evaluation process usually puts all teachers in the same box rather than allowing them to “think and teach outside the box” while judging their performance based on the effectiveness of their artistry; the effectiveness can be ascertained by observing the students, their eagerness to learn and work, their participation, their expressions, and their work.

Claude Nadir, a high school English teacher in the Washington metropolitan area, takes issue with the IMPACT instrument because of the lack in its depth and breadth.  He believes that it is the missing components by which the performance of an entire school district can be measured—namely, a Code of Conduct by which appropriate teacher behavior can be measured to address an elusive intangible in school culture.  Such an element also contributes to promoting a positive school culture.  Given carte blanche in the hiring and firing of teachers, Nadir would require adherence to a teacher code of conduct, including professional dress guidelines, maintaining credibility as a trusted authority in the school community regarding social activities and use of appropriate language, proper use of school resources, support of the school’s mission.

5. When asked to give recommendations on a new design for the evaluation process and the instrument to be used, teachers suggested the following:

-much observation of the classroom, including the atmosphere in the classroom, teacher interaction with students, minimal lecturing, presenting objectives, moving around and observing student understanding;

-(two opinions:) -teachers need to move away from setting a date for observation; any time the administrator comes in should be all right; observations should be scheduled, and observer should honor the time set, not arrive after the lesson has begun
-administrators should demonstrate, model examples for everything
-use videotapes for self-evaluation
-concentrate on student progress rather than long checklists for teachers
-evaluations should include such indicators as lesson plans, students’ portfolios, parent contact log, teacher’s attendance, committee participation, classroom environment, documentation of student growth, student attendance, student behavior charts; for Special Education teachers,
 IEP components
-use a numerical scale from 1-5 to allow the teacher to visualize where there is strength and weakness;
-provide opportunities for improvement without affecting teacher morale or discouraging the teacher in a negative way
-it is important to motivate employees to strive for improvement

Other voices

Valerie Strauss, who writes The Answer Sheet education column for The Washington Post, notes that critics of Michelle Rhee, previous Chancellor of District of Columbia Public Schools, were hoping for changes in the standardized testing policies with the appointment of a new chancellor.  Teachers were hoping for a change in the evaluation system, for many of the reasons noted in this report.  The chief concern of Rhee’s critics was that “her reform program unfairly focused on standardized tests as the key way to judge students, schools and teachers.  This resulted in the seriously flawed teacher assessment model called IMPACT that emerged from a department overseen by [Kaya] Henderson” (B2), who has now been named as the new Chancellor.  In a recent television interview, Henderson affirmed that use of the IMPACT instrument would continue, much to the dismay of many DC teachers.

David B. Cohen, a teacher for 16 years, is now in his 13th year of teaching in California public high schools.  He earned a Master’s Degree in Education at Stanford University and achieved National Board Certification in 2004.  He is a founding member of Accomplished California Teachers (ACT) and co-authored the group’s first policy report, which proposes a multiple-measure teacher evaluation system.  He is quoted here by Valerie Strauss in The Answer Sheet (9/23/10):  “I am adamantly opposed to the idea of using state test scores in teacher evaluations.  I have argued that point repeatedly in various ways in various publications, but in this letter I have focused on the entirely real situation in which I find myself this year.  So, here’s my final question for you: do you honestly believe that the combined effects of this much change can be measured?”  Cohen also took the role of teacher in putting forward very specific factors—all based on research--that need to be considered in what he described as his own evaluation.  They include: a new principal, new colleagues, the need for an additional administrator for support, new class scheduling, starting time for students, his own teaching schedule the use or misuse of tutorial time by students, and a new data management system.  Cohen maintains that research shows that each of these factors strongly influences a teacher’s performance, and his examples bear out that contention.

Richard Rothstein, a research associate with the Economic Policy Institute, stated that if principals are judged by how many “highly effective” teachers are in their schools, and if “highly effective” is defined by test scores, then teachers skilled at narrow test preparation could be kept on while teachers who were more skilled at developing critical reasoning skills could be let go.  (McCrummen, Post, 2/14/11, p. B2).

Mike McGrew, a former teacher, counselor and adjunct university professor, wrote recently of three outstanding teachers in his life.  He emphasized passion, direction, critical thinking, writing skill, probing discussions, toughness blended with fun, expectations, dedication, and the fact that “They developed important relationships with their students, in whom they cultivated mature thinking, cogent writing, personal responsibility, and love of learning” (C5).  He concluded by saying that “School systems must adequately assess all key teacher qualities as major components of their evaluations.  In this process, feedback from students, parents, administrators and master teachers is essential.  Otherwise, [efforts to race to the top] may become another monotonous march toward standardized mediocrity”  (B2)

In March 2011 the National Research Council stated in a report that the Council did not accept the idea that test scores alone are “…a good way to judge school improvement.  Yet the belief that test scores are synonymous with school quality has become deeply ingrained in our national consciousness since the passage of No Child Left Behind in 2002  Based on that dubious metric, schools across the nation are closing or being privatized” (B7).  The report concluded that the media and the public should be “immediately skeptical of dramatic score gains produced in a year or two” (B7).

The term “value added” has crept into evaluations, ostensibly to credit teachers who go beyond the printed page of evaluation indicators; however, the definition of “value added” is not always clear because although it is based on a technique developed by William Sanders, a statistician at the University of Tennessee, it has a number of variables and interpretations, thus proving counterproductive to the intent of defining the “value” that a teacher “adds” to the picture. 

Jason Kamras, who designed the IMPACT document, was asked by education reporter Bill Turque, “How do you explain value-added to a lay audience when you’re dealing with PhD-level statistics and math?”  Kamras  replied:

Basically, think of this way: A teacher has a set of students.  We can, through this
formula, figure out what the typical ending score is for kids like this.  By kids like this, I
mean kids who had a similar performance history and some of the similar demographic
characteristics, like free and reduced-price lunch, special education, [English language
 learners] and so forth.  And through this regression formula [a statistical tool for studying relationships between multiple variables] we can figure out that on average, kids like this tend to end up here.  Then we can calculate how did your kids actually end up, and then we compare the two.  And that’s it.  That is essentially what it comes down to.  Now that first piece is a bit of black box, surely.  And there’s a lot of math in there.  But the math is in there to make it as fair as possible, so that we’re taking into consideration where the kids started, what they’ve done previously and the other things we know are outside the teacher’s control.  By doing all that we’re isolating the impact of the teacher.  (Washington Post, May 16, 2011, page B2)
Are all lay audience members clear?

One might ask whether or not the mathematical formula(s) for value-added are oblivious to the human element, particularly when relationship between learner and teacher is so critical.  In reviewing the value-added component, Ravitch notes, “Value-added assessment is the product of technology; it is also the product of a managerial mind-set that believes that every variable in a child’s education can be identified, captured, measured, and evaluated with precision….With their sophisticated tools and their capacity to do multivariate longitudinal analysis, [statisticians ands economists] did not need to enter the classroom, observe teachers, or review student work to know which teachers were the best and which were the worst, which were effective and which were ineffective” (180).  How can the relationship factor be considered valuable if it is not observed in a real classroom?

A detailed article by Stephanie McCrummen in the Washington Post (March 2011) addresses the current system of evaluation in the public schools of the District of Columbia, now in its second year, by following the process of evaluation of a mathematics teacher in a fairly average school.  This individual has been a teacher in DCPS since 1989.  He described the IMPACT instrument as “a sword hanging over my head.”  Despite his many years of experience, he worried that his “freewheeling teaching style would bomb under the new system.  And in his case, classroom observations would count for 75% of his overall rating.”  An important factor mentioned by Nathan Saunders, president of the Washington Teachers Union, is that poverty is among the reasons that IMPACT is unfair: It is more difficult for teachers in such schools to earn a decent rating than for those in wealthier areas.  The school involved in this article is located in one of the poorest wards in the city.  The teacher is well aware of this factor and tries to compensate for his students’ challenges by serving as a mentor and a tutor on weekends.  However, IMPACT does not take these efforts into account in rating the teacher.  The final score of the evaluation, calculated by crunching many pages of number of ratings from one to five, was low; the teacher realized that he could lose his job, “[i]f the trend continued.”  The teacher made these comments:

What I’m trying to convey to you, [name of evaluator], is that most lesson plans, the best ones, no matter how pinpoint-precise I plan it, the lesson will deviate.  It will deviate because there is always some other rock I have to overturn to look at….I don’t feel that I’m putting in [only] ‘minimally effective’ effort at all….”

 The evaluator’s comments included the following:

I’m sorry…and I hate that this is discouraging.  I really do.  You’ve had good ideas, really….This does not measure your effort….But I do see your effort….[The evaluation] is measuring the effectiveness of that effort….This is not a reflection of your passion for education, your love for students.  Not at all.”  Again, the lack of freedom to deviate appropriately from a lesson plan reduces the teacher to a robotic type of performance and robs the children of the all-important relationship between student and teacher that cultivates the learning experience.  (McCrummen, The Washington Post, March 2011)

Diane Ravitch in her book The Death and Life of the Great American School System (2010) treats the matters of testing and choice, claiming that they undermine education.  Focusing on evaluation, she states: “To impart a love of learning, [teachers] should love learning and love teaching what they know,” using “signposts, such as their education, their command of the subject, and their skill in the classroom” (238-239).  A recurring theme in the book is that test scores should not be used in evaluating teachers.  “Our schools will not improve if we rely exclusively on tests as the means of deciding the fate of students, teachers principals, and schools” (226).

There are certain indicators to which all teachers should be responsible, such as attendance, punctuality, preparation, proper interaction with students, proper comportment, and the like—and some of these indicators are difficult to measure.  We must also note with emphasis that one cannot quantify quality; for that reason, it is difficult to categorize teachers as “good,” “outstanding,” “effective,” or with other descriptors.  Teachers bring many intangibles into the classroom, and there are many variables.  Categorizing or judging teachers puts all in the same box, when in reality each one has his or her own box, as noted above.  We must open each box with discernment and fairness, hoping for a gifted teacher but at least finding the gift of a teacher that students will receive and respond to readily as they learn how to learn, learn how to live, learn how to think, and learn how to contribute to the world in positive ways.

Summarizing the findings; recommendations

The information provided by those persons who were interviewed, along with current research on the subject of teacher evaluation, appears to promote the following findings:

1.Most teachers are desirous of being observed and receiving feedback on their performance, including areas where they show strength and areas that need improvement.  As Colvin documented, “…adults who aspire to professional levels of expertise require frequent observation, coaching, and feedback in order to make substantive improvement” (Reeves 45).

2. Most teachers in this survey expressed that they actually wanted more visits and observations by their principals and that the observations should cover a wider period of time, both during a single visit and over the school year  (e.g., up to an hour in the classroom and a number of times at intervals during the year).

3. Standardized testing should not be a primary factor in the evaluation.  Teachers whose students do not participate in such testing should not be held accountable in any way for testing results of the school in general.  Teachers whose students are identified as special needs or English language learners should not be held accountable for test scores that are “below the norm” but rather on student progress and growth shown by other methods.  In fact, standardized test scores can be removed from the body of the instrument and used as a possible criterion for bonus pay, which is an attractive alternative to merit pay.

4. Additional areas of instruction should be allowed to be taught and, likewise, observed (e.g., social studies and civics, literature, geography); if not taught separately by other teachers, classroom teachers may wish to receive comments regarding their instruction of computer science, music, physical education, and other subjects.

5. Flowing narrative commentary should be used rather than checklists; allowance for individual styles is critical.  While specificity is important, teachers should not be nailed to limited descriptors that they must satisfy according to the observer’s interpretations.

6. Observers must perceive and interpret well the intangibles present in the classroom/learning community of students with that particular teacher.  They must note and properly characterize the relationship that is cultivated between teacher and students, often identified by teachers as “classroom atmosphere.”

7. Consider teachers as artists who are developing their craft and their skills to the benefit of the learners; allow freedom of expression in instruction; recognize that spontaneity and deviation from the written plan may in fact be more beneficial to the students.

8. Evaluators should not simply sit in one part of the classroom; they should move around and observe how students demonstrate understanding.

9. Administrators should demonstrate what they consider excellent teaching to be, modeling examples for everything.

10. Videotapes can be used for teachers to evaluate themselves, with or without the evaluator.

11. Evaluations should include such indicators as lesson plans, students’ portfolios, parent contact log, teacher’s attendance, committee participation, classroom environment, documentation of student growth, student attendance, and student behavior documentation.

An evaluation instrument that teachers endorse and that provides administrators with essential information about teachers: What does it look like?

Rationale: Teaching is an art; art is not confined to a checklist or a box; the teacher artists in a school create the school’s art gallery, bringing various styles and interpretations to their craft.  While there are particular elements  to be honored in the practice of the craft, teachers—like artists—need to be free to be spontaneous, imaginative, and creative to do their best work and to have the strongest influence on others.  As Carbonneau, Vallerand, Fernet, and Gujay noted, what teachers need most is “…passionate engagement in which they find meaning and purpose….teachers…benefit from the expectation that they can and will have a profound effect on the lives of students and colleagues” (Reeves 80).

A common problem in evaluation of teachers is trying to quantify quality; but quality is often intangible and, as such, is not measurable in ways that satisfy a static instrument—for, after all, teaching and learning are dynamic, not static.  

In considering the use of the instrument entitled IMPACT, participants in this study were almost unanimous in stating that this tool was not well received by the teaching corps, primarily because of the length and extreme detail of the factors used, many of which were arguable and/or subject to interpretation.  Some educators did feel that IMPACT contained useful points and could serve as a guide, especially for new teachers, in planning and carrying out their instruction.  As a result, the recommendation is that IMPACT be used as a resource and a reference document, similar to a manual or an encyclopedia.

Nevertheless, there are some specific, concrete concepts that can be used to assess teachers, as teachers themselves have noted.  This evaluation instrument is a combination of the following:

            -“signposts,” as identified by Ravitch “…such as their education, their command of the subject, and their skill in the classroom” (239).  One might also take note of those things that show a person’s interest in teaching and the continuous pursuit of professional development to hone skills, such as certification and re-certification documents, attendance at appropriate workshops and seminars; presentations made to colleagues; articles and other works that are published.

            -indicators, as identified by teachers themselves, such as attendance, punctuality, how much time is effectively spent in the building, projects in addition to classroom work, participation in/development of extra programs for students such as after-school mentoring or tutoring, creativity (e.g., making up games, development of lessons in special ways), extra time reported as spent at home in such projects, home visits, ability to bring in outside resources for children’s benefit (career day speakers, historical persons, e.g.), committee work, collegiality with peers, interaction with school partnerships, support of administration, parent communication.*

[*Parent communication has certain factors associated with it.  For example, in one high school in Maryland, a social studies and literature teacher told his students that he wanted to confine his communication to the students and himself; he did not want to engage in parent communication or answer any complaints from parents, as he felt that student and teacher should resolve any differences.  Yet the students in his classes consistently rated him the most effective of all their teachers.  On the other hand, a teacher in a DC elementary school addressed her class parents on the first day and told them that her job was to teach, and their job was to parent.  She expected them to send their children to school ready to learn and not create any disciplinary problems.  When the first disciplinary incident would occur, she would immediately—during class—call the parent of the miscreant on her cell phone and have the child report what he or she had done; disciplinary problems were virtually non-existent in this teacher’s classroom.  At the same time, she held pool parties at her home for whole families, encouraged parents to call her by her first name, and developed a more informal rapport with them.  Which approach is better?  Each has its merits and difficulties.  Another factor to consider is the availability of email communication; is it a help or a hindrance?  Are there dangers along with advantages?  Evaluating a teacher’s communication with parents can be ambiguous.]

-teacher portfolios, giving teachers an opportunity to demonstrate the fruit of their labors rather than simply being responsive to a tool.
-a description of this artist; works and style, impact on those affected by these ways of teaching, and what students do as a result.
-how the teacher reaches all students and enables each one to progress from where he/she began to a higher level; how the teacher arranges and performs differentiated instruction or individualized instruction, as well as how the teacher accomplishes unified outcomes for the class.

Furthermore, teachers should be given the option of including certain information in their evaluations.  Reporting test scores should be an option, as testing is just one piece of the pie; but at the very least, test scores should never be used to determine whether or not a teacher keeps his or her job.

The Process

Part I: Preparation: Teacher and observer receive and review the instrument, interact with each other about the content, talk about the classroom visit.  The teacher may choose to mention any particular factors for which the observer might be prepared (e.g., an Emotionally Disturbed student In the class), and these matters could be discussed in advance.

Part II: The Visit/Observation:

* Quantifiable factors (most or all of which are facts and thus not arguable):

*Written narrative commentary

Part III: Formulating the outcome: The teacher and the observer meet together to determine the result of the visit.  While the observer will take the lead in explaining the factors noted and the narrative commentary, no final rating will have been pre-determined.  The two persons discuss the visit and come to consensus on how to express the result.  The teacher must be given a “safe” opportunity to respond and ask questions of the observer; the observer must be prepared to answer questions and clarify any points that are not well understood.  Generally, the quantifiable factors will be easy to rate since they are based on facts.  The written commentary may require more discussion, explanation, and interpretation.  Both persons may finish by writing their opinions of the observation and post-discussion.  If the “canvas” from that day is deemed by both parties to be an unproductive work, another date should be agreed upon for a re-visit.

Teachers who are quality professionals will no doubt be recognizable and receive appropriate “numbers” and comments.  Teachers who are lacking will not be so easily recognizable; but they should receive strong, positive support in any areas of weakness or where improvement is needed.  If the experience results in a teacher determining that the field of pedagogy is not the best choice, counseling needs to be provided in assisting such a person to pursue another field.

The Instrument

                                    TEACHER EVALUATION INSTRUMENT

TEACHER’S NAME __________________      GRADE OR SUBJECT AREA ______________



SPECIAL CONSIDERATIONS __________________________________________________

PART I. Quantifiable factors:

                        -Attendance (number of days present out of a total)

                        -Punctuality (number of times punctual out of a total)

                        -Discharge of duties (supervision, reports, et. al.) – (number of times re: default
                           or tardy)


                        -Degrees earned

                        -Professional memberships

                        -Professional development (e.g., workshops, presentations, etc.)

            Practical “value added” to the students, colleagues, administration, self

                        -Tutoring, mentoring, home visits

                        -Service on committees


            Awards and recognitions in the education field

While these indicators are quantifiable or answerable by “yes” or “no,” they should not be assigned numbers or poured into a total figure; the observer views the overall picture and makes written notes.

PART II.  Responsive commentary on areas noted (three or more sentences on each item):

            -Relationship of students and teacher

            -Classroom environment

            -Intangibles (e.g., evidence of passion, creativity)

Again, the observer is looking for a total picture rather than numbers of some kind.  Upon conclusion, the observer summarizes the written commentary.

PART III.  If a videotape has been made, as recommended by some teachers, the observer and the teacher view it together and discuss it during and afterward.

PART IV.  If a teacher portfolio is used, as recommended by some teachers, the observer and the teacher examine it together and discuss it throughout.

PART V – Conclusion: In response to the request of some interviewees, a designation from the Likert Scale can be affixed (Excellent, Good, Fair, Needs Support, etc.) in order to go beyond a simple “Satisfactory” or “Unsatisfactory.”  Alternatively, teachers may be allowed to choose which designation system they prefer.

The purpose of evaluation is to support a teacher, make a good teacher even better, and discern if a teacher needs assistance or counseling into a different field.  Conversation between the two parties, supported by the written documents, both of which must be acceptable to teachers and supervisors, can accomplish the purpose.


Carole Laughinghouse

Timothy Leonard

Barbara McClurkin

Theresa McClurkin

Bettye Miller, Ed.D.

Stacie Morton-Carter

Claude Nadir

Taiese Robinson

Marianna Zimmerman

                                                            Works Cited

  Cohen, David B.  “Teacher: What my evaluation must include.”  The Washington Post.

 19 Sept. 2010, n.p. Web.

Elia, Mary Ellen.  The Economist. 8 Jan. 2011, 26. Print.

McCrummen, Stephanie.  “D.C. teacher evaluation becomes a delicate conversation.”  The

            Washington Post.  18 Mar. 2011, A1, A4. Print.

McGrew, Mike.  “How do you quantify the teachers of a lifetime?”  The Washington Post.

            10 Apr. 2011, C5.  Print.

National Research Council.  The Washington Post.  10 Apr. 10, 2011, B7. Print.

New York State United Teachers publication (n.d.). Print.

Diane Ravitch, The Death and Life of the American Dream: How Testing and Choice are

 Undermining Education.  New York: Basic Books, 2010. Print.

Reeves, Douglas B.  Transforming Professional Development into Student Results.

            Alexandria, VA: Association for Supervision and Curriculum Development, 2010. Print.

Turque, Bill.  “D.C. Schools Insider.”  The Washington Post.  16 May 2011, B2.  Print.