Share this article

print logo

Among teachers, deep concern over limits of using test scores in evaluations

Math teacher Craig Dana got good marks from his principals in his annual evaluation at Orchard Park Middle School, just as he’s always gotten during his 15-year career.

But he also received a 2 out of a possible 20 on the portion of his review based on student test scores. The evaluation was well below what he received the year before, despite having taught many of the same students.

Dana wants to know how that’s possible.

“There’s no way that you can say, alright, let’s see the beginning, let’s see all of the calculations, and let’s see you press equal and get a 2,” Dana said.

The state does publish lengthy explanations for how it uses student test scores in teacher evaluations, but because the formulas account for so many variables – such as whether a student is poor or is still learning English – it is not transparent to many educators.

Lawmakers in Albany are considering more than doubling the importance of student test scores in measuring how teachers perform – a key measure Gov. Andrew Cuomo is pushing as part of his budget. In the meantime, deep concern has set in among teachers about how that will be done.

It’s not just teachers who have raised concerns. Studies of teacher evaluations paint a nuanced picture of relying on student test scores to weed out ineffective teachers. The American Statistical Association last year cautioned policy makers about the limitations of using statistical models in deciding which teachers should get fired or get raises. Even supporters of the idea acknowledge that there can be statistical flaws in the process and wide variations in how a teacher is scored from year-to-year.

“A critical mass of scientists in this field either, A, say that it’s completely inappropriate to use standardized tests as they’re currently being used in high-stakes contexts for making personnel decisions about teachers, or, B, raise critical concerns about doing so in the current situation,” said Mark Garrison, a D’Youville professor who has written about education accountability and has criticized the state’s attempts to revamp public education.

Cuomo – who is pushing state lawmakers to agree to overhaul the state’s teacher evaluation system – tends to talk about grading teachers in broad terms.

“Now, everyone will tell you nationwide, the key to education reform is a teacher evaluation system,” Cuomo said in January as he unveiled his 2015 budget proposal. “Why? So you know what teachers are doing well, what teachers need work and what teachers are struggling.”

He has deemed the current system – which has rated the vast majority of teachers in the state as “effective” or better – “baloney.”

Grading teachers

Since 2012, teachers across the have been rated based on three factors: 20 percent of a teacher’s evaluation is based on growth in student standardized test scores in math and English compared to similar students; 20 percent is based on student test scores chosen by local school districts; and 60 percent is based on classroom observations and other local measures.

Under that system, most teachers in the state have been rated “effective” or better. School districts, meanwhile, have tended to give their teachers even better scores on the local portions of the evaluations.

After seeing the results of that system, Cuomo wants local school districts to have less control in deciding how teachers are graded.

He has proposed making student test scores count for half of a teacher’s evaluation and making districts bring in independent observers to judge teachers in the classroom – a proposal that has prompted loud pushback from teachers who to this point have been able to negotiate the details of their schools’ evaluation systems. Lawmakers this week continued to negotiate over Cuomo’s proposal.

Like others who hope teacher evaluations can help identify struggling teachers, Cuomo sees state tests as a key to allowing for standard comparisons of teachers across the state.

In a February interview with The News, the governor acknowledged that there is “no silver bullet” for evaluating teachers, but said the state needed a “bona fide objective review.”

“The test is really the only easy answer because it is objective numerical data and it was the same test with the same demographic,” Cuomo said. “How you appeal to students, your disposition in the classroom, your rapport in the classroom, the energy you bring in the classroom – all relevant but hard to judge and hard to evaluate and hard to come up with a statewide system that doesn’t lead to collusion.”

Studying student test scores

Studies of teacher evaluation models based on student test scores point to their limits.

One study that tracked more than a million children in a large school district between fourth grade and adulthood found that good teachers had a lasting impact on their students’ educations and found benefits to using test scores to evaluate and fire teachers.

But that study, like others, also noted that statistical models that attempt to assess a teacher’s impact on test scores can be “unreliable” when based on just a few classes.

Other research has found wide variations from year to year in how some teachers are rated based on student test scores, even though the systems account for variables like poverty and the ability of a student to speak English.

Then there are the new standardized tests.

While New York is considering plans to make student performance on state tests an even more important part of a teacher’s evaluation, other states are scaling back their use as teachers and students adjust to new standardized tests aligned to new learning standards known as the Common Core.

Tennessee, a pioneer in creating a statewide teacher evaluation system, bases 35 percent of teacher evaluations on student growth on tests. But lawmakers in Tennessee are considering temporarily changing that to 10 percent as the new tests come online.

Garrison, the D’Youville professor, points to concerns about the reliability and validity of new Common Core-aligned tests in math and English that have been administered in New York for only two years.

“We’re really amplifying what unreliability and invalidity they have when we use them to make inferences about teachers,” Garrison said. “The teachers aren’t taking the test. It’s the students taking the test.”

One teacher’s experience

In Orchard Park, where schools typically rank among the region’s best on math and English scores, teachers this week twice marched through the village to show their concern about Cuomo’s proposal for teacher evaluations and other education reform proposals. The rallies echoed similar events at schools across the region in recent weeks.

Dana, who is one of the few teachers in the region to publicly reveal the details of his score, was among the educators holding signs critical of Cuomo and wearing union buttons as they walked in the drizzling rain. Earlier this month, he stood before his district’s Board of Education and questioned the disconnect between the low score he got based on how his students had done on the state standardized tests and the “effective” rating he got from local administrators.

His wife videotaped the speech and posted it on Facebook. Within days, it had been viewed more than 200,000 times.

Dana said he was optimistic last year when Cuomo considered giving teachers a two-year “safety net” on teacher evaluations. Then Cuomo reversed course and proposed increasing the importance of standardized tests scores in the evaluations.

“At that point, I said to my wife, ‘people need to know what he wants to do could cause me to lose my job,’ ” said Dana, whose father is president of the Kenmore-Tonawanda Board of Education and has led a movement there to consider a district-wide boycott of state tests.

Overall, because of his local scores, Dana was rated “effective.” But some of Dana’s teacher colleagues were also flummoxed by his low score on the state-provided portion of the evaluation.

“I teach science. He teaches math. There’s a great parallel between science and math in terms of understanding and concepts,” said Margaret Staebell, a middle school science teacher who taught the same team of students as Dana last year, yet was rated “highly effective.” “There’s no way he could score so low and I could score so high teaching the exact same thing. I just have an issue with that.”

While teachers, superintendents and lawmakers will all be focused on how to change the existing evaluation system during the next few months, it likely won’t be the last discussion.

In February, Cuomo noted that even the current proposals for changing the teacher evaluation system, if implemented, are likely to change as the state has more information about how it affects schools.

“I think,” Cuomo said as he championed his evaluation proposal, “this is going to get more refined over the years.”