It’s educational data season here in Tennessee. State test scores have major impacts on students, teachers, and principals. Prior to major state tests, educators and building leaders are asking themselves, “How can we achieve the best proficiency and growth scores possible?” They are also faced with questions about whether some strategies may pay off in the short term by pumping up certain metrics while not being best for all students or best for any students in the long run.
To think about the nature of these questions, let’s start with a simple thinking question: I am a teacher, and I have a class of 4 students. Based on their interim assessment data, they are predicted to have scale scores of 20, 28, 40, and 52. The proficiency cut score on this particular test is 38. Which student(s) are most likely to have an impact on my school’s accountability and performance measurement (proficiency or “success rate”), and which student(s) are most likely to impact my own TVAAS (student growth) numbers? If you think about this carefully, you’ll understand why some teachers in some schools are being told to prioritize teaching to and tutoring a subset of students (those close to passing) even though this is unethical.
Below is a video that explains how TVAAS works, why it’s necessary to have a statewide growth metric, and why TVAAS is quite often mis-interpreted. We will finish with a bit of public policy puzzling: How do the unintended consequences of poorly designed and poorly aligned education accountability create perverse incentives for leaders to prioritize some students and deprioritize others?
Answer to the questions at the beginning: the person with a predicted scale score of 40 is most likely to affect the proficiency rate because they are predicted to be “close” to the cut score of 38. On a good day they’ll definitely be proficient, on a bad day they’ll fall below the cut score and not be proficient. But for those predicted to do very well or very poorly, whether they learn a lot or a little in the month before the test, they are almost certainly going to be what they are predicted to be: proficient or not proficient. This is the root of the practice of making “bubble” lists where teachers are told to focus their energies on all the students who are predicted to be within ___ points of the cut score or who are predicted to have a 20-80% chance of passing.
Teaching to students on the bubble list implicitly deprioritizes those with a low chance of passing and those who are certain to pass, and it turns out this strategy works for pumping up a school’s % of students reaching proficiency / reaching the cut score between “Approaching” mastery and “On Track.” And in my opinion, it’s highly unethical. Cut scores are a public policy mistake that was codified nationwide with No Child Left Behind (NCLB). Even though there have been NCLB reforms, none of them have significantly shifted accountability away from cut scores and toward growing all the students in the building as much as possible.
With regards to TVAAS, it turns out that all students have an equal chance to impact the teacher’s TVAAS. Any growth metric or goal necessarily prioritizes the total growth of their whole classroom, which is great. The problem with TVAAS isn’t the fact that it’s a growth measure; the problem instead is that the 1-5 “TVAAS Levels” are widely misinterpreted. TVAAS is still a very good thing because growth-based accountability is the right kind of accountability rather than cut score accountability. If teachers, building and district leaders, and legislators properly understand the benefits of growth measures like TVAAS which incentivise teaching and nurturing all students, they can start to put greater emphasis on these measures and reward teachers and schools for supporting all of their students, instead of just those predicted to be a few points above or below the proficiency line.