TISA Rules and the Power of the Fuzzy Match

In May of 2022 Tennessee’s public-school funding model experienced a seismic shift. Governor Bill Lee signed the “Tennessee Investment in Student Achievement” Act (TISA) establishing a new comprehensive school funding model that replaces the “Basic Education Plan” (BEP) and updating the state’s educational funding rules for the first time in over thirty years.

Schools had a year to prepare their systems and budget expectations for the new TISA world.. The official TISA guide document the state provides is 54 pages long, but the most important shift is the move from what they call a resource-based funding model to a student-based funding model. Where the BEP funding formula was based on complex formulas drawing on assumptions about resources needed for staffing, textbooks, technology, and many other education costs, relative to their location and other unique needs, the TISA system allocates a base dollar amount per student in the state, with a short list of adjustments.

We will quickly outline TISA, the implications for schools, the importance of using multiple matching methods for databases, and finally an advanced matching method called “fuzzy matching” which can help schools in this new funding environment. The key takeaway here is that fuzzy matching helps when datasets might have some typos or missing data and in this particular case, fuzzy matching can have a huge return on investment in terms of ensuring schools claim all the funding that students are eligible for.

The TISA Template

The amount of funding a school receives under TISA starts at a base of $6,860 per student. That base number is then adjusted based on a few important criteria:

+5% per student for very small districts
+5% for rural districts with sparse population
+5% for schools in neighborhoods of concentrated poverty
+15-150% based on special learning needs like Dyslexia, English Language Learners, and other special
education classifications
+25% for each student categorized as economically disadvantaged by certain criteria

The TISA system is intended to be simpler and more transparent for schools than the previous BEP program, but there is a high-stakes catch to the new funding model: school funding depends on data quality and accurately capturing all these special student characteristics. In the past incomplete capture of the above characteristics affected the quality of statistical reports and education research, but now it also negatively affects which funding adjustments students qualify for. If a school or district fails to properly capture these student characteristics, this money which is intended to help support students is effectively forfeit.

The “weights” segment above is reliant on schools ability to understand the
new law and claim their funding

While several of these classifications are pretty easy to handle by marking all students in a “sparse” district as sparse in the student information system, those which vary student to student require more work. One of the ways that students are classified as economically disadvantaged under the funding formula is if they are “directly certified” as receiving public benefits like SNAP (food assistance) or TANF (cash assistance). Currently the state provides districts with a list of directly certified students for each county and relies on schools and districts to match this list to their student roster marking the matched students as being directly certified in the student information system. Doing this well is not as easy as it might seem.

Typically this task falls to someone in the district or school data team—someone who can use excel and run a VLOOKUP to attempt to match students on a key identifier like social security number. If every student’s SSN was known and accurate in both the state DHS/SNAP/TANF database and the school student information system, this would result in 100% of matches found in an hour of work. Unfortunately like all databases there will be inaccuracies and missing data in one or both databases resulting in missed matches. A secondary pass through the databases is warranted and typically this will be done based on matching for exact matches on First Name, Last Name and Date of Birth. In the past the decision on how many different passes and different creative matching attempts to make was left to the data team, but today school and district CFOs, superintendents and leaders need to be involved and helping to ensure that the very best efforts are made—possibly 5 or more different passes with different matching criteria and algorithms.
Each additional student matched results in $1,715 additional funding for the school. It is likely that most schools and districts will under-invest in matching efforts by calling it “good enough” a bit too soon. If 10 hours of additional work could turn up even a single additional match, the return on investment for this additional effort makes it an easy win to spend the time and effort ensuring as close to 100% of possible matches are round. The reality is that most schools and districts will be turning up 10 or 100 additional matches and making this additional effort pay off tens or hundreds of times over.

The direct certification database includes first and last name, date of birth, address, and Social Security Number.  By way of illustration, for a school with about half of students meeting the definition of economic disadvantage, running VLOOKUPs on a variety of different match criteria would typically return the results like these:

A few students may be found through searching for siblings at other schools who appear on the list

Based on our experience, schools that run one or two of the suggested lookups are likely to catch only 80-85% of the total number of students who are eligible for additional funding. The key column in the above graph is the last one showing that no one match criteria is as good as all of them put together—finding all students that match any of the match criteria.

For a school or district with 1000 students in a moderately disadvantaged area, just doing the extra effort of running 4 or more matching rules instead of 1 matching rule can easily turn into finding 70 additional students who qualify for this additional funding—or $120,000.

The Final Step to 100%: The Power of Fuzzy Matching

By examining multiple matching criteria, more matches will be caught despite the typos and missing data in any database. One final tool remains: we can do even better with the power of “fuzzy” matching! You can imagine a fuzzy match as a tool that “squints” at two datasets, and sees if the shape of any row of entries looks similar to another when you blur your eyes a bit. It is tolerant of multiple types of typos and can match two rows even if there is a typo in almost every column. It can rate potential matches by their likelihood and tell you that the following two people are indeed likely the same person:

John Dow, 10/12/1955, SSN: 123-45-6789, Address: 1488 Main Street, Memphis TN 38122
Jon Doe, 10/12/1995, SSN: 123-54-6789, Address: 1488 Maine Street, Memphis TN 39122

In recent work with a charter school client, we tested this theory by using a fuzzy match algorithm which would generate a dissimilarity score from 0 (exact match) to 1 (no chance of a match) for all possible or even remotely plausible matches between school records and the direct certification list. This could help us catch students with misspelled names, social security numbers that were off by a digit or two, and new addresses.

Running the fuzzy match paid off! It identified all of the students who were previously identified as economically disadvantaged with a score of 0.60 or below, and most students without any matching criteria returned a score above 0.70. There were ten students who returned a score between .60 and .70 warranting further investigation and some human judgment. Manual inspection of these 10 possible fuzzy matches yielded one student who was clearly a match. The student’s last name had been slightly misspelled, his address had changed, and his social security number was input with one incorrect digit, but his birthday, name, and new address were all verified, making it a clear match. It seems likely that 1 in 250 true matches will not be possible without fuzzy matching, and the larger the school or district the more imperative that their matching systems work hard for them. A large urban district in Tennessee could easily find 250 additional direct certified students this way; a small or medium sided CMO could easily find 5-15 direct certified students this way. Ultimately, this adds up to anything from $10,000 to $500,000 funding that could be claimed but would be forfeit without this advanced matching step.

Our current belief is that schools should be able to find 85% of true matches with a single exact matching criteria, 97-99% with 4 or 5 strategically selected and diverse exact matching criteria, and perhaps 99.8% of true matches with fuzzy matching.

United InfoLytics Can Help

Helping a client be successful in this matching game is something we enjoy: we know that not everyone can have advanced analytics in house, and we love ensuring that great schools get all the funding they deserve. Given that this matching task is an ongoing task throughout the school year as students come on and off the direct certification rolls, an automatic system can really help schools turn this into a “solved” problem instead of an ongoing challenge to see if their team can manually run all the matching criteria correctly each month. In the new TISA world, every school will need to adjust its systems to match the shifting landscape. We are happy to work with you to automate effective systems so you can keep your focus on effectively serving students and their families. If you are interested in automated fuzzy matching systems or if you are curious about how dashboards, data solutions and custom tools can serve your school or organization we would love to talk. Set up a time today.

Nate Mulder

Nate Mulder is a history teacher turned data professional focused on helping schools and other organizations succeed in organizing and analyzing data to make better decisions and serve all stakeholders well.