baseball win expectancy finder now has balls and strikes!

Here it is!

This wasn’t too much work, and like I mentioned last time it was nice to work on adding a new feature to something. Although I guess I did recreate the app’s shell with create-react-app, I didn’t port it to TypeScript 🙂

One annoying part is that there’s just not that much data so you can pretty easily get into situations where the stats are probably “wrong”. For example, in the top of the 8th inning, no outs or runners, home team up by 2, and a 2-0 count the home team has an 87.08% chance to win. But if the batter gets another ball to make the count 3-0 (good for the visiting team), now the home team has an 88.11% chance to win. I guess I should add a warning when the sample size gets too small or something. (although I don’t know what “too small” is)

I was a little worried that adding balls and strikes would bloat the size of the data files, which did happen. The worst-case is that it would increase the file size by 12x (4 choices for balls * 3 for strikes), but in practice it’s more like 9x. But it turns out that computers are fast so doing the lookups is only barely slower than before.

My original plan was to add the balls and strikes data to the mobile app (and make it an in-app purchase to unlock), but the increase in data size and corresponding memory usage make me less excited about it. Maybe at some point…

How common are walk-off walks (on four pitches!) in baseball?

I’ve been following the Astros pretty closely this season (since they’re very good this year!), and so when I saw they had won a game on a walk-off walk, I was curious how common that was.

A walk-off is when the home team wins the game on that play, by scoring a run to go ahead in the ninth or later inning. So a walk-off walk is when that happens because the batter is walked. It’s kind of very dramatic and anticlimactic at the same time!

In this case, the walk was on four pitches, which seemed exceptionally rare, because you might as well throw at least one strike, right? At first I thought “maybe this has never happened before!”, but (spoiler!) it turns out there are a lot of baseball games that have been played. So I wanted to at least know how common it was.

My baseball win expectancy finder is powered by a Python script that can parse games, so I extended the parsing code to make it easier to run these sorts of reports and ran it. (source available on GitHub, see WalkOffWalkReport in parseretrosheet.py)

So, the numbers: in the ~128000 games I have data for, a walk-off walk has happened only 442 times. That sounds like a lot but it’s only around 7 times a season. Not all the games have pitch-by-pitch data, but ~73500 of them do, and walk-off walks on four pitches have happened only 60 times. (not including data from this year)

Since there are roughly 2100 games per season (including the playoffs), this means we’d expect this to happen around 1 time per season. Which is indeed pretty rare!

In fact, Altuve got his walk-off walk with 2 outs – walk-off walks with 2 outs have only happened 257 times (~4 times/season), and ones on four pitches have only happened 41 times, which is around 2 every 3 seasons!

When I was in the middle of this work I remembered that baseball-reference has an incredibly powerful Event Finder, and lo and behold it can do this search as well. In fact, at first our numbers were pretty far off so I found some bugs in my script 🙂 (the numbers are still off by a few because it’s counting a walk where the fourth ball was a wild pitch and a runner scored, while my script doesn’t count those)

My original thought was that I could make it easier to use my script to find stuff like this, but the baseball-reference Event Finder is so incredibly powerful and relatively easy to use I probably won’t bother. Kudos to them!

adding expected runs per inning to baseball win expectancy finder

So…yup, I did that!

I already had all the data for this, so it was more mechanical than anything else. Actually, most of the time I spent was turning it into a proper React app. Before I just had a bunch of inline React code, which was great for testing and simplicity, but meant that a bunch of compilation had to happen on every page load. So I bit the bullet and figured out how to use nwb and all the stuff that lets you compile React code that I don’t really care about. But it builds a pretty small JS file, which is nice.

Still a big fan of React, though!