baseball win expectancy finder now has balls and strikes!

Here it is!

This wasn’t too much work, and like I mentioned last time it was nice to work on adding a new feature to something. Although I guess I did recreate the app’s shell with create-react-app, I didn’t port it to TypeScript 🙂

One annoying part is that there’s just not that much data so you can pretty easily get into situations where the stats are probably “wrong”. For example, in the top of the 8th inning, no outs or runners, home team up by 2, and a 2-0 count the home team has an 87.08% chance to win. But if the batter gets another ball to make the count 3-0 (good for the visiting team), now the home team has an 88.11% chance to win. I guess I should add a warning when the sample size gets too small or something. (although I don’t know what “too small” is)

I was a little worried that adding balls and strikes would bloat the size of the data files, which did happen. The worst-case is that it would increase the file size by 12x (4 choices for balls * 3 for strikes), but in practice it’s more like 9x. But it turns out that computers are fast so doing the lookups is only barely slower than before.

My original plan was to add the balls and strikes data to the mobile app (and make it an in-app purchase to unlock), but the increase in data size and corresponding memory usage make me less excited about it. Maybe at some point…

How common are walk-off walks (on four pitches!) in baseball?

I’ve been following the Astros pretty closely this season (since they’re very good this year!), and so when I saw they had won a game on a walk-off walk, I was curious how common that was.

A walk-off is when the home team wins the game on that play, by scoring a run to go ahead in the ninth or later inning. So a walk-off walk is when that happens because the batter is walked. It’s kind of very dramatic and anticlimactic at the same time!

In this case, the walk was on four pitches, which seemed exceptionally rare, because you might as well throw at least one strike, right? At first I thought “maybe this has never happened before!”, but (spoiler!) it turns out there are a lot of baseball games that have been played. So I wanted to at least know how common it was.

My baseball win expectancy finder is powered by a Python script that can parse games, so I extended the parsing code to make it easier to run these sorts of reports and ran it. (source available on GitHub, see WalkOffWalkReport in parseretrosheet.py)

So, the numbers: in the ~128000 games I have data for, a walk-off walk has happened only 442 times. That sounds like a lot but it’s only around 7 times a season. Not all the games have pitch-by-pitch data, but ~73500 of them do, and walk-off walks on four pitches have happened only 60 times. (not including data from this year)

Since there are roughly 2100 games per season (including the playoffs), this means we’d expect this to happen around 1 time per season. Which is indeed pretty rare!

In fact, Altuve got his walk-off walk with 2 outs – walk-off walks with 2 outs have only happened 257 times (~4 times/season), and ones on four pitches have only happened 41 times, which is around 2 every 3 seasons!

When I was in the middle of this work I remembered that baseball-reference has an incredibly powerful Event Finder, and lo and behold it can do this search as well. In fact, at first our numbers were pretty far off so I found some bugs in my script 🙂 (the numbers are still off by a few because it’s counting a walk where the fourth ball was a wild pitch and a runner scored, while my script doesn’t count those)

My original thought was that I could make it easier to use my script to find stuff like this, but the baseball-reference Event Finder is so incredibly powerful and relatively easy to use I probably won’t bother. Kudos to them!

adding expected runs per inning to baseball win expectancy finder

So…yup, I did that!

I already had all the data for this, so it was more mechanical than anything else. Actually, most of the time I spent was turning it into a proper React app. Before I just had a bunch of inline React code, which was great for testing and simplicity, but meant that a bunch of compilation had to happen on every page load. So I bit the bullet and figured out how to use nwb and all the stuff that lets you compile React code that I don’t really care about. But it builds a pretty small JS file, which is nice.

Still a big fan of React, though!

why baseball’s current wild card system is not crazy

Background: From 1994-2011 each league has had 3 divisions, and each division winner would make the playoffs. Since 3 is not a convenient number of teams to have in a playoff, they also added one wild-card slot for the team with the best record in each league that wasn’t a division winner. This meant of the 15 teams in each league, 4 would make the playoffs.

In 2012 they added another wild-card slot, and the first “round” of the playoffs is one game between the two wild-card teams. Whichever team wins moves on, and now they’re down to 4 teams again.

When I first heard about this (i.e. when I started following baseball again after the Astros stopped losing 100 games in a season) I thought this was pretty stupid. Baseball is a sport played over many games – their regular season of 162 games is the longest of any major sport by far – so having a one-game playoff to determine who advances seems unsporting and random. However, after a bit more thought:

– Yeah, a one-game playoff is fairly arbitrary. But it is exciting! And for one of the teams, they wouldn’t have made the playoffs at all, so they’re clearly in a better position.

– I like baseball’s emphasis on the regular season (as opposed to basketball, where more than half of the teams make the playoffs), but this change actually makes winning your division even more desirable.

– Having 5 out of 15 teams make the playoffs instead of 4 out of 15 isn’t that big a change, and 5 out of 15 feels reasonable to me.

So, on the whole, yay baseball!

P.S. This has nothing to do with the fact that the Astros are currently in the second-place wild-card spot 🙂