State Election Map – fixing the “votes to change the election” calculation

As I mentioned at the end of my “Lucky” book review, the book said that Trump in 2020 was only 43K votes away from winning the election. In my 2020 updates to the State Election Map, I had also this calculation, but came up with a different number. (76K) I already knew there was one problem that I hadn’t addressed, which was breaking the vote down by congressional district for Maine and Nebraska, so decided to bite the bullet and fix it up.

Along the way I discovered there were another issue causing wrong results: the difference between an electoral vote tie and a win can be substantial in some cases, and I was just calculating for a win. (in the case of a tie the vote goes to the House of Representatives, but each state gets a vote so it’s complicated, but in 2020 Trump probably would have one)

Anyway, I used Dave Leip’s Atlas of U.S. Presidential Elections to gather the data by congressional district, and improved the Rust code to do both of those calculations, and made the Javascript code show both calculations when they’re different. And lo and behold, Trump really was only 43K votes away from winning in 2020!

Here’s the commit for these changes. It was a lot more work than I expected, and I’m actually in the middle of another project that I put on pause because I was so irritated these numbers were wrong πŸ™‚

As an aside, the Rust code to calculate this got a bit more complicated and the more that happens the more I sprinkle in .clone() calls just to make things work. Which is fine for correctness but not great for performance. In this case it’s fine, though, the script still runs in ~1 second, and I guess it’s nice to have markers where I can try to optimize things if necessary later!

Adding 2020 baseball games to the win expectancy finder

You can see the results on the win expectancy finder. (although adding one year of data doesn’t change much, it’s the principle of the thing!) Updated apps will be coming soon.

Usually adding a year’s worth of games is a pretty quick task; run the scripts, and update a few things on the web page. (thankfully I made a list of what to do a few years back) But this year was different because of the rule changes in 2020. Not only that but now that I have two versions of the parsing script (the faster one in Rust and the original in Python) I wanted to keep both scripts up to date. And it ended up being quite a journey!

The rule changes in 2020 were:

  • in extra inning games, a runner starts on second base
  • for doubleheaders, the game only went 7 innings instead of 9

This didn’t sound too hard, but it meant I had to add a set of rules by which to parse the game. The first one was pretty easy (since I just had to know whether the game was in 2020 or not), but I was worried about figuring out whether a game was a doubleheader or not. Luckily Retrosheet added the number of innings to their event file format, and in fact also added whether a runner starts on second base in extra innings, which I didn’t discover until later and should probably go back and use!

Once I got all the games parsing in Rust, then the fun began:

  • I took a quick look at the resulting statistics and noticed that the situation at the start of a game (top of the 1st, bases empty, etc.) had around 700 new games, which sounded reasonable, and less than 100 of them had the visiting team winning, which did not! After some thought (and coming back to it the next day), I found and fixed the bug, which had to do with what the final game situation was as opposed to the last actual game situation was; you can see the fix in this commit.
  • So then I made similar changes in Python, and after running the script the results were off by exactly one game. (just like last time; what are the odds?) Anyway, by looking at the differences in the stats file I could see what situations the mystery game went through, and added a special Report type to find the game that went through those situations. It turns out only one playoff game in 2020 went into extra innings, and the Python script was handling that wrong, and it was pretty easy to fix!
  • That led me to discover that the Python script wasn’t throwing an exception by default if it failed to parse a game, which is bad, so I fixed that in this commit. (notice my Rust style of not putting parentheses around if conditions is starting to slip into my Python style…)
  • Running the Python script showed a major difference in the runsperinningstats file – in fact, the Rust script had never been updating it! The fix was a simple copy/paste error, and I made a later change to use “Self” instead of explicit type names to avoid some of these problems in the future.
    • So how did I never notice this before? The way I validated my Rust script when I was developing it was to run it and see if the results differed from what was in git. This has the now-obvious consequence that if the script didn’t do anything, it would seem like it was working! I guess the lesson I took away from this is, don’t do that πŸ™‚
  • Both scripts print out the number of games parsed at the end, and I noticed when I was debugging some of these problems that the numbers were slightly different between Python and Rust. There are 7 games that the scripts can’t parse correctly and I list them explicitly in both scripts (the event files seem wrong to me) so we can skip them, and the Rust script was correctly not counting them while the Python script was counting them.
  • The way to actually update everything is getting ridiculous – as I mentioned above I have it written down, but there are 16 steps to run! I really need to make this easier…

parsing baseball files in Rust instead of Python for an 8x speedup!

linked to by This Week in Rust! also see Hacker News discussion here.

Since I’ve been quickly becoming a Rustacean (Rust enthusiast) – see previous posts here – I decided to take a crack at the parser for my baseball win expectancy finder. It’s written in Python, and when I made it parse files in parallel (previous writeup) it sped up processing time by ~4x. I thought doing it in Rust would run even faster, although a lot of the work is in regular expressions which I didn’t think would show a big difference between the two languages.

Here’s the PR for the change, and I guess less time is spent in regular expression than I thought, because here’s a table of the time it takes to parse ~130,000 baseball games (all MLB games from 1957-2019):

Time (in seconds)
Python single core implementation165
Python multicore implementation35
Rust single core implementation (commit e76de817)20
Rust multicore implementation4 (!)
Time to parse 130,000 baseball games on my desktop i5-8600K CPU (which has 6 cores)

So that’s about an 8x speedup in both single core and multicore code! And the final product runs in between 4-5 seconds, which is just unbelievably fast.

As always, I learned a lot about Rust along the way:

  • Rust makes me want to write more performant code. This shouldn’t be a surprise, because performance is one of Rust’s raison d’Γͺtre’s (it’s first on the “Why Rust?” list at rust-lang.org!), but every time I have to call collect() or clone(), I think about if I really need to do it. It’s much different than when I’m writing code in Python!
  • That being said, I didn’t do anything major algorithmically to speed things up – the biggest thing was probably keeping the mapping of which runners end up at which base (RunnerDests in the Rust code) in an array instead of a HashMap. But I’m sure that I avoided a ton of copies/allocations along the way.
  • One of the nicest little ways to speed things up is the entry() method on HashMap. If the passed-in key is present in the map it will return a reference to the value, otherwise you can call something like or_insert() to insert a value. This is nice because you only have to do the lookup one time!
  • Still a huge fan of the ? operator which makes it very simple to propagate errors up. This time I used the anyhow crate to make it easy to return String errors, although I didn’t really take advantage of its ability to attach context to errors as I got a bit lazy. Maybe next time!
  • The Rust single core implementation in debug mode took 673 seconds, which is 33x slower than in release mode. This is my usual reminder to never benchmark anything in debug mode!
  • The nice thing about this project is that, other than writing tests for parsing tricky plays, you can just run the parser on all the games and see if the resulting stats files match. After implementing the report for win expectancy including the balls/strikes count, I was dismayed to see the stats files was wrong – there was one game somewhere (out of 130,000!) that wasn’t using the balls/strikes count correctly. Of course, I could have narrowed it down by only running the parser on certain years and going from there, but luckily after looking at the Python code more closely I realized that it handled cases where the pitches were lowercase, and the Rust code did not, which was easy to fix. I guess there’s one plate appearance somewhere that has lowercase pitches!
  • I did try some optimizations by using SmallVec (see commit dd6ed7a) to store small vectors on the stack instead of using heap allocations. It did seem to help a little bit – the single core runtime went from 20 seconds to 19 seconds, although I’m not 100% sure that’s significant. I also used smol_str (see commit 48b47dfe) to do the same thing for strings after verifying that most of the strings in files were 22 characters or less, although again it didn’t show much/any improvement.
  • I also went ahead and rewrote the script that the web app calls to look up the data in Rust. I’m still clearly slower at writing Rust code than Python code – it took me a little over an hour when I already had a working Python script to look at. I assume it also runs faster than the Python one but they’re both fast enough so I didn’t bother to benchmark it.
  • Like with the clue solver and population centers projects, I used Rayon for the multicore implementation, which worked pretty well. One complaint I have is that I had to create a new copy of each report for each file we process and then merge them all together. Ideally I would just create one copy per thread since each thread can safely update its own copy, and that would reduce the overhead of merging so many reports together. But I couldn’t find a way of doing this with Rayon, and I guess I can’t complain since it ended up so fast anyway!

One area of problems I ran into this time was with traits. For some background, the Python code has a Report class which acts as an interface – a list of Reports is passed into the code that parses a file, and after each game a method is called so the report can accumulate whatever statistics it wants. And there’s a subclass of that called StatsReport which assumes that you’re writing the data out in a certain format to a file, so it’s even easier to write new reports.

Rust doesn’t have inheritance, but it does have traits which are kinda similar, so I optimistically made a Report trait and a StatsReport trait, and made StatsReport have a supertrait of Report, so anything that implements StatsReport also has to implement Report. It’s kinda the same thing! But unlike with real inheritance, StatsReport can’t provide implementations for methods on Report, which is kind of annoying. Not hard to work around, since you can just make the methods on the concrete struct call helper methods on StatsReport, but it does mean there’s more boilerplate needed for concrete structs.

Another problem I ran into is that writing the types for the merge_into() method on Report is hard, since ideally it would take a parameter of the same type as the concrete type. To be fair, this is tricky in a lot of languages. (although Python types are optional, so it’s easy there!) What I ended up doing was having the method take something of type Any, adding a method to every concrete implementation that did

    fn as_any_mut(&mut self) -> &mut dyn Any { self }

to convert a Report to something of type Any (??), then adding a line to the top of merge_into() like

        let other = other.downcast_mut::<Self>().unwrap();

which seems like more than should be necessary, but obviously I don’t fully understand what’s going on. (thanks Stack Overflow as usual!) I had some other problems with making Report require the Clone trait, so I gave up and added a constructor method to the Report trait.

I’m thinking about trying out Rust and WebAssembly next when I have more spare time!

State Election Map updates for 2020!

I made some updates to my State Election Map! The most obvious one is including the 2020 data, just in time for the Electoral College vote today. But wait, there’s more!

Each year now includes:

  • Tipping point state – this is the “decisive” state that provides the Electoral College win. More specifically, I use FiveThirtyEight’s definition, which is to sort every state by the election winner’s margin, and add up electoral votes until they get to 270. (with one exception, see note below about Maine/Nebraska) If you subtract this state’s margin from the overall national popular vote you can calculate the electoral college split. This is one way to look at how close the election was.
  • Closest state (by percentage) – Simply the state that was closest to a tie. (by percentage and not raw votes)
  • Minimum number of additional losing-party votes so the losing-party would have won: If you had a magic wand and could conjure up votes for the losing party to make them win, in what states would you add those votes to have to conjure up the minimum number possible? This is another measure of how close the election was; for example, in 1976 Carter won 297-241, and the tipping point state was Wisconsin which was D+1.7%, but a mere 18,490 more Republican votes in Ohio and Hawaii(!) would have let Ford win, which is extremely close.
  • Graph of how each state changed since the last election: This data was already available on a per-state basis (by clicking on the state), but seeing it on a per-year basis can show some interesting trends.

Note about Maine and Nebraska – these states are alone in allocating their electoral votes per congressional district, which is an irritant for visualizations like these πŸ™‚ (and why I hadn’t wanted to add electoral vote data before) My solution was lazy, simple, and only a little wrong: I show the correct EV total for each election, but for the rest of the EV calculations ignore the split votes. Edit: this has been fixed now, see this update.

Some interesting observations of the data:

  • I added the “minimum number of votes to change the result” metric because I remember hearing that about Trump’s win in 2016 – if Clinton had gotten just 77,000 or so more votes in Michigan, Pennsylvania, and Wisconsin, she would have won, which is much closer than the 306-232 electoral vote victory. Which is true! But it turns out that by this metric, Biden’s win in 2020 was even closer – he had the same 306-232 electoral vote victory, but only ~43,000 more votes would have made Trump tie in the electoral college and presumably win given that the House of Representatives then votes on the winner, but each state gets one vote.
  • The split between the popular vote and tipping point state, which was just under 3% in 2016 and talked about a lot, increased to almost 4% in 2020, which is pretty scary! Part of the problem is that Pennsylvania and Wisconsin got a little more Republican-leaning relative to the popular vote. (although Michigan became a little more Democratic-leaning)
  • You can compare the 2020 election to 2016 and get an idea of which states disliked Clinton the most. Obviously this isn’t an exact science, but the clear winner to me, to my surprise, is Vermont! It’s the state with the biggest D shift from 2016 to 2020 (+9.0%), and it doesn’t seem to be because the state is generally trending Democratic. (like Colorado, which had the second highest D shift) Conveniently, the third highest D shift was in Delaware, Biden’s home state.
  • The new graph makes it clear how much Utahns didn’t like Trump – the shift from 2012 (when, admittedly, Romney was running) to 2016 was D+30%! That’s the biggest shift between elections since Hawaii moved D+36.5% between 2004 and 2008. I think there’s research showing that home state effects are larger if your state is smaller, which these seem to be consistent with.

The “minimum number of votes to change the result” calculation turned out to be more interesting than I thought. (I’m going to assume that an R won the election for the sake of concreteness) At first I thought it would be a simple matter of taking the closest state that the R won by percentage, add however many votes it takes to make the D win, and repeat with the next closest state until the overall result changes. But there are two reasons this won’t work:

  • You might end up adding unnecessary votes. If the R won the electoral votes 280-258, and the closest state the R won had 3 EV, and the next closest state the R won had 12 EV, there’s no need to flip the votes in the 3 EV state – the 12 EV state is sufficient by itself!
  • Since a state’s EVs aren’t exactly proportional to its population (much less the number of people who voted), just looking at the closest states by percentage may not be the right order to look at things.

After thinking about this a little, it sounded a lot like a 0-1 knapsack problem, but not quite. The knapsack problem means you’re picking some things to add to a knapsack that can only hold a certain number of pounds, and you want to maximize the value of the items. In this case, we’re trying to pick states so that their “weight” (EVs) is greater than a certain value instead of less than, and we’re trying to minimize the “value” (number of votes to flip) instead of maximize.

I thought about trying to use negative weights and values to turn this into the knapsack problem, but I couldn’t get it to work out. Eventually I realized that we could flip it around – look for the states to keep in the R’s column that have at most 268 EVs, and maximize the votes the R has, which is exactly the same as what we want.

The knapsack problem is NP-complete, and I was worried it would be slow to run, and also I love Rust now (see my Rust posts), so I decided to calculate it offline and just load the results in the UI. I found an example implementation of the knapsack problem in Rust and adapted it for my use (see my implementation here). It turns out with a maximum weight of 268 and only at most 50 states, the whole algorithm runs in less than a second, so this might have been overkill. But it was fun!