As I was out picking up some milk for Vanessa one night, I noticed that “STOP” is a pretty unusual word – “S” and “T” are adjacent in the alphabet, and so are “O” and “P”. In fact, all the letters are within 6 of each other!

So: what’s the probability that a random set of four letters would all be within 6 of each other?

(a reminder, `(n choose k)`

represents the number of ways to choose `k`

items from a set of `n`

items if order doesn’t matter. So `n choose 1`

is `n`

, `n choose n`

is `1`

, etc.)

My first thought was as follows: first pick the range of letters (could be “A” to “F”, or “B” to “G”), then pick which letters are in that range. There are `26-6+1=21`

ranges, and inside each range there `(6 choose 4)=15`

ways to choose the letters, so this would give `21*15=315`

combinations.

But, this actually counts the same set of letters multiple times! Consider the letters CDEF. These are counted under the “A” to “F” range, but also under the “B” to “G” range and the “C” to “H” range. So our answer of 315 is too high.

We could account for this by using the Inclusion-Exclusion principle, but there’s a simpler way. Instead of counting the number of combinations that are 6 letters apart or less, we can count the number of combinations that are exactly 4 letters apart, and exactly 5 letters apart, and exactly 6 letters apart.

Let’s look at the 4 letters apart case. First, let’s pick the outer letters. These could be “A” and “D”, or “B” and “E”, all the way to “W” and “Z”. There are `26-4+1=23`

choices for this. Then we just need to pick the inner letters, and there are `((4-2) choose 2)=1`

way to do this.

We can use similar formulas for any number of letters apart, which gives us

`(26-4+1)*(2 choose 2) + (26-5+1)*(3 choose 2) + (26-6+1)*(4 choose 2)=`**215**

possibilities. Since there are `(26 choose 4)=14950`

, the probability that a random set of 4 letters are within 6 of each other is `215/14950`

, or around **1.4%**. That’s pretty unlikely!

There’s also a nice identity we discovered here. When you choose 4 letters from the alphabet, the number of letters they’re apart is between 4 and 26. So this means that

`(26 choose 4) = sum from j=4 to 26 of (26-j+1)*((j-2) choose 2)`

Pretty neat!