# An algorithm for Wordle

If you’ve used Twitter during the begining of 2022 you’ll almost certainly have seen people posting tweets like this:

```Wordle 202 3/6

⬜⬜⬜⬜🟨
🟨⬜⬜⬜⬜
🟩🟩🟩🟩🟩
```

This is the result of playing the latest viral game “Wordle”. The concept is very easy, you have to guess a 5-letter word each day. For each guess you get the basic “Mastermind” reply, green for the correct character, yellow if it’s in the wrong spot.

This game was fun, for a couple of days, but my curious mind started to wonder… what are the BEST words to play?

So I downloaded a huge list of 5-character words and started playing around with the dataset.

Update: I’ve updated the post using the ‘correct’ Wordle word-list, there are two lists, one with possible secret words, and one list with all words valid for guessing.

# Eliminate by frequency

The first thing I tried was to try and eliminate the most frequently used characters first. With 5 guesses we can try and guess the most frequently present letters in the English language.

To get the frequency of the letters I used the following code:

This results in the following distribution:

```s 6665
e 6662
a 5990
o 4438
r 4158
i 3759
l 3371
t 3295
n 2952
u 2511
d 2453
y 2074
c 2028
p 2019
m 1976
h 1760
g 1644
b 1627
k 1505
... (etc)
```

Using some simple code it’s easy to find pretty good words to start with, for example:

`laser -> tonic -> dumpy`

These three guesses will tell you information about the 15 most common characters in all of the 5-letter words. This is “a” strategy, but it’s far from perfect. We can do much better!

# Most information per guess

Each time we guess a word, we get a reply back, it can be something like “⬜⬜⬜⬜⬜” or “🟨⬜⬜🟩🟨”. What we want to do is optimize for the information we’re getting in return. Each guess we do should eliminate most options.

If we go over the entire list of words, and compare all the replies we could get back against all other words, we get a distribution of the returned patterns.

I’ve decided to write some code to do this. For example if we pick the word `smile`, these patterns can be returned:

```🟨⬜⬜⬜⬜: 1464 ⬜⬜⬜⬜⬜: 1352 ⬜⬜⬜⬜🟨: 1122 🟨⬜⬜⬜🟨:  967 ⬜⬜🟨⬜⬜:  690 ⬜⬜⬜🟨⬜:  610
⬜⬜⬜⬜🟩:  504 🟨⬜🟨⬜⬜:  468 🟩⬜⬜⬜⬜:  467 ⬜🟨⬜⬜⬜:  417 🟨⬜⬜🟨⬜:  263 🟩⬜⬜⬜🟨:  252
⬜⬜🟨⬜🟨:  251 ⬜⬜🟩⬜⬜:  196 🟨🟨⬜⬜⬜:  191 🟨⬜🟩⬜⬜:  187 ⬜⬜⬜🟨🟨:  183 ⬜⬜🟨🟨⬜:  154
⬜🟨⬜⬜🟨:  154 🟨⬜🟨⬜🟨:  147 🟨⬜⬜⬜🟩:  136 ⬜⬜🟨⬜🟩:  136 ⬜⬜⬜🟩⬜:  128 🟨⬜⬜🟩⬜:  117
⬜🟨🟨⬜⬜:  115 ⬜⬜⬜🟨🟩:  115 🟩⬜🟨⬜⬜:  112 🟩⬜⬜⬜🟩:  109 🟩⬜⬜🟨⬜:  107 ⬜🟨⬜🟨⬜:  106
⬜🟨⬜⬜🟩:   99 🟩⬜🟩⬜⬜:   87 ⬜⬜🟩⬜🟩:   82 ⬜⬜⬜🟩🟩:   79 🟩🟨⬜⬜⬜:   63 ⬜⬜🟩🟨⬜:   63
🟨⬜⬜🟨🟨:   62 ⬜⬜🟩⬜🟨:   58 🟨🟨⬜⬜🟨:   53 ⬜⬜⬜🟩🟨:   52 🟨⬜⬜🟩🟨:   49 ⬜⬜🟨🟩⬜:   46
🟩⬜⬜🟩⬜:   43 🟨⬜🟩⬜🟨:   41 🟩⬜🟨⬜🟨:   41 🟨⬜🟨🟩⬜:   36 ⬜⬜🟩🟩⬜:   35 ⬜🟨🟩⬜⬜:   31
🟩⬜🟩⬜🟩:   27 ⬜🟩⬜⬜🟨:   27 🟩⬜⬜🟨🟨:   27 🟨⬜🟩🟩⬜:   25 🟨⬜🟩🟨⬜:   25 ⬜🟨⬜🟩⬜:   23
🟩⬜⬜🟩🟨:   20 🟨🟩⬜⬜⬜:   20 🟨⬜🟩⬜🟩:   19 🟨🟨🟩⬜⬜:   19 ⬜⬜🟨🟨🟩:   18 🟩⬜🟩⬜🟨:   18
🟩⬜⬜🟩🟩:   18 🟩🟩⬜⬜⬜:   18 🟩⬜🟨🟨⬜:   17 ⬜⬜🟨🟨🟨:   17 ⬜⬜🟩🟨🟩:   15 ⬜🟩⬜⬜⬜:   15
⬜🟩🟨⬜⬜:   15 ⬜⬜🟩🟩🟩:   14 🟩⬜⬜🟨🟩:   14 🟩⬜🟩🟩⬜:   13 🟩⬜🟩🟨⬜:   13 ⬜🟨🟩⬜🟩:   13
🟩🟨⬜⬜🟨:   13 🟩🟨🟩⬜⬜:   12 ⬜⬜🟨🟩🟩:   12 🟨🟨⬜🟨⬜:   11 ⬜🟩🟩⬜⬜:   11 ⬜🟩⬜⬜🟩:   11
⬜🟨🟨⬜🟩:   10 🟩🟨🟨⬜⬜:   10 🟨🟩🟩⬜⬜:   10 ⬜🟨⬜🟩🟩:    9 🟨⬜🟨⬜🟩:    9 🟨⬜⬜🟨🟩:    9
🟩⬜🟨⬜🟩:    8 🟩🟨⬜⬜🟩:    7 🟨⬜🟨🟨⬜:    7 ⬜🟨🟨🟨⬜:    7 🟨🟨🟨⬜⬜:    7 🟨⬜🟩🟩🟨:    6
🟩⬜🟨🟩⬜:    6 ⬜🟩🟩⬜🟩:    5 🟩🟩⬜⬜🟨:    5 ⬜🟩⬜🟩🟩:    5 🟩🟩⬜⬜🟩:    5 🟨🟩⬜⬜🟨:    5
🟩🟩🟩⬜⬜:    5 ⬜🟨⬜🟨🟩:    4 🟨⬜⬜🟩🟩:    4 ⬜🟩🟨⬜🟨:    4 🟩🟩⬜🟩⬜:    4 🟨🟨⬜⬜🟩:    4
🟩⬜🟩🟨🟩:    4 ⬜🟩🟨⬜🟩:    3 ⬜🟩⬜🟨⬜:    3 🟩🟨⬜🟨⬜:    3 🟩⬜🟩🟩🟩:    3 ⬜🟨🟩🟩🟩:    2
🟩🟨🟩⬜🟩:    2 🟨⬜🟨🟩🟩:    2 ⬜⬜🟩🟩🟨:    2 🟩🟩🟨⬜⬜:    2 🟩🟩⬜🟩🟨:    2 🟨🟩🟨⬜⬜:    2
⬜🟩⬜🟩⬜:    2 ⬜⬜🟩🟨🟨:    2 🟩🟨⬜🟩⬜:    2 ⬜🟨⬜🟩🟨:    1 ⬜🟩🟨🟨🟨:    1 ⬜🟨🟨⬜🟨:    1
🟩⬜🟨🟩🟩:    1 ⬜🟨⬜🟨🟨:    1 🟩🟨🟨⬜🟩:    1 ⬜🟨🟨🟩⬜:    1 🟨🟩⬜🟨⬜:    1 🟩🟩🟩🟩🟩:    1
🟨🟩⬜⬜🟩:    1 🟩🟨⬜🟩🟨:    1 ⬜⬜🟨🟩🟨:    1 🟩🟩🟩⬜🟩:    1 🟨🟩⬜🟩⬜:    1 🟨⬜🟩🟩🟩:    1
🟨🟩🟩⬜🟨:    1 ⬜🟩🟨🟩⬜:    1 🟨🟨⬜🟩⬜:    1 ⬜🟨🟩🟩⬜:    1 🟩⬜🟩🟩🟨:    1 ⬜🟩🟨🟨⬜:    1
```

This means that, if we pick `smile`, we can get 101 different resulting patterns. The largest group we’re left with contains 316 words. So by guessing this single word, we can eliminate most of the words.

If we do this for all the words, my algorithm gives back that `raise` is the absolute best first word. This word returns 107 different patterns and the worst case is that we have 168 words left after the initial guess.

Another good candidate would be `slate`, this splits the words into more groups: 129. But the worst-case is a little bit worse, leaving just 221 words.

## After `raise`, repeat.

After our first guess, we just continue with the following guesses. We calculate which words are still left and we go over all the words to find the one that gives us most information about the target word. Also: Don’t limit yourself to using/guessing just the possible words. For example look at the following case:

```tares: 🟨⬜⬜⬜⬜
pilot: ⬜🟩⬜⬜🟨
dunsh: ⬜⬜⬜⬜🟩
```

There are now still eight possible words:

```hitch, fifth, witch, aitch, bitch, fitch, gitch, mitch
```

Instead of guessing one of these words, often getting just a single piece of information… it’s much better to guess the word `awful`. This guess splits the possible words into 5 groups:

```🟩⬜⬜⬜⬜: aitch
⬜🟨⬜⬜⬜: witch
⬜⬜🟩⬜⬜: fifth
⬜⬜🟨⬜⬜: fitch
⬜⬜⬜⬜⬜: hitch, bitch, gitch, mitch
```

If we get ⬜🟨⬜⬜⬜ in return we can conclude that the only option left to guess is `witch` and we’re done 🟩🟩🟩🟩🟩.

This algorithm allows me to solve all words within the given 6 tries you get with Wordle. For a single move I think this algorithm is optimal. However, we might be able to do better if we consider multiple moves into the future. It should be possible to extend this, build a tree and check the depth.

Perhaps `raise` is great for a single move, eliminating a lot of options, but what if the largest (worst-case) group is very hard to break up after that?

I haven’t tried to figure this out… but perhaps you will? What other algorithms can we come up with?

# Bot tournament…?

Perhaps we can host a tournament where we have bots playing Wordle, taking turns solving and giving the opponent a next word to solve. We can even make the bots smart enough to recognise and counter certain words and/or tactics.

Sounds fun :)