i'm not great at wordle, which can probably be evidenced by this game
but other than that, i think that wordle is very interesting regarding information theory (just like 3b1b x3), so this will be my attempt at an explanation on how we, realizing it or not, employ information theory while playing a game of wordle.
imagine every possible word that is valid in wordle (about 14.9k words) - everything from aahed (the simple past and past participle of aah) to zymic (pertaining to, or produced by, fermentation). now imagine they all somehow exist in one solid block of words. to make a guess in wordle would be to chop off a section of this block with all the words that don't match your guess.
to give a simpler example: imagine you only had 4 words
to guess from - phone
, cobra
, sling
, and night
.
if you guess radar
, and you get
you can only eliminate cobra
as an option. (ie
chopping it off the block) but if you get
you can eliminate phone
, sling
, and night
!
very efficient. once you chop all 3 off the metaphorical
block, only cobra
is left, so you can be sure it's it.
and so a game of wordle is just repeatedly removing sections from all possible answers until you remain with just 1 word.
some of you may have heard of bits as they are in computer science (1s and 0s) - they are somewhat related to our bits here, but only somewhat. even though most people just call them bits, i'll call them shannons to remove ambiguity and to honour Claude Shannon (the "father" of information theory) i guess x3
a shannon is, really crudely said, how much did you chop off the possible guesses. and it's base 2 logarithmic. said with more understandable words, a shannon is basically how many consecutive times did you chop the space in half. if you remove ("chop off") half of the possible answers, that's 1 shannon (Sh for short). if you halved the possible answers twice in a row (or shrank your possible answer space 4 times), that's 2 Sh.
back to the game with only 4 possible words, guessing radar
and getting the first two yellow, means that you just gained
2 Sh of information (4 possible answers -> 1 possible answer),
but if you got all gray, you've only gained about (log2(4/3))
0.42 Sh of information. how can you chop something in half 0.42
times? this has to do with logarithms and fractional exponentiation
and the like, which you can read about here
if you're interested.
now, back to the real world. let's measure how good my guesses are in shannons with the full word list
as you can see, salon
and posts
are similarly good guesses,
while boost
barely eliminates anything. morse
narrows it
down more, then horse
only manages to eliminate itself as an
option, which finally leads me to guessing worse
.
i was also wondering about how much information does each individual letter give us, so i wrote a bit more code
as you can notice, most letters gave absolutely no information,
but also each guess is worth more than the sum of its parts. for
example, the sum of the letter values of the second guess, pots
,
is barely 1 Sh, while the guess' actual value is 4 Sh! this is due
to stuff like the Ss not giving information individually (since we
already know that there's an s in the word), but together in the guess
they drastically narrow down the possible answers.
you may notice that the word list that i'm using has some words that will never actually be the guess to an official wordle game ever. which is true! these are all the valid words (those you can play), not the possible answers. i've decided to use them instead of the answers for 2 reasons
the code wasn't too difficult to write, although i fear there still might be errors in my wordle logic
from collections import defaultdict
def wordle(guess, answer):
pattern = []
used = defaultdict(lambda: 0)
for i in range(5):
if guess[i] == answer[i]:
pattern.append("green")
used[answer[i]] += 1
continue
pattern.append("gray")
for i in range(5):
if guess[i] in answer:
if (used[guess[i]] < answer.count(guess[i]) and
pattern[i] != "green"):
pattern[i] = "yellow"
used[guess[i]] += 1
return pattern
everything is done very bruteforce-ishly, but it runs fast enough so i don't care that much. all the visualizations were made in matplotlib, aligning them was hell lol. this is hella awkward without an outro
© nicole; go back