Wordle 1091, June 14, 2024

I falsely believed that I had found a wordle state where an elaborate regular expression with alternations gives fewer possible answers than a simple regexp in a pipeline with extra greps.

I was wrong.

I try not to use outside help to solve Wordle brand video game puzzles. I occasionally use Linux tools like grep (which implements regular expression matching) to see how good my guesses are. In this case I’m looking at the regular expressions themselves to see what quantities of possible solutions various regular expressions match.

When I did my tentative examination of Wordle 1091, it appeared that a simpler regular expression matched more possible answers than a more elaborate regular expression. I decided to write a blog post about that. During preparation of this very post, I discovered that my tentative examination had a few flaws. Both the simpler and more elaborate regular expressions gave the same possible answers.

I already had 75% of this post complete, so I decided to admit to failure, finish the post and put it out there anyway.

Here’s what the screen looked like after my third guess:

smart phone screen shot of wordle, 3rd guess

Here are the facts from which to construct regular expressions:

  1. A in column 2
  2. S, E, R, M and Y do not appear in any column
  3. L appears in column 4 or column 5.
  4. T appears in column 1 or column 5, which interact with (3)
  5. A does not appear in column 3
  6. L does not appear in columns 1 or 3
  7. T does not appear in columns 3 or 4

Based on facts 1, 2, 5, 6 and 7, the regular expression that matches the solution (and lots of other words) looks like this:

grep -E '^[^sermyl]a[^sermyatl][^sermyt][^sermy]$'

That regular expression matches 97, five-letter words from /usr/share/dict/words file. Because of the many matches, it’s no help to me, a human.

The easiest change is to ensure that L and T appear in the matches, which is not quite congruent with facts 3 and 4. Adding extra grep commands for l and t could match words that aren’t answers because the extra greps don’t require appearance of letters in the correct columns.

grep -E '^[^sermyl]a[^sermyatl][^sermyt][^sermy]$' | grep l | grep t

That slightly improved regular expression matches 3, five-letter words from /usr/share/dict/words file. The improved regular expression guarantees that an L and a T appear in the matched words, not that those letters appear in the correct columns.

If an L appears in column 4, T can appear in columns 1 or 5. That’s two sub-expressions. If an L appears in column 5, T can only appear in column 1. That’s a third sub-expression.

Here’s an egrep format regular expression that has those 3 alternatives:

grep -E '^(ta[^sermyatl]l[^sermy]|[^sermyl]a[^sermyatl]lt|ta[^sermyatl][^sermyt]l)$'

This regular expression matches the same three words that the slightly improved regular expression above matches. Adding the extra information about where the L and T could appear made no difference.

Looking at my answers to this puzzle in detail in order to write this blog post, I see that my second guess, LATER, was less than useful. Going with MALTY on my second guess would also have been redundant in that the T, known to exist on the first guess, does not change column. LATER re-uses E, which we know from the first guess isn’t in the solution. I’m going to have to remember to use FAULT as my second guess if L, A and T are yellow letters in my first guess.