How the Wordle Optimizer Works

This project is a data-driven approach to solving the daily Wordle puzzle. Instead of guessing randomly, it uses a scientific method to find the single best starting word.

1. The Method: Information Theory

The core of the optimizer is a concept from information theory called 'entropy'. In simple terms, entropy measures the amount of uncertainty in a piece of information. When applied to Wordle, it measures how much a guess is expected to reduce the number of possible answers. The goal is to find the word that, on average, provides the most information and eliminates the largest number of possible answers.

The solver uses the Shannon entropy formula, which calculates the average amount of information gained from a single guess. A high entropy score means the word is excellent at narrowing down the possibilities, while a low score means it doesn't give you much new information.

2. The Logic: Step-by-Step

Load Word Lists: The function first connects to a Google Cloud Storage bucket I created, to load three lists: a list of all possible Wordle answers, a list of all valid guesses, and a list of all previous answers. This is done to ensure the solver is working with the most up-to-date and accurate data. The list of previously used words is updated daily—which the function then recalculates—but this time without the answers from previous Wordle puzzles, as answers do not repeat themselves.

The `calculate_entropy` function: This is the heart of the algorithm. For every possible guess word, the function calculates its entropy score. It does this by figuring out all the possible outcomes (e.g., green, yellow, gray letters) for that guess against all remaining possible answers. The formula then measures which guess gives you the most information on average.

def calculate_entropy(self, guess, possible_answers):
    "Calculates the information entropy for a guess."
    patterns = {}
    for answer in possible_answers:
        pattern = self.get_pattern(guess, answer)
        if pattern not in patterns:
            patterns[pattern] = 0
        patterns[pattern] += 1

    entropy = 0
    total_answers = len(possible_answers)
    for pattern_count in patterns.values():
        probability = pattern_count / total_answers
        entropy -= probability * math.log2(probability)

    return entropy

Find the Best Word: After calculating the entropy for all potential starting words, the function simply selects the word with the highest entropy score. This word is the statistically optimal first guess.
Store Results: Finally, the function saves the best word and its score to a Firestore database. This allows the website to display the result without having to run the complex calculation every time a user visits the page.
Understanding The Score: A good word entropy score is typically in the range of 4.5 to 6, which indicates the word is highly effective at eliminating possibilities. An average word might score between 3.5 and 4.5, while a bad word could have a score below 3.5, meaning it provides very little information for the next guess.

3. The Tech Stack

This project demonstrates a fully serverless, cloud-native architecture on Google Cloud Platform:

Python: The core logic of the Wordle solver is written in Python.
Google Cloud Storage: This is used to store the word lists, which the function reads from during its execution.
Google Cloud Functions: The Python code is deployed as a serverless function, which means it only runs when someone requests the best word. This is a highly cost-effective and scalable solution.
Google Cloud Firestore: This database is used to store the daily result, allowing the website to retrieve the word quickly without re-running the heavy computation.
HTML/CSS: A simple, static webpage displays the result and an explanation of the project.