Learn how a Wordle-solving robot was built using a 3D printer and a Raspberry Pi, tackling challenges like locating the phone screen, mapping printer bed coordinates, and selecting the best possible guesses.
[0:51] We’ve made a wordle solving robot!
[0:54] I’m actually just using my 3D printer as the robot and a RaspberryPi for the brains.
[0:59] As usual, we’re sponsored by PCBWay - I’ve got a bunch of PCBs on their
[1:04] way for our next project so check out their link in the description.
[1:08] Let’s dive into how the robot works.
[1:10] We have several challenges to solve:
[1:12] First, we need to locate the phone’s screen
[1:15] Once we have the screen location we know where the boxes are and
[1:19] we know where each keyboard key is in the image.
[1:22] Our next challenge is working out where the printer bed is.
[1:25] If we know where the printer bed is, and we know where the phone screen is,
[1:29] we can tell the robot the coordinates it needs to enter guesses on the phone’s keyboard.
[1:34] Once a guess has been entered we need to work out
[1:36] what colour the boxes are for the letters we have entered.
[1:39] With these colours found we then need to refine our guesses until we hit the correct answer.
[1:45] We’ve done a similar project to this before with
[1:47] our augmented reality Sudoku solver - I’ve put a link to that video in
[1:51] the description and at the end of this video - it’s definitely worth watching.
[1:56] Let’s start off with the first challenge - how do we locate the phone screen?
[2:00] I’ve set my wordle into light mode, so I know the screen should be pretty bright.
[2:06] We don’t really care about the colour of the phone screen yet, so we can convert our
[2:10] image to greyscale. The brightest object in the image should be the phone so we apply a
[2:15] simple threshold to each pixel of the image. This gives us blobs of pixels that could be the phone.
[2:21] We examine each of these blobs and create a rectangle that approximates the shape.
[2:26] We then filter out any shapes that are the wrong aspect ratio
[2:30] or are too small. The largest one that remains should be the phone screen.
[2:36] Since we know the size of the phone’s screen and we now have the corners of
[2:40] the screen in the image we have enough information to create a transform from
[2:44] points on the phone screen to points on the image.
[2:47] This transform lets us locate where the grid boxes and the keyboard keys are in the image.
[2:53] Finding the printer bed is a little bit more tricky. The printer bed is black
[2:57] and there’s quite a bit of black in the image from the printer structure and the printer base.
[3:02] To help us out a bit I’ve added some coloured dots to the print bed.
[3:06] There’s a nice trick that you can do if you are trying to
[3:08] find objects of a particular colour in an image.
[3:11] Normally, when we are dealing with colour images each pixel is represented by three numbers
[3:16] that represent the strength of the red, green and blue components of the colour.
[3:20] I’ve put a sample image up here with these three components split out.
[3:24] Mixing these three components together gives all the different colours in the image.
[3:29] There are however other ways of representing a colour.
[3:32] A good choice when trying to segment colours in an image is to convert the pixels to HSV colours.
[3:38] HSV colours also contain 3 values - the H or hue is the angle of the colour around the
[3:44] colour wheel, the saturation - is how strong the colour is, 0% means no colour, 100% means
[3:48] full colour. And the value is how bright the colour is.
[3:51] With the image in HSV colours, it’s quite straightforward to filter out any pixels that are
[3:56] not our purple dots. We simply look for pixels with hue values that are in the correct range
[4:01] and filter out any pixels that are either too dark or are not saturated enough.
[4:06] Now that we’ve located our printer bed in the image we can create a
[4:09] mapping from the image to the printer bed locations.
[4:13] We know now how to map from coordinates on the phone’s screen to coordinates in the image.
[4:17] And we can map from coordinates in the image to coordinates on the printer bed. This lets us tell
[4:23] the printer where to move to so that it can touch the keys on the keyboard and tap in our guesses.
[4:29] With our guess entered we now need to know what colour each letter is in the grid.
[4:34] To do this we can use the same technique we used to find the printer bed dots.
[4:38] This time we are looking for the colours green and orange.
[4:41] Here’s a sample image with some guesses filled in. And here’s the image filtered for green pixels
[4:46] and the image filtered for orange pixels.
[4:50] Since we know the location of each grid square in the image we just need to count how many green
[4:54] and how many orange pixels there are and that will tell us what colour the box is. If there
[4:59] are not enough green or orange pixels in the box then we know that there is no colour there.
[5:04] So, we can now enter guesses, and we can see how good our guesses are.
[5:08] How do we actually pick the best words to guess?
[5:12] There’s been a lot of research into which word is the best first guess. I’ve linked to a couple
[5:17] of these in the description. According to the 3Blue1Brown channel, the best word is
[5:22] “salet” which apparently is “a light round helmet extending over the back of the neck”.
[5:28] My own top words that I found from experimenting are these ones, but I am not
[5:32] an expert in information theory so do your own research and watch the videos that I linked to.
[5:38] Every time we make a guess, we get feedback from wordle on the letters
[5:42] that are not in the word, the letters that are in the word but in the wrong position,
[5:46] and the letters that are in the word but in the correct position.
[5:49] From the wordle source code, we can extract all the possible words that it will accept
[5:53] as a guess - there are almost 13,000 in total. Every time we make a guess we use
[5:59] the feedback from wordle to filter this list so it only contains the words that are possible.
[6:04] You can see this happening here. Every time we make a guess
[6:07] the number of possible words is reduced.
[6:10] I’ve hardcoded the first guess the robot makes to be one of the top five
[6:13] first guesses. For subsequent guesses, we can look at all the possible words that remain
[6:18] and pick the word that on average gives us the smallest number of follow-up guesses.
[6:23] The image processing and solving work really well,
[6:26] we have quite a nice controlled environment so you would expect this to be the case.
[6:30] If you like this kind of project then you should definitely take a look at my augmented
[6:34] reality Sudoku solver - it does a lot more image processing and even uses some machine learning.
[6:39] Thanks for watching and I’ll see you in the next video.