Learn how to use ChatGPT to enhance your blog's homepage, create summaries and tags, find related articles, and generate post images with ease, leveraging AI to save valuable time and effort.
[0:00] Hey everyone. So I wasn’t going to do a video on this, but I ended up doing a bunch of interesting
[0:04] things that I thought were worth sharing. You may not be aware of it, but alongside the channel
[0:09] we’ve got a blog. I normally post text versions of the videos, along with posts that don’t really
[0:13] make sense as a video. It’s a pretty fun side project, the site is all built using a tool
[0:18] called Jekyll. This takes a bunch of markdown files and compiles them into a set of static
[0:23] HTML pages. All this is hosted on AWS using an S3 bucket behind CloudFront. The nice thing about
[0:29] this is there’s no backend, so when a lot of traffic hits the site, it doesn’t really cost me
[0:34] anything. There have been a few annoying things that I’ve been meaning to fix for ages. The first one
[0:39] is there’s no related posts. This means that people tend to come and read one article and then
[0:44] disappear. There’s also no tags on the posts, so there’s no way to organise anything on the blog.
[0:50] The final annoying thing, at least to me, is that the homepage of the blog is arranged in
[0:54] tiles with nice images that represent each post. Lots of the older posts don’t have images,
[0:59] so if you go back in history, you end up with quite a boring homepage. I’m also using the
[1:04] first few lines from each blog as a teaser text on this page. This works, but they aren’t really
[1:09] good representations of the actual post content. All these problems are fixable with a bit of
[1:14] manual labour. I could go through each post, create a little summary, create some tags,
[1:19] trawl through all the other posts identifying similar posts, and I could create all the missing
[1:23] post images. But really, life is way too short. Let’s get our friendly AI to do it all for us.
[1:29] I’ve made this really simple prompt for ChatGPT. We give it our post content and then just ask it
[1:35] to output some JSON with the fields that we want. We’ve got a couple of summaries. I thought it might
[1:40] be interesting to have a simple summary and one that is optimised for search engines. Now we’ve
[1:44] got the list of tags. I’ve run this against all the posts and I’ve written the results out to the
[1:49] front matter of each post. It’s worked really well. What I found interesting was that to me,
[1:54] the SEO summary was a bit nicer than the normal summary, so I’ve ended up using that.
[1:59] After a bit of scripting and messing around, we’ve got tags on all our posts
[2:03] and we’ve got a tag index page. I do think it does need a bit more work to rationalise the tags,
[2:08] but it ticks the box and it works. We can visualise the tags using a word cloud.
[2:12] You can definitely see my recent interest in ChatGPT showing up.
[2:17] Our next challenge is the related articles. Here we can leverage some pretty powerful technology
[2:22] called embeddings. Embeddings are surprisingly simple. They are just an array of numbers that
[2:27] encode the semantic meaning of some text. We can get embeddings for all our article summaries.
[2:32] Since these embeddings are just vectors of numbers, we can use a simple formula to work
[2:36] out how similar they are to each other. If the embedding is similar, then the article content
[2:41] should be similar or at least a close topic. It’s really fun to visualise this. I’ve
[2:45] created a dendogram plot of the embeddings and you can see just how well articles cluster together.
[2:51] So we’ve got our new summaries, tags and we’ve got related articles. We just need the images now.
[2:56] Once again, we can turn to our friend ChatGPT and ask it to give us a prompt that could be fed
[3:01] into an image generation system. It’s not always great at doing this. It often strays into results
[3:06] that are more like briefing a design agency. Here’s the output for one of my posts. It’s a
[3:11] bit verbose, but we can feed it into something like Dali too. The results are not particularly
[3:16] fantastic, especially when you compare it against something like MidJourney. MidJourney always seems
[3:21] to produce amazing images, even when the prompt is pretty rubbish. The problem is, there’s no API
[3:27] for MidJourney and scripting MidJourney is strictly against the terms of service. So we’re not allowed
[3:33] to use MidJourney unless we copy and paste each prompt into it. But it’s fun to think about how
[3:38] you would script something like MidJourney. Not that I would recommend doing this. The following
[3:43] is purely a thought experiment. What are our options? MidJourney can only be driven via Discord.
[3:52] This runs as either an app on your desktop or in a web browser. One thing we could try is
[3:57] automating the GUI on the desktop. There are a few libraries for Python that would let us do this.
[4:02] You can drive the mouse and keyboard, sending clicks and key presses. It works, but it’s
[4:07] extremely fragile. There are some libraries that will attempt to do OCR and image recognition to
[4:12] find GUI elements, but this seems a bit too complicated for me. The other alternative is to
[4:18] use tools that have been developed for testing websites. A tool like Selenium or WebDriver
[4:23] lets us script Chrome. We can then locate GUI elements in the Discord web page and simulate
[4:28] clicks and typing. This is still quite fragile, but for a one-off job, it should work very well.
[4:33] Once again though, I must remind you not to do this. You may lose access to your MidJourney account
[4:38] if they detect your scripting attempts. Assuming you get all your images generated,
[4:43] you now have another problem. You need to download the results. It may be possible to do this using
[4:48] the WebDriver approach, but it’s a bit complicated. However, if we look at the MidJourney website,
[4:54] we can see all our generated images. Maybe we can get them from here.
[4:58] Whenever I am faced with a web scraping problem, my first protocol is to look at the Network tab.
[5:03] If we do this on the MidJourney site, we can see some interesting traffic. These network requests
[5:08] hidden amongst all the tracking pings that are blocked by my browser look very interesting.
[5:12] There are the results from all our image generation attempts. Expanding the results out,
[5:16] we can see that we’ve got our image URLs and we’ve got the prompt. As we scroll down the page,
[5:21] we can see that it hits this endpoint to get more images. We could try and get clever here
[5:26] and script the API, but we only want a couple of pages, so we can just copy the responses
[5:30] straight out from Chrome and stick it into a file. Our last and final problem is a pretty
[5:35] nice one to have. We’ve got too many images. Each prompt has generated four results. Which one should
[5:41] we use? So I asked ChatGPT how to detect if an image was interesting. It gave me a lot of suggestions,
[5:47] ranging from pretty easy to really difficult. In the end, I just went to picking the image
[5:52] with the highest contrast. It’s pretty arbitrary, but all the images look nice, so it doesn’t really
[5:57] matter. With the images picked, we’ve now got a thing of beauty. The blog is looking a lot nicer
[6:02] and has a lot more functionality. If you want to learn a bit more about ChatGPT, then you should
[6:06] watch this video that should be appearing right now, somewhere on the screen. I’ll be back with
[6:11] them soon with some more electronic focussed videos in the near future, so I’ll see you soon.