In March this year, OpenAI released a feature called 4o Image Generation. This is an update to ChatGPT’s image generation capabilities that brings about a number of improvements, such as more accurate text, better instruction adherence, and improved photorealism.
The process isn’t instantaneous, however. The way that you can watch the images appearing in real time takes me back to the good old days of dial-up.
ChatGPT Images and the Slow Reveal
Many AI images are generated by starting with a random noise, like the static you see in the intro to HBO shows. The AI model then refines that noise based on the prompt, with each iteration becoming less like random noise and more like the intended image. Eventually, after enough iterations, the image should resemble the prompt.
This means that generating an image takes time. With some AI models, you can watch the process happen, seeing the image go from fuzzy static to a finished image. Each step shows the state of the full image before the next iteration takes place.
4o Image Generation is a little different, however. It will first show a very blurry depiction of what the final image would look like, but then the image gradually clarifies. Rather than this happening to the entire image at once, however, it happens from the top down.
The top of the image is finished first, while the rest remains a blur. The boundary between the completed and fuzzy image slowly moves down the image so that you don’t see the completed image until it reaches the bottom.
A Flashback to the Dial-Up Days
The first time I saw this happen, I was immediately thrown back 30 years to the days of dial-up internet. Back then, the fastest speeds you could get were 56 Kbps, and the reality was usually much slower. These speeds were so slow that downloading a 100 KB image could easily take 30 seconds or more.
The way that images downloaded over dial-up is very similar to how ChatGPT’s new images appear. Each row of pixels would load from the top down, meaning you would see the top of the image first and have to wait for the rest of the image to load before you could see it.
Why the Slowdown?
It’s not entirely clear why ChatGPT’s new image-generation feature uses this new top-down method. DALL-E, the previous image-generation model from OpenAI, didn’t behave in the same way.
The images generated using 4o Image Generation are certainly far superior to those generated using DALL-E, and producing better images is likely to take more time. According to a tweet from OpenAI’s CEO Sam Altman, it seems that a lot of ChatGPT users are using the feature quite heavily, to the point where the company is considering limiting its use temporarily. If OpenAI’s GPUs are “melting” then the image generation is likely to take longer than it might otherwise.
This would explain why the images are loading slowly but not the way that images are refined from the top down. Whether this is a consequence of the way the images are generated or because someone at OpenAI really misses the dial-up days is unclear.
There’s Something to be Said For Having to Wait
We live in a world of instant gratification. You have access to the sum total of all human knowledge in your back pocket, and we mostly take it for granted. We never really have to wait for things anymore, except when companies like Apple cruelly dish out episodes of Severance at a rate of one a week.
I hate the fact that if I have to wait 30 seconds for a lift or for the commercials to finish, my hand will automatically be reaching for my phone, to fill those seconds with some mindless scrolling. I have to go to extreme lengths to stop myself from doomscrolling at every available opportunity.

Related
10 Ways to Stop Doomscrolling on Your iPhone
Get help to escape the cycle so you can go touch some grass.
But there’s something to be said for having to wait for something good. The slow loading of images in the dial-up days was frustrating, especially if the information you needed (or the bit of the image you most wanted to see) was at the bottom and was the last thing to load.
There was something quite magical about watching the image appear before your eyes, however, and I didn’t realize how much I missed that until ChatGPT reminded me of it.
Slow Generation May Not Be Around For Long
While I’m really enjoying the experience of watching my images slowly appear before my eyes, I may not be able to enjoy it for long. The pace of AI developments shows no sign of slowing down. It wasn’t long ago that AI images were hilariously easy to detect just by looking at the mangled hands, but current AI-generated images are getting seriously hard to spot.
As this technology improves, it’s likely that image generation will get even quicker, and the slow reveal will be gone forever. I plan to enjoy it while I can, because you don’t know what you’ve got until it’s gone.