How They Work and What They Can Do


Summary

  • AI-generated images rely on diffusion to create realistic pictures from pure noise, reversing the process step by step.
  • Ongoing training and refinement of AI models, with user input, have greatly improved image quality over years.
  • Text prompts are used to generate images, with additional parameters and generative fill tools enhancing results.

AI-generated images are everywhere now, and the very best of them look so good you’d never know they were made by a machine and not by a human. But, how is this possible? The answer to how AI image generation works is both simple, and very complicated.

It’s All About Diffusion

At the heart of AI-generated images, is the concept of “diffusion.” This is the basic process that all types of generative AI that make images use today, and it goes something like this:

  1. The diffusion process begins with a dataset of existing images. Noise, or random distortions, is gradually added to these images until they become nearly unrecognizable.
  2. The AI model learns to reverse this process by removing the noise step by step. This involves training the model to predict what the image looked like before the noise was added.
  3. Once trained, the model can start with pure noise and apply what it learned to generate entirely new, realistic images by reversing the noise process.

Generative AI image generators use a special type of neural network to learn from these data, and when you reverse the process, starting with diffuse noise and iterating until the image matches the text prompt, you are essentially running the neural net in reverse.

Every Day Is Training Day for AI Image Generators

The above process makes it seem simple, but AI image generation models are constantly being refined and improved, using as much data as possible. For example, when you vote for which images you like best on sites like Midjourney, you’re providing data that can help refine the model.

Early AI image generators were pretty awful. For example, here is an image of a woman eating an apple created using Midjourney V1 versus the latest (as of this writing) V6.

We went from nightmare fuel to “is that a real photo?” in just a few short years, all thanks to continued refinement and training of the model, as well as tweaking of the underlying neural nets that make this possible.

Turning Prompts Into Pictures

I alluded to this above, but when you as the user create images using AI, what you actually provide as input is a text prompt. This is simply a description like “a woman eating an apple”, which is the exact prompt I used to generate the two images above.

It takes a fair amount of experimentation with prompts to get the results you want, and sometime you’ll hit upon a set of words or phrases that really create something new and interesting.

Parameters, Generative Fill, and Other Neat Tricks

Of course, knowing how to prompt the right way and having a few specialized commands under your belt can make great results from cutting-edge models even better. Tweaking the options and making use of the post-generation tools that modern models offer are key to making perfect AI images.

The Midjourney web image option panel.

Generative fill is one of the most useful aspects of this AI technology. This allows you to erase a part of an image, and then use the AI to fill in something new based on a prompt or simply the context of an image.

Personally, I use this to fix issues such as characters with too many fingers all the time. You can also find this built into modern photo editors, where programs like Adobe Photoshop and Canva’s Magic Erase feature.


Generative AI has now advanced to the point where it can create video, and the models are becoming much better at producing exactly what we ask for, including details about poses, objects, and how they should be arranged in the image.

While this technology still isn’t perfect, it’s advanced so much in such a short time, that I expect it to be fully mature sooner rather than later.



Source link

Previous articleThe Best Kitchen Trash Can for 2024