Google shows AI tool that converts text into photo-realistic images

Spread the love

Google researchers have created an AI tool that can create realistic images based on text input. The researchers call their tool ‘Imagen’ and state that people find the results more realistic than the creations of the comparable tool DALL-E 2 from OpenAI.

Based on a description in text can Imagen generate images† You can choose from an ‘oil painting’ or a photo-realistic image. The latter is much more challenging to do convincingly with artificial intelligence. Imagen excels in this, say the makers.

Imagen works on the basis of a large pre-trained language model, such as GPT-3. That model is ‘frozen’, according to the researchers that produces the best results. The text input is then converted from random noise to image using a diffusion model.

Initially Imagen creates a small image of 64×64 pixels. With a super-resolution diffusion model, this is then enlarged to a final result of 1024×1024 pixels. The AI ​​tool can thus generate convincing non-existent images based on sentences such as “A dragonfruit wearing a karate belt in the snow” and “A photo of a raccoon wearing an astronaut helmet, looking out of the window at night”.

The researchers have published a paper with explanation about the operation of Imagen† In it, they also compare their AI tool with other tools that generate images. According to the researchers, people prefer Imagen’s creations.

Imagen is not the first AI tool that can generate images based on text input. OpenAI previously came with DALL-E 2† According to the makers, this is a tool that can generate realistic images and art based on text. DALL-E can also make 2 variations of existing artworks.

You might also like