Stable Diffusion XL Turbo can generate images in real time

Spread the love

Stability AI introduces its Stable Diffusion XL Turbo model. This AI model can generate images in real time based on text. According to the maker, the number of steps in generating images has been reduced to one to four steps, compared to fifty before.

The SDXL Turbo model from Stability AI is based on a new distillation technique called Adversarial Diffusion Distillation, or ADD. This technique enables the model to generate images without waiting time, while maintaining high quality. The model works in real time, so it can display images while typing prompts. According to the maker, the SDXL Turbo model is capable of generating images with a single sampling step. The current Stable Diffusion XL model has fifty steps.

Outputs from SDXL Turbo. Source: Hugging Face

Stable Diffusion XL Turbo uses score distillation to achieve that. The system uses an adversarial loss. This generates an image with the aim of misleading a discriminator, which is trained to distinguish AI images from real images. In addition, a large and pre-trained diffusion model is used as a kind of ‘teacher’. Company has published a research paper which explains exactly how the technology works.

Stability AI also publishes the results of a human test. Subjects had to judge the output of two AI generators based on the same prompt. The SDXL Turbo images scored relatively well compared to other AI models that use more steps. The company also says that SDXL Turbo works relatively quickly. The model can generate a 512×512 pixel image in about 0.2 seconds on an Nvidia A100 data center GPU.

The SDXL Turbo model is available now available free for personal use. Stability AI has also published a demo version, the one with an account can be used via the Clipdrop website. SDXL Turbo is not yet available for commercial use at the time of writing.

A demo of real-time image generation with SDXL Turbo