Diffusion Model
Image & Video AIThe AI architecture behind most modern image generators — it works by learning to gradually remove noise from random static until a coherent image emerges.
Diffusion models are the technology powering Stable Diffusion, DALL-E, Midjourney, and most other AI image generators. The core idea is counterintuitive: during training, the model learns to reverse the process of adding noise to images. Given pure static, it can iteratively 'denoise' it into a coherent picture.
The text-to-image pipeline works by encoding your text prompt into a numerical representation, using that to guide the denoising process so the resulting image matches your description. The number of 'steps' in this denoising process affects quality — more steps generally mean better images but slower generation.
Diffusion models replaced earlier approaches like GANs (Generative Adversarial Networks) for most image generation tasks because they produce more diverse, higher-quality results and are easier to control with text prompts.
Real-World Example
When Midjourney generates an image from your prompt it's using a diffusion model — starting from random noise and progressively refining it into your requested image.
Related Terms
More in Image & Video AI
FAQ
What is Diffusion Model?
The AI architecture behind most modern image generators — it works by learning to gradually remove noise from random static until a coherent image emerges.
How is Diffusion Model used in practice?
When Midjourney generates an image from your prompt it's using a diffusion model — starting from random noise and progressively refining it into your requested image.
What concepts are related to Diffusion Model?
Key related concepts include Stable Diffusion, Checkpoint, LoRA (Low-Rank Adaptation), Negative Prompt. Understanding these together gives a more complete picture of how Diffusion Model fits into the AI landscape.