Lesson 4 • 2 min
The Pipeline
Text → Numbers → Image
It's like a recipe
When you follow a recipe, you go: Recipe (text) → Gather ingredients → Cook → Dish. An image generator works similarly: Prompt (text) → Convert to numbers → Process through the model → Image.
See data flow through each stage of the pipeline
Here's what happens at each stage:
The pipeline stages
1. TOKENIZE: "a cat on a beach" → [1, 5847, 23, 1, 8921]
(Words become number IDs)
2. ENCODE: [1, 5847, ...] → [[0.2, -0.5, 0.8, ...], ...]
(IDs become meaning vectors)
3. DENOISE: Start with random noise, guided by the text vectors
(8 steps of cleanup)
4. DECODE: Latent representation → RGB pixels
(Decompress to actual image)Quick Win
You now understand the data flow: text gets converted to numbers, which guide the denoising process, which produces an image.