Lesson 2 • 2 min
Why Latent?
Speed and memory benefits
Editing a sketch vs a mural
It's easier to edit a small sketch and then scale it up than to paint a huge mural directly. Working in latent space is like working on the sketch—same creative process, much less effort.
Self-attention compares every element to every other element. For a 1024×1024 image, that's 1M × 1M comparisons—impossible. For a 128×128 latent, it's only 16K × 16K—4000× fewer operations!
Compare compute requirements: pixel space vs latent space
The computational savings
Self-attention complexity: O(n²)
Pixel space (1024×1024 = 1M tokens):
1M² = 1,000,000,000,000 operations 💀
Latent space (128×128 = 16K tokens):
16K² = 262,144,000 operations ✓
That's 4,000× less compute per attention layer!
Plus: 48× less memory for activations
Plus: Fewer channels to process
Result: Feasible on consumer GPUsQuick Win
You understand why latent space: it makes self-attention computationally tractable.