Lesson 1 • 2 min
Latent Space
Working in compressed form
Sheet music vs audio
Sheet music is a compressed representation of a song—it captures the essence in a fraction of the size. Musicians can reproduce the full audio from it. Latent space is like sheet music for images.
A 1024×1024 RGB image has ~3 million numbers. Processing that directly is expensive. The VAE compresses it to a 128×128 latent (about 49K numbers)—60× smaller while preserving essential information.
See an image compressed to latent space and back
Compression ratios
Original image: 1024 × 1024 × 3 = 3,145,728 values
Latent: 128 × 128 × 4 = 65,536 values
Compression: ~48× smaller!
The diffusion process happens entirely in latent space.
Only at the end do we decompress to pixels.Quick Win
You understand latent space: a compressed representation where diffusion operates efficiently.