Lesson 2 • 2 min
Distribution Matching
Learning the style, not the specifics
Learning to cook "in the style of"
You don't learn to replicate every dish a chef makes. You learn their style—their flavor profiles, techniques, presentation. Then you can make your own dishes in their style. That's distribution matching.
DMD (Distribution Matching Distillation) doesn't just match individual outputs. It ensures the student produces the same distribution of outputs as the teacher—same range of quality, diversity, and characteristics.
See how distribution matching ensures consistent quality
Distribution vs point matching
Point matching:
"This specific image from teacher = this specific student output"
Problem: Student might overfit to specific examples
Distribution matching:
"The set of all teacher outputs ≈ the set of all student outputs"
Benefit: Student learns the general capability
For the same prompt, teacher produces many valid images.
Student should be able to produce the same variety.Quick Win
You understand distribution matching: learning to match the overall output distribution, not just specific examples.