Lesson 2 • 2 min

How Machines Learn

The training concept

Think about how you learned

How did you learn what a cat looks like? You saw thousands of cats—in real life, in photos, in drawings. Your brain built an internal understanding of "cat-ness" from all these examples.

AI image generators do something similar, but simpler. They're shown millions of image-text pairs: photos with captions, artwork with descriptions. They don't "understand" cats the way you do—they learn statistical patterns: "When text says 'cat', these pixel patterns tend to appear."

See how a model learns from image-text pairs

The training data looks like this

{
  "image": "photo_001.jpg",
  "caption": "A golden retriever playing fetch in a park"
}
{
  "image": "photo_002.jpg",
  "caption": "Sunset over the ocean with orange clouds"
}
// ... millions more pairs

The model never memorizes these images. Instead, it learns patterns: "When text mentions 'sunset', the image usually has warm colors at the top." It learns to associate words with visual features.

Quick Win

You now understand training data: models learn from millions of examples, building statistical associations between text and visual patterns.

Continue to Practice