Imagine you’re asked to guess someone’s age from a photo.
If you can see their face up close, you’ll have useful clues – wrinkles around the eyes, graying hair, even the expression in their gaze. Their clothes or posture might offer more hints. With a full picture, you might confidently say they’re in their 50s. But you’d rarely claim something precise like 53. You know you can’t be certain.
Now imagine the picture is blurry or taken from far away. Suddenly, the task becomes harder. Maybe you can still tell if they’re a teenager or an older adult, but that’s about it. The less you can see, the less precise your guess. And if the photo is from another culture or era – with unfamiliar clothing or setting – your confidence drops further. Even the signs you normally rely on, like wrinkles or teeth, might mislead you. (Botox and cosmetic dentistry don’t help much with this kind of inference.)
Now let’s make the task even harder: instead of guessing how old the person is, I ask you to guess how long they have left to live. This isn’t as hopeless as it might seem – you might take your age estimate and subtract it from the average life expectancy, adjusting based on context. Someone in a hospital bed, for example, might not have long. Someone lounging in a Hamptons living room might have decades.
From a technical standpoint, both problems – estimating age or remaining lifespan – are prediction problems. You’re inferring an unknown quantity from patterns in the data: the pixels in the image. If you wanted to test how good your guesses are, you’d compare them to reality. Predicting “53” will almost always be off by at least a year or two. Saying “between 40 and 60” will be right more often, but less useful. Accuracy and precision always trade off.
How would you improve your guesses? You can envision a Sherlock Holmes type approach that does a more thorough job of combing the image pixels for any sign. You would, of course, have to aggregate these signs into a final prediction. What type of thinking would this entail? Logical or probabilistic reasoning? A biological understanding of aging? Leaning into your knowledge about societal norms, culture, technology, or medicine?
It turns out, there is one sure way of mastering this task: through experience. Say, if you played a game where you get feedback on each guess, you’d develop an intuition – a “feel” . In Radiology, we call this gestalt: the learned, experience-driven sense that tells you when something in an image looks off, even before you can articulate why. This is essentially how neural networks – both biological and artificial – get good at what they do. They learn from repeated feedback, adjusting internal patterns until they can make better predictions.
That principle – learning predictive patterns from data – lies at the heart of today’s AI revolution. When ChatGPT predicts the next word in a sentence, when DALL-E imagines a “Van Gogh-style” portrait, or when a self-driving car identifies a pedestrian; each is solving a prediction problem. The models don’t have an understanding of the underlying physical reality. They’re extraordinarily skilled at anticipating what comes next, or guessing the unknown, based on vast experience.
In the case of generating a “Van Gogh painting” of a new scene, there’s no ground truth, you might wonder – Van Gogh never painted it. Yet after seeing hundreds of examples, one can develop an intuitive feeling for a Van Gogh painting. This, in turn, can be used to train a prediction engine for Van Gogh painting generators.
Let me spell this out clearly: Virtually all the remarkable progress in AI today comes from building better prediction engines. Larger models, more data, cleverer training. But intelligence is not just about prediction.
Life does not simply consist of a series of guessing questions or chess games. To be sure, being able to predict things, events, outcomes precisely can be very helpful. But if you don’t understand the complex causal relationships and mechanisms involved (like the biological processes underpinning aging or disease like cancer, or the physical systems that govern the weather, or the social norms that dictate human interactions), you will be just an observer. And observers cannot be intelligent.
True intelligence requires understanding – a model of how the world works, so you can reason about causes, effects, and goals. Prediction tells you what is likely; understanding tells you why, and what to do about it.
We are only at the beginning of creating systems that can act purposefully – systems that can explore, reason, and discover. To move beyond prediction, AI will need something akin to the scientific method: curiosity, experimentation, and causal understanding. No amount of data or compute alone will get us there. That next step – from intuition to understanding, from prediction to purpose – is where real intelligence begins.
Leave a comment