“The meaning of a word is its use in the language.” — Ludwig Wittgenstein
But what if the language has no users who experience reality?
The Grounding Problem
When you say “cat,” the word connects to something real—furry creatures that purr and knock things off tables. You’ve seen cats, touched cats, heard cats meow at 3 AM demanding food. Your word “cat” is grounded in experience.
When an LLM outputs “cat,” what does the word connect to? The model has never seen a cat. It has never felt fur or heard purring. It has only seen the word “cat” appear in relation to other words—next to “pet” and “whiskers” and “meow.” Is there any genuine connection to actual cats in the world?
This is the symbol grounding problem, and it cuts to the heart of whether LLMs can have genuine understanding or produce genuine truth.
Three Theories of How Words Connect to Things
Philosophers have long debated how language refers to reality. Three major theories offer different answers—and different implications for AI.
Direct Reference
Philosophers like Kripke and Putnam argued that words refer to things through causal chains. The word “water” refers to H₂O because someone, long ago, first used “water” while pointing at the stuff. Others learned the word from them. Usage passed through a chain of speakers down to us. The reference was established by that original pointing and preserved through history.
For LLMs, this creates a problem: the causal chain seems broken. LLMs learned “water” from text, not from anyone pointing at actual water. Can reference survive transmission through text alone, or does something essential get lost?
Descriptivist Theory
Frege and Russell offered a different account: words refer through descriptions, through clusters of properties we associate with them. “Water” refers to whatever is colorless and liquid, falls as rain, fills oceans, and has the chemical formula H₂O.
This view might be more promising for LLMs. They learn extraordinarily rich descriptions. They know water is wet, drinkable, composed of hydrogen and oxygen, boils at 100°C at sea level. If reference is just about having the right descriptions, LLMs seem well-positioned.
Use Theory
Wittgenstein argued that meaning is use. Words mean whatever role they play in our language games—in requests (“Get me some water”), descriptions (“The water is cold”), explanations (“Water boils at 100°C”), and countless other contexts.
LLMs master use. They deploy “water” correctly across an enormous range of contexts. They know when to mention water in a cooking recipe versus a chemistry explanation versus a survival guide. If meaning is use, LLMs seem to have it.
Harnad’s Challenge
Stevan Harnad posed the grounding problem in a vivid way. Imagine a Chinese-English dictionary that defines every Chinese word using other Chinese words. You open it up and look up a word. The definition contains more Chinese words. You look those up too. More Chinese words. It’s definitions all the way down.
If you don’t know any Chinese to start with, this dictionary is useless. You’re trapped in a circle of symbols pointing to symbols. Without some way to break out of the symbolic circle—without some word whose meaning you grasp directly—you can never learn what any of it means.
For humans, the answer is sensory experience. We break out of the symbol circle through our bodies. “Red” means something because we’ve seen red things. “Pain” means something because we’ve felt pain. We ground our symbols in perception and action.
LLMs never break out. They process text, which is defined by text, which is defined by more text. Everything in their world is relational—this word appears near that word, in these contexts. Nothing is ever directly grounded in experience.
Or is it?
The Case Against Grounding
Several arguments suggest LLMs genuinely lack the connection to reality that human language has.
The Chinese Room argument, extended: Searle imagined someone following rules to manipulate Chinese symbols without understanding Chinese. LLMs scale this up to billions of tokens, but more processing doesn’t equal understanding. More tokens don’t equal grounding. More parameters don’t create genuine reference. Quantity cannot produce the qualitative shift from syntax to semantics.
The alien language thought experiment makes the point vivid: Imagine encountering an alien language with no overlap with human experience. From text alone, you might learn that “glorb” is something you do to “fizznats,” that it causes “ploomth,” and that excessive glorbing damages things. You’d learn relational structure perfectly. But would you know what any of it means?
LLMs are arguably in this position with respect to all language. They’ve learned all the relations—this word goes with that word, in these patterns. But they’ve never experienced any referent. They know how words relate to each other, not what words relate to.
The Case For Grounding
But there are counter-arguments worth taking seriously.
Perhaps grounding doesn’t require sensory experience. Perhaps what matters is having the right relational structure. Consider the human concept of “red”: it’s opposite of green, associated with danger and passion and heat, lighter than maroon, darker than pink, has wavelength around 700nm, looks like blood and fire and roses. LLMs learn exactly this relational structure from text. If the structure is identical, maybe the grounding is equivalent.
Perhaps grounding can be transmitted through testimony. LLMs learned from humans who are grounded. When someone writes “the sunset was a brilliant red, warm against my face,” they’re encoding their grounded experience into text. The LLM learns from this encoding. Perhaps it inherits grounding indirectly, the way I inherit knowledge about ancient Rome from historians who studied primary sources.
And increasingly, LLMs are multimodal. They process images and video alongside text. When an LLM has seen millions of pictures labeled “cat,” and learned to associate visual patterns with the word—hasn’t it “seen” cats in some sense? This isn’t full embodied experience, but it’s closer than pure text.
The Puzzle of Truth
Here’s something curious: LLMs often output true statements. “Paris is the capital of France.” “Water is H₂O.” “The Earth orbits the Sun.” If they lack grounding, how can they produce truths about the world?
The correspondence theory says truth is about matching reality—“Snow is white” is true if and only if snow is white. This seems to require connection to the world. If LLMs have no connection to the world, how do their outputs correspond to it?
The coherence theory offers an alternative: truth is about coherence with other beliefs. LLMs excel at coherence. Their outputs hang together, maintain consistency, respect logical constraints. If truth is coherence, LLMs produce truth.
But hallucination suggests limits. When LLMs confidently assert that Neil Armstrong walked on Mars in 1969, they’re producing something that sounds right—famous astronaut, space milestone, plausible date—without tracking correspondence to reality. They’re optimizing for plausibility, not truth.
Then again, humans confabulate too. We generate false memories, make confident errors, fill in gaps with plausible nonsense. If human hallucination doesn’t undermine human reference, does AI hallucination undermine AI reference?
A Spectrum, Not a Binary
Perhaps reference comes in degrees rather than being all-or-nothing.
At one end sits full reference: a human pointing at a cat and saying “cat.” The word connects directly to a present object through a conscious act. At the other end is no reference at all: random noise, symbols with no systematic connection to anything.
In between sit various intermediate cases: a human describing a cat they’ve never seen, based on reliable testimony. Someone learning about distant historical events from books. A blind person understanding “red” through description and analogy. Statistical patterns that reliably correlate with real phenomena without anyone intending them to refer.
LLMs might occupy a position on this spectrum. They have something more than random noise—their outputs systematically correlate with facts about the world. They have something less than full grounded reference—they’ve never directly experienced what they describe. Maybe “statistical reference” is a useful concept: not nothing, but not robust human reference either.
Living With Uncertainty
The grounding problem remains philosophically unresolved. We don’t have clear criteria for deciding whether LLM symbols genuinely refer to anything.
What we can say with confidence: LLMs lack sensory grounding—they’ve never experienced what they describe. They have rich relational structure—they know how concepts interconnect. Their outputs often correspond to facts—but this might be correlation, not reference. Hallucination suggests limited truth-tracking—but humans hallucinate too.
Until we resolve these deep questions, the practical response is appropriate uncertainty. LLM outputs are useful patterns that might be truth, might be noise, and we often can’t tell which. Trust, but verify. Use, but validate. The tools are powerful, but we’re still learning what they are.
This is part of the AI Truth and Justice series—16 modules exploring the philosophical foundations of artificial intelligence: epistemology, ethics, alignment, and consciousness.
From Theory to Practice
See how we address these challenges in production systems:
- CLI-First Architecture — Transparent access to external data sources
- Memory Systems — Building persistent context that agents can reliably access
- Squads Architecture — Organizing agents around verified domain knowledge
Get Intelligence Reports
How are enterprises handling AI reliability? Our Intelligence Reports cover implementation patterns, verification strategies, and real-world case studies.