28 Aug 2025 - tsp
Last update 28 Aug 2025
4 mins
You’ve probably heard the claim: “Large language models are nothing but fancy autocomplete - just predicting the next word.”
That phrase sounds dismissive, as if predicting words in a row were trivial. I admit that I myself once fell into that trap - for a long time I thought LLMs were just statistical gadgets churning out words like a Bayesian estimator, until I realized why they are so cool. What actually makes them remarkable is what those words encode. Human language is not just chatter - it is the accumulated record of millennia of human thinking, compressed into symbols, grammar, and stories.
When a model trains on language, it trains not on facts but on the patterns of thought humanity has ever written down, and recognizing this was what changed my own perspective.
Every scientific paper, novel, folk tale, or casual message reflects more than words - it reflects how humans perceive, connect, and reason. The structure of a proof, the rhythm of a poem, the shape of an argument, even the way a joke lands: these are all patterns in language.
Once training data is broad enough, it doesn’t just contain isolated facts. It encodes virtually the entire space of patterns humans have discovered and expressed.
So when an LLM learns to “predict the next word,” what it’s really learning is how to internalize and generalize those patterns. Always keep in mind it’s not the actual facts or content that counts but the patterns that make those systems so powerful.
A key misunderstanding is to think LLMs just store and replay. They don’t primarily work by memorization. If they did, they’d completely fail as predictors. Instead, they build abstract representations of patterns across contexts.
That’s why they can:
They don’t just “know words” - they know how the patterns behind those words interact.
Another myth: “LLMs can’t be creative — they only remix existing data.”
But creativity itself is often just that: recombining patterns in novel ways. A “spark of intuition” in humans is usually an unexpected collision of concepts we already know.
LLMs do the same - at a scale no single human can match. And with a sprinkle of randomness in their weights or through randomness in the sampling process, they even experience the same unpredictability that fuels intuition and invention.
One of the striking differences is scale. A person can only read a few thousand books in a lifetime. An LLM is trained across billions of documents. This means its internal web of patterns covers far more domains and combinations than any one human can access. That does not make it human - but it does mean it inhabits a pattern space larger than ours, and can surface connections we might never have made.
Dismissing LLMs as “just next word predictors” overlooks three truths:
Humans think in patterns. We express them in language. Large language models absorb those expressions and generalize over the collective pattern-space of humanity.
So no - they’re not “just autocomplete” or word predictors. They are a new kind of engine for navigating, remixing, and extending the patterns of thought we’ve been laying down in text for thousands of years.
And that’s why they’re so cool and powerful.
The following papers are a very good start to get an idea of what LLMs are, what they are capable of and how they work.
This article is tagged:
Dipl.-Ing. Thomas Spielauer, Wien (webcomplains389t48957@tspi.at)
This webpage is also available via TOR at http://rh6v563nt2dnxd5h2vhhqkudmyvjaevgiv77c62xflas52d5omtkxuid.onion/