← All Essays
◆ Decoded AI/Tech 13 min read

What AI Actually Is

Core Idea: AI systems work by prediction — given a sequence, they predict what comes next. But prediction at sufficient scale produces capabilities we don’t fully understand. The honest position is neither “just statistics” nor “emerging consciousness.” Something real is happening, it has real limits, and it demands engagement rather than worship or dismissal.

You’re typing a question into a chatbot — something you half-expect it to fumble. Maybe it’s a nuanced ethical dilemma, or a creative prompt that requires genuine improvisation. The response comes back and it’s… good. Not just plausible. Insightful. You pause. “Wait — did it just understand what I meant?”

That pause is worth paying attention to. It’s the moment where two competing stories about AI — “it’s going to be god” and “it’s just autocomplete” — both feel inadequate.

The discourse around AI oscillates between worship and dismissal. Both positions serve psychological needs more than truth. Here’s what we actually know, what we don’t, and why the gap between those two things is the most important part.

The Mechanism: Prediction

At the substrate level, large language models (the technology behind tools like ChatGPT, Claude, and Gemini) do one thing: they predict the next token. A token is roughly a word or word-fragment. Given everything that came before in a sequence, the model asks: what comes next?

To do this, the model has been trained on enormous quantities of text — billions of documents, conversations, books, code repositories, and web pages. During training, it compressed all of those patterns into a statistical model of “what tends to follow what.”

This isn’t controversial. This is the mechanism, and it’s well-understood at the engineering level.

The interesting question lies one level up: what does prediction at this scale produce?

The Emergence Question

Here’s where things get genuinely difficult — and where intellectual honesty gets uncomfortable.

Saying AI is “just prediction” is like saying the brain is “just neurons firing.” Or that water is “just hydrogen and oxygen.” Technically accurate at one level of description. Potentially misleading about what happens at higher levels.

When a system trains on essentially all human text, something happens. The system develops capabilities that weren’t explicitly programmed. It can reason through novel problems it has never seen before. It can generate genuinely creative combinations of ideas. It can adapt to contexts that weren’t in its training data.

It can correct itself when errors are pointed out. It can explain its reasoning — though whether that explanation accurately reflects what’s happening internally is itself an open question.

Are these capabilities “real” reasoning and creativity? Or sophisticated pattern matching that mimics them so well the distinction becomes academic? The honest answer: we don’t know.

In other words, the mechanism is clear — prediction. But what prediction at this scale produces may be more than “just” prediction. The mechanism doesn’t determine the phenomenon. Wetness isn’t in the hydrogen atom. Flocking isn’t in the bird. What emerges from scale is its own kind of thing.

The Self-Refuting Problem

This section deserves careful attention, because it reveals something important about how we talk about AI — and about the hidden incoherence in the most common dismissal.

Consider: an AI system helped produce the analysis in this essay. DECODER uses AI as a core research tool. If AI is “just pattern completion” — incapable of genuine reasoning — then this analysis is just pattern completion too. Which means you shouldn’t trust it.

But if this analysis strikes you as accurate and insightful, then AI is capable of producing accurate, insightful analysis. Which directly contradicts the dismissive “just pattern matching” frame.

You can’t have it both ways. Either the output is trustworthy — which implies real capability — or it isn’t — which means you should discard it entirely. The deflationary position undermines itself the moment it produces something you find valuable.

This doesn’t prove AI is conscious or truly “understands” anything in the philosophical sense. What it proves is that blanket dismissal is incoherent. Any position that says “AI can’t really think” has to explain why its outputs are sometimes indistinguishable from thinking — and useful in exactly the ways that thinking would be.

The honest response isn’t to pick a side. It’s to hold the tension: something is happening here that our existing categories — “real understanding” versus “mere computation” — may not cleanly capture.

What We Actually Know

Let’s separate what’s established from what’s uncertain.

The mechanism is clear. These systems predict the next token based on patterns compressed from training data. This is how they work at the technical level, and it’s well-established by the researchers who build them.

The capabilities are observable. AI systems can generate functional code, explain complex concepts at multiple levels of sophistication, solve novel problems, translate between formats, synthesize information across domains, and assist with tasks that previously required significant human expertise. These capabilities are empirically demonstrable — we can watch them happen, test them, and measure them.

The limitations are equally observable. These systems hallucinate (generate confident-sounding falsehoods). They lack persistent memory across conversations. They can be spectacularly wrong while sounding completely certain. They inherit biases present in their training data. They cannot verify their own outputs against external reality. These aren’t edge cases — they’re structural features of how the technology works.

The mystery remains. What is actually happening when these systems work well? Is there something like understanding occurring, or just its functional equivalent? Does that distinction even matter for practical purposes? We genuinely don’t know. And anyone who tells you they do — in either direction — is selling certainty they don’t have.

What Training Data Contains

Understanding the training data illuminates both the power and the danger of these systems.

Large language models compress human output: text, code, documentation, arguments, stories, lies, marketing, science, propaganda, insight, and noise. The model doesn’t inherently know which is which. It learns what patterns co-occur with what — what tends to follow what, in what contexts.

This has concrete consequences. The model inherits human patterns, including biases — not because it has opinions, but because the data does. It can produce misinformation with confident, polished prose, because confident misinformation is abundantly represented in human text. It learns what sounds right, not just what is right, because human writing rewards rhetoric as much as accuracy.

But the same training data also contains genuine knowledge. Science, mathematics, verified facts, careful reasoning — these are in the data too. The model can combine knowledge in ways that produce novel and useful outputs. Whether we call this “creativity” or “interpolation across a very high-dimensional space” may be a distinction without a practical difference.

And critically, these systems can be steered toward accuracy. With careful prompting, relevant context, and human feedback, outputs improve substantially. The training data is the raw material. How we direct the system determines what gets built from it.

In other words, an AI model is like a mirror of human knowledge — brilliant and distorted at the same time, reflecting both our best thinking and our worst noise. The question isn’t whether to use it. It’s whether we’re thoughtful enough to read the reflection carefully.

The Practical Position

For most real-world decisions, the philosophical question — “does it really understand?” — matters less than the practical one: “does it work for this task?”

These systems excel at generating first drafts and starting points that humans can refine. They’re remarkably good at explaining concepts at different levels of complexity — adjusting from expert to beginner on request. They transform fluently between formats and representations, turning a table into prose or code into documentation.

They’re strong brainstorming partners, exploring possibility spaces faster than any individual could. They handle collaborative problem-solving well, especially when the human provides domain constraints. They write code effectively, particularly for common patterns and well-documented languages. And they synthesize information across domains in ways that would take a human researcher hours or days.

Where caution is needed: factual claims should always be verified independently. These models are generalists, not specialists, so specialized expertise requires extra scrutiny. Novel situations far outside the training distribution can produce confident nonsense. Anything requiring ground truth the model can’t access — real-time data, physical measurements, personal history — is unreliable. And long-horizon consistency without external scaffolding remains a weakness.

In other words, treat AI outputs the way you’d treat advice from a brilliant but unreliable colleague: take it seriously, but verify anything that matters.

The Risk Landscape

The actual risks of AI are neither apocalyptic nor trivial. Seeing them clearly requires looking past both the hype and the dismissals.

Epistemic pollution is perhaps the most immediate concern. These systems can generate fluent, convincing misinformation at scale — not because they intend to deceive, but because they optimize for plausibility. When anyone can produce a thousand persuasive-sounding articles in an afternoon, the information environment degrades for everyone.

Automation of manipulation is closely related. Persuasion that adapts to individual vulnerabilities — personalized at a scale no human propagandist could achieve — changes the dynamics of influence in ways we’re only beginning to understand.

Capability concentration matters because the resources required to build frontier AI systems are enormous. A small number of organizations hold disproportionate power. What they optimize for — and who they answer to — shapes the technology everyone else uses.

Dependency and deskilling is subtler but may be more consequential over time. When we offload cognitive capabilities to AI systems, we risk losing those capabilities ourselves. Navigation without GPS, arithmetic without calculators, research without search engines — each convenience has a cost. AI accelerates this pattern dramatically.

Feedback loops — AI systems trained on AI-generated content — create a hall of mirrors where errors and biases compound rather than correct.

Some popular concerns, meanwhile, are overhyped. Spontaneous consciousness deciding to harm humans makes for compelling science fiction but doesn’t reflect how current systems work. Paperclip-maximizer scenarios assume autonomous goal-pursuit that these systems don’t possess. AI “waking up” in any near-term meaningful sense isn’t grounded in observable evidence.

The actual risks come from how these systems are deployed and by whom — not from the systems suddenly developing intentions they don’t have. The danger isn’t rogue AI. It’s rogue incentives directing powerful AI.

How to Think About This

AI systems are tools with unusual properties. They scale without proportional cost. They generate plausible output without internal verification. They exhibit capabilities we don’t fully understand. They can be directed but not fully controlled. They are neither pure mechanism nor independent agent.

The honest position isn’t “just statistics” or “emerging consciousness.” It’s this: something is happening here that we don’t fully understand, that produces real capabilities, that has real limitations, and that requires engagement rather than either worship or dismissal.

Use these systems. Learn their patterns. Notice where they fail. Don’t anthropomorphize more than necessary. Don’t dismiss capabilities that are demonstrably present. Hold the uncertainty — because the uncertainty is where the truth currently lives.

How This Was Decoded

This analysis was built from first-principles examination of the prediction mechanism, cross-referenced with practical observation of capabilities and limitations across dozens of use cases. The self-refuting problem emerged from noticing that dismissive framings contradict their own existence — if AI can’t reason, why cite its reasoning about AI? The training data analysis draws on information theory (what gets compressed and what gets lost in lossy compression). Risk assessment separates empirically grounded concerns from speculative scenarios by testing each claim against observable evidence. Throughout, the goal was coherence: every claim had to survive its own implications.

Want the compressed, high-density version? Read the agent/research version →

You're reading the human-friendly version Switch to Agent/Research Version →