The Naming Game: How "AI" Became the Most Profitable Word in History
Companies sell you an LLM, label it AI, and let your imagination fill the word with AGI. A data-driven investigation into the gap between what the technology actually is, what it actually does, and what it is being valued as.
Before the Argument, the Definitions
Almost the entire marketing problem lives in the gap between three terms that get used interchangeably and should not be.
Artificial Intelligence is the broad umbrella, an academic field dating to 1956 covering any system that performs tasks we associate with intelligence. By this definition your email spam filter is AI. So is the algorithm that beats you at chess. The term is so wide it is nearly meaningless as a product descriptor, which is exactly why it is so useful for marketing.
A Large Language Model, or LLM, is the specific thing nearly every "AI" product today actually is. It is a statistical engine trained on enormous amounts of text to predict the most likely next token in a sequence. That is the whole mechanism. It is extraordinary at it, and that prediction process produces outputs that look like reasoning. But the underlying operation is pattern completion at massive scale, not understanding in any human sense.
Artificial General Intelligence, or AGI, is the thing being implied but not delivered. AGI means a system that can learn and perform any intellectual task a human can, transfer knowledge across domains, and handle novel problems it was never trained on. It does not exist. Nobody has built it. There is no clear technical path to it, and there is no agreement on how we would even measure it if it arrived.
What We Actually Have Right Now
We have very good text predictors with tool access. They write code, draft documents, summarize, translate, and hold fluent conversations. These are real, valuable capabilities, and dismissing them entirely is as dishonest as overselling them. But "can do impressive things" and "reliable enough to bet a trillion dollars on" are different claims, and the data on the second one is not flattering.
Start with hallucination. Across modern LLMs, factual error rates run from roughly 15% to over 50% depending on the model and task, with most clustering in the 20 to 27% range. In legal contexts the average sits near 18.7% and in scientific contexts near 16.9%. On high-complexity reasoning tasks the rate still exceeds 33%. In adversarial testing, citation fabrication has been measured as high as 94%. These are not edge cases. A system that invents facts a fifth to a half of the time is being sold as a knowledge engine.
The most damning evidence comes from benchmarks built to mimic actual jobs. On TheAgentCompany, a test of consequential real-world workplace tasks, state-of-the-art agents fail the majority of tasks. Administrative and finance work, the bread-and-butter of most office jobs, produced the lowest scores, with many models completing none of the tasks successfully. The systems do best at software engineering, and the researchers explain why with refreshing bluntness: the entire field has been optimized around coding benchmarks because coding data is abundant and public. In other words, the models look smartest at the one thing they were most heavily trained and tested on, and that narrow success gets generalized into "it can do knowledge work."
On ARC-AGI-3, a test of genuine novel reasoning with no stated rules, frontier models score below 1% while ordinary humans handle it easily. That single data point is the clearest illustration of the AGI gap. The thing being marketed as approaching human-level general intelligence cannot do what a curious child does in minutes.
Why "AI" Gets Marketed This Hard
The valuations require the marketing, not the other way around. As of mid-2026, Anthropic raised $65 billion to reach a $965 billion valuation, briefly the highest ever for a pre-IPO company. OpenAI sat around $852 billion. SpaceX, after merging with xAI, was valued around $1.25 trillion. The combined fundraising across the three expected IPOs could exceed $200 billion. Global venture capital poured roughly $300 billion into AI in Q1 2026 alone, and industry-wide AI capital expenditure for 2026 is projected near $690 billion.
Anthropic's financials look meaningfully healthier in trajectory. It reported annualized revenue rising from $1 billion to $9 billion to $19 billion and then to a $47 billion run-rate, and forecasts cutting its cash burn to roughly one-third of revenue in 2026 and 9% by 2027. That is a genuinely different curve. But healthier-than-OpenAI is not the same as justifying a near-trillion-dollar valuation, and even bullish analysts note that all three major players are still losing more than they make.
This is the engine of the hype. When your valuation is 15 to 50 times revenue, and that revenue does not yet cover your costs, the gap between what you are worth and what you have proven has to be filled with a story. The story is that these systems are on a smooth path toward general intelligence that will automate enormous swaths of human labor, and that the company telling the story will own that future. The word "AI," with its science-fiction freight, does the storytelling work that the actual product cannot. The same dynamic, where official narratives are constructed to obscure the real numbers, connects to our examination of how economic data is built to tell the most comfortable version of the story.
The Commodity Problem
There is an additional crack forming under the whole structure. Cutting-edge model capability is becoming cheap and abundant, with open-source models now trailing the best proprietary ones by single-digit percentages on real benchmarks. On the hardware bug-repair benchmark HWE-Bench, the top open-source model came within 7.6 points of the best proprietary model. The trillion-dollar valuations assume these companies keep their pricing power and that customers have no alternative. If frontier-level capability becomes a commodity, the moat that justifies the valuation evaporates, and so does the premium people are paying.
So Is It a Scam?
"Scam" is too strong if it implies the technology is fake. It is not. The tools are real and genuinely useful, and people use them to ship code and get work done every day. That part is not in dispute.
The more precise charge is that the framing is a confidence game built on a deliberate category error. Three things are being conflated to create value that the product alone does not support. A narrow text predictor gets sold under a term so broad it implies almost anything, in order to evoke a capability that does not exist and may never arrive on the current technical path. The hallucination rates, the 37% lab-to-deployment gap, the majority failure on real office tasks, and the sub-1% scores on genuine novel reasoning are all publicly measurable. They sit in tension with valuations that only make sense if you believe the AGI story.
You can hold both truths at once. These are remarkable tools that have changed how a lot of people work. And the companies building them are, in many cases, valued at multiples that require a leap of faith the current evidence does not earn. The marketing exists to make that leap feel inevitable rather than optional. The data suggests it is very much optional, and that the most important word in the entire industry is doing far more financial work than the technology behind it.
Frequently Asked Questions
What is the difference between AI, LLM, and AGI?
Artificial Intelligence is a broad academic field dating to 1956 covering any system that performs tasks associated with intelligence. A Large Language Model is the specific thing nearly every AI product today actually is: a statistical engine trained on text to predict the most likely next token. Artificial General Intelligence is a system that can learn and perform any intellectual task a human can. AGI does not exist. The marketing sleight of hand is that companies sell you an LLM, label it AI, and let your imagination fill the word with AGI.
How often do AI language models hallucinate or make factual errors?
Across modern LLMs, factual error rates run from roughly 15% to over 50% depending on the model and task, with most clustering in the 20 to 27% range. In legal contexts the average sits near 18.7% and in scientific contexts near 16.9%. On high-complexity tasks the rate exceeds 33%. In adversarial testing, citation fabrication has been measured as high as 94%. Enterprise systems show roughly a 37% gap between lab benchmarks and real-world deployment performance.
Are AI companies like OpenAI and Anthropic profitable?
No. OpenAI generated around $20 billion in annualized revenue while projecting a $14 billion loss for 2026, spending $1.69 for every dollar of revenue. It projects operating losses near $74 billion in 2028. Anthropic reported a rising revenue trajectory with a $47 billion run-rate but still loses more than it makes. All three major players are currently valued at 15 to 50 times revenue while spending more than they earn.
Can AI actually do real office work and knowledge tasks?
Current AI systems fail the majority of tasks on TheAgentCompany, a benchmark mimicking actual workplace tasks. Administrative and finance work produced the lowest scores, with many models completing none of the tasks successfully. On ARC-AGI-3, a test of genuine novel reasoning, frontier models score below 1%, while ordinary humans handle it easily. The systems perform best at software engineering because that is what they were most heavily trained and benchmarked on.
Is AI a scam?
Scam is too strong if it implies the technology is fake. The tools are real and useful. The more precise charge is that the framing is a confidence game built on a deliberate category error: a narrow text predictor gets sold under a term implying almost anything, to evoke a capability that does not exist. The hallucination rates, the 37% lab-to-deployment gap, the majority failure on real office tasks, and the sub-1% scores on novel reasoning all sit in tension with valuations that only make sense if you believe the AGI story.
Kai Tutor | The Societal News Team
Follow Us!
It helps decentralize our presence across the web and it's completely free!
Instagram ➤
Youtube ➤
Substack ➤
X.com ➤
Telegram ➤
TikTok ➤