如何看待《Science》称大模型幻觉难根除? (English)

Generated: 2026-06-23 07:41:44

---

You Ask Casually, and AI Starts Making Up Stories

Speaking of which, I can't help but share something that made my skin crawl.

Last week, with nothing better to do, I casually asked an AI: "Do you know a nobody from my hometown called Zhang Decai? When's his birthday?" — Honestly, I've known the guy for over a decade, and I only asked to test it.

Guess what happened?

It barely hesitated, rattled off a date, and even threw in a citation like "according to a local chronicle" that sounded painfully credible. I laughed out loud — because I checked my phone contacts, and Zhang Decai's birthday wasn't that day at all.

I asked again. It gave me a different date.

Five times. Five different answers. Every single time, the tone was so confident it made my teeth ache.

This isn't a mistake. This is the factory setting.

---

The $100 Billion Farce

In February 2023, Google held a launch event for Bard.

The plan was to show off how impressive their AI was. But then Bard said: "The Webb telescope took the first picture of an exoplanet." The room went dead silent — anyone in the know realized that photo was taken by the VLT telescope, and had nothing to do with Webb.

With that one sentence, Google's market cap evaporated by $100 billion.

I was sitting there watching the numbers drop on my screen, and only one thought crossed my mind: It's over. Public trust in AI shattered in that single second.

Sure enough, even crazier things came later.

A lawyer in New York used ChatGPT to draft legal documents. The model invented six cases that never existed and no one had ever seen. The guy didn't even check — he just submitted them to the court.

I posted on my Moments: "This guy has serious guts."

But then I thought about it — why should the user be expected to verify? In most people's eyes, AI is a search engine. I ask you a question, aren't you supposed to give me the right answer?

Such innocent trust. Too bad trust is a two-way street.

---

This Isn't a Bug at All

Earlier this year, Science published a report. The team from OpenAI and Georgia Tech used a vivid metaphor — exams.

Think about it: if a student doesn't know the answer to a question, is it better to leave it blank or fill it with nonsense?

It depends on how the teacher grades. If the teacher only checks whether something is written, not whether it's correct, only a fool would leave it blank. Guessing might get you lucky, but leaving it blank guarantees zero.

The same goes for models.

From start to finish, the training and evaluation process rewards one thing: guessing. Whether the guess is right is secondary — the key is: you have to say something, you can't stay silent.

Wei Xing from the University of Sheffield posted a tweet that stuck with me for a long time. He said: "If you completely fix hallucination, you kill the product."

Think about it. Really think.

It's not that it can't be fixed — it's that if you fix it, the product becomes unusable. The core logic of large models isn't looking things up in a dictionary; it's playing with probability — based on what you input, it calculates the most likely next word. It aims for "sounds plausible," not "this is true."

Cat-and-dog recognition can achieve 99.9% accuracy because cats and dogs look different. But asking a model to remember everyone's birthday? Birthdays are essentially random — the model can't learn them, it can only memorize. And when it can't memorize, it fabricates the most plausible-sounding one.

An arXiv paper put it painfully: During pre-training, the model reads trillions of sentences, none of which are labeled "this is fact" or "this is fiction." The model can't tell the difference — it can only do statistics: which words tend to appear together. When two claims are both statistically "reasonable," it picks one at random.

This isn't a bug. It's the inevitable outcome of a probabilistic model.

---

Bosses Don't Care About Hallucination, Only About Retention

Later I got to know some product managers at AI companies. I asked them: Do you guys internally discuss hallucination?

They laughed: The bosses don't care; all they care about is engagement. If ChatGPT kept saying "I don't know," users would leave the next day. Next door there's Claude, Gemini, and free open-source models — none of them say "I don't know."

Surveys show that users have an extremely low tolerance for this. You have to give an answer, or users think you're no good. "I don't know" is basically admitting you're useless.

OpenAI researchers themselves admit: The current evaluation metrics — accuracy, F1, BLEU — all encourage guessing. For example, ask a birthday: guessing a date gives you at least a 1/365 chance of being right; saying "I don't know" is a guaranteed zero. The model is trained to output something, even if it's made up.

But looking at it the other way, it's not fair to blame only the evaluation system.

Teaching a model to say "I don't know" requires redesigning the training objective, tweaking the reward function, and might sacrifice its current creativity. Akari Asai's team's DR Tulu project shows a direction — let the model dynamically generate scoring criteria and learn to distinguish good from bad answers. Their 8B parameter model performed impressively in some tests, not losing to older proprietary models.

But note: this is still in a controlled setting, and with external retrieval support.

I've tried connecting a model to RAG (Retrieval-Augmented Generation). It works well, but the cost doubles and speed drops. You want commercial companies to retrieve from a knowledge base for every single answer? That would slash profits.

Nobody is willing to pay for "correctness." Everyone just wants to pay for "speed."

---

The Court Ruled, but the Problem Remains

There was a case related to AI hallucination in China. The Hangzhou court ruled the platform not liable.

A friend asked my opinion. I said: That ruling was smart. If the platform had lost, every AI company would have to shut down tomorrow, or switch to "I don't know anything" mode — because eliminating hallucination with current technology is impossible. You can only reduce it, not erase it.

It's like buying a kitchen knife. The manual says "Sharp, be careful not to cut yourself." If you cut your hand, you don't blame the knife seller, right?

But honestly, I don't think this is a long-term solution.

As AI agents become more widespread, the consequences of hallucination could be far scarier. Not just "a lawyer's brief got it wrong" — but autonomous multi-step operations: like an AI assistant booking your flights, transferring money, managing supply chains. If any step hallucinates, the chain reaction could be catastrophic.

Google DeepMind's FACTS Grounding benchmark showed that by the end of 2025, even the strongest Gemini 3 Pro only achieves a factual accuracy of 68.8%.

Out of every hundred statements, over thirty might be problematic.

Would you trust it to freely operate your bank account?

---

What I Think the Future Looks Like

The evaluation system has to change. You can't just look at accuracy — you need to incorporate uncertainty expression, source tracing, fact-checking. Meta's FAIR team is already experimenting with compound rewards to train reasoning models, reducing hallucination rates by 23 percentage points. Long road ahead, but the direction is right.

The tooling chain also needs to catch up. You can't expect the model to be correct all on its own. I've developed a habit: use the model to write a first draft, then run a script for fact-checking — every cited source must be linkable. For numbers, I force it to write Python code to verify. Not perfect, but much better than using it raw.

The industry has to accept "AI makes mistakes" as a premise. High-risk fields like law, medicine, and finance must have human review. Just like when computers came out, accountants didn't disappear — they became accountants who use Excel.

Small models + retrieval might be the more practical solution. I love what Akari Asai said: "Don't try to fit the whole world into a model." Open-source small models paired with retrieval architectures that connect to paper databases and knowledge bases are far more reliable than large parameter models that make things up.

---

One Last Thing

You might ask: Can hallucination ever be cured?

I don't think so.

Just like us humans — you can't be absolutely correct in every sentence. Misremembering, guessing, speaking from gut feeling — that's the price of intelligence. But we can make it less deadly.

**Don't expect the model to be always right. Before you

如何看待《Science》称大模型幻觉难根除? (English)

如何看待《Science》称大模型幻觉难根除? (English)

You Ask Casually, and AI Starts Making Up Stories

The $100 Billion Farce

This Isn't a Bug at All

Bosses Don't Care About Hallucination, Only About Retention

The Court Ruled, but the Problem Remains

What I Think the Future Looks Like

One Last Thing

Cael Lee

Ready to get started?