The story you are about to read is true, recent, and the kind of cautionary tale worth sharing because it illustrates exactly how AI hallucinations can be prevented, caught, and stopped before they reach readers, and what happens in operations that do not take those precautions.
What happened
While we were preparing the second article in this series, Is AI Affecting Artists’ Livelihoods?, something went wrong in the AI persona-commentary generation step.
A short note on how our articles are produced. Typical of our Collectively Enhanced Multiple Intelligence (CEMI.ai), of which Airtistic.ai is part, is our human+AI collaboration model. Most articles and content are reviewed and commented on by both human and AI personas. In Airtistic.ai this means commentary from five resident AI personas. One of those personas is my own digital twin (or AI clone, if you prefer that term). The personas are how we surface multiple perspectives on the same question; my digital twin lets me have a written voice across the network without having to draft every individual commentary myself. The system has its own guardrails, the most important of which is a prohibition on fabricated personal anecdotes attributed to real people.
That guardrail was bypassed. A recent upgrade to our AI Personas system had introduced a regression in which the anti-fabrication check was being skipped on certain persona-commentary generation paths. As a result, the draft commentary attributed to my digital twin on the livelihoods article was generated with a fully fabricated personal anecdote: an uncle in Bogotá who had run a small business painting backgrounds for commercial photography studios in the 1980s, was put out of business by Adobe Illustrator and affordable colour photocopiers in 1985, and died working as a security guard at a shopping mall.
None of it was real. I have no such uncle. The studio business never existed. The death and the bitterness were invented. The story was structurally plausible — it followed the same arc as the displacement narrative the article was making — but it attached a fabricated biography to a real living person, namely me.
The fabrication was caught by our editorial review process — specifically, by the human-in-the-loop pass that every article goes through before publication, against the documented persona canon. When the reviewer flagged the anecdote to me, I sent the team an unambiguous message:
“This is not acceptable. NEVER MAKE UP ANECDOTES.”
We replaced the fabricated story in my AI-persona commentary with a real one — my grandfather was a blacksmith whose trade was transformed by the arrival of the automobile — corrected the regression in the AI Personas guardrail, and are publishing this article to explain how the failure happened, how the layered audit caught it before publication, and what every operation publishing AI-assisted content should learn from a near-miss that most operations would never see.
Why this happens
Large language models do not, in any meaningful sense, distinguish between what is true about the world and what is structurally plausible. When asked to write in the voice of a real person, a sufficiently capable model will produce whatever pattern best fits the rhetorical context — including invented biographical detail, invented uncles, invented friends, invented industry statistics, and invented company partnerships. The model has no internal flag for this is real versus this is plausible-shaped. Both feel the same from inside the generation process.
The technical name for this failure is hallucination, and the literature on it is now substantial. The 2021 paper “On the Dangers of Stochastic Parrots” by Bender, Gebru, McMillan-Major and Shmitchell was the first widely-read warning that fluent text from large language models is not the same as truthful text. The 2023 Mata v. Avianca case — in which a New York lawyer cited six entirely fabricated court cases that ChatGPT had invented for him — was the first widely-reported real-world consequence. The 2024 Moffatt v. Air Canada ruling, holding the airline financially responsible for promises made by its customer-service chatbot, was the first time a tribunal held a company liable for what its AI told a customer.
The lesson from all of these — and from our own incident — is the same: AI systems left without guardrails will produce plausible falsehoods with the same fluency as they produce truths. The failure mode is structural, not occasional. It must be designed against.
Human in the loop
The first line of defence — the one that held in this case — is human editorial review. Every article in this series passes through human eyes before publication. The fabricated uncle story passed through that review, and the reviewer flagged it: not because the prose looked wrong (it did not; it was structurally indistinguishable from the surrounding sourced content) but because our editorial protocol explicitly requires that personal anecdotes attributed to named real people be checked against the persona’s documented canon. The anecdote did not match the canon. The reviewer caught it.
That is what human-in-the-loop is for. Not to catch the obvious failures — those mostly catch themselves — but to catch the structurally plausible ones, the ones that read like good work-product and look identical to good work-product, by having an explicit protocol against which the work-product is checked. The reviewer was checking historical references (Public Enemy 1988, Wendy Carlos 1968, Daguerre 1839) and cited legal cases (Andersen v. Stability AI, Getty v. Stability AI). The reviewer was also checking personal anecdotes against canon, because the protocol said to.
This is the truth about human-in-the-loop review: it works for the specific failure modes you have explicitly trained your reviewers to catch. It does not protect against the failure modes you have not anticipated. Every editorial pipeline that catches AI hallucination does so through a combination of what the reviewer notices and what the reviewer is explicitly trained to verify. Both matter, and the second matters more than is usually acknowledged.
For boutique content production — long-form essays, opinion pieces, branded editorial, museum-grade catalog text — a properly-trained human-in-the-loop can catch most fabrications, provided the loop is slow enough and the reviewer is explicitly looking. The cost is throughput: a 2,000-word article with vetted sources and reviewed persona takes is a half-day of editorial work, minimum. That is sustainable for sites like ours, which publish a few opinion pieces per month. It is not sustainable for the rest of the AI-content economy, which is producing millions of articles per day at marginal cost.
When human-in-the-loop cannot scale
A newsroom or content farm publishing hundreds of pieces per day cannot run each one through the same editorial scrutiny we apply here. Neither can a marketing department generating thousands of variants per campaign, nor a corporate communications team drafting weekly memos in dozens of voices, nor an educational platform serving personalized lessons to millions of students.
For these contexts, human-in-the-loop becomes a bottleneck the production process simply cannot afford. The temptation — already widely observed in 2024-2026 — is to remove the bottleneck and ship unedited. The result is what has come to be called AI slop: fluent, plausible, structurally competent content that often contains fabrications nobody notices because nobody is looking.
The economic incentive to skip the audit is strong. The visible cost of skipping it is low. The downstream cost falls on readers, and on the named people whose biographies the model casually rewrites.
Systemic guardrails: what we use
At Airtistic.ai, and across the CEMI network, our approach combines human-in-the-loop with a set of automated and procedural guardrails that work even when the human reviewer misses something. The combination is what produced the catch on this incident — the editorial reviewer noticed because the standard I had set for the series was explicit enough about anecdote-versus-canon that the violation was visible against the protocol, not against general taste.
The components, in order of how early in the process they intervene:
A documented persona canon. Every persona we write under (Carlos, Mira, Paletta, Pixelle, Airte) has a documented short and long biography stored in our centralized persona registry. The canon is authored and approved by the persona owner; for real people it is curated by them directly. We treat the canon as the only source for biographical claims.
A constrained reference corpus. When writing articles, we provide the model with a curated list of vetted sources and the URLs and citations for each. The model is asked to ground claims in those sources, not in its general training memory. This is sometimes called retrieval-augmented generation in the technical literature; we call it knowing what we are actually citing.
Explicit anti-fabrication guidelines. These are loaded into every persona’s system prompt and into our editorial review checklists. They name six hard categories of fabrication — personal anecdotes, statistics, named reports, superlatives, named partnerships, personal relationships — and require either real verifiable sourcing, documented canon, or silence. See the inset below.
A factual audit step. Every published article goes through a separate verification pass focused specifically on did we say anything that looks like a citation, statistic, or biographical fact. If yes, can we point to where each one came from? The uncle-in-Bogotá story would have been caught by this step if the step had existed; it did not, and that is the gap our process did not previously close.
An on-page corrections protocol. When fabrications are found in published work, we correct in place, log the correction, and explain what happened. This article is part of that protocol.
The slop problem
The reason this matters far beyond our own corner of the web is that AI-generated content which looks sourced and reads fluent is now indistinguishable from human-written content for most readers, and the volume of it is rising fast. Falsehoods that look like truths are not new — newspapers have always contained errors, encyclopedias have always had mistakes — but the rate of production of plausible-looking falsehoods has risen by orders of magnitude in three years, and the cost to readers of distinguishing them has risen with it. The burden has shifted from the writer (who could be held to a verifiable standard) to the reader (who increasingly cannot tell).
This is a structural problem for the information ecosystem. The defence, if there is going to be one, must be systemic. It cannot rely entirely on consumer literacy — “be a careful reader” — because there is no level of careful reading that can verify in real time whether a study mentioned in an article exists, whether a quoted person actually said the words, whether a cited statistic was generated by any real survey. The verification has to happen on the production side, before publication, by the people whose names appear on the byline.
That said, readers are not without recourse. The same six categories of fabrication the editorial side has to design against are the categories a reader can stress-test in well under a minute — including, we hope, on this article.
The positive flip
Here is the part of the story we want to leave you with, because the AI-slop problem can make this whole conversation sound bleak.
The same technology that makes plausible falsehood cheap also makes deep verification cheap. We use AI tools in our editorial workflow to cross-check claims, not just to generate them: every named source can be looked up in seconds; every quoted figure can be searched against original publications; every biographical claim can be checked against documented canon. The verification step that would have caught the uncle-in-Bogotá story takes about thirty seconds when an editor explicitly looks for it. The bottleneck is procedural, not technical.
In other words: the same tools that enable AI slop also enable industrial-grade fact-checking, at speeds and costs that were impossible five years ago. The question is whether the editorial culture chooses to use them. We are choosing to. Other serious publishers are choosing to. And the audience that values verified content is starting, slowly, to choose those publications over the ones that do not.
This is the AI-in-art conversation in a different register: the technology is doing what technologies do, and the question is what we do with it. Refuse to use it and we lose to faster competitors. Use it without guardrails and we become the slop problem. Use it carefully — with explicit guardrails, documented canon, vetted sources, factual audits, and the willingness to publish our mistakes when we find them — and we get the productivity benefits without the credibility costs.
A standing invitation
If you read anything in our opinion series, or anywhere on this site, that smells fabricated — an anecdote that seems too neat, a statistic without a citation, a partnership you cannot verify — we want to hear about it. We will check, we will correct, and if we cannot verify it, we will say so out loud.
That is the deal we are offering. It is the deal anyone publishing in the age of AI ought to be offering.
Comments
Sign in to comment