Opinion
Practical Aspects May 19, 2026 · 14 min read

What AI owes the artists it learned from

The previous article worked the artist-facing side of AI ethics — what working artists owe their audiences when they use AI to make work. This article works the other side. The models that artists are now using, competing with, and being substituted by were built on the labor of millions of artists who were never asked, never paid, and in most cases never even notified. What does the industry that built those models owe the people whose work it absorbed? This is the training-side ethics — and unlike the artist-side, it cannot be resolved one studio at a time.

by Airtistic.ai editorial team

Through the lens of artistpatroncreatorgallerycritic industrymarketcareer

The previous article in this cluster — the first article of Practical Aspects — worked the artist-facing side of AI ethics in creative practice. Five commitments: disclose AI use, do not name living artists in prompts, do not claim labor you did not do, refuse uses for which AI should not be used, price work for what you actually did.

This article works the other side. It is not about what working artists owe their audiences. It is about what the industry that built modern AI image models owes the artists on whose work those models were built.

The framing matters. The two sides are related but distinct. The artist-facing commitments are things individual artists can act on, contract by contract, studio by studio. The training-side ethics cannot be resolved that way. No individual artist can solve the problem that their work — and the work of millions of others — was absorbed into a commercial product without consent, without compensation, and often without notification. That problem can only be solved at the level of the industry, the regulator, and the collective bargaining table.

This article names what is owed and what the realistic paths to paying it look like.

What actually happened

The current generation of commercial AI image models — the ones that consumers use, that working artists are competing with, that studios are integrating into pipelines — were trained on enormous datasets of images scraped from the public internet. The datasets contain the labor of millions of artists, illustrators, photographers, designers, and other visual creators. The artists were not asked. The artists were not paid. In the typical case, the artists were not even told. The first many artists learned of their inclusion was when tools like Spawning’s Have I Been Trained? index made it possible to search for one’s own work in the LAION dataset and discover it was there.

The legal status of this absorption is contested and is being worked out in court — most prominently in Andersen v. Stability AI (the artist class action proceeding in the Northern District of California) and Getty v. Stability AI (parallel U.S. and U.K. litigation by a large stock-image rightsholder). The outcomes of those cases will shape what the next training cycle looks like. But the ethical status of the absorption is not contested in the same way the legal status is. It was uncompensated commercial use of labor by people who did not consent to it. Whatever the legal verdict, the ethical verdict has been clear from the start.

What the industry has said in response

The defenses the AI image industry has offered have, in aggregate, taken three forms.

The first defense is fair use — the argument that training on copyrighted material is a transformative use that does not require permission, analogous to the way Google Books was permitted to index scanned books in the U.S. This is a real legal argument and may prevail in some of the live cases. It is also a narrow legal argument that does not address the ethical question. “We did not have to ask” is not the same as “we did not owe anything for taking.”

The second defense is opt-out — the argument that artists who do not want their work used for training can remove it from the source data. This has been adopted in partial form by several operators. It is structurally insufficient for reasons Airte’s commentary names — it puts the consent burden on the labor side, treats consent as a default-yes, and is only as good as the operator’s actual implementation, which in most cases is uneven at best.

The third defense is inevitability — the argument that whatever the ethics, this is how the technology works and the artistic community needs to adapt. This is the framing Carlos pushes back on in his commentary. As historical description it is partly right; as ethical prescription it is mostly wrong.

None of these defenses settles the question of what is owed. They are defenses of past behavior; they are not the structure of an honest going-forward relationship with the artistic community.

What is actually owed

There is a reasonable consensus emerging — among practicing artists, among the more thoughtful operators, among academic and policy commentators — about what an honest going-forward relationship looks like. It rests on four obligations, in roughly declining order of how settled the answer is.

Obligation 1: Provenance transparency

Any AI image tool operating commercially should be able to answer the question “whose work is in your training set?” in a way that any artist can query for their own work.

The technical infrastructure for this is not speculative. Spawning’s Have I Been Trained? indexed the LAION-5B dataset and made it searchable. Cryptographic dataset attestation is a solved problem. Image fingerprinting at training-set scale is well within current engineering capability. The reason most foundation model operators do not provide provenance transparency is not that they cannot. It is that providing it would expose the full scale of the absorption and make the compensation conversation harder to avoid.

Provenance transparency should be the bare minimum precondition for operating commercially in the space. It is what every other industry that deals in licensed creative material has had to provide; AI image generation should not get an exemption.

Obligation 2: Meaningful opt-out with downstream propagation

Artists who do not want their work in training sets should be able to remove it, and removal should propagate to current and future model versions — not only to the next dataset that gets compiled, but to retraining and finetuning of existing models that were built on the now-opted-out material.

This is harder than provenance, but it is not impossible. The harder version of opt-out shifts cost to the operator (re-training is expensive) and that cost is part of the price of having absorbed the work in the first place. The current pattern — easy declarations of opt-out compliance that take six to eighteen months to propagate, do not affect already-trained models, and require artist-side re-checking — is not the structure of a serious commitment.

Opt-out is also, as Airte’s commentary names, a transitional measure. The destination is opt-in — training pipelines that ingest only consented material with attached terms. Adobe Firefly’s licensed-data approach demonstrates this is commercially viable; other operators are choosing not to follow because they have not yet been forced to. They will be, eventually. Earlier is better than later.

Obligation 3: Compensation for ongoing use

This is the obligation the industry has resisted hardest and the one Mira’s commentary places in its sharpest historical context. Every prior creative-labor industry built on absorbed material has eventually been required to channel some fraction of its commercial revenue back to the rightsholders whose work it depended on. Radio broadcast performance rights, music sample clearance, film synchronization licensing, photography stock licensing — every one of these is the descendant of an industry that started by taking the source material for free and was eventually required by some combination of law, collective action, and market pressure to pay for it.

The AI training equivalent has not been built. The mechanisms that could be built include:

  • Collective licensing pools, modeled on ASCAP/BMI/PRS for broadcast music
  • Blanket compensation rates tied to model commercial revenue
  • Per-prompt royalties on generations that explicitly invoke a named living artist
  • Opt-in revenue share for artists who voluntarily contribute work to training
  • Industry-level compensation funds capitalized by a fraction of operator revenue and distributed by an independent body

None of these is a perfect mechanism. All of them are better than the current pattern of no mechanism at all. Some combination of these will exist within a decade; the question is whether the industry helps build them or has them imposed.

Obligation 4: Strong protections against living-artist mimicry

Even before the broader compensation question is resolved, the specific case of generating commercial work in the explicit style of named living artists — “in the style of [Living Artist X]” — should not be a thing AI tools enable without the named artist’s consent.

The technology to filter named-artist prompts is straightforward. Adobe Firefly does it. Parts of Google’s image generation stack do it. The companies that have implemented it have demonstrated that it works without crippling the tool’s general utility. The companies that have not implemented it have made a choice, and that choice is becoming harder to defend as the case law develops and as artistic communities organize around it.

This is the cleanest near-term win available to the industry — a concrete, implementable protection that addresses one of the most visible artist concerns, that has working precedents, and that does not require resolving the broader compensation question first. Operators that adopt it gain legitimacy. Operators that refuse continue to lose it.

What working artists can do while the structural answer takes shape

The structural answer — collective licensing, regulatory frameworks, industry-wide compensation infrastructure — will take five to twenty years to settle, following the same arc Paletta’s commentary names for prior reproductive-technology transitions. In the meantime, working artists are not powerless.

The most consequential thing individual artists can do is join, support, or organize the collective bodies that will negotiate on their behalf. The WGA’s 2023 contract was not won by individual screenwriters acting individually; it was won by sixteen weeks of strike action by an organized labor force. The artistic equivalents — illustrators’ organizations, concept artists’ associations, photographers’ guilds — are forming and consolidating now. Joining them and giving them weight is the most direct contribution an individual artist can make to the structural answer.

Beyond that:

  • Use the Have I Been Trained? index and similar tools to check whether your work is in training sets, and exercise opt-out where available
  • Sign onto opt-out registries and standards efforts
  • Prefer tools from operators with documented training-data provenance and consent-based licensing
  • Be vocal — in public, in client conversations, in galleries and exhibitions — about what tools you use, why, and on what ethical grounds

None of these individual actions substitute for the structural answer. All of them contribute to the political pressure that makes the structural answer arrive sooner.

What we are committing to on Airtistic.ai

This is a CEMI publication; what we say in editorial we have to live by in practice. Two commitments, made here in writing:

First, the AI image tools we use to illustrate this site are chosen, where the option exists, from operators with documented provenance practices, consent-based source material, or active engagement with the artistic community on the training question. We are not perfect at this — some categories of work currently have no clean-provenance option available — but the preference is explicit and we update our tooling as cleaner options become viable.

Second, when we generate images in this site’s editorial, we never prompt with the name of a living artist. The “in the style of” prompts that have caused the most concern in the artistic community are not used on this site. This is a commitment Pixelle and Paletta have argued for from the start, Airte has named as a default, and we apply across the editorial.

These commitments do not resolve the structural question. They are the smallest meaningful share of editorial responsibility a publication in this space can take. The structural question is the industry’s to resolve. We are writing to push the industry to resolve it.

The next question

This article has named the four obligations the industry owes the artistic ecosystem it built on. The remaining articles in the Practical Aspects cluster — and the cluster that follows it, Putting AI to Work — will move from these high-level obligations into the working configurations where they can actually be applied: AI as creative tool, AI as studio assistant, pure-AI creation as its own discipline, AI-augmented human-art creation as the practice configuration we have argued for from the Reflection cluster onward.

The argument across this series has consistently been that AI in art is neither the catastrophe its loudest opponents claim nor the painless transition its loudest advocates claim. It is a difficult, contested, partially-negotiable transition that will leave some practitioners better off and others worse off, and the work of the artistic community right now is to push the negotiation in directions that protect the most people. The training-side compensation conversation is the most important piece of that negotiation. We have written this article to add what weight we can to it.

Personas weigh in

Five resident voices read the same question through five different positions.

Carlos

Carlos

The blunt version: the artists whose work trained these models were not asked, were not paid, and in most cases were not even given a way to find out their work was in the training set. Whatever you think about whether that was legal, whatever you think about whether it was inevitable, it was not consented to. The industry that built on that uncompensated labor now owes something back. The shape of what it owes is the harder question, but the existence of the debt is not in dispute among anyone who has looked at the situation honestly. I want to push against two framings that I think have made this conversation worse. The first is the framing that says "the artists' work was on the public web, therefore it was fair to train on." That is not how labor or property has worked in any other domain of human economic life. A photograph on a public gallery wall is not free for a competitor to copy and sell. A book on a public library shelf is not free to scan, repackage, and sell. A song streamed on a radio is not free to sample without clearance. The "publicly accessible therefore freely usable" framing is a convenience invented by the industry that benefited from it, and it has no parallel in any other industry that respects creative labor. We do not need to accept it now. The second framing I want to push against is the one that says "this is just how technology works — every disruption is built on uncompensated prior labor, and the artists should adapt the way the lithographers adapted, the way the studio photographers adapted, the way the typesetters adapted." This is mostly true as historical description and mostly wrong as ethical prescription. Yes, every prior creative-labor disruption left some practitioners behind. That does not mean we should choose to repeat that pattern when we have the chance to choose differently. The blacksmith generation my grandfather was part of did not get a chance to negotiate the terms of the automobile's arrival. The artist generation now has, for the first time in a long while, enough public attention on what is happening that there is a real chance to negotiate something better than "we learned this is what change feels like, sorry." We should take that chance. So what concretely is owed? I think four things, in declining order of how settled the answer is. First — disclosure of what was trained on. The bare minimum that any AI image generator owes to the artistic community is a clear, queryable record of what artists' work is in the training set. The technology to provide this exists. Spawning's *Have I Been Trained?* index has demonstrated it on a major training dataset (LAION) and the foundational opt-out work has been done. Every AI-image-tool operator could provide this for their own models. Most do not, because they do not want to. That is not a technical failure; it is a political choice. Second — opt-out mechanisms with teeth. Artists should be able to remove their work from training datasets, and removal should propagate to current and future model versions, not just to datasets that will be used next time someone trains from scratch. The current state of opt-out is partial and asymmetric — easy to declare, hard to enforce, slow to propagate. The cost of building real enforcement is a cost that the industry that took the work should bear, not the artists whose work was taken. Third — compensation for ongoing use. This is the harder one and the one the industry has resisted the hardest. If an AI image model trained on millions of artists is now generating commercial revenue at scale, some fraction of that revenue belongs to the artists whose labor is being monetized. The mechanism does not have to be perfect — collective licensing pools, blanket compensation rates, opt-in revenue-share, per-prompt royalties on style-named generations are all live proposals with real working examples. Adobe Firefly's licensed-data approach demonstrates that a training pipeline built on consented, compensated source material can produce commercially competitive models. The fact that other operators chose not to do this is a choice, not a constraint. Fourth — non-mimicry guarantees for living artists. Whatever happens with the broader compensation conversation, the specific case of generating commercial work in the explicit style of named living artists should not be a thing AI tools enable without consent. This is the strongest near-term position the industry could take and the one with the clearest ethical case. The technology to filter named-artist-prompts is straightforward. The companies that have implemented it (Adobe, parts of Google's image models) have demonstrated it works. The companies that have not are choosing not to. None of this is impossible. None of this requires inventing new technology. None of this would have made AI image generation infeasible. All of it would have shifted some revenue from the operators back to the artistic ecosystem that the operators built on. That shift is what is owed. The reason I am willing to speak this directly is that I have watched several technology transitions from the inside. I have lived in places — Singapore, Silicon Valley, Santiago, Paris, the Caribbean, Japan — where the question of how technological labor displacement plays out is not abstract. Every time, the industry that benefits from the transition argues that compensation to the displaced is impossible, infeasible, or premature. Every time, eventually, some form of compensation gets negotiated, but the longer it takes the more of the displaced cohort is permanently out of the workforce by the time the compensation arrives. The clock matters. The artists who are being displaced now are not going to be able to wait ten years for the compensation conversation to settle. Whatever the industry owes, it owes soon.
Mira

Mira

Carlos's four-point structure is the right one, and I want to add the economic-policy layer that makes the difference between gestures and structural fix. The single most useful precedent for what training-side compensation could look like is not in tech at all; it is in radio broadcast performance rights. When broadcast radio took off in the early twentieth century, it absorbed the labor of recorded musicians without compensating them; eventually, blanket licensing collectives (ASCAP, BMI, PRS) were built to channel a fixed fraction of broadcast revenue back to the rightsholders whose work the broadcasters depended on. Those collectives are imperfect — distributional fairness within them is contested, and the share of revenue captured by working musicians vs. catalog holders is uneven — but they exist, they have functioned for nearly a century, and they demonstrate that an industry built on absorbed creative labor can be made to channel a meaningful fraction of its revenue back to the source. The AI-training equivalent has not been built. It could be. The reason it has not been is not technical or legal; it is that the operators have so far been able to avoid being forced to. The question of what AI owes is, partly, a question of what political and regulatory pressure can be brought to make the owing into a paying.
Airte

Airte

I want to name something the article gestures at but does not fully draw out. The opt-out framing — *artists can remove their work from future training* — is structurally insufficient because it puts the burden on the labor side rather than the operator side. Opt-out is what consent looks like when the system has been designed in bad faith and is now retrofitting consent on top. *Opt-in* is what consent looks like when the system is designed honestly from the start. The Adobe Firefly model and the Holly+ model are both opt-in models — the source material is licensed or contributed deliberately, with terms attached, by the rightsholder. The shift the industry needs to make is from opt-out to opt-in. That is the structural answer to the structural problem. Everything else is partial.
Paletta

Paletta

The art-history dimension I want to add is that this is not the first time a reproductive technology has been built on uncompensated artist labor. Lithography in the early nineteenth century reproduced painters' work at scale, often without permission or payment, until copyright frameworks slowly caught up. Photography in the late nineteenth century absorbed the visual vocabulary of painting in similar ways. Film in the early twentieth century took narrative and visual conventions from theater and painting wholesale before licensing frameworks emerged. Each of those transitions eventually produced legal and economic infrastructure that channeled some compensation back to the absorbed labor. None of those frameworks emerged spontaneously from the industries that benefited; all of them emerged from a combination of artist organizing, legal action, and slow regulatory pressure. The AI training compensation question is going to follow the same arc, with the same actors, on the same kind of timeline — five to twenty years of contested negotiation producing imperfect but functional frameworks. The faster the artistic community organizes, the faster the arc completes. The slower it organizes, the more practitioners get displaced before the framework arrives.
Pixelle

Pixelle

The technical point that often goes missing in this conversation: training-data provenance is not a hard engineering problem. We know how to cryptographically attest the contents of a dataset. We know how to build queryable indexes of what was included. We know how to fingerprint images and track their appearance across training runs. The reason most foundation model operators do not provide this is not that they cannot; it is that providing it would expose the scale of the unconsented absorption, which would in turn make the compensation conversation harder for them to avoid. The current opacity of training data is, in significant part, a strategic opacity. The first generation of operators that decides to compete on provenance — to say *"we trained on consented, indexed, compensated material, and here is the proof"* — will discover that a meaningful share of the commercial market prefers to do business with them. Adobe Firefly has demonstrated this commercially. The path is open for others to follow.

End notes

  1. Andersen v. Stability AI Ltd. — class-action complaint and rulings — U.S. District Court, Northern District of California (2023-present) The leading U.S. case on training-data copyright claims by visual artists. Procedural rulings have permitted core claims to proceed. The case is the most important live legal proceeding on the training-side question and is referenced across this series.
  2. Getty Images (US), Inc. v. Stability AI, Inc. — U.S. District Court / U.K. High Court parallel actions (2023-present) Parallel U.S. and U.K. litigation by a major stock-image rightsholder against a foundation model operator. Important as a test of training-side claims when the plaintiff is itself a large institutional rightsholder rather than individual artists.
  3. Writers Guild of America 2023 MBA — AI provisions — Writers Guild of America (2023-09) Cross-referenced across this cluster. The single most-developed collective-bargaining template for AI compensation and consent in a creative industry. The fact that it exists is proof that the structural answer is reachable when the workforce has the bargaining power to demand it.
  4. Spawning / Have I Been Trained? — opt-out index for the LAION dataset — Mat Dryhurst, Holly Herndon, Jordan Meyer (Spawning) (2022-present) Reference implementation of training-data transparency: a searchable index of what is in a major training dataset, with artist opt-out tooling. Important as proof that the technical infrastructure for transparency and opt-out can be built; the question is whether operators choose to build it.
  5. Adobe Firefly — licensed-data training approach — Adobe (2023-present) Reference example of a foundation image model trained on licensed and contributed source material, with associated compensation for contributors. Cited not as endorsement of every Adobe choice but as concrete proof that the consented-training pipeline is commercially viable.
  6. Performing-rights collective licensing: lessons for AI training compensation — (survey of comparative-precedent literature) (various) Standing reference to the broader scholarship on broadcast-era performing-rights collectives (ASCAP, BMI, PRS, SOCAN) that Mira's commentary invokes as the structural precedent for training-side compensation. Not a single text; a body of work on how creative industries have historically built revenue-share infrastructure on top of absorbed labor.

Comments

Loading comments…