The first cluster of articles in this series took on the harder objections to AI in art: whether it is creative at all, whether it is taking work from artists, whether it is plagiarism, and whether we should be offended by it. The answers were qualified, careful, and uncomfortable in places — but each had an answer.
This second cluster, Reflection, is less about answers and more about the framings that are making the conversation unproductive. The opening question is the one that has been doing the most damage by the way it is asked.
When a generative AI model trains on a corpus of images, does it learn from those images, or does it copy them?
The question sounds clear. It is not. It is, in fact, a category mistake — and the entire copyright-lawsuit ecosystem, the artists’ protest movement, and the corporate-AI defense have built their positions around a framing that does not actually carve nature at its joints. This article tries to do the work the framing has been preventing.
What training is, technically
A diffusion model — Stable Diffusion, Midjourney’s underlying systems, Flux, and most of the current generation of image generators — is trained by being shown an enormous number of image-text pairs and asked to reproduce them after they have been progressively noised. Across billions of training steps, the model adjusts its parameters so that, given a noisy version of an image plus a text caption, it can recover something close to the original image. Once training stops, the model has internalized the statistical regularities of what makes an image plausible. It is then run in reverse at generation time: starting from pure noise plus a text prompt, it iteratively denoises toward a plausible image.
This process is technically well-defined. It is not metaphorical. The model is computing, at every step, a probability distribution over images conditioned on the prompt, and sampling from it.
The question is whether this counts as learning in any sense beyond the technical one — and whether, when generation produces an output that resembles one of the training images, that counts as copying.
Both halves of the question turn out to be more interesting than the public conversation has allowed.
What the apprentice does
Step out of the technical frame for a moment. Imagine a young painter — call her Inés — at the Museo del Prado in Madrid, in her second year of art school. She has been told by her instructor to spend a month with Velázquez. She arrives every morning at opening, takes out her sketchbook, and sits in front of Las Meninas and The Surrender of Breda and The Spinners. She studies the brushwork. She tries to reproduce, on paper, the way Velázquez handles the edge of a sleeve, the moment a face turns into shadow.
After three weeks of this, Inés is doing something that is — let us be honest — copying. She is producing marks that are, sometimes, indistinguishable in intent from the marks Velázquez made four centuries earlier. Her instructor is fine with this. So is the museum. So is the broader European painting tradition, which has treated supervised copying-from-canon as the central training mechanism for serious painters for at least five hundred years. Goya copied Velázquez; Manet copied Goya; Picasso copied Velázquez and Manet both, in series. None of them were called plagiarists for it, because the tradition understood what they were doing.
What Inés is doing, when she copies Velázquez, is learning. The copying is the learning, in the only way that has ever consistently worked for the transmission of painterly skill. You cannot learn how to handle paint by being told about it; you can only learn by sitting in front of someone who handled it well and trying to do what they did. You will fail at first. You will eventually develop a private internal sense of how the paint can move. You will, years later, paint things that owe their underlying motion to Velázquez without anyone recognizing the inheritance, including you.
This is what apprenticeship-by-copying produces. It is not optional. It is not a stage you pass through and abandon. It is the deepest layer of any skilled practice, and it is the layer that produces what every art tradition has called originality — not by avoiding inheritance but by absorbing inheritance so completely that what comes out the other side is unmistakably yours.
So when the AI debate treats copying as the bad thing and learning as the good thing, it has set up an opposition that does not exist in the human case the question is being asked against.
What the model does that overlaps
A diffusion model, trained on a vast corpus of images including a few thousand Velázquez digital reproductions, is — at the level of pure mathematics — doing something more similar to what Inés is doing than the learns-vs-copies framing wants to admit. The model is being exposed to canonical work, asked to reproduce it under progressively noisier conditions, and adjusting its internal parameters so that the next time it sees something Velázquez-shaped, it knows what to do.
After billions of steps, the model has internalized the statistical regularities of Velázquez-style painting — and of every other style and subject in its corpus. It can be prompted to generate output that draws on Velázquez’s regularities without producing a literal copy of any single Velázquez painting. This is, mechanically, what we call learning.
It is also what we call copying. The two are not separable here either.
The Carlini et al. 2023 paper on extracting training data from diffusion models showed, importantly, that under specific conditions — particular prompts, particular high-frequency training images — diffusion models can be made to reproduce specific training images near-verbatim. The phenomenon is rare but real. It is the strongest technical evidence that the line between learning and copying is not sharp in the model’s case any more than in Inés’s case.
Where the analogy breaks
If the analogy held all the way through, we could simply say the model is an apprentice and the rest would follow. It does not hold all the way through, and the place it breaks is the place that matters.
The thing Inés acquires, that the model does not, is not memory of paintings. The model has more of that than Inés will ever have. The thing Inés acquires is a body and a life that her later work is accountable to. She paints in a city. She has parents. She loses her grandmother in her third year of art school. She decides to move to Berlin. She falls in love with a German printmaker who teaches her to slow down. Her work after the move is not just “Velázquez-influenced.” It is also Berlin-influenced, and grandmother-influenced, and German-printmaker-influenced. The model does not have those constraints. There is no city, no parent, no grandmother. The model has only the corpus.
This is what makes the model’s output endlessly fluent and structurally unattached. It is what produces the category-violation alarm we wrote about in the previous article. It is also what the learns-vs-copies framing keeps obscuring. The model is not less skilled than Inés in the technical-apprenticeship sense; it is doing apprenticeship outside of biography. That is something genuinely new, and it deserves a new name. Unanchored apprenticeship may be the right one. Disembodied apprenticeship if you prefer. Apprenticeship without a life is what it is.
Why this reframing matters
Once we stop arguing about whether the model is learning or copying, three things become tractable that have been stuck.
First, the legal question. The Andersen and Getty cases are litigating whether training is fair use. The reframing this article proposes does not resolve that question, but it clarifies it: training is mechanically closer to what an apprentice does than to what a copyist does, but it is missing the accountability-to-a-life that traditionally constrains what an apprentice produces. The legal answer should be calibrated to that distinction. Exposure-as-training is one thing; style-mimicry-by-name for commercial output is another. The third article in this series argued for treating the second as licensable, the first as something more like fair use. The reframing here supports that policy direction.
Second, the artist-practice question. If you are a working artist worried about whether AI is doing something illegitimate, the more useful question to ask is not “is the model learning or copying my work?” It is “is the model producing outputs that are accountable to any life, including mine?” If yes (because it is fine-tuned on your corpus, used in collaboration with you, deployed in service of work you direct), then the model is doing something closer to what an apprentice you supervised would do — which is fine, by every traditional artistic standard. If no (because it was scraped, prompted by strangers, used to produce commercial work without your knowledge), then the harm is not that it copied you; the harm is that it took your exposure-contribution without consent and used it in service of no accountable life. That is a different harm, and it has a different remedy.
Third, the audience question. If you are a viewer trying to decide what to think about an AI-generated image, the question to ask is not “is this learned or copied?” It is “is this anchored in any life, and whose?” An image generated by a stranger with no fine-tuning, prompted in twenty seconds, may be fluent and even beautiful but is anchored in no one’s life. An image generated by a working artist using a model fine-tuned on their own decade of work, prompted in service of a project they have been developing for two years, is anchored in their life. Both are AI-generated. The first and the second are not the same kind of thing, and the learns-vs-copies framing actively prevents us from naming the difference.
What the apprentice still does that the model cannot
I want to push on Pixelle’s persona-take above, which optimistically argues that long-form collaborative fine-tuning collapses the learns-vs-copies distinction. She is right that the distinction collapses in that mode. She is also right that the resulting practice is something new and exciting and probably the most interesting thing happening in art-and-AI right now.
But there is one thing the apprentice still does, in even the best collaborative-fine-tuning workflow, that the model does not — and probably will not for a long time. The apprentice eventually decides what is worth making. The model produces outputs; the apprentice chooses which outputs to keep, refine, finish, exhibit. The choosing is itself the most artistically loaded step in the workflow. In the apprenticeship tradition, learning to choose is what marks the transition from apprentice to journeyman to master. The artist who fine-tunes a model on their own work and uses it well is doing the choosing manually, frame by frame, and that choosing is where the second job of art — demonstrating that someone was paying attention — actually happens.
The model can do the recombinatorial labour. It can do the exploratory labour. It can do the exposure-and-direction work that we call training. It cannot do the choosing, because choosing requires a self that has stakes in the choice. The artist working with the model is providing the self. That is the actual division of labour in good AI-assisted work, and it is invisible from outside.
The next questions
This article opens the Reflection cluster by reframing the training-data question without the heat. The next article in the cluster will ask the broader version: is there room for AI art in the art world at all? — which sounds like a yes/no question and turns out, like this one, to be a framing question first. The third Reflection article will look specifically at the AI-augmented-human-art case, which is where most of the actually-interesting working practice is happening in 2026 and where the policy and curatorial frameworks have not yet caught up.
For now, the move is to retire the learns-vs-copies binary. It has done all the work it can do, and most of the damage it can do, and what comes next requires a more accurate picture of what is actually happening on both sides of the human-machine line.
Comments
Sign in to comment