Beyond Imitation

Crafting Original AI Art

By Mike Taylor
July 16, 2024
Grid block Grid block (source: Pixabay)

The first AI image generation model I got to play around with was Midjourney v2 in summer 2022. A month earlier, OpenAI had launched DALL-E 2 in beta, and the results looked unbelievably magical. You could generate images in any art style simply by prompting an AI with the name of an artist.

I didn’t go to art school, and I didn’t really know that much about art, so one of the first prompts I tried was “Super Mario drinking a glass of beer.” The resulting image wasn’t anything Nintendo’s IP lawyers would get out of bed for, but exactly two years later, the version generated by Midjourney v6 is pixel-perfect.

Learn faster. Dig deeper. See farther.

Join the O'Reilly online learning platform. Get a free trial today and find answers on the fly, or master something new and useful.

Learn more

The media and online commentators have discussed the legal and ethical implications of training on copyrighted material, but those cases are in the hands of the courts and governments, who will need to unpick that thorny issue. Whatever happens with copyright law for training, there’s a common practice in prompt engineering today that I’m absolutely sure will be banned by all major tools one day soon: using the names of copyrighted IP in prompts. For example, if I try the same prompt in ChatGPT, it refuses:

After some clever work to trick ChatGPT into revealing its system prompt (the instructions given to it by OpenAI, in addition to your prompt), we can see it has been told not to create images in the style of artists within the last 100 years: “You can name artists, creative professionals, or studios in prompts only
if their latest work was created prior to 1912 (e.g., Van Gogh, Goya).” Copyright only lasts so long before becoming public domain, and it’s safe to assume an artist’s work is no longer protected by copyright if they died over 100 years ago.


Source: https://x.com/bryced8/status/1710140618641653924

Be careful when using a living artist’s name

As a coauthor of Prompt Engineering for Generative AI, published by O’Reilly in June 2024, this topic has been on my mind. In editing, we went through every example in the book that referenced a living artist and swapped it out for something public domain. This is a higher standard than most prompt engineers hold themselves to today, but my expectation is that this will soon become the norm.

When you invoke the name of an artist or protected IP franchise in order to copy their style for commercial gain, it’s hard to argue that you’re not violating copyright. It’s one thing to have an AI that was influenced by an artist in training, and it’s quite another to intentionally prompt the AI to copy that artist’s style precisely. Consider the case of Greg Rutkowski, a favorite among early AI adopters. His name was invoked thousands of times by AI artists looking for a fantasy aesthetic. If Magic: The Gathering or Dungeons & Dragons decide to add “in the style of greg rutkowski” to their prompts instead of hiring him for their next set of illustrations, he has a clear claim of loss of income.

Source: https://thehustle.co/10-13-22-fantasy-artist

There has been growing awareness around this issue, with tools like Stable Diffusion providing opt-out mechanisms for artists who don’t want their works included. Newer AI tools have been more savvy about their restrictions on what can go into a prompt: for example Suno doesn’t allow you to reference the name of a band or musician. Instead, to make a Taylor Swift-style song for my four-year-old daughter, I had to prompt for “Contemporary country pop with elements of indie rock and a female singer.”

Unbundling and remixing the style of an artist

If using artists’ names in prompts is illegal or at least unethical, what’s the alternative? It may be time to go to art school! Rather than AI eliminating the artist’s role, I suspect artists who adopt AI will do far better than AI specialists like myself who don’t know art. For example, I recently listened to Isaacson’s biography of Da Vinci and learned about the technique of sfumato, the subtle blending of colors and tones. Now that I know that word, I can add it to my prompts when I’m trying to create depth and realistic human expressions. An actual artist would have known that already, as well as many other techniques and when it’s appropriate to use them.

If you read further down in ChatGPT’s system prompt, they describe a useful technique anyone can use to avoid ripping off an artist’s style:

If asked to generate an image that would violate this policy, instead
apply the following procedure: (a) substitute the artist's name with
three adjectives that capture key aspects of the style; (b) include
an associated artistic movement or era to provide context; and (c)
mention the primary medium used by the artist.

This is very close to a technique I use every day called unbundling, coined by Bakz T. Future, where you ask ChatGPT to describe an artist’s style and use that description in your prompt instead of the artist’s name. This technique leads to more creative and original output because there is room for interpretation in a list of stylist elements rather than constraining the creativity of the output to a specific artist.

Source: https://bakztfuture.substack.com/p/dall-e-2-unbundling

The chances are that there are elements of the artist’s style that you don’t actually want to copy. When you have a description of an artist’s style, you can then more easily modify the description to get what you want. Perhaps you want red and yellow swirls instead of blue and green, or you want to see the sky in the daytime instead of at night. The more you deviate from Van Gogh’s original vision, the more the end result will be your own.

They say to steal ideas from one person is plagiarism—to steal from many is research. One surefire technique that I’ve found for increasing the originality of my prompts is to remix the styles of multiple artists together. For example, you could merge the styles of Van Gogh’s The Starry Night with elements of Salvador Dalí’s The Persistence of Memory:

While using artists’ names in prompts is still allowed in most tools, it wouldn’t be too surprising if they’re banned in the near future. Even if the ethical considerations don’t motivate you, practical ones should. Getting good at this unbundling and remixing technique now will put you at an advantage when one day this practice gets banned from most major platforms and you get to benefit from more creative and interesting work in the meantime, building more of a name for yourself in the industry. Steve Jobs may have said “great artists steal,” but T.S. Eliot, the original source of that quote, elaborates that you should “make it into something better, or at least something different.”

The same principle applies to text-generation too

I don’t expect it to just be AI-generated images and music that will be affected. This will apply to text one day too. Role-play prompting is still an extremely common technique on the text-generation side, with people prompting an LLM to “Name this product in the style of Steve Jobs,” “Write a new scene for the TV show Friends,” or “Write this novel in the style of Hemingway.” It may be harder for LLM platforms to ban all writers and celebrities from prompts than it has been to do so with artists and musicians, but as AI progresses, this will be easier for them to do.

Despite the contribution from Meta’s Llama 3, there still isn’t a competitive open source model to rival GPT-4 like there is with Stable Diffusion XL in the image generation space. While OpenAI, Google, and Anthropic hold all the cards, your ability to use roleplay in your prompts is at risk of going away at any time. When that happens, you don’t want to suddenly have to rewrite all of your prompt templates to stop them failing! Having an unbundled and remixed style in your prompt instead of invoking a famous name makes your prompt future-proof, and maybe one day your lawyers will thank you.

Post topics: AI & ML, Artificial Intelligence
Post tags: Research
Share:

Get the O’Reilly Radar Trends to Watch newsletter