Faced with the wild exploitation of data by certain artificial intelligence (AI) developers, artists are deliberately trapping their creations to make them unusable, with the help of university researchers. Paloma McClain is an American illustrator. Several generative AI software already makes it possible to create images inspired by one’s style, even though the artist has never given his consent and will not benefit financially from it. “It bothered me,” explains the designer, based in Houston (Texas). “I’m not a well-known artist, but I was uncomfortable with the idea of my work being used to train” an AI model. To remedy this, she ran her works through Glaze software, a program that adds pixels to her illustrations, invisible to the human eye, to disrupt the work of the AI.
After this processing, the images created are blurred, the faces blurred, without comparison with the originals. “We’re trying to provide the technological tools to protect human creators from the abuse of generative AI models,” says Ben Zhao, a researcher at the University of Chicago whose team created Glaze.
Alerted in November 2022, this computer science professor developed the software in barely four months, using previous work intended to disrupt facial recognition. “We worked at full speed, because we knew the problem was serious,” says Ben Zhao. A lot of people were suffering.”
Generative AI giants have made agreements to secure the rights to use certain content, but the vast majority of data, images, text or sounds used to develop models have been done without explicit consent. Since its launch, Glaze has been downloaded more than 1.6 million times, according to the researcher, whose unit is preparing to launch a new program, called Nightshade.
It focuses on plain language queries (prompts) that the user of a generative AI model submits to obtain a new image. It aims to derail the algorithm, which will then offer, for example, an image of a cat when a dog was requested.
Another initiative, that of the start-up Spawning, which has developed Kudurru, software which detects mass collection attempts on image platforms. The artist then has the choice of blocking access to his work or sending an image other than the one requested, “which amounts to poisoning” the AI model in development and affecting its reliability, described Jordan Meyer, co-founder of Spawning. More than a thousand websites are already integrated into the Kudurru network.
Spawning also created Have I Been Trained? (haveibeentrained.com), a site that tracks whether images have been fed into an AI model and gives their owner the ability to protect them from future unauthorized use. Beyond the image, researchers from Washington University in St. Louis (Missouri) were interested in sound and developed AntiFake.
This software enriches a sound file with additional noises, imperceptible to the human ear, which make credible imitation of a human voice impossible, explains Zhiyuan Yu, a doctoral student behind the project. The program aims in particular to prevent “deepfakes”, hyperrealistic photo or video montages using the appearance of a person, often famous, to make them do or say something.
The team, supervised by Professor Ning Zhang, was recently contacted by the producers of a successful podcast who wanted to protect it from misappropriation, according to Zhiyuan Yu. Although it has, for the moment, only been used for spoken language, AntiFake could also protect the voices of singers, believes the researcher, whose software is freely accessible.
Ben Zhao’s unit has been approached by “several companies who want to use Nightshade to preserve their images and intellectual property,” according to the Chicago academic. He is not opposed to even large companies using his program.
“The goal is for people to be able to protect their content, whether they are individual artists or companies with a lot of intellectual property,” argues Ben Zhao. In the case of Spawning, the idea is not only to obstruct but also, secondly, to “allow people to organize themselves to sell their data for remuneration”, specifies Jordan Meyer, who announces the launching a platform in early 2024. “The best solution,” according to him, “would be a world in which all data used for AI is subject to consent and payment. We hope to push developers in this direction.”