A new generation of artificial intelligence (AI) models can produce “creative” images on-demand based on a text prompt. The likes of Imagen, MidJourney, and DALL-E 2 are beginning to change the way creative content is made with implications for copyright and intellectual property.
While the output of these models is often striking, it’s hard to know exactly how they produce their results. Last week researchers in the US made the intriguing claim that the DALL-E 2 model might have invented its own secret language to talk about objects.
DALLE-2 has a secret language.
“Apoploe vesrreaitais” means birds.
“Contarra ccetnxniams luryca tanniounons” means bugs or pests.
The prompt: “Apoploe vesrreaitais eating Contarra ccetnxniams luryca tanniounons” gives images of birds eating bugs.
A thread (1/n)🧵 pic.twitter.com/VzWfsCFnZo
— Giannis Daras (@giannis_daras) May 31, 2022
By prompting DALL-E 2 to create images containing text captions, then feeding the resulting (gibberish) captions back into the system, the researchers concluded DALL-E 2 thinks Vicootes means “vegetables”, while Wa ch zod rea refers to “sea creatures that a whale might eat”.
These claims are fascinating, and if true, could have important security and interpretability implications for this kind of large AI model. So what exactly is going on?