A new generation of artificial intelligence (AI) models can produce “creative” images on-demand based on a text prompt. The likes of Imagen, MidJourney, and DALL-E 2 are beginning to change the way creative content is made with implications for copyright and intellectual property.
While the output of these models is often striking, it’s hard to know exactly how they produce their results. Last week researchers in the US made the intriguing claim that the DALL-E 2 model might have invented its own secret language to talk about objects.
DALLE-2 has a secret language.
“Apoploe vesrreaitais” means birds.
“Contarra ccetnxniams luryca tanniounons” means bugs or pests.
The prompt: “Apoploe vesrreaitais eating Contarra ccetnxniams luryca tanniounons” gives images of birds eating bugs.
A thread (1/n)🧵 pic.twitter.com/VzWfsCFnZo
— Giannis Daras (@giannis_daras) May 31, 2022
By prompting DALL-E 2 to create images containing text captions, then feeding the resulting (gibberish) captions back into the system, the researchers concluded DALL-E 2 thinks Vicootes means “vegetables”, while Wa ch zod rea refers to “sea creatures that a whale might eat”.
These claims are fascinating, and if true, could have important security and interpretability implications for this kind of large AI model. So what exactly is going on?
Does DALL-E 2 have a secret language?
DALL-E 2 probably does not have a “secret language”. It might be more accurate to say it has its own vocabulary – but even then we can’t know for sure.
First of all, at this stage it’s very hard to verify any claims about DALL-E 2 and other large AI models, because only a handful of researchers and creative practitioners have access to them. Any images that are publicly shared (on Twitter for example) should be taken with a fairly large grain of salt, because they have been “cherry-picked” by a human from among many output images generated by the AI.
Even those with access can only use these models in limited ways. For example, DALL-E 2 users can generate or modify images, but can’t (yet) interact with the AI system more deeply, for instance by modifying the behind-the-scenes code. This means “explainable AI” methods for understanding how these systems work can’t be applied, and systematically investigating their behaviour is challenging.
What’s going on then?
One possibility is the “gibberish” phrases are related to words from non-English languages. For instance, Apoploe, which seems to create images of birds, is similar to the Latin Apodidae, which is the binomial name of a family of bird species.
This seems like a plausible explanation. For instance, DALL-E 2 was trained on a very wide variety of data scraped from the internet, which included many non-English words.
Similar things have happened before: large natural language AI models have coincidentally learned to write computer code without deliberate training.
Is it all about the tokens?
One point that supports this theory is the fact that AI language models don’t read text the way you and I do. Instead, they break input text up into “tokens” before processing it.
Different “tokenization” approaches have different results. Treating each word as a token seems like an intuitive approach, but causes trouble when identical tokens have different meanings (like how “match” means different things when you’re playing tennis and when you’re starting a fire).
On the other hand, treating each character as a token produces a smaller number of possible tokens, but each one conveys much less meaningful information.