The vast majority of computer vision research leads to technology that surveils human beings, a new preprint study that analyzed more than 20,000 computer vision papers and 11,000 patents spanning three decades has found. Crucially, the study found that computer vision papers often refer to human beings as “objects,” a convention that both obfuscates how common surveillance of humans is in the field, and objectifies humans by definition.
“The studies presented in this paper ultimately reveal that the field of computer vision is not merely a neutral pursuit of knowledge; it is a foundational layer for a paradigm of surveillance,” the study’s authors wrote. The study, which has not been peer-reviewed yet, describes what the researchers call “The Surveillance AI Pipeline,” which is also the title of the paper.
The study’s lead author Pratyusha Ria Kalluri told 404 Media on a call that she and her co-authors manually annotated 100 computer vision papers and 100 patents that cited those papers. During this process, the study found that 90 percent of the papers and patents extracted data about humans, and 68 percent reported that they specifically enable extracting data about human bodies and body parts. Only 1 percent of the papers and patents stated they target only non-humans.
In order to analyze the rest of papers and patents spanning three decades, the study used an automated system which scanned the documents for a lexicon of surveillance words that the study’s authors came up with. This analysis found not only that the majority of computer vision research enables technology and leads to patents that surveil humans, but that this has accelerated over the years. “Comparing the 1990s to the 2010s, the number of papers with downstream surveillance patents increased more than five-fold,” the study’s authors wrote.
“In addition to body part and facial recognition, these technologies were frequently aimed at mass analyzing datasets of humans in the midst of everyday movement and activity (shopping, walking down the street, sports events) for purposes such as security monitoring, people counting, action recognition, pedestrian detection, for instance in the context of automated cars, or unnamed purposes,” the authors wrote.
As co-author Luca Soldaini said on a call with 404 Media, even in the seemingly benign context of computer vision enabled cameras on self-driving cars, which are ostensibly there to detect and prevent collision with human beings, computer vision is often eventually used for surveillance.
“The way I see it is that even benign applications like that, because data that involves humans is collected by an automatic car, even if you’re doing this for object detection, you’re gonna have images of humans, of pedestrians, or people inside the car—in practice collecting data from folks without their consent.” Soldaini said.
Soldaini also pointed to instances when this data was eventually used for surveillance, like police requesting self-driving car footage for video evidence.