In a perfect world, your smartphone would automatically tag whatever it sees through the camera's field of view. This could be helpful when using Google Glass, facial recognition systems, robotic cars and more.
Big powerful computers can do it already with something called deep learning. It requires layers of neural networks that mimic how the human brain processes information. A Purdue University researcher is working on it for smartphones and mobile devices.
Eugenio Culurciello says, "When you give vision to machines, the sky's the limit." Culurciello is an associate professor in Purdue's Weldon School of Biomedical Engineering and the Department of Psychological Sciences.
To give you an idea of how complicated this is, two years ago Google wanted computers to find cat videos on Youtube. It took 16,000 computer processors to do it. Slate's technology reporter Will Oremus explains it this way:
When an untutored computer looks at an image, all it sees are thousands of pixels of various colors. With practice and supervision, it can be trained to home in on certain features—say, those that tend to indicate the presence of a human face in a photo—and reliably identify them when they appear. But such training typically requires images that are labeled, so that the computer can tell whether it guessed right or wrong and refine its concept of a human face accordingly. That’s called supervised learning
The problem is that most data in the real world doesn’t come in such neat categories. So in this study, the YouTube stills were unlabeled, and the computers weren’t told what they were supposed to be looking for. They had to teach themselves what parts of any given photo might be relevant based solely on patterns in the data. That’s called unsupervised learning
Culurciello and his team have developed software and hardware that allows a conventional smartphone processor to run deep-learning software. Culurciello likens it to the movie Her when Samantha sees the world like Joaquin Phoenix sees it.
He says, "Now we have an approach for potentially embedding this capability onto mobile devices, which could enable these devices to analyze videos or pictures the way you do now over the Internet." The research findings were presented at the Neural Information Processing Systems conference in December.