AI and AV: Further Away Than You Think

According to new research from MIT and IBM, computer vision algorithms have to get a lot better before they’re applied to real world installations.

Leave a Comment

AV integrators love to throw around fun buzzwords that could one day become mainstream in the industry but haven’t quite become a must-have in new commercial projects.

If you’ve been to an AV conference, you’ve heard about how 5G, IoT and AI will become a big part of what the industry does as data collection will help inform these machines how they should interact with end users.

However, there’s a very good reason why AI isn’t quite mainstream in pro AV: it’s not ready, especially when it comes to computer vision and object recognition.

The name is a play on ImageNet, the crowdsourced database of photos responsible for a lot of the progress in AI.Researchers at Massachusetts Institute of Technology and IBM created a new object-recognition dataset called ObjectNet, which features photos taken by paid freelancers rather than photos from Flickr and social media platforms.

Rather than carefully staged photos, the photos analyzed in ObjectNet showed objects tipped on the sides, at odd angles and in cluttered rooms.

Leading object-detection models were tested on ObjectNet, and those accuracy rates fell to 50-55% from ImageNet’s high of 97%, according to the researchers.

“We created this dataset to tell people the object-recognition problem continues to be a hard problem,” says Boris Katz, a research scientist at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and Center for Brains, Minds and Machines (CBMM).  “We need better, smarter algorithms.”

However, there’s a very good reason why AI isn’t quite mainstream in pro AV: it’s not ready, especially when it comes to computer vision and object recognition.

In a controlled research and deep learning setting, AI can learn to pick out a chair in a photo after training on thousands of examples, but even very large datasets won’t be able to show each object in every possible orientation and setting.

“People feed these detectors huge amounts of data, but there are diminishing returns,” Katz said.

“You can’t view an object from every angle and in every context. Our hope is that this new dataset will result in robust computer vision without surprising failures in the real world.”

When AI is applied to real life, that can be a problem, especially in low-income areas.

According to VentureBeat, the research builds on a Facebook AI study from earlier this year that found computer vision for recognizing household objects works better in high-income households.

Read Next: Solutions360 Founder Expects Voice Control and Artificial Intelligence to Permeate Pro AV By 2020

This obviously has implications in the real world, especially when it comes to smart buildings and smart cities.

We in the AV world like to talk about AI, but it’s clearly still a few years away.