Multisensory AI: Reduce Downtime

Multisensory AI: Reduce Downtime and Boost Efficiency

October 1, 2024

Christina Cardoza

A male engineer wearing a yellow hardhat looks over equipment in a factory.

When you’re waiting by the side of the road for the tow truck, isn’t that always the moment when you realize you’ve neglected your 75,000-mile tuneup and safety check? The “check oil” light and low-tire pressure alert can avert dangerous situations, but you can still end up in that frustrating and time-consuming breakdown. Now scale up that inconvenience and lost productivity to the size of a factory, where nonfunctioning machinery can result in hugely expensive downtime.

That’s where predictive maintenance comes in. Machine learning can analyze patterns in normal workflow and detect anomalies in time to prevent costly shutdowns; but what happens with a new piece of equipment, where AI has no existing data to learn from? Can some of the attributes that make humans good—if inefficient—at dealing with novel situations be harnessed for machine-based inspections?

Rustom Kanga, Co-Founder and CEO of AI-based video analytics provider iOmniscient, has some answers for these and other questions about the future of predictive maintenance. He talks about the limitations of traditional machine learning for predictive maintenance; when existing infrastructure can be part of the prediction solution—and the situations when it can’t—and what in the world an e-Nose is (Video 1).

Video 1. Rustom Kanga, CEO of iOmniscient, discusses the impact of multisensory and intuitive AI on predictive maintenance. (Source: insight.tech)

What are the limitations to traditional predictive maintenance approaches?

Today when people talk of artificial intelligence, they normally equate it to deep learning and machine learning technologies. For example, if you want the AI to detect a dog, you get 50,000 images of dogs and label them: “This is a dog. That is a dog. That is a dog. That is a dog.” And once you’ve trained your system, the next time a dog comes along, it will know that it is a dog. That’s how deep learning works.

But if you haven’t trained your system on some particular or unique type of dog, then it may not recognize that animal. Then you have to retrain the system. And this retraining goes on and on and on—it can be a forever-training.

The challenge with maintenance systems is that when you install some new equipment, you don’t have any history of how that equipment will break down or when it will break down: You don’t have any data for doing your deep learning. And so you need to be able to predict what’s going to happen without that data.

So what we do is autonomous, multisensory, AI-based analytics. Autonomous means there’s usually no human involvement, or very little human involvement. Multisensory refers to the fact that humans use their eyes, their ears, their nose to understand their environment, and we do the same. We do video analysis, we do sound analysis, we do smell analysis; and with that we understand what’s happening in the environment.

How does a multisensory AI approach address some of the challenges you mentioned?

We have developed a capability called intuitive AI. Artificial intelligence is all about emulating human intelligence, and humans don’t just use their memory function—which is essentially the thing that deep learning attempts to replicate. Humans also use their logic function. They have deductive logic, inductive logic; they use intuition and creative capabilities to make decisions about how the world works. It’s very different from the way you’d expect a machine learning system to work.

“Multisensory refers to the fact that humans use their eyes, their ears, their nose to understand their environment, and we do the same” – Rustom Kanga, @iOmniscient1 via @insightdottech #AI

What we as a company do is we use our abilities as humans to advise the system on what to look for, and then we use our multisensory capabilities to look for those symptoms. For instance, if a conveyor belt has been installed and we want to know when it might break down, what would we look for to predict that it’s not working well? We might listen to its sound: when it starts going “clang, clang, clang,” something is wrong with it. So we use our ability to see the object, to hear it, to smell it to tell us how it’s operating at any given time and whether it’s showing any of the symptoms that we’d expect it to show when it’s about to break down.

How do you train AI to do this, and to do it accurately?

We tell the system what a person would be likely to see. For instance, let’s say we’re looking at some equipment, and the most likely break-down scenario is that it will rust. We then tell the system to look for rust or for changes in color. Then, if the system sees rust developing, it will tell us that there’s something wrong and it’s time to look at replacing or repairing the machine.

And intuitive AI doesn’t require massive amounts of data. We can train our system with maybe 10 examples of the data set, or even fewer. And because it requires so few data sets, it doesn’t need massive amounts of computing; it doesn’t need GPUs. We work purely on the standard Intel CPUs, and we can still achieve accuracy.

We recently implemented a system for a driverless train. The customer wanted to make sure that nobody could be injured by walking in front of the train. That really requires just a simple intrusion system. In fact, camera companies provide intrusion systems embedded into their cameras. And the railway company had done that—had bought some cameras from a very reputable company to do the intrusion detection.

The only problem was that they were getting something like 200 false alarms per camera per day, which made the whole system unusable. So they set the criterion that they wanted no more than one false alarm across the entire network. We were able to achieve that for them, and we’ve been providing the safety system for their trains for the last five years.

Do your solutions require the installation of new hardware and devices?

We can work with anybody’s cameras, anybody’s microphones—of course, the cameras do have to be able to see what you want to be seen. Then we provide the intelligence. We can work with existing infrastructure for video, for sound.

Smell, however, is a very unique capability. Nobody makes the type of smell sensors that are required to detect industrial smells, so we have built our own e-Nose to provide to our customers. It’s a unique device with six or so sensors in it. There are sensors on the market, of course, that can detect single molecules. If you want to detect carbon monoxide, for example, you can buy a sensor to do that. But most industrial chemicals are much more complex. Even a cup of coffee has something like 400 different molecules in it.

Can you share any other use cases that demonstrate the iOmniscient solution in action?

I’ll give you one that demonstrates the real value of a system like this in terms of its speed. Because we are not labeling 50,000 objects, we can actually implement the system very quickly. We were invited into an airport to detect problems in their refuse rooms—the rooms under the airport where garbage from the airport itself and from the planes that land there is collected. This particular airport had 30 or 40 of them.

Sometimes, of course, garbage bags break and the bins overflow, and the airport wanted a way to make sure that those rooms were kept neat and tidy. So they decided to use artificial intelligence systems to do that. They invited something like eight companies to come in and do proofs of concept. They said, “Take four weeks to train your system, and then show us what you can do.”

After four weeks, nobody could do anything. So they said, “Take eight weeks.” Then they said, “Take twelve weeks.” And none of those companies could actually produce a system that had any level of accuracy, just because of the number of variables involved.

And then finally they found us, and they asked us, “Can you come and show us what you can do?” We sent in one of our engineers on a Tuesday afternoon, and on Thursday morning we were able to demonstrate the system with something like 100% accuracy. That is how fast the system can be implemented when you don’t have to go through 50,000 sets of data for training. You don’t need massive amounts of computing; you don’t need GPUs. And that’s the beauty of intuitive AI.

What is the value of the partnership with Intel and its technology?

We work exclusively with Intel and have been a partner with them for the last 23 years, with a very close and meaningful relationship. We can trust the equipment Intel generates; we understand how it works, and we know it will always work. It’s also backward compatible, which is important for us because customers buy products for the long term.

How has the idea of multisensory intuitive AI evolved at iOmniscient?

When we first started, there were a lot of people who used standard video analysis, video motion detection, and things like that to understand the environment. We developed technologies that worked in very difficult, crowded, and complex scenes, and that positioned us well in the market.

Today we can do much more than that. We do face recognition, number-plate recognition—which is all privacy protected. We do video-based, sound-based, and smell-based systems. The technology keeps evolving, and we try to stay at the forefront of that.

For instance, in the past, all such analytics required the sensor to be stationary: If you had a camera, it had to be stuck on a pole or a wall. But what happens when the camera itself is moving—if it’s a body-worn camera where the person is moving around or if it’s on a drone or on a robot that’s walking around? We have started evolving technologies that will work even on those sorts of moving cameras. We call it “wild AI.”

Another example is that we initially developed our smell technology for industrial applications—things like waste-management plants and airport toilets. But we have also discovered that we can use the same device to smell the breath of a person and diagnose early-stage lung cancer and breast cancer.

Now, that’s not a product we’ve released yet; we’re going through the clinical tests and clinical trials that one needs to go through to release it as a medical device. But that’s where the future is. It’s unpredictable. We wouldn’t have imagined 20 years ago that we’d be developing devices for cancer detection, but that’s where we are going.