Video Analytics and Big Data, Together at Last

December 1, 2020

Kenton Williston

Chetan Gagil & Lerry Wilson IoT Chat

A conversation with Chetan Gagil & Lerry Wilson @Intel, @splunk

What happens when you combine edge AI with metadata analysis in the cloud? The possibilities are endless. Previously unanswerable questions like “Are my factory workers staying safe?” or “How can I reduce losses in my retail chain?” suddenly come into focus.

Join us as we explain how to combine video analytics with virtually any data source to reveal new insights in this conversation with Chetan Gagil from the Edge AI team at Intel, and Lerry Wilson from Splunk, a leader in Big Data analytics. We discuss:

Why edge AI is needed for a wide range of industries and use cases
How Big Data combines video and other data to reveal hidden patterns
How the Intel + Splunk team-up helps developers quickly deploy complex systems

Transcript

Chetan: How many people across my entire chain of, let’s say, 7-Elevens that I have deployed the solution in—how many people on an average entered without wearing masks between 5:00 p.m. to 7:00 p.m. on Saturdays? You can ask questions like that. OpenVINO is very good at providing the metadata, and Splunk can provide the time series, or the complex analytics part of it.

Kenton: That was Chetan Gadgil from Intel. And I’m your host, Kenton Williston, the editor-in-chief of insight.tech. A publication of the Intel^® Internet of Things Solutions Alliance, insight.tech is your go-to destination for IoT design ideas, solutions, and trends.

Today’s show is all about the convergence between video analytics and big data. I’ll be talking with Chetan, as well as Lerry Wilson from Splunk. Did you know that Splunk was into video? Well, neither did I, so I can’t wait to learn more.

But before we get to that, a quick note that we ran into problems with Chetan’s audio connection, so you’ll hear a big drop in his audio quality halfway through the podcast. But you’ll want to stick around for his technical insights in the second half. With that, let’s get to it!

Lerry, welcome to the show. Really glad to have you here. Could you tell me a little bit about your role at Splunk?

Lerry: Kenton, thank you so much for the opportunity and the invitation. Yeah. I’ve been at Splunk for about four and a half years. I live in what’s called our Global Strategic Alliances Organization, and I specifically focus on what I like to call innovations. An innovation area is where we take a look and identify key technologies that can be brought into Splunk and make us more valuable into new markets. So, really excited to share what we’re doing with Intel.

Kenton: Excellent. Speaking of Intel, Chetan, could you tell me a little bit about your role there?

Chetan: Sure. Thanks, Kenton, for having me here. I’m Chetan Gadgil. I’m part of Intel’s Internet of Things Group. We just call it IOTG in short. Within IOTG I am responsible for what is called as “Edge AI at Scale.” So Edge AI is basically our terminology for our artificial intelligence applications at the Edge, which is not in the cloud—things like IoT use cases.

My job is to apply my technical expertise in deep learning, or artificial intelligence, and help our partners adapt our deep learning technologies in their commercial grade offers. Once that integration is done, then I also work within Intel, as well as with the partners, to help them scale these solutions in different use cases and markets.

Kenton: Very interesting. I want to come back to a little bit of discussion about Edge AI, because, to me, it seems like not necessarily what Splunk is known for. But, first, I want to ask a bigger-picture question. When I think of Splunk, I think about big data expertise and not necessarily video analytics, which is the topic of our conversation today. So, Lerry, can you tell me how in the world did Splunk get together with Intel to tackle this new field?

Lerry: It’s actually a great question, Kenton, because when you talk to people about Splunk, they’re probably used to hearing most often we do all types of data and big data—except video or except imagery. That’s a true statement, because certainly Splunk is focused on machine data, human-readable data, and correlating that with a variety of other data sources. But we don’t look at data from a pixel-by-pixel perspective. How we got together with Intel is really driven by customers. What customers are realizing now in this digital age is that actually a huge part of their big data set is actually imagery.

Again, what Splunk is really good at is taking that metadata and the information around the GPS location—the time at which that image was captured or that video was shot—and we can correlate that and bring that into our system. Now, you add in the additional data that OpenVINO creates from Intel, where you’re now actually creating inferences, and you’re identifying the things and targets that are important to your business. That also becomes a digital signature that can move in to Splunk. Now we’re just bringing the digital side of imagery into our environment, helping you correlate it. So we’re expanding big data to actually include video now.

Kenton: That’s really interesting. So what you’re saying is essentially—answering the question that I was going to ask about Edge AI, which is that Splunk is not so much interested in storing and processing video as much as it is the metadata. Of course, the way you get that rich metadata is through doing analytics at the Edge to ascertain what it is that’s in your field of view. Chetan, I think this would be a great thing for you to speak to a little bit more. What is Intel doing in that area, and what is the OpenVINO thing that we just heard about?

Chetan: Sure. So, I’ll answer the second part of the question first, which is OpenVINO. OpenVINO stands for Open Visual Inference Neural Network Optimization Toolkit. It’s quite a mouthful, but that’s what the acronym stands for. What that means is, basically, if you think about some of the advances in deep learning. As an example, in 2018 AI crossed the threshold of being able to identify human faces more accurately than humans themselves. So, an AI can tell the difference between two faces much more accurately than humans. That was done in 2018.

What happens is these types of AI algorithms, especially computer vision, but many others too—in computer vision you need a lot of processing. In order to be able to do a lot of processing very fast, one solution is obviously you can buy more and more powerful, expensive hardware, but that hardware is typically not very easy to deploy in the field or at the Edge.

OpenVINO actually is able to optimize the complex models and just look at the essential parts, because many of these complex models, the way they’re built is by brute force computing, but the entire model is not always required. So it is able to shrink the model down into the most essential parts and still retain the accuracy that customers require. Then we are able to run it on different types of Intel hardware very, very efficiently.

Everybody, of course, is very familiar with the Intel CPUs. But then within the CPUs there’s what we call as the Integrated GPU. There is another hardware called VPU, or Vision Processing Unit, from Movidius, a company we acquired a few years ago. And new classes of hardware that Intel is now also working on and releasing as we speak that are optimized for AI. Regardless of which target hardware your model needs to run on, OpenVINO provides you that single abstraction layer, so as a customer of Intel you do not have to rewrite your application. You just pick whatever hardware is the right thing for that particular job, and your software will still continue to run with the best possible performance. That’s what OpenVINO does.

The first part of your question was, how do I look at the relevance of something like being able to integrate Splunk, and what benefit that brings to the industry. What I believe is that, say for example, Splunk is one of the most well known, as well as the most powerful tool that I have seen. Especially the types of things it does—being able to adjust data from different sources—a lot of existing integrations of vast ecosystem of experts who already know how to use it and deploy. It means that, from customer point of view, they already have a starting point, if Splunk is already there.

What OpenVINO gives us is that ability to convert different kinds of information that used to be opaque, earlier. So video data just used to be files, NVRs, and things like that—you have to actually have people looking into these video recordings to know what’s happening. But now you don’t need to do that. In real time, OpenVINO can actually turn what used to be opaque data into more transparent metadata, and you can actually then analyze.

For example, you can now start asking questions to Splunk like, “Tell me how many people across my entire chain of, let’s say, 7-Elevens that I have deployed the solution in—how many people on an average entered without wearing masks between 5:00 p.m. to 7:00 p.m. on Saturdays?” You can ask questions like that. OpenVINO is very good at providing the metadata, and Splunk can provide the time series, or the complex analytics part of it. They complement each other very, very well.

Kenton: Yeah, that makes sense. In fact, you mentioned something that I wanted to bring up, which is, I think the use of video has been growing dramatically in the last two years because of these capabilities. You’re talking about the ability to deploy analytics in a much more cost-effective and practical way.

I think that the pandemic has really accelerated that because there are all these new use cases. Things like, are people wearing masks? Are people maintaining social distancing? Even more advanced things, like contact tracing. So I’m wondering where you see the biggest need for video analytics right now, and if you see that changing as a result of the pandemic?

Lerry: I’ll go ahead and start with that response. We’ve seen a huge acceleration of some key trends that have happened during the pandemic. The greatest one, of course, is how do you protect not only the people that are employees or customers, but also the people that manually were sent out to take temperatures and do different types of things.

There’s this unbelievable connection in terms of how does technology help us now in the physical environment that this solution plays really well into. There’s an abundance of use cases, and throughout the rest of the time Chetan and I will talk about some of them. But I think the most important thing to recognize is that video and imagery plays an important part in an entire loop.

That first loop, like I said, is gathering data, or collection of information, or monitoring and observing an environment, and then being able to, again, use the inference models to actually find what you’re looking for very, very quickly and do the analysis with Splunk. But then as you go through the process and you come back out on the other end, we’re still in an environment where we as humans, before we flip a switch or turn something back on, there’s always a visual confirmation that everything has been done correctly.

I think that’s another important role that the infrastructure that people are investing in—the cameras—that they’re looking to do is just that final confirmation, for a human operator to say, “Okay, I see that things were done to assess this issue, and now I can mark this off comfortably and we can turn something back on, or we can continue with whatever process.” I think it’s the capture part of it, the observability, and the continuous piece of it, but also that taking advantage of all the technology that they gather, and then at the end making that confirmation visually of that environment that it’s safe to move forward.

Kenton: So, that’s real interesting, and honestly I’d like to dive into some of those applications. Like I said, I think there’s such a huge range of things that people are looking to do with video these days. I’m really interested to know where you’re seeing—both in terms of the specific applications, and also the industries that are deploying these applications.

Lerry: So, from a Splunk perspective this fits really well into our core area around security. Again, with this pandemic, and even before so, there’s a huge need to marry the cyber and physical data points. So, imagery is a great way to help secure—be that surveillance engine, be that observer—and that information is absolutely critical. I think that’s a foundational use case that we share already with Intel, the OpenVINO environment, and many of our customers. That’s truly horizontal. Then I think you can look across verticals, and you can start to see where other things are going in.

So, a specific industry—such as retail or manufacturing—where you’re looking at something being built and trying to make sure that the quality control measures are being followed. The combination of seeing and visualizing that information and being able to measure it against KPIs that you have for those processes—that’s really the power of this joint combination. Chetan, you want to talk a little bit about some of the other vertical use cases you guys are involved in and focus on?

Chetan: Lerry mentioned about the common use cases around security. At Intel, we also have a similar point of view in terms of the market opportunity. We call it situational monitoring, and it cuts across multiple verticals. It is not just limited to computer vision. Audio is another big use case—being able to detect audio signals and then be able to figure out what’s happening on metadata that can be generated out of classifying these audio signals. That’s another common use case.

In terms of the situational monitoring, it really has different applications in many, many industries. So, for example, in retail, loss prevention is a big use case for retail customers to be able to operate there—stores with maximum transparency and efficiency. In manufacturing, it happens to be in terms of worker movements. Worker safety is another common concern from customers that they can solve very effectively with real-time computer vision.

Another category of use cases that we are prioritizing is what we call as product inspection. Traditionally it was done visually—visual inspection, manual inspection. But now with computer vision, instead of just sampling and taking a random sample of some products, you can pretty much analyze every single product with AI, because all it takes is just more compute power. You don’t need humans to make those decisions of the quality aspect.

Another cool thing about this is the fact that with product inspection, it doesn’t have to be limited to just visual spectrum. X-rays can be analyzed, and so on, in terms of manufacturing. Another area where we are seeing a big demand from customers is analysis of medical images and such: MRIs and CT scans. Things like that. Even in these areas there are many types of analysis that computers are doing a much better job than experienced radiologists. So that’s another area where it is not as mature as some of the other things, but it is definitely getting very more mature, and they are picking up in demand.

Kenton: One thing I think is particularly interesting about all of these applications—and we’ve talked about this a little bit already—is the fact that, like you said, it’s more than just vision, in the sense of putting a camera that’s looking at the visual spectrum. It could be x-rays, but there also could be audio. There could be other things, like we mentioned at the top of the call, related to the positioning of things—GPS, or what have you.

So, I’m wondering, and I think this is probably a question for Lerry, if you could tell me a little bit more about how you are integrating these various data sources and drawing insights from a multimedia, if you will, data set?

Lerry: Absolutely. That’s what we get very, very excited about, Kenton. Again, we see this as very powerful data sources that, when combined with other types of things, will open up whole new insights and automation capabilities for our customers. What really makes Splunk unique in this area, as Chetan mentioned earlier, is our ability to quickly bring raw data streams in—any digital data, time-stamp, human-readable—and then search.

Splunk is really a search engine for machine data. So as you bring these other data sources in, and you start to query the data and information, you quickly can build visualizations or actions around the results of those searches that then continuously stream, and are made available, and can be pushed out to a wide variety of users. Everybody is looking at the same information to make the same decisions, and then you could even automate beyond that with different types of playbook.

So that unique capability to scale to incredible amounts of data and then correlate that and make it easier for analysts to investigate and understand. And now you add the capability of, “Hey, we’ve actually found things that you’re looking for” that OpenVINO brings into the environment. Now you can run back from that point and figure out—how did we get to that fault, or to that item, or to that area of interest, and what can we do to improve or automate around that?

Kenton: Interesting. So, Chetan, this makes me want to come back to you. What I’m hearing from Lerry is in the big data side you’re correlating and combining all these different types of data, but I’d love to hear a little more about on the Edge side. I think you were talking earlier about how OpenVINO compresses things down in terms of making the AI algorithms manageable at the Edge.

I think a big part of what you’re talking about there is the idea—to get a little technical—that we’re talking about mostly pushing inferencing to the Edge. So, I think across the system holistically a lot of what’s happening here that makes this work is putting the right kind of intelligence in the right place, and the right kind of data in the right place. Would you agree with that?

Chetan: You’re talking about exactly the right types of issues that customers have. Number one, moving data around is very expensive, so it is always better to analyze anything that happens close to where it happens, rather than trying to move it halfway across the world and get it analyzed in the cloud.

One example—just take a standard store. So, the pharmacy that I go to, it has probably got about 40, 50 cameras in a small area—and just that one pharmacy. Imagine a moderate-sized city: you’ll have millions of cameras, and there is no cloud in the world that can take all these camera feeds and analyze everything, move the data to the cloud. First of all, it’s going to be extremely expensive and, second of all it’s not even going to work.

So, having the analysis happen right where the data is being produced—so where the cameras are—it is going to be a lot more efficient. That’s the power of being able to shrink down the models, being able to provide more Edge-processing capabilities. We believe that over the next 10 to 15 years when this technology matures enough, 80%-90% of AI inferencing will actually happen at or near the Edge, and not in the cloud.

Kenton: That makes a lot of sense, and I think in particular if I extend these considerations, not only the practicalities of moving data around, but also the privacy concerns related to that. Inherently, if you’re not transmitting information that has any personally identifiable sort of data, in that it helps protect privacy. But, of course, there have been a lot of concerns throughout 2020 that people are starting to have about privacy, and about how public-facing video cameras are capturing their data. I wonder if you could speak a little bit to how Intel, how Splunk, are addressing these privacy concerns?

Chetan: Yeah, that’s a great question actually. There are three things that happen when you process information close to the Edge, where the cameras are. Number one, you’re not moving data where it doesn’t need to be there—for example, if data doesn’t need to be copied to a cloud where it can be hacked. You’re not opening up more surface area for attack of the data that is supposed to be privatized—number one.

Second thing is that, with AI inferencing, it’s better than humans taking a look at what other humans are doing, because inherently if an AI is looking at something it is private. Basically, an algorithm is looking at what’s happening. It’s not some human who is sitting in some security office and observing other people and how they are behaving. That’s another area.

In addition to that, the way most of these algorithms work, they work on these visual features. When people talk about facial recognition and use cases like that, the way it works is that they just classify a set of features, and it just happens to match to one person in that particular case. It’s not like a computer really knows who that person is—unless you physically map that person’s information with what the algorithm is doing. That’s the second protection that AI solutions bring in terms of the privacy.

Third, is it actually goes to the application there, and this is where Lerry can add some of his context as well. The application also then has to still do its job, which is making sure that all the other bits and pieces of correlated information that we’re managing—you’re only keeping and retaining at the minimal required bits, and not allowing situations, where if there is an attack or any information being stolen, it does not really compromise things that are not meant to be disclosed. For example, social security numbers, those types of things—at the Edge, nobody even stores those things. So there’s not even any opportunity for any attackers to steal that information if you have trust in that Edge AI.

Lerry: Kenton, that’s a great question, and it’s something that Splunk takes very, very seriously. Just to give you some context, when the COVID-19 hit we actually talked to about 200 senior executives, and without a doubt the number-one issue was regarding any type of data—whether imagery data or even machine data, IoT sensor data, any type of that data—the number-one issue was around privacy. Splunk’s roots are in security, and we understand that.

First and foremost, you have to recognize these companies are working within regional different types of requirements in terms of how data gets handled and how they manage it. Then it’s up to them to really manage that. How Splunk really helps them is making sure that they know that the data is theirs. They own the data. They are responsible for that data. We produce that data. In terms of exposing that data, we work with our customers very closely in terms of who sees what at what point, because at most times it is just data.

It’s not information about people—it’s not that information. It’s really just the data pieces that are going through. But at some point, if you’re responding to an incident based on the requirements that the customer has set in place and the protections that they’ve set in place, you want to be able to dive into that information and understand where the sources are and be able to manage it. In a physical environment that becomes a very, very sensitive issue that our customers are very, very attuned to.

Kenton: I’m really glad to hear from both sides these issues being taken so seriously. Like I said, this year has been a lot in the headlines, and I’m glad to see both Splunk and Intel really taking the right steps, I think, to address these concerns. The topic I’d like to close on is probably the most important topic, which are the dollars and cents.

As we’ve been talking about with the pandemic, there’s been a real acceleration of video use cases, and real need for all kinds of new technologies and techniques to ensure safety and security and the health of the public and employees and all the rest. But, at the same time, the pandemic has also put a real crunch on a lot of organizations’ budgets. So, I wonder if you can to talk to me about how these things can be deployed in a cost-effective manner that has a very quick ROI in these budget-constrained times?

Lerry: In that same discussion that we had with all of these senior executives, this was exactly the point that they brought to our attention as well—is, “Look, we are very constrained. If we’re fortunate, we’ll break even on our budgets for a little while.” The important thing about what we’re doing together is recognizing that a lot of the investment and infrastructure has already been made. So we’re just helping them, in many ways, leverage better use of the investments that they’ve already put into camera infrastructures—many camera-rich environments. I don’t have any exact numbers, but I think we can all agree there’s a significant amount of video that never gets looked at because of the human cost associated with that.

If we could actually demonstrate and show them the capabilities that we jointly have produced, that could be a great way for them to leverage that previous investment. I think the most important thing is recognizing that your digital journey is indeed a journey, and leveraging the data and the infrastructures you already have is where our customers are going, and being very smart about recognizing that—make these investments now, relatively small to the investments they’ve already made, and start to correlate these different silos of data. Then they’ll be much more efficient and ready to optimize and keep further investments as they move forward.

Chetan: Yeah, I agree with that, Lerry. We’re also seeing the same thing that most customers—they are looking at being able to leverage their existing investments rather than put in new investments, and how do you do that? With this type of integration, if you look at the install base of Splunk—which is vast—if you look at the install base of Intel hardware, it is also quite vast.

So, being able to then get the benefit of that to the customers by saying, “Okay, you don’t have to start by replacing or buying completely new infrastructure and all of that from day one.” You can start gradually by adding new capabilities to existing infrastructure, and the way you do that is by taking OpenVINO as the foundational AI element so you can actually leverage hardware to the best possible extent—leverage software investments, leverage expertise—that many of these customers already have Splunk. And then being able to integrate those two.

Then showing them how to create new solutions as the needs of the businesses evolve. As you pointed out, right now we’re in a stage where everything has slowed down. Mostly business…the least amount of destruction required, that’s what they’re looking at. But this is turning around. In 2021, as we start seeing more business turning around and starting to look at newer use cases, being able to create new experiences for their customers. Because one thing that the pandemic has changed is that we’ve realized now that new experiences are needed in many types of scenarios. So how do you do that? Now we are pretty well set in showing them a path to getting there.

Kenton: Yeah, absolutely. So—love to end on a positive note. Let me just give you folks some opportunity if there’s anything we haven’t touched on already that you’d like to add to the conversation.

Chetan: On an ending note, one thing I could say here is artificial intelligence is really a game-changing technology. What this opens up is a new set of opportunities. I mentioned computer vision is the most obvious one, or we can say the most visible one, by definition.

On the other hand, if you start looking at how these things are being applied in different ways, combined integrated and applied, one use case I did not mention was what we call control optimization and autonomy—meaning autonomous-mobile robots—and being able to combine the control aspects—like how something behaves in conjunction with what it observes. That’s pretty much what humans do.

Those types of integrations are becoming very, very required now, because then you can send robots into dangerous areas, or distributing medicines in a quarantined location. Use cases like that, which we never would have imagined a year ago, are now being talked about. That’s one big trend in terms of how people are thinking about these applications.

Second things is, how do you improve the ability of software or hardware manufacturers to create these types of applications? We have to break down the barriers. Instead of us having to spend years of development cycles, can we bring it down to months, or weeks, or whatever? What is the way to do that? What are some other development tools we can provide? That’s the second part—opportunity for the industry.

The third part is, how do you do it very, very cost effectively? I think your point [inaudible] customers, they’ll always have budget constraints, because it’s the way an opportunity costs at the very least. If you’re spending that “X” here, you don’t have that “X” to spend somewhere else. So how do we provide that range of different options for these customers to be able to deploy the solutions for their needs? Any solution should be able to not only scale up, but should be able to scale down to the requirement of the smallest customer.

Lerry: Chetan, that’s great. I think, in closing, the thing hopefully we conveyed today is the excitement and the opportunity around the technologies that we’ve combined. But how we get these into market—how do we make these use cases come alive? Intel and Splunk together are going to be relying heavily on a phenomenal ecosystem. Chetan referenced those manufacturers that will be able to provide boxes that are preconfigured to set these systems up into a customer environment.

It’s really also important that our systems integrators and the development community really embrace OpenVINO and Splunk together, and start to understand what is capable, and how there is an opportunity for them to play in this extremely growing market, and really come into an area that has not been explored. We’re working very diligently—and very excited to bring in the entire ecosystem from the developer’s side, from the system integrator, and the go-to-market side, to make sure that the customers not only get this great technology, but they also get it implemented correctly, they get it operating efficiently, and they can realize its value as quickly as possible.

Kenton: That’s great. Well, that just leaves me to say thanks to both of you for joining. So, thank you, Lerry, so much for being with us today.

Lerry: Thank you very much, Kenton. Appreciate it.

Kenton: And, Chetan, much the same to you. Thanks so much for your really valuable insights.

Chetan: Yeah. Thanks, Kenton. Thanks for having me here.

Kenton: And thanks to our listeners for joining us. If you enjoyed listening, please support us by subscribing and rating us on your favorite podcast app. This has been the IoT Chat podcast. We’ll be back next time with more ideas from industry leaders at the forefront of IoT design.

The preceding transcript is provided to ensure accessibility and is intended to accurately capture an informal conversation. The transcript may contain improper uses of trademarked terms and as such should not be used for any other purposes. For more information, please see the Intel^® trademark information.