Sramana Mitra: Let’s take one of your customers and double-click down. In that use case, what I’d like to understand is where is the traffic being intercepted, how is it being modelled, what parameters is it being modelled against, and what is the nature of the AI algorithm driving this kind of predictive modelling.
Amir Husain: Just zooming out, let me first tell you how we deal with customers and what we provide specifically. We have a product called Spark Secure. Spark Secure can be deployed either in the cloud or on-premise. It’s delivered through a hybrid model. Spark Secure can ingest many forms of data. One of those might be, for example, proxy logs or firewall logs.
It looks through that semi-structured information. It can also look at binaries. It can also read articles and texts and textual description of security threats on the web. All of that data is used and fused together by these algorithms to build models of what constitute a threat. Not only is that threat detection capability delivered through this AI-fuelled structured and unstructured data analysis, but evidence is also collected. For example, if your log is going into Spark Secure and somewhere in there, Spark Secure finds an anomalous pattern. That alone doesn’t mean that it’s a threat.
In typical systems, you have a fairly brutal and finite set of rules. If a rule trips or if a threshold matches, you get an alert. Ultimately, you have so many alerts that you can’t pay attention to all of them and you start to ignore them. This is where the AI algorithm that Spark Secure delivers really comes into play. What would a human security researcher do? They would start to look into it deeper.
You may not have that destination listed in a blacklist. That’s the whole nature of a zero-day threat. You don’t know about the signature. You don’t know whether a destination or source is good or bad. It is simply unknown. You start to research it. Spark Secure can automatically create natural language queries.
We also leverage Watson. In fact, we built the first security corpus on IBM Watson. I also served for the IBM Advisory Board for cognitive computing. Within Watson, we’ve created a massive corpus of security content. We’re also tied in with Bing and Google. When Spark Secure finds something anomalous, it starts to do its research like a human being would using NLP techniques.
That’s a lot of content, a lot of articles, magazines, security publications, and discussion boards. It consumes all of that and returns a model for whether what was described in these pages constitutes a threat or not. It adds that to its evidence and it adds that to its confidence. It goes deeper. If this particular threat was a download of a binary, it can automatically obtain the binary.
We’ve built machine learning-based anti-virus capabilities that are not based off of a signature or a fixed database. We’ve created an algorithm that has learned what malware seeks to do. It’s symptom-based. Hundreds of thousands of malware samples have been provided to these algorithms to train on. Now that machine learning anti-virus capability steps in and says, “I think this is bad. It doesn’t match a hard signature but because of all these reasons.”
Now you have this graph-based analysis that started this whole pipeline, which looks really strange. You went to natural language processing to add evidence. Then you went to machine learning anti-virus which then adds even more data and all the while, collecting evidence and giving and presenting all of these evidences to an incident response professional.
Not only does this set of diverse techniques reduce the number of false positives, but you’ve also made the decision easier for a human being. Now they can look at one page of evidence and say, “This is why it got triggered. This is what it looks like. This is what the world is saying about this threat. Here is Watson chiming in. Here is the evidence from machine learning antivirus.” That evidence-based continuous learning ensemble learning is used to enrich the analysis and do more of what a human being does. That’s at the heart of what makes us unique.
This segment is part 2 in the series : Thought Leaders in Cyber Security: Amir Husain, CEO of SparkCognition
1 2 3 4 5