The Sad Story of SIEM Can Still Have a Happy End

Posted by Avi Chesla on May 1, 2018 3:04:00 PM

For those of you wondering what I brought back from RSA – other than more tote bags than I know what to do with – I actually returned with a strong idea of what was most, and least, interesting there. And the single trend that dominated them all.

First, we can dismiss the booth hype and the marketing slogans that screamed at us as we walked the floor. Everyone’s claiming they can save us from the same problems that bedeviled us last year – threats from IoT, cloud, ransomware.

This slogan combat pointed to a real issue in our industry. There is so much overlap in our cybersecurity “bubble” - with hundreds of solutions claiming to overcome the same issues – that it is no wonder so many buyers are frustrated and confused.

But one trend - a distinct move from last year - was pretty clear. 

The arena that we used to call security analytics, orchestration and reports (SOAR) has morphed into something new. Or something that providers want us to think is new.

Everyone had their own clever hook for it. RSA called it “Evolved SIEM.”, IBM rolled out “adaptive orchestration”; some startups are using the word “Autonomous SOC/IT”; others, like Exabeam, are simply calling it NG SIEM (I assume NG stands for “next gen” and not “No Go”.)

But one thing is undeniable: this evolution is a loud and plaintive call for someone to deliver on the unmet promise of SIEM. 

The original promise itself – more than 10 years old now - was seductive, for sure. SIEM came on the scene claiming it could create more value from existing security tools. The argument was that by consolidating and organizing the data they [security tools] generate, and helping to analyze it to understand if an incident is emerging, it can prioritize response more effectively. 

Right idea. Bad execution. In fact, now we need SIEM’s original promise more than ever. But before I get into the future of SIEM, a look back into its evolution will illuminate the fundamental flaws in its design, and why it has failed so miserably:

  • SIEM began as a centralized logs repository and retention tool responsible for consolidating data and “normalizing” it for better visibility.
  • Immediately afterwards, the need for security alerts rules arose, and SIEM vendors responded with a logs correlation language that allowed to customize alerts and flag possible incidents. 
  • Then came a bigger change, the dot-com boom and an exploding internet dependency. Businesses became juicy targets and the frequency of attacks grew sky-high. In response, organizations deployed more and more security tools.  The result? A big data problem which had a disastrous impact on SIEM’s effectiveness. The existential question became – “how many events per seconds (EPS) and data can the SIEM process?”  
  • SIEM vendors responded by attempting to create more robust and scalable databases, and search engines to allow sorting and finding logs quickly. BUT the system was still based on the same old manual and static correlation rules.
  • In this new reality, the number of correlation rules required to cover all the attack patterns grew exponentially - based the number of logs and constantly changing data.
  • Imagine – as many as tens of thousands of correlation rules which need to be constantly created and maintained to keep up with constantly changing attack patterns. Think of the sheer investment of expert manhours needed!

Most recently (in the past 3-4 years), the need for ever-faster response automation compelled SIEM vendors to develop and integrate with SOAR (Security Orchestration, Automation and Reports).  They needed tools for automatic case management, investigation, and some mitigation/remediation. This led to industry shifts, such as the recent acquisition of Phantom by Splunk. It makes some sense. SOAR allows SIEM vendors to define workflow security rules (yes, more manual rules) that automate the basic repetitive SOC tasks - making them more efficient.  

So where are we after all this?

SIEM vendors tried nobly but failed miserably. They attempted to adapt to the market’s changing needs by adding bigger and stronger log databases, with the ability to search and process logs faster to accommodate the big data problem. It’s like the military thinking the solution is bigger weapons when the world is moving to drones.  Here’s the one critical challenge they ignored:

The need to create and maintain the vast array of security correlation rules in order to detect new and unknown attack sequences … faster and faster than before.

The “Big Rules” Problem – Big and dynamic data followed by exponential growth of possible correlations:

null

In other words - what I saw at RSA was just old wine in a new bottle. The screaming “NG SIEM” booths offered more and more rules-based security systems.  Yes - exactly like the old generation IDS, AV etc., which means:

  • Reactive! New attack sequences are missed. They are fighting the attack patterns of the past.
  • Complex! Too many rules – typical large (and even medium-sized) organizations are burdened with thousands of security correlation rules which are very hard to maintain - or the entire system is rendered useless.
  • Very expensive! Requires massive ongoing investment, which results in a very high TCO.
  • Wrong fit! The system wasn’t designed for response automation, which is why SOAR came into the market in the first place. But without overcoming the log correlation rules problem, SOAR, which includes workflow rules (more rules) - which are supposed to be triggered based on the SIEM correlation rules – cannot work effectively.   Other than just automating very basic SOC steps? They can’t! 

Is there a way out of the mess? Indeed, there is. The “BIG-RULES” problem can be solved, but it’s not trivial. We need to start with the blunt recognition that:

  • SIEM vendors missed something pretty significant along the way.    
  • This big thing is data classification using NLP ML technologies
  • The good news is that this area has developed rapidly over the last few years.   

Why are NLP data classification algorithms fundamental for solving our problem? 

Because we are essentially dealing with a data complexity problem, which needs to be reduced, and which NLP is built for.

A. What we have here is big data. Pure and simple. 
B. Cyber data changes constantly - new malware variants, intrusion, viruses, phishing, C2 and malware sites etc. – meaning that the content of logs and data feeds is constantly changing.  
C. Taking (A) and (B) into account, there is a close to infinite number of possible correlations (that means possible attacks)- and it’s growing exponentially, given the explosion of data (see illustration above).

This is a complexity problem on steroids!  And NLP ML data classification is all about reducing complexity to a manageable and predictive level - allowing correlations to be identified without human involvement.

What makes this possible is that the content within Cyber Data messages can be described as natural language data.  Machine learning and NLP algorithms can classify that data wherever it emanates – logs, threat and intelligence feeds, research articles, CVEs reports, attack signature databases and more. This makes it is possible to train NLP classifiers to read and understand the content, no matter what the source. Once the meaning of the terms used to describe security related threats, research results, relevant vulnerabilities, and attack vectors etc., is known, their appearances in new sentences will be understood without human involvement.

What makes this possible is that the content within Cyber Data messages can be described as natural language data.  Machine learning and NLP algorithms can classify that data wherever it emanates – logs, threat and intelligence feeds, research articles, CVEs reports, attack signature databases and more. 

This makes it is possible to train NLP classifiers to read and understand the content, no matter what the source. Once the meaning of the terms used to describe security related threats, research results, relevant vulnerabilities, and attack vectors etc., is known, their appearances in new sentences will be understood without human involvement.

Once these classifiers understand the language, they can instantly categorize the data into a significantly smaller number of security behavior classes.

Each class represents an act or step inside an attack campaign; the potential threat itself; the attack vector and the associated risk – basically, emulating the research processes that security experts need to conduct.  A perfect example of where machines are better, faster and cheaper than their flesh and blood alternatives.

Most importantly, this allows us to solve the problem of correlation, as the number of possible cause-and-effect connections between the classes is much smaller, finite, and steady than before.

The NLP Classifiers reduce the complexity:

null

Steady means that these classes don’t change frequently (think about our vocabulary - it doesn’t change that much because language is inherently flexible). 

After reducing the enormous amount of dynamic data into a smaller number of steady security classes (see illustration below) we can achieve the following: 

  1. Correlation can be done automatically based on predictable and adaptive cause-and-effect connections between the classes – all without the need for any manual correlation rules, now and into the future.
  2. New and different sources of data can be seamlessly added and classified – as the classifiers are data source agnostic.
  3. Investigation and mitigation remediation can be optimized and automated much more effectively - based on the detected attack intent.

In short, my trip to RSA highlighted the problem and presented few solutions. What we need today and tomorrow is:

A SIEM that utilizes AI and ML (NLP) for data classification in the right way. That means a SIEM that deeply understands the language of security, and thus doesn’t need an army of security experts to constantly create and maintain correlation rules.

Once we get this done – and we can - the BIG-RULES problem is solved!  

As a direct result of this, we also overcome all the other problems I outlined.  We replace the SIEM that is reactive and not proactive; that is ROI negative; and complicates rather than simplifies, with the one that should have been invented in the first place.

Sometimes it takes half a generation to fix the whole problem.

Topics: Artificial Intelligence