Martin Korec's article "Integration May Answer Questions in Machine Intelligence" has been published in the most recent edition of Cyber Defense Magazine's "Cyberwarnings Newsletter." A .pdf of the issue is available here. We have included the full article below.
Integration May Answer Questions in Machine Intelligence
You are probably familiar with terms “Artificial Intelligence” and “Machine Learning,” i.e. the idea that computers can be taught to learn, and then make predictions based on the data they are given. Artificial Intelligence/Machine learning tools present huge opportunities in many areas, especially in cyber security. The UK government considers it technology which is the engine of the digital revolution. But, some are skeptical. Gartner put Machine Learning (a subset of Artificial Intelligence) at the “Peak of Inflated Expectations” in its 2015 Hype Cycle. Simon Crosby of Bromium considers these tools to be a “pipe dream.”
What Are Artificial Intelligence and Machine Learning?
Machine Learning is a subset of Artificial Intelligence, and both address the capability of machines to be taught to make predictions based on “learned” data. Both are popular terms in marketing materials, and are often confused. Deloitte has decided that a better term is “Machine Intelligence” - describing it as “an umbrella term for a collection of advances representing a new cognitive era. We are talking here about a number of cognitive tools that have evolved rapidly in recent years: machine learning, deep learning, advanced cognitive analytics, robotics process automation, and bots, to name a few.” We’ll use Machine Intelligence here (partly because “Artificial Learning” didn’t work as well) to mean the use of data analytic/predictive tools in the network security context.
The Benefits of Machine Intelligence
The essential benefit in Machine Intelligence is that it can take truly massive amounts of data, analyze it in real time, and identify anomalous or malicious behaviors invisible to manual review, or which would not be accurately identified through static detection rulesets (which are also a hassle to set up). Of course, the more data a Machine Intelligence solution has, the more effectively it can do its job. Some have claimed prediction can be improved by over 90%. If the solution has limited data from only Netflow, it is limited in its effectiveness. If input data comes from the every layer of the network, then it can identify anomalies at each layer, and each device within each layer. This means the Machine Intelligence solution identifies behavior - like advanced persistent threats or insider attacks - that may be limited or very well hidden among massive volumes of network traffic, and which would be missed by a security team pre-programming logic in SIEM systems, even well thought-out ones (a limitation of SIEM systems), or working with an IDS ruleset alone.
Some Claim Machine Intelligence has Drawbacks
Advanced analytics have been around for 20 years or more, there must be something wrong with them, or we’d all be using them. Right? Naturally, as with anything created by humans, Machine Intelligence solutions can be defeated by other humans. However, there are several existing approaches, including classification algorithms, proven to successfully mimic security analyst behavior which can be used in design and testing to avoid defeat by new threat samples. A second criticism of Machine Intelligence solutions is that they are not “plug and play,” e.g. that they need analyst time to filter out false positives/e.g teach the system what is a threat and what isn’t. Failure to do so leads to excessive false positives and alert fatigue. Alert fatigue is a problem. A recent article suggests that over half of security professionals are missing alerts they should address. However, MIT research indicates that human/Machine Intelligence collaboration is actually beneficial and can reduce false positives by close to 85%. Furthermore, while Machine Intelligence solutions may not be “plug and play,” their implementation time is much lower as compared to SIEM systems (hours vs. months) and training the machine on false positives requires a very small actual time commitment (minutes a day).
Bringing Solutions Together
Is it possible to have the benefits of Machine Intelligence technology, but minimize the hassles? Is it possible to use Machine Intelligence in such a way that this technology is used for truly advanced analysis, reducing false positives and saving the security team’s time? Integrating several features/technology types into one solution mitigates several issues with Machine Intelligence technology, and creates a more efficient system. Specifically, integrating with IDS rules and network performance monitoring is an efficient means of improving network security by joining complimentary features and data sets.
In such an integration, detection is more effective and false positives are reduced. Less time training the system is required, and information that is “trained” starts from a more accurate position.
Integration with an IDS ruleset specifically brings two benefits: The first is that the IDS, a list of existing rules and known signatures, helps the Machine Intelligence tools function more efficiently, by determining early in the data analysis that certain traffic matches known malicious code or patterns, creating a deeper chance for analysis of events that do not trigger an IDS alert. Secondly, this type of integration has the added benefit of identifying for the Machine Intelligence tools what particular viruses/malware/trojans, etc, look like. This means that the predictive analysis tools have more, and more accurate data upon which to build their analysis. This data is also available much more quickly than if the solution was completely self-educating, or assisted only by the security team.
This also applies to adding a performance monitoring capability. A more informed and more efficient Machine Intelligence solution exists because traffic data is integrated to help it spot things like too many communication partners, services which haven’t been used before, exceptional network application delays, changed MAC addresses, or new devices or services in the network.
Integration also benefits the security team, because integrated IDS data increases efficiency. Not only does the team spend less time training the system (see above) but it also means more accurate results, resulting in less risk of alert fatigue. Alerts that actually matter are less likely to be missed as a result of the process.
In summary, Machine Intelligence technology, despite what its detractors suggest, is here to stay. Though all providers may not be using its full capabilities, its potential is too great, and its benefits in terms of detection of advanced threats too tangible for it to be given up. But, it can be improved. An integrated approach; featuring several different types of input and analysis helps to streamline Machine Intelligence data analysis, making it more effective and improves the functionality of the integrated tools. This means more effective and more efficient network security, and more family time for security analysts.