Machine learning in cyber security: It’s only just starting

Machine-learning has become one of the biggest buzzwords in cyber security, with almost every maker of a product in this sector touting it as part of their detection capability.

A recent example of that turned up last week in a blog from Microsoft describing how in May Windows Defender spotted something suspicious in a small-scale email campaign purportedly from a landscaping business in Calgary. Microsoft’s machine learning systems stopped the mail, which asked target victims to review an attached PDF document.

It turned out the mail was coming from a spoofed address of the landscaping business. More than that, it was a spear phishing campaign to around 80 persons or firms.

When opened, the PDF became a so-called “secure document” where the victim had to click on a link to a malicious website with a sign-in screen that asks for enterprise credentials.

“We stopped it from the very first encounter with the file with our client-side machine models and the cloud-based machine learning models, even though we had never seen the attack pattern before,” blog co-author Geoff McDonald, the Vancouver-based cloud machine learning architect for Windows Defender, said in an interview Thursday.

Microsoft didn’t identify the Calgary company, so it couldn’t be interviewed.

Typically in a spear-phishing campaign these days an attacker crafts a slightly different script within the malicious code for each target so traditional anti-virus/anti-malware software won’t catch it, McDonald said. It’s one of the ways attackers are getting better at obfuscating their work. “This makes it really difficult for researchers to write signatures against it” for their AV software. “However, these patterns are really good for machine learning models, because if you were to look at the code it would be highly suspicious … A machine learning model can look at it the way a human can and clearly identify that it is hiding malicious intent.”

“Machine learning plays a really important role in modern protection against malware attacks,” said McDonald. “Conventional signature technology really isn’t enough to protect against attack these days because attackers are too sophisticated.”

There are a wide variety of machine learning models being created and deployed by cyber security companies. For example, in the Calgary incident, Microsoft’s script-based model did the detection. It also has models for other threats such as executable and post-breach detection.

Almost every security vendor already has incorporated or plans to incorporate specialized classes of machine-learning algorithms into their solutions to focus on specific problem domains, noted Forrester Research cyber security analyst Merritt Maxim. The technology is also being used to help develop an optimal remediation response based on assessing previous security incidents to ensure that the resulting response is quick, effective, and user-friendly.

In identity and access management machine learning and artificial intelligence are being used to scan groups of users, their roles, entitlements, and actual access to assess if users’ access is consistent with their job role. For risk-based authentication, machine learning algorithms, coupled with the ability to collect and use risk scoring for a wide variety of contextual attribute data (for example, clickstream analytics and GPS or location data from mobile devices) and information about real-time activity, enable security administrators to fine-tune such models in real time, he added.

Still can’t look at processes

However, Gartner cyber security analytics analyst Avivah Litan cautioned that machine learning is still in its early years. The technology is restricted to looking at suspicious files or network behavior, she pointed out. Few machine learning models can be used to scrutinize processes such as what is running in memory, she said.

Machine learning has become a commodity technology for cyber security providers, she said. ”It definitely helps, but you can’t only rely on machine learning [alone] because it doesn’t catch everything. And the machine learning for endpoint security is only catching files that are suspicious. If a bad guy logs into a machine directly and starts writing routines in Powershell, machine learning isn’t going to see that.”

And, she added, vendors still have to deal with a lot of false positives that machine learning – like any defensive system – can generate.

“Certainly we have only just begun” to exploit the potential of machine learning, Litan said. It’s more advanced brother, artificial intelligence “is one of the most disruptive forces we’ll see in the next decade,” she added.

Would you recommend this article?


Thanks for taking the time to let us know what you think of this article!
We'd love to hear your opinion about this or any other story you read in our publication.

Jim Love, Chief Content Officer, IT World Canada

Featured Download

Howard Solomon
Howard Solomon
Currently a freelance writer, I'm the former editor of and Computing Canada. An IT journalist since 1997, I've written for several of ITWC's sister publications including and Computer Dealer News. Before that I was a staff reporter at the Calgary Herald and the Brampton (Ont.) Daily Times. I can be reached at hsolomon [@]

Featured Articles

ADaPT connects employers with highly skilled young workers

Help wanted. That’s what many tech companies across Canada are saying, and research shows...

Unlocking Transformation: IoT and Generative AI Powered by Cloud

Amidst economic fluctuations and disruptive forces, Canadian businesses are steering through uncharted waters. To...

Related Tech News

Tech Jobs

Our experienced team of journalists and bloggers bring you engaging in-depth interviews, videos and content targeted to IT professionals and line-of-business executives.

Tech Companies Hiring Right Now