Industry News

Machine learning – effective in spotting malicious sites, CERT-RO says

Machine learning algorithms help Romania’s CERT (computer emergency response team) identify phishing and malicious websites to protect Romanian Internet users, Andrei Bozeanu, Security Consultant for CERT-RO, said at the 2016 DefCamp conference taking place in Bucharest.

The problem

CERT-RO focuses on monitoring the security of Romanian websites in the face of drive-by attacks. However, the high number of sites (372.000) and the proliferation of complex cyber-threats such as ransomware and phishing make the job increasingly difficult.

Furthermore, JavaScript-based malware is expertly dodging antivirus detection, using sophisticated obfuscation and evasion techniques. Additionally, the current solution for identifying malicious links – high-interaction client honeypots – is no longer effective.

Client honeypots seem like a great solution, but they are time-consuming, detectable and easily beatable by malicious JavaScript files,” Bozeanu says. “If you were to analyze every page in a honey client it would take up to 5.89 years for a single iteration”.

The solution

The CERT-RO team performed an experiment and analyzed 40 malicious JavaScript (JS) files found in the wild. They searched HTML files looking for suspicious JS files, namely code that silently redirected users to an exploit site, was obfuscated, or code that executed selectively or in a delayed manner. The team also looked for unusually large content.
After scrutinizing files and URLS, they compiled a set of heuristics and developed a system based on the Random Forests machine learning algorithm. Multiple agents were analyzing every URL, its name and score and sent it to an anomaly detection engine.

We chose this algorithm because it’s one of the most successful in terms of capturing nonlinearities and feature interactions, “ Bozeanu added. “With this method, the same dataset can be fed into a Regressor and a Classifier and its ease-of-use and high performance makes it perfect for the task.”

Bozeanu said this was a test, but CERT-RO will continue improving detection scores with the help of machine learning technologies.

About the author

Alexandra GHEORGHE

Alexandra started writing about IT at the dawn of the decade - when an iPad was an eye-injury patch, we were minus Google+ and we all had Jobs. She has since wielded her background in PR and marketing communications to translate binary code to colorful stories that have been known to wear out readers' mouse scrolls. Alexandra is also a social media enthusiast who 'likes' only what she likes and LOLs only when she laughs out loud.