Carnegie Mellon University

Hands on a keyboard

May 30, 2019

Preventing exposure to malicious websites

By Daniel Tkacik

Meet Bob. Like many Americans, Bob likes watching sports on TV. Since the game he wants to watch isn’t being broadcast locally, Bob searches for a free online stream of the game. He finds one eventually, but only after clicking through several ad-heavy pages written in different languages.

Over the next few days, Bob loses access to his email and money goes missing from his bank account. It turns out that Bob visited a malicious website in his pursuit of finding a free stream of the game. This website downloaded malware onto his computer, allowing malicious hackers to gain credentials to Bob’s accounts.

A new study out of CyLab aims to help prevent anyone from falling into the same trap as Bob.

“Most traditional security defenses are reactive, and warn users only after or at the time they’ve visited a malicious website,” says Mahmood Sharif, a Ph.D. student in Electrical and Computer Engineering. “We wanted to figure out: are there hints about a user’s behavior that could tell us when something bad is going to happen before it happens?”

Sharif recently presented the study at the ACM Conference on Computer and Communications Security in Toronto.

The team evaluated three months’ worth of web traffic generated by over 20,000 mobile device users in 2017. The data was obtained with users’ consent with the help of collaborators from the research arm of KDDI, a large Japanese cellular provider. 

In their analysis, a website was marked as “malicious” if it appeared on the Google Safe Browsing blacklist, which contains a constantly-updated list of unsafe websites and web resources, such as phishing or deceptive sites and sites that host malware.  

“Out of all the users that we observed, about 11 percent were exposed to malicious websites,” Sharif says. “But out of the many browsing sessions, only 1 out of 1,000 sessions were exposed, on average.”

The researchers then combed through the data in search of behavioral differences between users who had been exposed versus users who hadn’t. They found, for example, that exposed users visited pages with more ads and browsed the web more at night than unexposed users.

Based on their findings, they identified three feature types that could help predict whether a user would be exposed or not: contextual features (e.g. number of links clicked, session length, time of day, etc.), past behavior (e.g. average links clicked per session, whether the user had been exposed in the past or not, etc.), and self-reported behaviors reported via a survey (e.g. whether the user runs anti-virus software, whether the user has had previous online security incidents, etc.). The team tested the predictive system, and found it can accurately predict exposure seconds before it occurs. 

“Our system was even able to detect malicious webpages before they had been added to blacklists,” Sharif says. “Now we can use the predictions to proactively protect users, thus adding a complementary line of defense to the existing reactive defenses.”

Other authors on the study include Institute for Software Research and Engineering and Public Policy professor Nicolas Christin, and KDDI researchers Jumpei Urakawa, Ayumu Kubota, and Akira Yamada.


Story originally published here.