Alan Mislove on Measurement, Security, and Privacy

Alan Mislove, Rice CS alumnus and Northeastern CS associate professor

Rice University CS alumnus Alan Mislove is the Associate Dean and Director for Undergraduate Programs in the College of Computer and Information Science at Northeastern University.

 

“Math is sort of the family business,” said Alan Mislove, who had planned to follow the footsteps of his father, a math professor at Tulane University.

While pursuing his undergraduate math degree at Rice, Mislove took computer science courses on the side. Then he became a double-major in Math and CS, but remained focused on graduate programs in math.

Mislove said, “I persisted with that plan until the end of the fall semester of my senior year. I had just completed Peter Druschel’s class on peer-to-peer systems and the idea was very new and file sharing systems, like Napster, were all the rage.

“Structured peer-to-peer networks were something Peter was researching and I really, really liked it. Peter encouraged me to apply for a Ph.D. in CS and work with him. Despite getting into several math programs, I accepted Rice’s CS Ph.D. offer to stay and work with Peter.”

With the change in focus came more career options but he quickly realized industry was not his path. He said, “One year I was hired as a summer intern and the day after I showed up for work, about a third of the engineers were fired. I became somewhat disillusioned with industry.”

But Mislove’s path through academia would also hold a surprising twist. Three years after he began his Ph.D. program at Rice, his mentor left the university.

Mislove said, “Peter left Rice to be the founding director of the new Max Planck Institute in Germany. Most of his postdocs and Ph.D. students moved with him. We were still Rice students, but we lived in Germany and spent the rest of our time working remotely from there.”

Druschel’s style of mentoring combined with his ease of building personal relationships across his research team greatly influenced Mislove’s decision to move to Germany. But he said the primary driver was his fascination with peer-to-peer systems and the opportunity to continue working on them with Druschel.

Mislove described the difference in traditional distributed systems and peer-to-peer systems. He said traditional distributed systems are typically built using a client-server architecture, which means the client (such as your web browser) connects to a server and asks it for service.

“It is easy to develop,” said Mislove. “You just have to make sure the server stays up and keeps serving up data. But that limits scalability. A server can get overloaded. It is a single point-of-failure that hackers easily target. For example, the Web is still largely built that way and we still have attacks.”

“No one machine is special in peer-to-peer systems. Everyone is a client and everyone is a server. Anyone can fail or multiple machines can fail and the system keeps on working. That is a lot more complicated to build.”

He said when he began his research, peer-to-peer networks were just being developed and Druschel was exploring ways to make such systems reliable.

When a browser requests a file, it is either served up by the server or it doesn’t exist, prompting the recognizable 404 Not Found error message. But with thousands of devices in a peer-to-peer system, how could any one of the machines be confident that a file did not exist?

“Napster was a file-sharing system, not really a peer-to-peer system, but it showed that scalability was possible. My first project during my graduate studies was building a peer-to-peer email system. I wanted to work on a system that people actually relied on, not just music sharing,” said Mislove.

So he focused on email, which people were already using as a critical part of their day-to-day work. He said, “I wanted to prove a reliable application could be built as a peer-to-peer system, and most of our research group used it for about six months. Peter was using it as his primary email system. That was a stressful time for me. Imagine what would happen if my adviser didn’t get his email!”

From peer-to-peer systems, Mislove moved onto social networks. He said he had originally signed up for a Facebook account in 2004 at Rice, when it was still locked down to certain universities. Later, Facebook opened up to the public and began growing in popularity.

For researchers, these early social networks provided an easily accessible and rich data source that was previously unavailable. “For the first time, you could view social processes at scale. Sociologists had been relying on manual ways of collecting data, or proxies like phone call records, to find out who was friends with whom. But these online networks were capturing that data and we viewed it as an opportunity to look at those at scale,” said Mislove.

“In our study, we crawled four large networks, YouTube, Flickr, Live Journal, and Facebook. We could see who was friends with whom. We looked at group structure and found similarities even though the underlying structures or systems were different.”

As part of Druschel’s team, Mislove was one of the first researchers to focus on the structure of large-scale social networks. Since that time, he and other researchers have done much more in terms of measurement and learning how people become friends. Lately, Mislove’s own research has turned to security and privacy issues.

“In a paper we presented in February, we were looking at how the Facebook ad system has matured. It has become a powerful ad service because it allows advertisers to target users with specific attributes, which can be used both for good or malicious purposes,” said Mislove.

He explained that the same attribute selection process that helps a bar or restaurant target ads to ‘Bostonians in their 30s and following a favorite local band’ can easily be exploited in a discriminatory way, for example excluding housing ads to people of a certain ethnicity.

The Fair Housing Act make it illegal to place housing ads that discriminate on the basis of ethnicity. When the real use case was discovered recently by Propublica, Facebook apologized for the unintended consequences of its ad tools and removed ethnicity-related attributes for housing ads.

In a follow up paper, Mislove and his collaborators showed advertisers on Facebook could use a variety of other targeting features to still exclude groups of people with sensitive attributes. But he is also concerned about a Facebook feature for custom audiences.

He said, “With custom audiences, I can literally upload email addresses, phone numbers, whatever I’ve got. Then Facebook matches the data against their user database and they let you advertise to just those people.

“Given recent political events, this is obviously a very powerful tool. How could it be abused? What if you uploaded PII–personally identifiable information–like voter records? A number of states disclose race in their voter records; many states also make criminal records freely available. The custom audience feature allows you to discriminate in such a way that it is difficult for Facebook to determine that you are discriminating.”

His research team also explored the information leaks that occur inside the advertising interface. Mislove said they showed how the interface could be abused to infer user phone numbers and reported it to Facebook as a security issue.

“Facebook fixed the issue, but this highlights how careful we need to be with these powerful tools,” said Mislove.

As fascinating as his research into social networks and privacy can be, Mislove still prefers the time he spends with students.

He said, “I love working with students, teaching, and mentoring Ph.D. students. What I really enjoy is when a student drives a project. With our recent Facebook research, the student really took charge.

“He was so excited, it was infectious. Working on projects with students who also find them interesting – that’s the most rewarding part of my day.”