Apollo Breach and Dangers of Public Information

Matti Suominen

October 8, 2018 at 09:39

 

On Friday October 5th, the story broke on Wired about a sales intelligence company called Apollo suffering from a data breach. The company was collecting data from various sources – many of them public – and connecting it all together to create profiles of people. The purpose of their platform was to enable sales by helping sales people target the right stakeholders with the right message.

This case is one in the long line of breaches to various sites that exposed massive amounts of data to attackers. In this case, the company claims to have profiles for 200 million people in 10 million companies. Much of the information was public but wasn’t available in one database where all the necessarily analysis work to combine it is already done. There are interesting aspects to this story that relate to integrations to various platforms which may have ended up populating their database with information that had no place being shared.

Much has already been written about the event and no doubt more will be written over the coming days. The incident comes at a time when the recent Facebook breach is currently estimated to have affected closer to 100 million people is only a couple of weeks old. We now live in a world where such events appear to be part of the weekly news circulation. If you are reasonably active person in social media and using various digital services, odds are good that you’ve already been affected by several of these events. My own personal e-mail has been involved in two major breaches and I barely use it.

If you are concerned or just curious, you can use the following site (https://haveibeenpwned.com) to check if your e-mail was included in this (or many of the previous) breaches. If you find yourself there, maybe as part of one of the previous breaches (e.g. the LinkedIn breach few years ago), now is a fine time to change your passwords.

 

have I been pwned

 

As stated, no doubt a lot will be written about this incident and the ones that will follow. Instead of focusing on the incident itself, let’s think for a moment what really happened and why it’s a bad thing. Much of the information that was in the database was public information. You may have written this information to your profile in LinkedIn to create your CV. Maybe you included some of the information in your Facebook profile. Or maybe you used some services that collect information about your working history, hobbies and so forth. Alone, these bits of information are not that interesting. I’m happy to tell you my e-mail address (it’s not a secret) and where I work (not a secret either). I would be happy to share my CV with you. However, as you start putting more and more information together, this information may tell more about me than I initially thought.

Thought experiment time. Imagine you get an e-mail that starts like this.

scamemail

 

That’s an obvious scam, right? It doesn’t take much to figure that out. But why is it so easy to tell? Because none of it makes sense in context. First, the sender doesn’t know even your gender, meaning that they sent this to a random person they don’t know. Second, you have no connection to Nigeria, don’t know any princes there and there is no reason they would contact you as if you were a trusted person to them. The story falls apart immediately if you think about it. 

As a side note: many scams are intentionally poorly written and make little sense to ensure that the scammer can filter out people who would likely not fall for it. Don’t mistake it for poor execution on their part.

 

How do we make this scam better? By making it more relevant and believable. What if you knew much more about me and could easily adjust some facts to make the message more likely to be true? Imagine, for example, that you claimed to be a person from a company I used to work for. You could easily fabricate details about when we used to work together and how your need relates to my position there. You could maybe name a person who I knew who also worked there at the time. Maybe you could refer to the location where we worked. All this is public information if you go through my LinkedIn profile, my contacts and my work history. It would make for a much better scam. Best of all, you could likely do most of this automatically – just create a few templates for the message and let the system fill in the blanks based on work history and other information about the person.

Let’s compare the previous e-mail to this one:

scamemail2

 

It’s instantly a better scam that looks to get information from me regarding IT systems I might know something about. Maybe I would reply with some names of people I know who could tell more about it. Maybe I would just talk a bit about what I remember. All this by just taking some simple details that are public knowledge and using them to weave a good story that compels me to help an old buddy – fake one, mind you – out in a tough situation.

The problem is, we can’t entirely predict how information about us can be combined to create new information that we would not want others to know. Few years ago, the Cambridge Analytica case regarding abuse of Facebook data made headlines because the company found and exploited the collected data in innovative ways. It wasn’t entirely obvious at first glance that by knowing all sorts of information about people, one could determine their political leaning and target the right people to win over votes. Most people would not mind sharing such basic information but they might very well object to what ultimately happened.

It does not look like the pace of data breaches is slowing time anytime soon. The problem is tough to solve as we must trust every single company that holds our information to keep it safe. Worse yet, we must also trust companies that just went and collected our information from other sources to do the same. I wish I had a solution to offer to you that would magically make the problem go away. Unfortunately, there is no one simple thing you can implement by yourself. Different stakeholders, ranging from legislative bodies to companies and down to individual developers are coming up with different ways to address the issue. 

In the meanwhile, be safe out there and make sure you do your part by following the security best practices. And really, go check if you were affected and switch out passwords for good measure if so. Maybe also turn on the multi-factor authentication (MFA) for services you use while you are at it.

Related blogs