Friday, August 4, 2017

What is Dark Data?

TheMerkle Dark Data

Consumers rarely take necessary precautions when it comes to privacy and information security. A new paper presented during the 2017 DEF CON event covered this trend in painstaking detail. In fact, researchers have dubbed this trend “dark data,” which can easily be used to cause all kinds of harm to people. It is a very disturbing trend as well, especially considering how criminals operate to exploit dark data.

Dark Data Can Identify Internet Users With Ease

The lackluster approach by most consumers to keep their information safe will cause all kinds of problems over time. Leaving digital breadcrumbs for third parties to exploit is never a good idea, yet it appears few people are aware of what they can do to counter this problem. The research paper regarding dark data certainly paints an interesting, if worrying, picture.

The researchers note that the vast majority of internet users leave a significant amount of data exposed to criminals who are waiting to abuse it. Collecting data about internet users has proven to be incredibly easy, and it seems this problem has only grown worse over time. Anything you see online can be a major threat to your online privacy at any given moment. That is quite disheartening to think about.

The paper outlines how the team set up a bogus online marketing consultancy firm. It looks quite professional, all things considered, but that just goes to show how easily people are tricked into believing a website has only honest intentions. They then looked for web analytics companies providing clickstream data. Such data includes a list of websites users visit and in which order, as well as URL details of pages accessed on the sites. This is a powerful tool utilized by retailers and other service providers.

Signing up for a free web analytics trial can give malicious entities access to clickstream data for several millions of users. There are companies specializing in collecting user data, and they can easily be taken advantage of to a certain degree. While the data should be harmless more often than not — no real names are included — that is not necessarily the case. Anyone who knows what they are looking for can use such information to identify internet users.

Marketing companies and web analytics service providers prefer to be very meticulous when it comes to gathering information. Not only do they note when people visit a particular website, but they also collect information regarding user behavior. For example, when someone searches for an item on your website, do they stick around or leave the platform? Even though there is no real name attached to these behavioral profiles, people cannot change their online behavior either. Given enough key points to create a profile, it simply becomes a matter of matching that behavior to an actual internet user.

Indeed, a lot of information can be extracted from the website URLs one visits, even if his or her name is not attached. According to the paper, close to 3% of internet users can be identified from clues contained in a URL. Contrary to what one might expect, visiting a website prior to the site being analyzed by a service provider will often include metadata from a previous platform. If that site happens to be a social media platform, the web analytics firm now sees your Twitter or Facebook handle as part of that URL.

from The Merkle