By Meg Eastwood, Science and Engineering Reference Librarian
Big data wants your data – how do they get it? If you’re using a service that’s freely available on the web, YOU and your data are the product used to make money and stay in business. Librarians chose the theme “Big Data is Watching You” for this year’s Choose Privacy Week (May 1st – 7th, 2018) over five months ago, but the recent Facebook Cambridge Analytica story makes this year’s theme even more timely. In a nutshell, the Cambridge Analytica story began with a personality quiz on Facebook – if you took the quiz, you gave an app called “This Is Your Digital Life” permission to access to your Facebook data. The app’s developer, Aleksandr Kogan, eventually provided that data to Cambridge Analytica, a political consulting firm. Cambridge Analytica had access to this data when they helped Trump’s 2016 Presidential campaign teams buy online ads targeted at specific voters.
While you could easily spend hours reading about the Facebook Cambridge Analytica story, Choose Privacy Week is all about letting you know that the modern world is full of privacy concerns. We’re not trying to scare you off the internet completely, we just want you to understand how your data gets collected and used behind the scenes.
Let’s start with Facebook – what data do they collect about you? One user downloaded their Facebook information and discovered records of calls and texts made with their Android phone. The personality quiz that led to the Facebook Cambridge Analytica scandal isn’t unique – the internet is full of tempting-looking quizzes that privacy experts warn could be created by identity thieves looking to steal your personal information, or it could be a big data company looking to mine your data.
Google may have access to even more information about you than Facebook does, thanks to their suite of products including Gmail, Google+, YouTube, Chrome, and more. In 2013, Google combined their privacy policies across these services, allowing Google to pool data on your activities across all these products. A judge ruled that Google was legally allowed to do this, because the judge believed that people understand that they’re the product when they sign up for free services. Google’s analytic bots essentially have access to every word you type in all of their services, as highlighted when a glitch caused some Google Doc users to be locked out of their documents when the scanning program mistakenly decided the documents violated Google’s terms of service. Additionally, Google has more trackers embedded in websites you visit than either Facebook or Twitter, according to a study by the Princeton Web Transparency & Authority Project.
Google and Facebook aren’t just harvesting your data – they’re putting you to work for them. Facebook users tagging themselves and friends in photos helped Facebook build a facial recognition software that is far more accurate than the software used by the FBI. Every Google search you type in helps Google refine their spell checker. While I’m not usually a fan of Mashable articles because of the overwhelming number of ads, this Mashable article gives a great overview of the work you do for social media companies.
The big data privacy problem extends beyond Google and Facebook – email in general is not secure. The story of Hillary Clinton’s hacked emails dominated the news during the 2016 Presidential election – my personal favorite stories on the topic revealed John Podesta’s tips on cooking risotto. The moral of the story is that personal information you send through work email could someday become public information (or, like those who worked for Enron, your emails could become part of one of the most studied text corpuses in computer science history). If you decide to take a break from email and do some shopping instead, companies are hungrily waiting to harvest data on your shopping habits to help sell you more stuff through predictive analytics. If you decide to become a luddite and get rid of your computer, don’t forget to wipe the hard drive multiple times to prevent your personal information from being retrieved after the computer leaves your possession. Computer forensics is actually an evolving field that can be used to solve crimes, detect forgeries, and more.
If you’re concerned about the fact that big data is watching you, I recommend you start with this eight-day data detox kit to understand your data footprint and what you can do to protect your personal information online. Find more resources like this kit on the Choose Privacy Week tools page. And follow the #ChoosePrivacy hashtag this week to learn more!
For more details on Facebook and Cambridge Analytica, The Atlantic provides a good three paragraph summary, and The New York Times provides a longer overview. Cambridge Analytica defends themselves in this press release, and Facebook wrote a blog post to explain what information they provide to advertisers. Wccf (Where Consumers Come First) tech provided an interesting breakdown of the Facebook blog post.