One of the distractions of the Cambridge Analytica story is to be found in its name. “Cambridge Analytica” just reeks of brainpower. It sounds like a spell plucked from Harry Potter: a phrase that, once uttered by Hermione Granger, would fill a library with a flurry of pages, mostly written in ancient unreadable script and describing rituals dark and mysterious.
Associate that name with computers and data manipulation, and you have a recipe for a story that sounds as frightening as it does impenetrable. We’re in that realm of science fiction where the machines rule us.
The reality is that Cambridge Analytica (say it enough times and it does begin to lose its magic) was doing something that we all do in our everyday lives – and probably doing it far less efficiently than even the most socially awkward human. It was making judgements about people based on fairly limited data and using algorithms that are far less subtle in the way they interpret that data. The only thing that makes the Cambridge Analytica story significant is that they were doing it on such a huge scale.
The voguish word for this kind of work is “big data”. It means massive sets of data that only computers could ever hope to process. In its simplest form, it’s like asking a single human to take all the telephone directories in the world and work out the most popular surname. That would be the work of one or more lifetimes and yet would be a trivial task for a computer presented with data in electronic form. (Incidentally, if computers had anything like human intelligence, they would do what most humans would do: intuit the answer from the knowledge that China is the most populated nation and therefore the world’s most popular name is probably “Lee” or “Wang”).
What computers can do, however, is go so much further with that same limited data. They could begin to unlock more subtle patterns, such as the geographic distribution of names across the globe. Cross-referenced with economic data, the same computer could calculate a person’s wealth based simply on their name. In other words: from seemingly meaningless and “shallow” data, valuable information begins to accrue.
This is why the Cambridge Analytica is less frightening than the story the media sometimes present but it is also more concerning. There’s a lot of talk about a “breach” of Facebook data but that’s to overlook the greater danger which is what happens if there wasn’t a breach. What happens if this isn’t a story about the dangers posed by computers having access to all our personal data? What happens if this is a story about the dangers posed by computers having access to quite limited public data and what they can do with that?
Sign up for our FREE Reaction Weekend Email
Read the week's best-read articles on politics, business and geopolitics
Receive offers and exclusive invites
Plus uplifting cultural commentary
We should, therefore, forget about computers performing a deep trawl into the minutia of person’s life because it’s rare (and, indeed, unlikely) for any of us to put that kind of information online. Frankly, also, there’d be no “magic” about an influence campaign that had access to our innermost thoughts. Advertisers have long sought (and in some cases found) the keys that unlock our collective urge to spend yet we are a long way from individualised ad campaigns based on our deepest and perhaps subconscious impulses.
To think about the Cambridge Analytica problem in terms of depth is wrong. The story is much larger and infinitely more difficult to resolve. If you take the most lenient reading of their work, assume Cambridge Analytica did nothing illegal, then we’re still left with a real problem about how we protect ourselves in an online world where every action we make publicly can be reverse engineered to work out who we are and what motivates us.
Take, for example, a simple task such as examining your publicly available data on Twitter. It would be relatively trivial to write an algorithm to examine everybody in the list of the people you follow. Simply checking how many of those include the letters “FBPE” (a widespread acronym meaning “Follow Back, Pro EU”) might be a good indicator of how you voted in the EU referendum.
That’s only the start of the kind of work we could do analysing uses of language (often a good indicator of where we live, education, and lifestyle). With enough data (all publicly available) we could begin to follow the way that memes populate the internet, working out the streams of influence but also, in the process, understanding where and how to interrupt them. Now, of course, there’s always a chance that any one person’s data doesn’t help us make these calculations but that is where the Big Data comes in. We might not be able to make an accurate prediction for every single person but, do this for a million or (even better) a billion people and the statistics begin to work in our favour and patterns emerge.
That is why the problem is not simply limited to Cambridge Analytica. Facebook itself is a leader in this kind of technology and, in an existential sense, it is the very reason the company gives away a free service in exchange for our personal data. “Big data” is being used across most industries where people seek to gain some competitive advantage.
More worrying yet would be if it is being used by governments. In 2016, the UK government brought in Investigatory Powers Act (aka “The Snooper’s Charter”) that required web and phone companies to keep twelve months of web browsing data for every single user. At the time it was suggested that we have nothing to fear because the date being recorded didn’t go deep enough to reveal personal details. It was simply sites visited than, say, articles read. That’s reassuring so long as we think of it as part of the old world of small data. With Big Data there’s so much that can be gleaned from shallow data recorded broadly.
These are new paradigms with results that as yet unknown. We should be unsurprised if, as suggested, they have already begun to shape us, as we should also be more wary of benign technologies that offer us free services in exchange for our personal data.