Vladimir Putin, the Russian President, is doing an 80s-inspired workout right before my very eyes. A red headband covers his spherical bald head, and his groin is tightly packed into a bright blue unitard. As he stretches and flexes to the song “Sweet Dreams (Are Made of This)”, it becomes increasingly apparent that the video – sadly – is synthetically produced. The clip has been made by @1facerussia, a TikTok account that focuses solely on “deepfaking” Putin. Videos include everything from Putin dancing and playing tug of war with Boris Johnson to light sabering with Joe Biden.
The improvement in the quality and sophistication of these deep fakes means it won’t be long before we struggle to discern fact from fiction, authentic from synthetic, or Putin from Jane Fonda. Take the recent viral deepfake made by Tom Cruise, for example; it may just be the most realistic yet. The videos, made by @deeptomcruise, show the actor practising a golf swing and doing a magic trick with a coin, leave you feeling somewhat dumbfounded over the likeness.
Although Putin and Cruise’s videos seem innocuous, some deepfakes are used for malicious intentions beyond comedic effect and raise important questions about our perceptions of reality. What happens when deepfakes are used to warp and destroy political outcomes? What if they are used for fake revenge porn? Propaganda? Identify fraud? Or to extort companies or individuals? Sadly, the deepfakes we see going viral now will one day go further than superimposing celebrities’ faces; malicious actors will use them as a weapon to spread disinformation.
The proliferation of this counterfeit media will play a worrying part in contributing to the “infopocalypse” – where we struggle to tell what is real from what is not. But, what exactly is a “deepfake”, why are they problematic, and what can be done about it?
A ‘deepfake’ – a portmanteau of “deep learning” and “fake” – is a type of “synthetic media” -images, audio or video that is manipulated or generated by Artificial Intelligence (AI). A deepfake usually involves swapping one person’s face onto someone else’s body. One way to make deepfakes is to use Generative Adversarial Networks (GANS). GANS is an AI-assisted technology that can synthesise the body and face. It uses authentic footage as a training set, creating a competition between two software neural networks. Through GANS, both audio and video can be synthesised to show realistic footage of humans speaking. In the case of the Putin deepfake, a GAN can look at a thousand pictures of Vladimir Putin, produce a new photo that approximates those photos without being an exact copy of one of them, thereby producing an entirely new image of the Russian President.
These videos weren’t known as “deep fakes” until late 2017, when a Reddit user called r/deepfakes started using Google’s open-source deep-learning library to superimpose celebrities’ faces – including Gal Gadot and Scarlett Johansson – on the bodies of women in pornographic movies. The codes inside DIY deepfakes today are by-and-large created from this original code.
The majority of these DIY deepfakes continue to be pornographic. The AI firm Deeptrace found 15,000 deepfake videos online in September 2019, where a staggering 96 per cent were pornographic, and 99 per cent of the mapped faces were female celebrities on the bodies of pornstars. As a result, targets are now not just Hollywood celebrities and Instagram influencers but also private individuals – and women are disproportionately affected. Last month, a British government-backed review said that sharing digitally altered “deepfake” pornographic images should be made a crime. If the UK moves forward with the ban, it’ll be the first country to do so and will hopefully inspire the US and the EU to follow suit.
The problem is that deepfakes are becoming increasingly easy to make. There is already plenty of opportunity online with software tools available on GitHub, including FakeApp, DFaker, faceswap-GAN and DeepFaceLab. The app Zao, lets users add their faces to the bodies of a list of TV and movie characters.
Achieving high enough quality to deceive remains time-consuming and costly, however. The video creator behind @deeptomcruise, told The Verge that a tremendous amount of time and effort went into making each deepfake: “You can’t just do it by pressing a button”, he said. “Each clip took weeks of work, using the open-source DeepFaceLab algorithm as well as established video editing tools.”
Yet, at the current pace of AI advancement, it won’t be long before weeks of work becomes minutes of work. Nina Schick is the author of Deepfakes: The Coming Infocalypse and has advised everyone from the former secretary of NATO to President Joe Biden. In her view, the use of deep fakes will become ubiquitous: “Some experts believe that by the end of the decade, 90 per cent of video content will be synthetically-generated. It’s not a question of if but when.” In her book, Schick stresses that the deepfake phenomenon is nascent, so it is crucial to distinguish the intent of synthetic media: “Some synthetic media will be used for legitimate or commercial purposes and will have a positive impact on industries – entertainment through to communication,” she says. “However, there will be media that has been used negatively. That’s why when I refer to a ‘deep fake’, I mean the negative or the malicious use of synthetic media.”
Schick goes on to say that we currently have a window of opportunity to set the norms and parameters of deepfakes. Her call-to-action for classification is based on the fact that video remains the most effective communication medium. At its core, images have a far stronger persuasive power than text but, comparatively, citizens have weak defences against visual deception: “Until this moment, we have accepted that video is an extension of our perception. Visual information is so compelling, we are psychologically wired to want to believe what we see.”
In the past, only Hollywood-esque studios, full of experts and specialists, had the budget and the resources to create these effects. But it won’t be long before a TikTok user can do the same and create counterfeit content where people will not be able to recognise any difference from the real thing.
The dark side to this democratisation means that any malicious actor – from conspiracists and online trolls to hostile states – can become a purveyor of misinformation and create media of anyone saying or doing anything. These users can hijack someone’s biometrics to manipulate events, information and push ideologically-entrenched communities even further into their echo chamber. They could also make use of the ‘Liar’s Dividend’ to escape repercussions – after all, if anything can be faked, everything can be denied.
The concern for governments is that if you can project Scarlett Johannsson’s face on the body of a pornstar, it’s only a matter of time before a malicious user can create a deepfake that can damage an elected representative’s reputation. Take the digitally-altered video of Nancy Pelosi, for example. Made by a “Trump superfan”, the speaker of the US House of Representatives appeared to slur drunkenly through her speech. Trump reposted the clip on his Twitter with the caption “PELOSI STAMMERS THROUGH NEWS CONFERENCE.” Although the counterfeit was quickly debunked, it had already been viewed millions of times. On the surface, this may seem harmless, but ask yourself: what happens if a video like this was of a prospective candidate and was released the day before an election?
These concerns are propelling a rise in detection countermeasures. Two programmes work to keep deepfakes out of your life, known as Reality Defender and Deeptrace. Reality Defender hopes to “tag and bag” manipulated images and videos before they can do damage. Deeptrace works on an API that acts like a hybrid antivirus filter, pre-screening incoming media and diverting manipulations to a quarantine zone. Silicon Valley has started to wage war against deepfakes with social media sites like Facebook and Twitter banning them from their networks, and corporations like Amazon and Microsoft developing an application programming interface (API) to detect deepfake videos.
Yet according to researchers at Sungkyunkwan University in Suwon, South Korea, even the application programming interface (API) can be deceived by the deepfake-generated videos. Schick believes there are three reasons why detecting deepfakes is a bigger challenge than what one might think:
“Firstly, there are far more people interested in developing the generation side of deepfakes than the detection side. Secondly, there can never be a one-size-fits-all detector as the results are only based on the training dataset these companies have. Thirdly, the AI behind the deepfake is based on an adversarial training model; the generator gets cleverer and cleverer.”
It’s clear that as deepfakes grow in sophistication, a technical cat-and-mouse game between creators and detectors will ensue. Schick stresses that as well as focusing on piecemeal solutions, we should conceptualise deepfakes within the broader understanding of technological change: “We have to use this moment to reassess the paradigm change that is coming. We need to build the technical stuff, establish a framework of media provenance and embed into the hardware authentication technology,” she says. “But we should also focus on what we are going to do about digital education, our legal frameworks, what we do about identity fraud. It’s an emerging civil liberties issue.”
There is no doubt that the greater proliferation of deepfakes will play a part in eroding public trust and exacerbating social rifts. From porn through to politics, if we cannot agree on a basic universal truth such as what we see, there is no monolithic source of trust or sense of reality. For authoritarian regimes, deepfakes are a dreamland – used to stoke populism and consolidate power. For liberal democracies, deepfakes are a nightmare – used to disrupt the democratic process and create a zero-trust society. Whether you are an authoritarian or libertarian, deepfakes will have consequences for all. We are living in an age where technological advancement continues to skyrocket, and as Schick says, if undetectable deepfakes are not a matter of if but when, it’s about time society prepared itself.