The video industry is in the midst of a technological revolution, as the exploration and application of artificial intelligence, machine learning, and deep learning radically expand the possibilities for business practices.
New technologies, especially AI, provide many exciting advancements in our industry. However, while the positive potential of these capabilities is being realized with wonderful gains, those with malicious intent are developing methods for using these technologies in criminal or otherwise immoral ways. Deepfake technology is one of those exciting advancements that is certainly not an exception to this rule.
Deepfake—a term derived from the combination of “deep learning” and “fake”—uses advanced technologies to alter videos in monumentally new ways, effectively making people appear to say or do things that didn’t actually happen. Deepfakes entered the mainstream consciousness due to both the amazing application of the technology and concerns about its misuse.
Deepfakes in Film Dubbing
An excellent example of the positive usage of deepfakes is a video that was created to launch a petition to end malaria. The video, created by Malaria Must Die campaign, features legendary soccer player David Beckham appearing to speak in nine different languages. Well, actually, the voice changes—even to female—yet Beckham’s mouth seems to be perfectly in sync with the words. With the help of artificial intelligence (AI), deepfake technology was used to manipulate his facial movements, thereby creating the visual illusion that he’s saying the words in each language.
The David Beckham video illustrates the potential for the application of deepfake technology in dubbing films. By manipulating an actor’s facial movements, it can be made to appear that they’re saying words that they didn’t actually say. This concept can be taken further by using AI to manipulate actors’ voices so that people who watch dubbed versions of a film will still hear the original actor’s voice.
Voice manipulation technologies already exist. For example, Lyrebird AI uses AI to clone any voice by just using a one-minute voice recording. Pair it with deepfake technology, and you have the ultimate dubbing tool.
Altering Video Transcripts
The dark side of deepfakes garnered public attention when videos of celebrities appeared online where they said and did things that they never did. Another example that received widespread attention is an altered video of President Barack Obama that was circulated by researchers at the University of Washington to illustrate the potential harm that could result from the use of deepfakes with ill intent.
However, while these concerns are valid, deepfake technology carries the potential for a range of positive applications in the world of broadcasting and entertainment. Studios can save massive amounts of money and time by utilizing the capabilities of these tech advancements to edit and change video transcripts after they’ve been filmed without the need for a reshoot.
Currently, film studios have to jump through hoops to reshoot a scene. Let’s say a necessary alteration has been done to the dialogue in the scripts after the scene has been shot; the film studio will need to assign a budget for reshoots, book all the actors that were involved in the scene, and the location. And this is not all that’s involved with reshoots. Imagine how many people will be involved and how much cost will that require. Meanwhile, when deepfake technology is developed enough not to cost a significant amount of money, the costs will be significantly lowered. In this case only visual effect artists and actors will only be paid. The latter depends on the laws that the Screen Actors Guild (SAG) will decide on. No crew or location booking costs involved.
In 1963, President John F. Kennedy was on his way to deliver a speech in Dallas when he was assassinated. The beloved politician didn’t get to bring those words to the world on that day, but thanks to modern technologies and innovative techniques, we can hear it now. To bring this idea to fruition, the team at CereProc analyzed the recordings of 831 JFK speeches to “build his voice” by separating the audio into 116,177 phonetic units. The incredibly challenging task was made even more difficult by the fact that the recordings were made on various types of equipment at different times. CereProc used AI to successfully create a very realistic audio speech derived entirely from data.
This is a proof of the vast possibilities for using deepfake technology for educational purposes, such as making it possible to create new videos of historical figures telling their own story to bring important tales to life. Last year, the Illinois Holocaust Museum and Education Center had a showcase where the holographic images of 15 Holocaust survivors were shown on rotation. Visitors had the chance to ask their questions to the holographs of survivors. The interviews with the survivors were recorded by a sphere of cameras. Each interview took 5 days to shoot.
With the use of deepfake technology, the same can be done on a bigger scale. Historical figures can be brought back to life, and more interactive historic classes can be created for schools. This practice already exists, deepfakes can take it to the next level.
Although deepfake technology can indeed be used with malicious intent, the broadcasting and entertainment industries are on the cusp of revolutionary advancements through the application of innovative techniques that utilize these incredible new tools in positive and exciting ways.
First published at Hackernoon.