President Joe Biden called for a ban on artificial intelligence (AI) enabled voice impersonation during his State of the Union address during a section of the speech where he also mentioned the promise and the perils of AI technology.
The president did not elaborate on the ban on AI voice impersonation or add any details as to what he meant or how a ban would be imposed—he in fact only mentioned “AI” twice in the entire speech.
The concern is that campaigns like this can influence voters, thinking they’re hearing accurate information from a trusted party member when in reality it is malicious. The ability of AI to convincingly impersonate voices and identities poses risks including fraud, market manipulation and electoral interference.
Recent events, such as the AI-generated robocall impersonating President Biden during the New Hampshire primary, underscore the urgency of addressing these risks. Beyond the realm of electoral interference or financial crimes, AI-based voice impersonation also intersects with entertainment.
There’s a growing trend of AI being used to replicate the voices of actors and public figures for various purposes, including dubbing movies and creating synthetic dialogue. While this technology offers new creative possibilities, it also raises ethical questions about consent, authenticity and the potential for misuse, particularly in areas like fake news or celebrity impersonations.
Jonathan Nelson, director of product management and data at Hiya, says a well-crafted AI call has the risk of being more believable and therefore misleading the recipient.
“We have already seen developments of this as static, obvious robocall recordings gave way to services able to create a rough fake conversation, reacting to the things the recipient says using a limited list of pre-recorded messages,” he explains. “But now, the entire conversation can be created, live and fully responsive to the recipient’s words.”
Nelson adds that’s just the risk when dealing with anonymous, generic generated voices.
“It’s a completely higher tier of risk when the voice used is specifically impersonating a known or trusted person,” he says, pointing to the recent campaign that impersonated the voice of President Biden.
Whether or not it is feasible to enforce a ban on AI voice impersonation, considering the rapid advancements in AI technology and the difficulty in regulating its use, is another matter entirely.
“Sadly, not feasible at all, for several reasons,” Nelson admits. “Firstly, scammers don’t really care about regulations like these. They’re already breaking several laws by conducting their campaign of misinformation, so one additional law broken is irrelevant.”
Secondly, a lot of enforcement of laws such as those against robocalling are dependent on the recipient knowing they’ve received a robocall, so they know to report it, or for law enforcement agencies or enforcement actors within the telephony industry being able to recognize the call as robocall.
“As AI-generated voices and real-time conversations improve, it will become very difficult for these players to even realize the call was using AI or was considered a ‘robocall’, or at least not with enough confidence to pursue legal action,” Nelson says.
While there are enough subtle signals in AI-generated voice that sophisticated software can detect it, even with the AI voices as sophisticated as they are today, those services can only have a certain level of confidence–perhaps 80% or 90%–the call is AI-generated.
“As AI voices improve, that confidence will start dropping,” he cautions.
Most analytics services that fight back against robocallers don’t use the audio of the call in their detection. Instead, they look at other characteristics like the caller’s history, the signature of the phone call itself and how recipients have been reacting to their calls to weed out the bad actors.
“Even when using AI voices, the way to actually create the phone call hasn’t changed, so analytics services remain strong in our ability to detect and stop these calls,” Nelson says.