Voice Your Concerns

May 21, 2024 OpenAI Scarlett Johansson Generative AI Silicon Valley Sam Altman Naomi Klein

Why does it feel so wrong to copy someone’s voice?

Last week, right after OpenAI had unveiled the next-generation talking ChatGPT, Sam Altman (in characteristic low-caps simplicity) tweeted: “her”.

It turns out that the parallel between the decade-old Spike Jonze flick and Altman‘s invention wasn’t merely coincidental: Today, Scarlett Johansson—who voiced the AI assistant Samantha in the movie—revealed that OpenAI had contacted her during the development, requesting she lend her voice to the software. She repeatedly refused.

It now appears that the company went ahead and created a voice eerily similar to hers—apparently using an unnamed, sound-alike voice actor.

The fallout was as swift was it was predictable: Scarlett Johansson threatened legal action and OpenAI, apparently surprised by the backlash, pulled the sound-alike voice from their product “out of respect” for the actress.

To be clear: What OpenAI allegedly did isn’t illegal. But not only is it wrong, it’s a particularly bad look for this company.

Because this isn’t just another case of a company knowingly circumventing a restriction, such as the many examples of large corporations commissioning soundalike songs for music that artists didn’t want them to license (this happened to Tom Waits, Bette Milder, and, most recently, Beach House).

It is arguably worse: OpenAI is driving the AI hype, they are developing the most popular tool for generative AI that’s already automating away creative jobs and slashing creative budgets.¹ Their entire business model is built on the imitation of creative work, and copying an actress’ voice (one way or another) speaks of a deep disrespect towards creatives in general.²

The decision, then, shows how OpenAI think they’re too clever and too innovative to be held back by the constraints of common decency. Tech companies pushing the envelope still follow the tired 2010s mantra that “it’s easier to ask for forgiveness than for permission”—a sign of deep arrogance that they’re rightly criticized for.

But there’s also something more profound to this story, because it otherwise wouldn’t have caused such a stir. After all, Silicon Valley companies misbehave all the time; even the scraping and automatic reproduction of art (and style) is—sadly—becoming normalized. Then why does it feel so viscerally wrong to copy a voice?

Perhaps it’s the unease about seeing a credible fake: In 2014’s Enemy, Jake Gyllenhaal recoils when he first spots his lookalike in a movie, and as a viewer you feel the scare in your bones. It’s uncanny and doesn’t feel right. (Naomi Klein does a great rundown of this effect in her book Doppelganger.)

But more likely that it’s the difference between an (external) work of art and a part of a person’s identity. For what is a human’s voice, their distinct pitch, tone, and intonation? A bodily property? A trait of personality? It certainly feels like an innate part of a person, and simply replicating it is a lot more disrespectful than copying anything they’ve made.

Just a year ago, I still found the use of generative AI in advertising impressive and the aesthetics intriguing, these days I find it not just overused but borderline tasteless—not to mention problematic in how it undercuts creative work.↩︎
Personally, I’ve never seen a product become so quickly embraced and so widely abused in human interactions—from writing emails to gathering feedback. I’ve myself been on the receiving end of utterly baffling emails from people who had outsourced the most simple responses to AI. But just because something can be done doesn’t mean it should be done—and there’s something about how “magical” AI feels that brings out the worst in some people.↩︎

Next: Breaking up with X

Previous: Zombie Internet