OpenAI is gradually rolling out a new advanced voice mode for ChatGPT to a select group of ChatGPT Plus subscribers. This feature, which was first showcased at their GPT-4 launch event in May, stirred up some controversy for sounding a bit too much like Scarlett Johansson. Because of this, the release was delayed to address safety concerns.
At the event, the new voice mode seemed to be a big step up from what ChatGPT currently offers. During the demo, OpenAI employees could interrupt the chatbot and ask it to tell stories in different ways, and it handled those interruptions smoothly.
This helps to make chatbots more human-like in their conversations, where in the past they were more clunky and robotic sounding because there would be a short delay in the response time.
One of the major criticisms during the event was how much the onstage voice, called “Sky,” sounded like Scarlett Johansson, who famously played an AI in the movie “Her.”
So OpenAI pulled the plug on the voice temporarily to figure out the situation. Although the OpenAI official line was that the training was not done using her voice, it’s obvious that it was influenced by the voice in the movie.
I also think during the pause in the release cycle, both sides had legal discussions about Scarlett getting compensated. This is just a theory, but I’s be surprised if there weren’t private meetings with lawyers.
It also raises the issue of intellectual property, and who would own the IP in this situation. There are legal precedents for “borrowing” someone’s style (see the Tom Waits case, for example) and getting compensated for use of their likeness.
But in this case, would Scarlett own the IP or would the movie studio? The studio presumably owns the copyright for the move.
This voice was actually part of ChatGPT before the spring demo, but the company removed it after Johansson sent letters to OpenAI questioning how it was made.
Christianson clarified that ChatGPT’s new mode will only feature four preset voices created with voice actors. She emphasized, “We’ve ensured that ChatGPT can’t mimic other people’s voices, whether they’re individuals or public figures, and it will block anything that strays from these preset voices.”
-
Interactive Storytelling: Creators can use the advanced voice mode to narrate stories with more natural and engaging voiceovers, making interactive fiction or audio dramas more immersive.
-
Podcasting: Creators can use ChatGPT’s voice mode to generate podcast episodes, interview simulations, or voice different characters, adding variety without needing multiple voice actors.
-
Video Content: For YouTube or social media videos, creators can use the voice mode to add narration, voice characters, or provide commentary, enhancing the quality and professionalism of their content.
-
Voiceover for Tutorials: Creators producing educational content can use the voice mode for clear and articulate voiceovers in tutorials, making complex subjects easier to understand.
-
Audiobooks: Authors or content creators can leverage the voice mode to produce audiobooks, offering listeners a more human-like and engaging reading experience.
-
Customizable Virtual Assistants: Developers and creators can integrate the advanced voice mode into virtual assistants, making interactions feel more natural and user-friendly.
-
Interactive Gaming Experiences: Game developers can use the voice mode to create dynamic, responsive NPC dialogue, enhancing player immersion with lifelike character interactions.
-
Live Stream Narration: Creators can utilize the voice mode during live streams for real-time commentary, narration, or even to respond to chat in a more engaging and personable way.
|