Microsoft’s new AI can simulate anyone’s voice with 3 seconds of audio

An AI-generated image of a person's silhouette.

On Thursday, Microsoft researchers announced a new text-to-speech AI model called VALL-E that can closely simulate a person’s voice when given a three-second audio sample. Once it learns a specific voice, VALL-E can synthesize audio of that person saying anything—and do it in a way that attempts to preserve the speaker’s emotional tone.

Its creators speculate that VALL-E could be used for high-quality text-to-speech applications, speech editing where a recording of a person could be edited and changed from a text transcript (making them say something they originally didn’t), and audio content creation when combined with other generative AI models like GPT-3.

Microsoft calls VALL-E a “neural codec language model,” and it builds off of a technology called EnCodec, which Meta announced in October 2022. Unlike other text-to-speech methods that typically synthesize speech by manipulating waveforms, VALL-E generates discrete audio codec codes from text and acoustic prompts. It basically analyzes how a person sounds, breaks that information into discrete components (called “tokens”) thanks to EnCodec, and uses training data to match what it “knows” about how that voice would sound if it spoke other phrases outside of the three-second sample. Or, as Microsoft puts it in the VALL-E paper:

Read 6 remaining paragraphs | Comments

Source

Microsoft’s new AI can simulate anyone’s voice with 3 seconds of audio

Leave a Reply Cancel reply

Baldur’s Gate 3’s latest patch brings more mod support and tools

LastPass users targeted in phishing attacks good enough to trick even the savvy

The Download: American’s hydrogen train experiment, and why we need boring robots

Disney Speedstorm’s Golden Pass controversy moves Gameloft to consider changes

Broadcom says “many” VMware perpetual licenses got support extensions

Baldur’s Gate 3’s latest patch brings more mod support and tools

LastPass users targeted in phishing attacks good enough to trick even the savvy

Disney Speedstorm’s Golden Pass controversy moves Gameloft to consider changes

Broadcom says “many” VMware perpetual licenses got support extensions

January 2023
M	T	W	T	F	S	S
« Dec				Feb »
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31