Tech World

PlayAI clones voices on command

Back in 2016, Hammad Syed and Mahmoud Felfel, an ex-WhatsApp engineer, thought it’d be neat to build a text-to-speech Chrome extension for Medium articles. The extension, which could read any Medium story aloud, was featured on Product Hunt. A year later, it spawned an entire business.

“We saw a bigger opportunity in helping individuals and organizations create realistic audio content for their applications,” Syed told TechCrunch. “Without the need to build their own model, they could deploy human-quality speech experiences faster than ever before.”

Syed and Felfel’s company, PlayAI (formerly PlayHT), pitches itself as the “voice interface of AI.” Customers can choose from a number of predefined voices, or clone a voice, and use PlayAI’s API to integrate text-to-speech into their apps.

Toggles allow users to adjust the intonation, cadence, and tenor of voices.

PlayAI also offers a “playground” where users can upload a file to generate a read-aloud version and a dashboard for creating more-polished audio narrations and voiceovers. Recently, the company got into the “AI agents” game with tools that can be used to automate tasks such as answering customer calls at a business.

PlayAI’s agent feature, which builds automation tools around the company’s text-to-speech engine. Image Credits:PlayAI

One of PlayAI’s more interesting experiments is PlayNote, which transforms PDFs, videos, photos, songs, and other files into podcast-style shows, read-aloud summaries, one-on-one debates, and even children’s stories. Like Google’s NotebookLM, PlayNote generates a script from an uploaded file or URL and feeds it to a collection of AI models, which together craft the finished product.

I gave it a whirl, and the results weren’t half bad. PlayNote’s “podcast” setting produces clips more or less on par with NotebookLM’s in terms of quality, and the tool’s ability to ingest photos and videos makes for some fascinating creations. Given a picture of chicken mole dish I had recently, PlayNote wrote a five-minute podcast script about it. Truly, we are living in the future.

Granted, the tool, like all AI tools, generates odd artifacts and hallucinations from time to time. And while PlayNote will do its best to adapt a file to the format you’ve chosen, don’t expect, say, a dry legal filing to make for the best source material. See: the Musk v. OpenAI lawsuit framed as a bedtime story:

PlayNote’s podcast format is made possible by PlayAI’s latest model, PlayDialog, which Syed says can use the “context and history” of a conversation to generate speech that reflects the conversation flow. “Using a conversation’s historical context to control prosody, emotion, and pacing, PlayDialog delivers conversation with natural delivery and appropriate tone,” he continued.


Source link

Related Articles

Back to top button

Adblock Detected