LEXI Voice, AI-Media’s AI-driven translation solution, provides broadcasters and streaming services with real-time low-latency voice translations of live audio in multiple languages.
The solution leverages speech recognition and synthetic voice technology to isolate and interpret audio at source, preserving speaker changes, timing and tone.
As well as inserting text into the captioning workflow, the solution is able to deliver an additional audio feed that can be added to broadcast workflows, explained Russ Newton, head of strategic partnerships, speaking to TVBEurope at IBC2025.
“There are ways that we’re able to train the model on the speech-to-text side of things so that you get very high accuracy,” he added. “Accuracy is something that we manage very, very closely.”
Eliminating the need for human interpreters, LEXI Voice provides seamless multilingual translations, providing broadcasters and streaming services with audio feeds that can be delivered to viewers around the world. The solution creates a speech-to-text conversion as the first layer based on what the person is saying. It then adds a second layer of the translation with a latency of around two seconds.
“We can pull in one language and output up to a hundred different languages, assuming we have the right number of encoders in play,” said Newton. “This multi-tiered approach allows us to bring high levels of accuracy within the translation, beyond what others are able to do.”