It’s time to send smarter subtitles

By Frans de Jong, EBU

I have to say, it’s not a topic that everybody wants to hear about – sending subtitles to improve content accessibility is not even in the same category as, let’s say announcing the next Eurovision Song Contest winner to most people. But it’s important. And to those who rely on the services, a vital piece of their everyday information and entertainment. So why not improve them more?

There are multiple reasons, but typically it comes down to costs: creating high quality descriptions is not free. While some tech giants may want you to believe that you could keep your phone next to the screen to show you the audio, the reality is that serious content producers still employ or hire professional subtitlers for transforming sounds and spoken words into screen text. This part of the process is not new but, the formats that are used to create and distribute the subtitles are changing and opening up a business that used to rely on floppy disks and Teletext specifications to multi-platform repurposing and automatic quality improvers for (live) subtitles.

For the European Broadcasting Union (EBU) exchanging content is something very close to our heart and one that is not limited to the yearly Eurovision Song Contest. Roughly 25 years ago, the EBU introduced EBU Tech 3264: Subtitling data exchange format, which specifies up to – literally – the bit level how to exchange subtitle files in production facilities using a floppy disc.

Like the Eurovision Song Contest, the subtitling data exchange format (.STL) became a hit. And over time, it became criticised too. The feature set was not flexible enough for new applications, such as higher framerates or extended character sets. Despite this, .STL remains a popular choice as is reflected in the millions of files that exist in broadcasters’ archives around the world. Its bit-tied definition turned out to be both its strength and its weakness.

Transition to Timed Text

About 10 years ago, the EBU decided .STL could do with a follow-up and it joined the W3C Timed Text (TTML) effort, which was working on a format to distribute text in a predictable way, both in terms of timing and presentation styles. Paradoxically, the timing of that work was not ideal. Although technically a more than valid exercise, many of the EBU’s Members were not at all interested in changing their subtitling practices.

There simply was no business justification for it. Subtitles kept flowing and in many countries Teletext kept itself in the picture as well. The new features of ‘Timed Text’ appeared to be over the top for those members…

The EBU addressed the feature-size concerns of TTML by deriving a much smaller subset: EBU Tech 3350: EBU-TT. Although the specification does not exclude its use for distribution, it focuses on the production side of things, like .STL. But as .STL originated in a world where production and distribution had a one-to-one mapping, EBU-TT exists in a different context. Nowadays the mapping is one-to-many, with content being distributed to a wide variety of platforms, and it is, perhaps unexpectedly, the streaming side of things that is driving the interest in new subtitling formats. That includes the EBU’s easy-to-implement distribution profile: EBU Tech 3380: EBU-TT-D.

The new text-based formats allow for relatively straightforward and, above all, automatic transformations to different platforms, such as the open internet, HbbTV and vendor-specific devices, such as smartphones. It is this ‘author once, publish many’ principle that makes it attractive for users to switch to a ‘smarter’ subtitling format like EBU-TT in the coming years.

Internationally in-sync

As we’ve seen, it is not only in Europe – known for its dual use of subtitling for both accessibility and translation purposes – that streaming subtitles are leading a transformation. The demanding FCC Regulations in the United States have driven a similar interest across the Atlantic. With content and subtitling being global markets, it is not strange that the EBU-TT family and several of the related specifications have thus ended up being very similar, too, and derived from a common source, TTML.

Subtitle regulation in Europe is probably toughest in the United Kingdom, where media regulator OFCOM requires broadcasters to provide up to 100% of their (live!) output with subtitles. This demand is often met by using respeaking technology, a technology that the BBC pioneered many years ago. Although quite good, speech recognition technology is not perfect, requiring broadcasters to strike a subtle balance between the amount, speed and quality (in terms of timing and word accuracy) of subtitles. While subtitle volumes are approaching or have reached 100%, the focus of regulators and the audience alike is shifting to quality.

Live quality improvers

After publishing both the core EBU-TT format and its distribution brother, EBU-TT-D, the EBU decided to focus back on the production domain to find the missing piece of the live puzzle: how to contribute ‘live’ subtitles from author (or respeaker, or even automated playout of prepared subtitles) to the playout, encoding and distribution facilities.

In current set ups this has typically been addressed by modifying .STL in ad-hoc, sometimes undocumented, ways or by embedding teletext in video data. This practice has worked, but has not resulted in a very futureproof set up, let alone one in which it is easy to achieve interoperability between vendors. The newest specification of the EBU-TT family is meant to change exactly this: EBU Tech 3370: EBU-TT Live was released for industry preview this summer.

The goal is to provide a standard, transport-agnostic way to deliver subtitles from its origination to the point where they will be improved, encoded and distributed and/or archived. An interesting application of the new format will be the use of modular ‘improvers’ that can increase the quality of the live subtitles on the fly.

Think of speech correctors or timing re-aligners that transform a stream of subtitles at their input into an even better one at their output.

The EBU-TT Live format supports such scenarios by allowing detailed timing parameters to be retained from the authoring station.

The EBU’s work on subtitling technology is jointly chaired by Andreas Tai (IRT) and Nigel Megitt (BBC), both of whom will provide an update on the state of play of subtitling in the industry at the EBU stand (10.F20).

Join us at the stand or follow the group’s work at: tech.ebu.ch for more information.

The EBU-TT family in a nutshell

EBU Tech 3350: EBU-TT Part 1

• Core subtitling format

• Focus: Production, archive and B2B exchange

• Simple profile of W3C TTML

• Includes additional broadcast features and workflow metadata

EBU Tech 3360: EBU-TT Part 2

• Guidance for converting STL to EBU-TT Part 1

• Focus: Transition from legacy

• Defines a grid similar to Teletext (40 32 chars)

EBU Tech 3380: EBU-TT-D

• Distribution version of EBU-TT Part 1

• Even simpler profile of TTML

• Focus: Online/OTT distribution

• Used in Germany as EBU-TT-D-Basic-DE and specified
for DVB DASH, HbbTV 2 and Freeview Play

EBU Tech 3370: EBU-TT Live

• Specifies how to use EBU-TT in live operations

• Includes a complete system model (e.g. ‘nodes’)

• Focus: Live and broadcaster-side streaming

• Can also be used in mixed modes

Your browser is out-of-date!

Related

Sweden’s public broadcaster uses AI to generate subtitles

AI’s future value is anyone’s guess… well, not really

Who’s the cameraman in the black? Germany’s Baller League employs 5G to give fans the ref’s perspective

#IWD2024: Meet media technology’s inspiring women

Opinion: Why it’s archive’s time to shine

Using AI to spare the referee’s (and broadcaster’s) blushes

Editor’s choice: Ten things to look out for at 2024 NAB Show

England’s World Cup Final defeat scores almost 15 million viewers