With announcement of the audio subsystem for the new ATSC 3.0 broadcast standard expected imminently, Linear Acoustic founder and The Telos Alliance CTO Tim Carroll recalls the major milestones along the way to David Davies, and explains why this ‘next generation audio’ standard will have been worth the wait.
Supporting the broadcast requirements of today while simultaneously accommodating the expectations of tomorrow – not least with regard to the enhanced visual experience of 4K/UHD and the rapid growth in mobile viewing behaviours – has been the not-insignificant challenge confronting the technical teams at work on the new ATSC 3.0 broadcast standard, which has now been in development for nearly three years.
But it is arguable that the task facing the audio sub-group has been most acute of all given that they effectively have been required to completely rethink the scope of broadcast audio for the future. The end-result, it is expected, will be a standard that delivers interactive audio functionality, thereby enabling high spatial resolution in sound source localisation in azimuth, elevation and distance, and facilitating an enhanced sense of sound envelopment throughout the listening area. There will also be extensive ‘personalisation’ features – for example, enhancements to the control of dialogue, mixing of assistive audio services and special commentary – as well as support for the normalisation of content loudness and contouring of dynamic range.
An integral part of the entire process has been the determining of a suitable audio subsystem to deliver this new functionality, and during 2015 proposed systems from Dolby Laboratories and the MPEG H Audio Alliance (comprising Fraunhofer, Qualcomm and Technicolor) underwent extensive testing. An announcement was expected before the end of 2015, but as Tim Carroll – The Telos Alliance CTO, Linear Acoustic founder and member of the committee deciding the ATSC 3.0 audio standard – asserts, now it really is imminent.
When would you say that work on the audio part of ATSC 3.0 truly began in earnest?
It was really a few years back when the ATSC decided that the new standard would not be backwards-compatible. It was a pretty radical decision, but the hope was that it would simplify things greatly further down the chain. So this was effectively a clean sheet of paper from the codecs to the transport stream to the RF – everything.
I have been involved with the ATSC for a long time, and in some ways this process harked back to the early days of ATSC 1.0. The analogy I would use is that if you have never driven a car it is very difficult to be asked what you look for in a car! So a lot of the early stages of ATSC 3.0 focused on talking to broadcasters and other stakeholders about what they knew and thought was possible.
When you add in the idea of object-based audio and personalisation, then clearly you have to have a long period of discussion. Not only are you thinking about what broadcasters need today, you have to consider seriously what their requirements will be several years into the future.
Would it be fair to say that it is only now that many people are becoming aware of quite how significant a change ‘next generation audio’ technologies will represent?
I think so. I remember hearing about object audio for the first time from Simon Tuff at the BBC some years ago, and thinking ‘have you lost your mind? 5.1 channels is hard enough!’ It will be a paradigm shift, but just as importantly the new standard represents a fresh approach to the way we work today.
So no more is it just a static downmix where a 5.1 mix is automatically combined into stereo; now it is something that is rendered into stereo, and the listening differences are tremendous. The difference between a straight downmix and a rendered two-channel version is that the latter is so much more pleasing to listen to. Immersive is a treat for the ears, but it is the basic stuff that is so much better.
Of course we have more efficient codecs at our disposal, but it is all the features that surround them that make the greatest difference. Getting that balance right took time, and maybe took everyone by surprise a little bit.
There has been extensive public showcasing of the various proposed systems over the last 12 months. How important has this been to the process, and when should we now expect a decision to be announced?
It is a significant aspect not only in terms of showing people what is coming, but also to highlight the fact that there is a difference between the lab environment and something that is standardised. It might be a fantastic development in its own right, but there has to be a dialogue about how it is implemented practically and without breaking the bank.
I can say that by the end of 2015 we were all pretty sure about the decision regarding the audio subsystem, and I do now expect that to be made public in time for NAB 2016.
NAB attendees would be advised to check out the ASTC Pavilion in the South Hall, but what should show visitors expect to see in particular from The Telos Alliance companies?
With the arrival of the AES67 audio-over-IP networking interoperability standard we have the opportunity to carry hundreds and hundreds of channels and still have them be perfectly synchronised to video. It is not necessary for them to be part of the IP video stream, so at NAB this year you will see a number of products from us that support this idea of separating audio from the core processing.
We expect these to be very popular as experience tells us that video people will do anything they can to shovel audio off to anywhere or anyone else! And of course the emergence of object-based audio isn’t exactly going to help matters on that front.