Lip-Synced Live Captions, Thanks To ENCO


ENCO is heading to NAB Show New York with a new capability in its enCaption4 automated captioning system. At the conference, which starts tomorrow (10/16), the company will unveil a new video delay feature that enables lip-sync grade caption synchronization even when transcribing live feeds.

The software-defined enCaption4 platform enables broadcasters and content producers to add closed or open captions to live and pre-recorded content. Its latest enhancement helps customers overcome an industry-wide challenge in the captioning of live programming. While captions could be aligned with corresponding speech during post production of file-based content, the nature of live captioning has inherently precluded such precise synchronization. Speech-to-text processing of a word or phrase cannot begin until after it has been spoken, and taking the context of surrounding words into account for greater transcription accuracy adds to this latency.

ENCO boasts that enCaption4’s newest capability effectively synchronizes the live captions with the spoken words. Further, enCaption4 can now delay the associated video and audio by a user-configurable duration to provide lip-sync-like alignment. “Two to four seconds of video delay is generally sufficient to provide the desired temporal precision, but by setting a longer delay, customers can choose to expand the audio analysis window to further enhance enCaption4’s renowned speech-to-text accuracy,” ENCO says.


“Broadcasters have long considered lip-sync-like caption synchronization for live content as the ‘holy grail’ of closed captioning, particularly for programming such as newscasts and sports,” ENCO President Ken Frommert said. “Now, customers can bring in a live feed and get an exceptionally well-synchronized, captioned version back out, all through a single system.”

The integrated video delay functionality is a key element of ENCO’s automated captioning and can be applied to a wide array of enCaption4 output options.

The video delay can also be used to align open captions that are overlaid atop web-destined and NDI output streams.

Other recent enCaption4 features on display at NAB Show New York will include an enhanced scheduling interface; a Web API for third-party integration; the ability to detect changes between multiple speakers even within a single mixed feed; and further improvements to the system’s outstanding accuracy and speed.