Microsoft is working on a number of new features and additions for its recently launched, business-focused Azure Video Analytics suite.
Speaking at TV Connect yesterday, Microsoft’s managing director of worldwide media and cable, Tony Emerson, said that Microsoft’s future roadmap includes insights search, face recognition, logo detection and enhanced audio processing.
As of Q1 of this year, Microsoft offers speech-to-text processing in languages including English, Spanish, German, French, Italian and Chinese.
Emerson said that more languages will be supported in the future and that speaker detection, speech translations and live speech-to-text functionality are all also in the pipeline.
“[Azure Media] Indexer today understands eight different languages including Arabic, Mandarin, most of the European languages and shortly Japanese and it takes about two to three months for a new language to come online,” said Emerson.
Currently the Azure offering can also deliver functionality including video tagging, video thumbnails and emotion detection.
Referring to the latter, Emerson said this could be deployed in set-top boxes to determine whether people are, for example, interested, paying attention, surprised, happy or disgusted by the content they are watching.
ICYMI: Sprint merger drives Deutsche Telekom beyond €100 billion in 2020 revenues digitaltveurope.com/2021/02/26/spr… https://t.co/MtNwC8GHdH
26 February 2021 @ 19:05:00 UTC
3SS to more than double install base in 2021 digitaltveurope.com/2021/02/26/3ss… https://t.co/jyus4F4cgh
26 February 2021 @ 18:30:00 UTC