Microsoft's AI-based video metadata extraction service now generally available

Reading time icon 1 min. read


Readers help support MSpoweruser. We may get a commission if you buy through our links. Tooltip Icon

Read our disclosure page to find out how can you help MSPoweruser sustain the editorial team Read more

Microsoft Video Indexer is a cloud service that enables you to extract visual and speech metadata from your videos, which can be used to build enhanced search experiences in your existing apps. At Build developer conference last year, Microsoft first announced the public preview of the Video Indexer service. At IBC 2018 last week, Microsoft announced the general availability of Video Indexer service. Along with the information about GA, Microsoft announced the following new capabilities.

  • The Emotion recognition model which detects emotional moments in video and audio assets based on speech content and voice tonality.
  • A Topic inferencing model built to understand the high-level topics of the video or audio files based on spoken words and visual cues. Topics in this model are sourced from IPTC taxonomy among others to align to industry standards.
  • Enhanced celebrity recognition model which now covers one million faces based on commonly requested data sources such as IMDB, Wikipedia, and top LinkedIn influencers.

Learn more about this announcement here.

User forum

0 messages