Why wait for Gemini Ultra when you can use GPT-4 Turbo with Vision via Azure OpenAI service

Home » Microsoft

2 min. read

Updated on December 12, 2024

by Pradeep Viswav

updated on December 12, 2024

Share this article

Improve this guide

Readers help support MSpoweruser. We may get a commission if you buy through our links.

OpenAI’s GPT-4 Turbo with Vision is a large multimodal model (LMM) that can analyze images and provide textual responses to questions about them. This advanced multimodal AI model includes all the capabilities of GPT-4 Turbo while adding the ability to process and analyze image inputs.

Today, Microsoft announced that GPT-4 Turbo with Vision is now available via Azure OpenAI Service. Existing Azure OpenAI Service customers in Australia East, Sweden Central, Switzerland North, and West US Azure regions can now access the GPT-4 Turbo with Vision service.

Along with the availability of GPT-4 Turbo with Vision, Microsoft is announcing following improvements to Azure AI services enabling advanced functionalities.

Optical Character Recognition (OCR): Extracts text from images, integrating it with the user’s prompt and image to enrich the context.
Object grounding: Enhances text responses from GPT-4 Turbo with Vision by identifying and outlining key objects within images.
Video prompts: Allows GPT-4 Turbo with Vision to answer questions using the most relevant frames from a video based on the user’s prompt.
Azure OpenAI Service on your data with images: By combining GPT-4 Turbo with Vision, Azure AI Search, and Azure AI Vision, images can now be added with text data, utilizing vector search to develop a solution that connects with user’s data, enabling an improved chat experience.

GPT-4 Turbo with Vision on Azure OpenAI service will be charged based on the number of input and output tokens. Find the details below.

Model	Input	Output
GPT-4 Turbo with Vision¹	$0.01 per 1000 tokens	$0.03 per 1000 tokens
+ Enhanced add-on features for OCR	$1.50 per 1000 transactions
+ Enhanced add-on features for Object Grounding	$1.50 per 1000 transactions
+ Enhanced add-on feature for “Add your Image” Image Embedding	$0.10 per 1000 transactions
+ Enhanced add-on feature for Video prompts integrating Video Retrieval	$0.05 per minute for indexing$0.25 per 1000 transactions²

Early this week, Microsoft Research team revealed that OpenAI’s GPT-4 model can beat Google Gemini Ultra when new prompting techniques are used. So, if you are waiting for Gemini Ultra, you should definitely give GPT-4 Turbo with Vision a try.

Pradeep Viswav

Software and Services Expert

Pradeep is a Computer Science and Engineering Graduate. He was also a Microsoft Student Partner. He is currently working in a leading IT company.

User forum

0 messages

Sort by:

Leave a Reply Cancel reply