Why wait for Gemini Ultra when you can use GPT-4 Turbo with Vision via Azure OpenAI service

Reading time icon 2 min. read


Readers help support MSpoweruser. We may get a commission if you buy through our links. Tooltip Icon

Read our disclosure page to find out how can you help MSPoweruser sustain the editorial team Read more

OpenAI GPT-4 Turbo vs Gemini Ultra

OpenAI’s GPT-4 Turbo with Vision is a large multimodal model (LMM) that can analyze images and provide textual responses to questions about them. This advanced multimodal AI model includes all the capabilities of GPT-4 Turbo while adding the ability to process and analyze image inputs. 

Today, Microsoft announced that GPT-4 Turbo with Vision is now available via Azure OpenAI Service. Existing Azure OpenAI Service customers in Australia East, Sweden Central, Switzerland North, and West US Azure regions can now access the GPT-4 Turbo with Vision service.

Along with the availability of GPT-4 Turbo with Vision, Microsoft is announcing following improvements to Azure AI services enabling advanced functionalities.

  • Optical Character Recognition (OCR): Extracts text from images, integrating it with the user’s prompt and image to enrich the context. 
  • Object grounding: Enhances text responses from GPT-4 Turbo with Vision by identifying and outlining key objects within images. 
  • Video prompts: Allows GPT-4 Turbo with Vision to answer questions using the most relevant frames from a video based on the user’s prompt. 
  • Azure OpenAI Service on your data with images: By combining GPT-4 Turbo with Vision, Azure AI Search, and Azure AI Vision, images can now be added with text data, utilizing vector search to develop a solution that connects with user’s data, enabling an improved chat experience.

GPT-4 Turbo with Vision on Azure OpenAI service will be charged based on the number of input and output tokens. Find the details below.

ModelInput Output 
GPT-4 Turbo with Vision1$0.01 per 1000 tokens$0.03 per 1000 tokens
+ Enhanced add-on features for OCR$1.50 per 1000 transactions
+ Enhanced add-on features for Object Grounding$1.50 per 1000 transactions
+ Enhanced add-on feature for “Add your Image” Image Embedding$0.10 per 1000 transactions
+ Enhanced add-on feature for Video prompts integrating Video Retrieval$0.05 per minute for indexing$0.25 per 1000 transactions2

Early this week, Microsoft Research team revealed that OpenAI’s GPT-4 model can beat Google Gemini Ultra when new prompting techniques are used. So, if you are waiting for Gemini Ultra, you should definitely give GPT-4 Turbo with Vision a try.

User forum

0 messages