Microsoft hints at upgrading Bing Chat's multimodal Visual Search

It's "economically feasible," says Microsoft's spokesperson.

Reading time icon 2 min. read


Readers help support MSPoweruser. When you make a purchase using links on our site, we may earn an affiliate commission. Tooltip Icon

Read the affiliate disclosure page to find out how can you help MSPoweruser effortlessly and without spending any money. Read more

Key notes

  • Microsoft wants to upgrade the Visual Search feature on Bing Chat.
  • The upgrade has been hinted in an X (fka Twitter) interaction.
  • In the future, the chatbot will be able to “remember” images in the conversation a lot better.

Microsoft has reportedly teased on its plan to upgrade the Visual Search feature on its popular AI-powered chatbot, Bing AI Chat.

The Visual Search in Chat, which was launched last summer alongside Bing Chat Enterprise, is powered by OpenAI’s GPT-4 Vision. It’s the AI model used to perform these sorts of tasks, which allow you to upload images of anything and ask the chatbot about them.

However, some users expressed frustration with the current functionality, noting that they had to repeatedly upload the same image for image recognition tasks as the chatbot could not access previous conversations related to that image. 

Microsoft’s Mikhail Parakhin, responsible for user experience on Bing Chat, has acknowledged the need for an improvement to the chatbot’s image retrieval capabilities. 

In a Twitter exchange with a user, Parakhin stated that the enhancement is “economically feasible” and that the team will prioritize it once they have freed up capacity.

The Redmond-based tech giant also said that they would implement the latest GPT-4 Turbo for Bing, but “still need to iron out a few kinks.”

GPT-4 Turbo has a larger context window than its predecessors, GPT-4 and GPT-3.5, with 128,000 tokens compared to 64,000 tokens and 8,000 tokens respectively.