Microsoft announces Phi-3-vision, a new multimodal SLM for on-device AI scenarios
1 min. read
Updated on
Read our disclosure page to find out how can you help MSPoweruser sustain the editorial team Read more
Key notes
- Phi-3-vision is a 4.2B parameter model that supports general visual reasoning tasks and chart/graph/table reasoning
At Build 2024, Microsoft today expanded its Phi-3 family of AI small language models with the new Phi-3-vision. Phi-3-vision is a 4.2B parameter model that supports general visual reasoning tasks and chart/graph/table reasoning. The model can take both images and text as input, and output text responses.
Microsoft today also announced the general availability of Phi-3-mini in Azure AI’s Models-as-a Service (MaaS) offering. Phi-3 models are gaining momentum since they are cost-effective and optimized for on-device, edge, offline inference, and latency bound AI scenarios.
In addition to the news about Phi-3 models, Microsoft announced new features across APIs to enable multimodal experiences. Azure AI Speech now has speech analytics and universal translation. Azure AI Search now comes with significantly increased storage and up to 12X increase in vector index size at no additional cost to enable large RAG workloads at scale.
User forum
0 messages