Microsoft will bring new Phi-4-multimodal to Copilot+ PCs

Both models are now available for developers

Home » News

2 min. read

Published on February 27, 2025

by Rafly Gilang

published on February 27, 2025

Share this article

Improve this guide

Readers help support MSpoweruser. We may get a commission if you buy through our links.

Key notes

Microsoft launched two new AI models, Phi-4-multimodal and Phi-4-mini, with Phi-4-multimodal in Copilot+ PCs.
Phi-4-multimodal handles text, images, and speech, outpacing models like Google’s Gemini 2.0 Flash.
Copilot+ PCs use AI locally for faster, more private tasks.

Microsoft has recently been adding two new AI models to its Phi-4 small family: the Phi-4-mini and the Phi-4-multimodal. And with that, the Redmond company says that it’ll integrate the latter into Copilot+ PCs.

The 5.6B parameters Phi-4-multimodal can process text, images, and speech all at once. It’s designed to be efficient, performing tasks like speech recognition and understanding visuals, while using less energy compared to larger models.

“Copilot+ PCs will build upon Phi-4-multimodal’s capabilities, delivering the power of Microsoft’s advanced SLMs without the energy drain. This integration will enhance productivity, creativity, and education-focused experiences, becoming a standard part of our developer platform,” says Weizhu Chen, Microsoft’s VP for generative AI.

Copilot+ PCs use AI locally for some tasks, meaning the AI runs directly on the device instead of relying on the cloud. This helps with privacy and speed. For example, AI features in Microsoft apps like Word and Outlook, or even the controversial all-knowing Recall, can work without needing an internet connection.

The Redmond tech giant also says that the Phi-4-multimodal outperforms its competitors, including Google’s new Gemini 2.0 Flash that powers the Gemini chatbot, in certain cherry-picked benchmarks.

On the other hand, the Phi-4-mini, with 3.8 billion parameters, is designed for text-based tasks such as reasoning, math, coding, and instruction-following, with the ability to process sequences up to 128,000 tokens.

Both models are available for developers through platforms like Azure AI Foundry and HuggingFace.

Rafly Gilang

Tech Reporter

Rafly is a reporter with years of journalistic experience, ranging from technology, business, social, and culture. Currently reporting news on Microsoft-related products, tech, and AI on MSPowerUser. Got a tip? Send it to [email protected]

User forum

0 messages

Sort by:

Leave a Reply Cancel reply