Microsoft to ship Phi Silica small language model (SLM) as part of Windows to power GenAI apps

Microsoft launched yet another small language model (SLM).

Reading time icon 2 min. read


Readers help support MSpoweruser. We may get a commission if you buy through our links. Tooltip Icon

Read our disclosure page to find out how can you help MSPoweruser sustain the editorial team Read more

Key notes

  • A lot of small models, like Apple’s OpenOELM and Microsoft’s Phi-3, have been launched.
  • Now, the Redmond company launched Phi Silica for generative AI applications.
  • It will soon ship as a part of the Windows operating system.
Microsoft building

Microsoft has launched plenty of AI models in recent months. Some are designed for smaller tasks, hence called a small language model (SLM). But, while Microsoft launched the “cost-effective” Phi-3 model that outperforms its competitors, a new addition to the family has now arrived: the Phi Silica. 

As a part of the recent AI-centered Microsoft Build 2024 event, the Redmond tech giant said that this Phi Silica model is “custom-built for the NPUs in Copilot+ PCs.” Or, in other words, this built-from-the-Phi-3-series model will soon ship to future versions of Windows to power generative AI applications.

Microsoft says that the Phi Silica model will even be more cost-effective and power-friendly. It reuses the KV cache from the NPU and runs on the CPU to produce roughly 27 tokens per second.

The company then boasts, “With full NPU offload of prompt processing, the first token latency is at 650 tokens/second – and only costs about 1.5 Watts of power while leaving your CPU and GPU free for other computations.”

A lot of small AI models have arrived recently. Apple, Microsoft’s number-one competitor, has also launched OpenELM with different parameters starting from 270 million up to 3 billion.

The Phi-3 family, on the other hand, first arrived under three variants, phi-3-mini (3.8B parameters), phi-3-small (7B), and phi-3-medium (14B). The mini version was trained using Nvidia’s AI-friendly H100 GPUs.

“Phi-3 models significantly outperform language models of the same and larger sizes on key benchmarks … Phi-3-mini does better than models twice its size, and Phi-3-small and Phi-3-medium outperform much larger models, including GPT-3.5T,” Microsoft said in the initial announcement.