How does Apple's OpenELM open-source model compare to Microsoft's Phi-3, parameters-wise?
Coincidence?
2 min. read
Published on
Read our disclosure page to find out how can you help MSPoweruser sustain the editorial team Read more
Key notes
- Apple released OpenELM on HuggingFace with eight variants.
- Each model comes with different parameters: 270 million, 450 million, 1.1 billion, and 3 billion.
- Microsoft’s Phi-3 model, on the other hand, includes versions with 3.8 billion, 7 billion, and 14 billion parameters.
Shortly after Microsoft launched the Phi-3 family, a set of small, open-source models designed for lighter use, Apple joined the train. The iPhone makers have (quietly) launched OpenELM, its latest open-source AI model.Â
OpenELM, short for Open-source Efficient Language Models, comes in eight variants, each pre-trained and instruction-tuned gets four. Apple’s researchers said that the model uses a layer-wise scaling strategy to efficiently distribute parameters within each layer of the transformer model, and you can use these models on HuggingFace.
“For example, with a parameter budget of approximately one billion parameters, OpenELM exhibits a 2.36% improvement in accuracy compared to OLMo while requiring 2× fewer pre-training tokens,” the documentation reads.
As for its sizes, each model comes with different parameters: 270 million, 450 million, 1.1 billion, and 3 billion. And while it’s not always the best measurement standard, parameters in AI models are always the start in comparing them.
Frankly enough, OpenELM isn’t as impressive (parameters-wise) as other open-source models: Llama 3, which powers Meta AI, comes with a maxed-out parameters count of 70 billion, and Microsoft-backed Mixtral launched its 8x22B model with 176B parameters.
Phi-3-mini, the smallest version of Microsoft’s Phi-3 model, has 3.8 billion parameters and was trained for a week using Nvidia’s H100 GPUs. In comparison, the medium version has 14 billion parameters, and the small version has 7 billion parameters.
User forum
0 messages