Microsoft has applied for a trademark for DeepSpeed, their optimised AI library which allows for training of large-scale AI models on modest hardware.
The trademark, filed on the 8th December 2020, makes reference to “providing temporary use of online non-downloadable computer software for artificial intelligence processing and deep learning,” suggesting Microsoft may be thinking of making DeepSpeed available via an online cloud service.
DeepSpeed, released in February this year, is a Python library that allows for the creation of deep learning models with a trillion parameters, more than five times as many as in the world’s current largest model, using only 800 Nvidia V100 graphics cards. Without DeepSpeed, the same task would require 4,000 Nvidia A100s, which are up to 2.5 times faster than the V100, crunching for 100 days.
The tool is particularly useful after GPT-3 showed that models continue to improve the larger the neural net and dataset. This did, however, threaten to push the latest innovations out of the reach of smaller teams with fewer resources, but DeepSpeed has allowed these teams to compete with much larger services with minimal hardware. A move to cloud processing could make large model AI training even more accessible.