Microsoft patents method for shrinking Deep Neural Networks to bring on-device AI


Microsoft’s scientists have patented a method to shrink Deep Neural Networks, the technology which drives much of the recent AI advances, so that it can move from servers to the handset in your pocket.

A typical large neural network such as used to recognize speech can have up to 30 million parameters, making running good voice recognition on your phone prohibitively difficult in terms of RAM and processor requirements.

Microsoft’s method of simplify such large neural networks can shrink them by up to 20 times, meaning good voice recognition can now be done on your phone or laptop, which could mean more intelligent and responsive devices.

The details of the patent is significantly over my head, but includes:

The technologies described herein generally relate to converting a neural network system having a relatively large footprint (e.g., larger storage size) into a neural network system having a relatively smaller footprint (e.g., smaller storage size) such that the smaller-footprint neural network system may be more easily utilized by on one or more resource constrained devices. Aspects disclosed herein relate to reducing the storage size of a large-footprint DNN having one or more matrices. The one or more matrices of the large-footprint DNN store numerical values that are used in evaluating features of an audio signal. Evaluation of these features using the numerical values in the matrices allows the large-footprint DNN to determine a probability that the audio signal corresponds to particular utterance, word, phrase, and/or sentence.

Aspects of this disclosure relate to conversion techniques, that when applied to one or more matrices of a large-footprint DNN, result in a smaller matrix size. One conversion technique includes analyzing vectors of a large-footprint DNN matrix to identify portions of the vectors (e.g., sub-vectors) that have similar numerical properties. Sub-vectors with similar numerical properties are grouped. An approximation (or codeword) of the group may be determined for a group. Codewords are then indexed into a codebook, which contains the address of the codewords. In aspects of technology, after the codebook is obtained, the codewords can be fine-tuned using a variety of neural network training techniques. Using the codebook to index to the appropriate codeword corresponding to the groups of sub-vectors, a small-footprint DNN matrix can be formed.

nn-patentOn device AI is not just attractive for speed reasons, but also due to the privacy implications of being able to take advantage of AI advances without having to share all your information with Microsoft and other cloud providers, an approach which Apple has favoured but to the detriment of the performance and intelligence of Siri. Microsoft’s technology may be a way to have the best of both worlds.

Read more about the approach at the US PTO here.

Some links in the article may not be viewable as you are using an AdBlocker. Please add us to your whitelist to enable the website to function properly.

Source Related