Multi-Stage Machine Learning Model Synthesis for Efficient Inference

Information

  • Patent Application
  • 20230297852
  • Publication Number
    20230297852
  • Date Filed
    July 29, 2021
    3 years ago
  • Date Published
    September 21, 2023
    12 months ago
Abstract
Example implementations of the present disclosure combine efficient model design and dynamic inference. With a standalone lightweight model, the unnecessary computation on easy examples is avoided and the information extracted by the lightweight model also guide the synthesis of a specialist network from the basis models. With extensive experiments on ImageNet it is shown that a proposed example BasisNet is particularly effective for image classification and a BasisNet-MV3 achieves 80.3% top-1 accuracy with 290 M MAdds without early termination.
Description
Claims
  • 1. A computing system with improved machine learning inference efficiency, the system comprising: one or more processors; andone or more non-transitory computer-readable media that store: a machine-learned prediction model configured to receive an input and to process the input to generate both an initial prediction and a plurality of combination values respectively for a plurality of machine-learned basis models;the plurality of machine-learned basis models; andinstructions that when executed by the one or more processors cause the computing system to perform operations, the operations comprising: obtaining the input;processing the input with the machine-learned prediction model to generate the initial prediction and the plurality of combination values;determining whether the initial prediction satisfies one or more confidence criteria;when the initial prediction satisfies the one or more confidence criteria:providing the initial prediction as an output; andwhen the initial prediction does not satisfy the one or more confidence criteria:synthesizing, based at least in part on the plurality of combination values, a combined model from the plurality of machine-learned basis models;processing the input with the combined model to generate a final prediction; andproviding the final prediction as the output.
  • 2. The computing system of claim 1, wherein processing the input with the machine-learned prediction model consumes relatively fewer computational resources than processing the input with the combined model.
  • 3. The computing system of claim 1, wherein determining whether the initial prediction satisfies one or more confidence criteria comprises comparing a confidence score generated by the machine-learned prediction model for the initial prediction to one or more threshold confidence scores.
  • 4. The computing system of claim 1, wherein the plurality of machine-learned basis models comprise a plurality of expert models that were respectively trained on a plurality of different training datasets.
  • 5. The computing system of claim 1, wherein the machine-learned prediction model is a standalone model independent of the plurality of machine-learned basis models.
  • 6. The computing system of claim 1, wherein: each of the plurality of machine-learned basis models comprises a plurality of layers; andfor each of the plurality of machine-learned basis models, the machine-learned prediction model is configured to predict a plurality of layer values respectively for the plurality of layers.
  • 7. The computing system of claim 1, wherein: each of the plurality of machine-learned basis models comprises one or more kernels; andsynthesizing, based at least in part on the plurality of combination values, the combined model from the plurality of machine-learned basis models comprises linearly combining the kernels of the plurality of machine-learned basis models according to the plurality of combination values.
  • 8. The computing system of claim 1, wherein: processing the input with the machine-learned prediction model comprises running the machine-learned prediction model on a central processing unit; andprocessing the input with the combined model comprises running the combined model on or more hardware accelerator units.
  • 9. The computing system of claim 1, wherein the input comprises an image and the output comprises a classification of the image into one or more classes.
  • 10. A computer-implemented method to train machine-learned models, the method comprising: obtaining, by a computing system comprising one or more computing devices, a training input;processing, by the computing system, the training input with a machine-learned prediction model to generate a plurality of combination values respectively for a plurality of machine-learned basis models and, optionally, an initial prediction;synthesizing, by the computing system and based at least in part on the plurality of combination values, a combined model from the plurality of machine-learned basis models;processing, by the computing system, the training input with the combined model to generate a final prediction;evaluating, by the computing system, a loss term that compares the final prediction to a ground truth output associated with the training input; andmodifying, by the computing system and based at least in part on the loss term, one or more parameters of one or both of: the machine-learned prediction model; orone or more of the machine-learned basis models.
  • 11. The computer-implemented method of claim 10, wherein said modifying comprises modifying parameters of both the machine-learned prediction model and one or more of the machine-learned basis models.
  • 12. The computer-implemented method of claim 10, wherein the method further comprises: evaluating, by the computing system, a second loss term that compares the initial prediction to the ground truth output associated with the training input; andmodifying, by the computing system and based at least in part on the second loss term, one or more parameters of the machine-learned prediction model.
  • 13. The computer-implemented method of claim 10, wherein synthesizing, by the computing system and based at least in part on the plurality of combination values, the combined model from the plurality of machine-learned basis models comprises: determining, by the computing system and based on the plurality of combination values, an unevenly combined model from the plurality of machine-learned basis models; andmixing, by the computing system and based on a mixing hyperparameter, the unevenly combined model with an equally combined model to produce the combined model.
  • 14. The computer-implemented method of claim 13, wherein: the method is performed for a plurality of iterations; andthe mixing hyperparameter decays over the plurality of iterations to provide increased relative influence to the unevenly combined model.
  • 15. The computer-implemented method of claim 10, wherein the method further comprises: randomly eliminating one or more of the plurality of basis models.
  • 16. The computer-implemented method of claim 10, wherein the plurality of machine-learned basis models share parameters in some but not all of their layers.
  • 17. The computer-implemented method of claim 10, wherein said modifying comprises: backpropagating, by the computing system, the loss term through the combined model; andafter backpropagating, by the computing system, the loss term through the combined model, continuing, by the computing system, to backpropagate the loss term through the machine-learned prediction model.
  • 18. A computing system with multi-stage model synthesis, comprising one or more processors; andone or more non-transitory computer-readable media that store: a machine-learned prediction model configured to receive an input and to process the input to generate a plurality of combination values respectively for a plurality of machine-learned basis models;the plurality of machine-learned basis models; andinstructions that when executed by the one or more processors cause the computing system to perform operations, the operations comprising: obtaining the input;processing the input with the machine-learned prediction model to generate the plurality of combination values;synthesizing, based at least in part on the plurality of combination values, a combined model from the plurality of machine-learned basis models;processing the input with the combined model to generate a final prediction; andproviding the final prediction as an output.
  • 19. (canceled)
  • 20. The computing system according to claim 18, wherein the input comprises an image and the output comprises a classification of the image into one or more classes.
PCT Information
Filing Document Filing Date Country Kind
PCT/US2021/043655 7/29/2021 WO
Provisional Applications (1)
Number Date Country
63057904 Jul 2020 US