METHOD AND DEVICE FOR PROVIDING COMPRESSION AND TRANSMISSION OF TRAINING PARAMETERS IN DISTRIBUTED PROCESSING ENVIRONMENT

Information

  • Patent Application
  • 20230297833
  • Publication Number
    20230297833
  • Date Filed
    April 24, 2023
    a year ago
  • Date Published
    September 21, 2023
    a year ago
Abstract
Disclosed herein are a method and apparatus for compressing learning parameters for training of a deep-learning model and transmitting the compressed parameters in a distributed processing environment. Multiple electronic devices in the distributed processing system perform training of a neural network. By performing training, parameters are updated. The electronic device may share the updated parameter thereof with additional electronic devices. In order to efficiently share the parameter, the residual of the parameter is provided to the additional electronic devices. When the residual of the parameter is provided, the additional electronic devices update the parameter using the residual of the parameter.
Description
Claims
  • 1. A method for updating multiple parameters, comprising: receiving first information for updating the multiple parameters; andupdating the multiple parameters using the first information for updating the multiple parameters, whereinresiduals of the multiple parameters are generated based on the first information for updating the multiple parameters,the residuals of the multiple parameters are added to the multiple parameters, respectively,the residuals of the multiple parameters are acquired by performing decoding on second information for the residuals of the multiple parameters included in the first information for updating the multiple parameters, andthe multiple parameters are updated using the residuals of the multiple parameters.
  • 2. The method of claim 1, wherein the multiple parameters are deep-learning parameters that configure one layer of a deep-learning model.
  • 3. The method of claim 1, wherein, when the decoding is performed, a method for decoding a block of an image is used, and the multiple parameters are used to determine a context of the block.
  • 4. The method of claim 1, wherein, when the decoding is performed, a method for decoding a block of an image is used, and the parameter to is used to determine motion information of the block.
  • 5. The method of claim 1, wherein, in order to perform the decoding, one or more of entropy decoding, scanning, dequantization, and inverse-transform of a block are used.
  • 6. The method of claim 1, wherein: scanned information is generated based on the first information for updating the multiple parameter, andthe scanned information includes scanned quantized gradients.
  • 7. A method for providing information for generating updated multiple parameters, comprising: generating first information for the updated multiple parameters; andgenerating a bitstream comprising the first information, whereinthe first information for the updated multiple parameters is generated based on residuals of multiple parameters,the residuals of the multiple parameters are differences between the updated multiple parameters and the multiple parameters,the first information for the updated multiple parameters includes second information for the residuals of the multiple parameters,the second information is generated by performing encoding on the residuals of the multiple parameters, andthe residuals of the multiple parameters are information to generated the updated multiple parameters based on the multiple parameters.
  • 8. The method of claim 7, wherein the multiple parameters are deep-learning parameters that configure one layer of a deep-learning model.
  • 9. The method of claim 7, wherein, when the encoding is performed, a method for encoding a block of an image is used, and the multiple parameters represent a context of the block.
  • 10. The method of claim 7, wherein, when the encoding is performed, a method for encoding a block of an image is used, and the multiple parameters to is used to determine motion information of the block.
  • 11. The method of claim 8, wherein, in order to perform the encoding, one or more of entropy encoding, scanning, dequantization, and inverse-transform of a block are used.
  • 12. The method of claim 8, wherein: the first information for the updated multiple parameters represents scanned information, andthe scanned information includes scanned quantized gradients.
  • 13. A non-transitory computer-readable recording medium storing the bitstream generated by the method of claim 7.
  • 14. A non-transitory computer-readable recording medium storing a bitstream, the bitstream comprising: first information for updating multiple parameters, whereinthe multiple parameters are updated using the first information for updating the multiple parameters,residuals of the multiple parameters are generated based on the first information for updating the multiple parameters,the residuals of the multiple parameters are added to the multiple parameters, respectively,the residuals of the multiple parameters are acquired by performing decoding on second information for the residuals of the multiple parameters included in the first information for updating the multiple parameters, andthe multiple parameters are updated using the residuals of the multiple parameters.
  • 15. The non-transitory computer-readable recording medium of claim 14, wherein the multiple parameters are deep-learning parameters that configure one layer of a deep-learning model.
  • 16. The non-transitory computer-readable recording medium of claim 14, wherein, when the decoding is performed, a method for decoding a block of an image is used, and the multiple parameters are used to determine a context of the block.
  • 17. The non-transitory computer-readable recording medium of claim 14, wherein, when the decoding is performed, a method for decoding a block of an image is used, and the parameter to is used to determine motion information of the block.
  • 18. The non-transitory computer-readable recording medium of claim 14, wherein, in order to perform the decoding, one or more of entropy decoding, scanning, dequantization, and inverse-transform of a block are used.
  • 19. The non-transitory computer-readable recording medium of claim 14, wherein: scanned information is generated based on the first information for updating the multiple parameters, andthe scanned information includes scanned quantized gradients.
Priority Claims (2)
Number Date Country Kind
10-2017-0172827 Dec 2017 KR national
10-2018-0160774 Dec 2018 KR national
Continuations (1)
Number Date Country
Parent 16772557 Jun 2020 US
Child 18306075 US