METHOD AND DEVICE FOR PROVIDING COMPRESSION AND TRANSMISSION OF TRAINING PARAMETERS IN DISTRIBUTED PROCESSING ENVIRONMENT

Information

Patent Application
20230297833

References
Source

Publication Number
20230297833
Date Filed
April 24, 2023
a year ago
Date Published
September 21, 2023
a year ago

Inventors
Original Assignees
- Electronics and Telecommunications Research Institute

CPC
- G06N3/08 - Learning methods
- G06F18/214
- G06N3/04 - Architectures
- G06V10/82
International Classifications
- G06N3/08
- G06N3/04
- G06V10/82
- G06F18/214

Information

Abstract

Disclosed herein are a method and apparatus for compressing learning parameters for training of a deep-learning model and transmitting the compressed parameters in a distributed processing environment. Multiple electronic devices in the distributed processing system perform training of a neural network. By performing training, parameters are updated. The electronic device may share the updated parameter thereof with additional electronic devices. In order to efficiently share the parameter, the residual of the parameter is provided to the additional electronic devices. When the residual of the parameter is provided, the additional electronic devices update the parameter using the residual of the parameter.

Description

Claims

1. A method for updating multiple parameters, comprising: receiving first information for updating the multiple parameters; andupdating the multiple parameters using the first information for updating the multiple parameters, whereinresiduals of the multiple parameters are generated based on the first information for updating the multiple parameters,the residuals of the multiple parameters are added to the multiple parameters, respectively,the residuals of the multiple parameters are acquired by performing decoding on second information for the residuals of the multiple parameters included in the first information for updating the multiple parameters, andthe multiple parameters are updated using the residuals of the multiple parameters.
2. The method of claim 1, wherein the multiple parameters are deep-learning parameters that configure one layer of a deep-learning model.
3. The method of claim 1, wherein, when the decoding is performed, a method for decoding a block of an image is used, and the multiple parameters are used to determine a context of the block.
4. The method of claim 1, wherein, when the decoding is performed, a method for decoding a block of an image is used, and the parameter to is used to determine motion information of the block.
5. The method of claim 1, wherein, in order to perform the decoding, one or more of entropy decoding, scanning, dequantization, and inverse-transform of a block are used.
6. The method of claim 1, wherein: scanned information is generated based on the first information for updating the multiple parameter, andthe scanned information includes scanned quantized gradients.
7. A method for providing information for generating updated multiple parameters, comprising: generating first information for the updated multiple parameters; andgenerating a bitstream comprising the first information, whereinthe first information for the updated multiple parameters is generated based on residuals of multiple parameters,the residuals of the multiple parameters are differences between the updated multiple parameters and the multiple parameters,the first information for the updated multiple parameters includes second information for the residuals of the multiple parameters,the second information is generated by performing encoding on the residuals of the multiple parameters, andthe residuals of the multiple parameters are information to generated the updated multiple parameters based on the multiple parameters.
8. The method of claim 7, wherein the multiple parameters are deep-learning parameters that configure one layer of a deep-learning model.
9. The method of claim 7, wherein, when the encoding is performed, a method for encoding a block of an image is used, and the multiple parameters represent a context of the block.
10. The method of claim 7, wherein, when the encoding is performed, a method for encoding a block of an image is used, and the multiple parameters to is used to determine motion information of the block.
11. The method of claim 8, wherein, in order to perform the encoding, one or more of entropy encoding, scanning, dequantization, and inverse-transform of a block are used.
12. The method of claim 8, wherein: the first information for the updated multiple parameters represents scanned information, andthe scanned information includes scanned quantized gradients.
13. A non-transitory computer-readable recording medium storing the bitstream generated by the method of claim 7.
14. A non-transitory computer-readable recording medium storing a bitstream, the bitstream comprising: first information for updating multiple parameters, whereinthe multiple parameters are updated using the first information for updating the multiple parameters,residuals of the multiple parameters are generated based on the first information for updating the multiple parameters,the residuals of the multiple parameters are added to the multiple parameters, respectively,the residuals of the multiple parameters are acquired by performing decoding on second information for the residuals of the multiple parameters included in the first information for updating the multiple parameters, andthe multiple parameters are updated using the residuals of the multiple parameters.
15. The non-transitory computer-readable recording medium of claim 14, wherein the multiple parameters are deep-learning parameters that configure one layer of a deep-learning model.
16. The non-transitory computer-readable recording medium of claim 14, wherein, when the decoding is performed, a method for decoding a block of an image is used, and the multiple parameters are used to determine a context of the block.
17. The non-transitory computer-readable recording medium of claim 14, wherein, when the decoding is performed, a method for decoding a block of an image is used, and the parameter to is used to determine motion information of the block.
18. The non-transitory computer-readable recording medium of claim 14, wherein, in order to perform the decoding, one or more of entropy decoding, scanning, dequantization, and inverse-transform of a block are used.
19. The non-transitory computer-readable recording medium of claim 14, wherein: scanned information is generated based on the first information for updating the multiple parameters, andthe scanned information includes scanned quantized gradients.

Priority Claims (2)

Number	Date	Country	Kind
10-2017-0172827	Dec 2017	KR	national
10-2018-0160774	Dec 2018	KR	national

Continuations (1)

	Number	Date	Country
Parent	16772557	Jun 2020	US
Child	18306075		US

METHOD AND DEVICE FOR PROVIDING COMPRESSION AND TRANSMISSION OF TRAINING PARAMETERS IN DISTRIBUTED PROCESSING ENVIRONMENT

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (2)

Continuations (1)