This application is based upon and claims the priority of Chinese Patent Application No. 201910810199.7, filed on Aug. 29, 2019, the entire contents of which are incorporated herein by reference.
Embodiments of the present disclosure generally relate to neural networks, and more particularly, to a method and a device for subnetwork sampling and a method and device for building a hypernetwork topology.
Neural networks have been applied widely in various fields. In some fields such as neural architecture search (NAS), a method of generating an independent neural network by search each tinge and acquiring an index by training is confronted with a problem that efficiency of evaluation is low, which may greatly restrict the speed of a search algorithm.
According to a first aspect of the embodiments of the present disclosure, a method for subnetwork sampling is provided, which may be applied to a hypernetwork topology, the hypernetwork topology including n layers, each layer including at least two substructures, each substructure including a batch normalization (BN) module in one-to-one correspondence with a substructure of a closest upper layer, n>0 and n being a positive integer. The method includes: a substructure A(N) of an N-th layer is selected, 1>N≥n; a selected substructure A(N-1) of an (N−1)-th layer is determined; a BN module C(B) in one-to-one correspondence with the substructure A(N-1) is determined from the substructure A(N); and the substructure AN is added into a subnetwork through the BN module C(B).
According to a second aspect of the embodiments of the present disclosure, a method for building a hypernetwork topology is provided. The method includes: an n-layer structure is built, n>0 and n being a positive integer; m substructures are arranged in each layer, m>0; for each layer, an N-th layer, of a second layer to an n-th layer, in BN modules are arranged in each substructure; and for each layer, the N-th layer, of the second layer to the nth layer, a one-to-one correspondence is established between each BN module and a substructure of an (N−1)-th layer.
According to a third aspect of the embodiments of the present disclosure, a device for subnetwork sampling is provided, which may be applied to subnetwork sampling in a hypernetwork topology, the hypernetwork topology including n layers, each layer including at least two substructures, each substructure including a BN module in one-to-one correspondence with a substructures of a closest upper layer, n>0 and n being a positive integer. The device includes: a memory configured to store an instruction; and a processor configured to execute the instruction stored in the memory to: select a substructure A(N) of an N-th layer, 1>N≥n; determine a selected substructure A(N−1) of an (N−1)-th layer; determine, from the substructure A(N), a BN module C(B) in one-to-one correspondence with the substructure A(N-1); and add the substructure A(N) into a subnetwork through the BN module C(B).
According to a fourth aspect of the embodiments of the present disclosure, a device is provided, which may include a memory configured to store an instruction; and a processor configured to execute the instruction stored in the memory to: build an n-layer structure, n>0 and n being a positive integer; arrange m substructures in each layer, m>0; for each layer, an N-th layer, of a second layer to an n-th layer, arranging m batch normalization (BN) modules in each substructure; and for each layer, the N-th layer, of the second layer to the nth layer, establishing a one-to-one correspondence between each BN module and a substructure of an (N−1)-th layer.
It is to be understood that the above general descriptions and detailed descriptions below are only exemplary and explanatory and not intended to limit the embodiments of the present disclosure.
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the present disclosure.
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. The following description refers to the accompanying drawings in which the same numbers in different drawings represent the same or similar elements unless otherwise represented. The implementations set forth in the following description of exemplary embodiments do not represent all implementations consistent with the present disclosure. Instead, they are merely examples of apparatuses and methods consistent with aspects related to the present disclosure as recited in the appended claims.
Terms used in the present disclosure are only adopted for the purpose of describing specific embodiments and not intended to limit the present disclosure. “A/an” and “the” in a singular form in the present disclosure and the appended claims are also intended to include a plural form, and a plural form is intended to include a singular form too, unless other meanings are clearly denoted throughout the present disclosure. It is also to be understood that term “and/or” used in the present disclosure refers to and includes one or any or all possible combinations of multiple associated items that are listed.
It is to be understood that, although terms “first”, “second”, “third” and the like may be adopted to describe various information in the present disclosure, the information should not be limited to these terms. These terms are only adopted to distinguish the information of the same type. For example, without departing from the scope of the present disclosure, first information may also be called second information and, similarly, second information may also be called first information.
In some NAS methods, a hypernetwork including all search network structure spaces is trained, all substructures in the hypernetwork may share a parameter when different subnetworks are constructed, and after the hypernetwork is trained to a certain extent, subnetwork sampling and index evaluation may be performed without retraining the subnetworks. A substructure may usually include a plurality of batch normalization (BN) modules, and each BN module may be performed on characteristics output by its closest upper layer to overcome training difficulties caused by changes in data distribution of an intermediate layer. A BN module may be an essential part in a neural network.
In some examples, each layer of a hypernetwork may include multiple selectable substructures, and in a subnetwork path in a hypernetwork topology, each substructure of each layer may include a BN module and each BN module may be connected with each substructure of its closest upper layer. In such case, a BN module of each substructure may be required to learn a BN parameter for an output characteristic of each substructure of its closest upper layer. When the output characteristics of different substructures of the closest upper layer are greatly different, the BN module of the present layer may not reach an ideal learning state. Therefore, after a subnetwork is sampled, training the subnetwork may need to be continued to ensure that a parameter of each BN module in the subnetwork may reach the ideal learning state for a subnetwork path (i.e., a substructure connected with the closest upper layer of the BN layer). In such case, additional training time may be required by the sampled subnetwork. In an existing hypernetwork topology, after a subnetwork is sampled, the subnetwork is further needed to be trained, which requires additional time and operation cost.
Embodiments of the present disclosure provide methods for subnetwork sampling.
Referring to
In S11, a substructure A(N) of an N-th layer is selected, 1>N≥n.
When a subnetwork is sampled, a substructure may be needed to be selected from each layer of the hypernetwork topology and subjected to a connection operation. In the N-th layer of the hypernetwork topology, the substructure A(N) may be selected according to a requirement.
In S12, a selected substructure A(N-1) of an (N−1)-th layer is determined.
The selected substructure A(N-1) of the (N−1)-th layer, i.e., the closest upper layer, may be determined as a basis for selection of a BN module from the N-th layer.
In S13, a BN module C(B) in one-to-one correspondence with A(N-1) is determined from the substructure A(N).
The corresponding BN module in the selected substructure of the present layer may be determined according to the determined selected substructure of the closest upper layer, thereby ensuring personalized normalization over characteristics output by different substructures of the closest upper layer to further ensure that a result obtained at the substructures of the present layer is more accurate.
In S14, the substructure A(N) is added into a subnetwork through the BN module C(B).
The substructure A(N) of the present layer may be added into the subnetwork through the BN module C(B) in one-to-one correspondence with the selected substructure of the closest upper layer to ensure different normalization processing of different data sources and ensure personalized processing of different sources.
According to the method of the embodiment, selection of a substructure of each layer and selection of a BN module in the substructure may be made, thereby ensuring that a sampled subnetwork requires no additional training time and can be used directly.
In an example, the operation that the substructure A(N) is added into the subnetwork through the BN module C(B) may include that: the BN module C(B) is connected with the substructure A(N-1) to add the substructure A(N) into the subnetwork. The BN module may be connected with the substructure of the closest upper layer and may receive a characteristic output by the substructure of the closest upper layer, thereby adding the substructure A(N) of the present layer into the subnetwork.
In an example, the substructure may further include an output module configured to output a characteristic; and the operation that the BN module C(B) is connected with the substructure A(N) may further include that: the BN module C(B) is connected with the output module of the substructure A(N-1). Each substructure may be configured to receive data, perform data processing and output characteristic data, and the characteristic data may be output through the output module of the substructure, so that the BN module may be connected with the output module of the substructure of the closest upper layer to receive the characteristic data and perform normalization processing. Therefore, efficiency of the subnetwork is improved.
Through the above embodiment, the substructures may adopt corresponding BN modules for different substructures of their respective closest upper layers, so that independent processing may be carried out for characteristics output by different substructures, a good training effect can be ensured, and a sampled subnetwork requires no additional training time.
An embodiment of the present disclosure also provides a method S20 for building a hypernetwork topology. As shown in
In S21, an n-layer structure is built, n>0 and n being a positive integer.
An exemplary constructed hypernetwork topology is shown in
In S22, m substructures are arranged in each layer, m>0.
Multiple substructures configured for characteristic extraction may arranged in each layer, and parameters may be shared among the substructures.
In S23, for each of the second layer to the nth layer, m BN modules are arranged in each substructure.
In each of the second layer to the nth layer, each substructure may need to receive characteristics output by the m substructures of the closest upper layer, and the characteristics output by each substructure of the closest upper layer may be different, so that the m (consistent with the number of substructures of the closest upper layer) BN modules may also be arranged in each substructure of the second layer to the nth layer. The BN module may be configured to perform BN on the output characteristics of the closest upper layer to overcome the training difficulties caused by changes in data distribution of intermediate layers.
In S24, for each of the second layer to the nth layer, a one-to-one correspondence between each BN module and a substructure of a (N−1)-th layer is established.
After the in BN modules are arranged, a one-to-one correspondence between each BN module and a substructure of a closest upper layer may be established to ensure there is one BN module corresponding to each substructure of the closest upper layer. In the subsequent training or subnetwork sampling, a corresponding BN module may be determined according to a selected substructure of a closest upper layer to perform data normalization processing in a substructure of a present layer through.
For the hypernetwork topology constructed by the method S20 for building a hypernetwork topology, training efficiency may be improved, and a sampled subnetwork may be directly used without additional training.
In an example, the selection unit 110 may be further configured to select a substructure A(1) of a first layer.
In an example, the connection unit 140 may be further configured to connect the BN module C(B) with the substructure A(N-1) to add the substructure A(N) into the subnetwork.
In an example, each substructure may further include an output module configured to output a characteristic; and the connection unit 140 may be further configured to connect the BN module C(B) with the output module of the substructure A(N-1).
With respect to the device 100 for subnetwork sampling in the above embodiment, the specific manners for performing operations for individual units therein have been described in detail in the embodiment regarding the method, which will not be repeated herein.
With respect to the device 200 for building a hypernetwork topology in the above embodiment, the specific manners for performing operations for individual units therein have been described in detail in the embodiment regarding the method, which will not be repeated herein.
The processing component 302 is typically configured to control overall operations of the device 300, such as the operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 302 may include one or more processors 320 to execute instructions to perform all or part of the operations in the abovementioned method. Moreover, the processing component 302 may include one or more modules which facilitate interaction between the processing component 302 and other components. For instance, the processing component 302 may include a multimedia module to facilitate interaction between the multimedia component 308 and the processing component 302.
The memory 304 is configured to store various types of data to support the operation of the device 300. As examples, such data may be instructions for any application programs or methods operated on the device 300, contact data, phonebook data, messages, pictures, video, etc. The memory 304 may be implemented by any type of volatile or non-volatile memory devices, or a combination thereof, such as a Static Random Access Memory (SRAM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), an Erasable Programmable Read-Only Memory (EPROM), a Programmable Read-Only Memory (PROM), a Read-Only Memory (ROM), a magnetic memory, a flash memory, and a magnetic or optical disk.
The power component 306 is configured to provide power for various components of the device 300. The power component 306 may include a power management system, one or more power supplies, and other components associated with generation, management and distribution of power for the device 300.
The multimedia component 308 may include a screen for providing an output interface between the device 300 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes the TP, the screen may be implemented as a touch screen to receive an input signal from the user. The TP includes one or more touch sensors to sense touches, swipes and gestures on the TP. The touch sensors may not only sense a boundary of a touch or swipe action but also detect a duration and pressure associated with the touch or swipe action. In some embodiments, the multimedia component 308 may include a front camera and/or a rear camera. The front camera and/or the rear camera may receive external multimedia data when the device 300 is in an operation mode, such as a photographing mode or a video mode. Each of the front camera and the rear camera may be a fixed optical lens system or have focusing and optical zooming capabilities.
The audio component 310 is configured to output and/or input an audio signal. For example, the audio component 310 includes a Microphone (MIC), and the MIC is configured to receive an external audio signal when the device 300 is in the operation mode, such as a call mode, a recording mode and a voice recognition mode. The received audio signal may further be stored in the memory 304 or sent through the communication component 316. In some embodiments, the audio component 310 further includes a speaker configured to output the audio signal.
The I/O interface 312 is configured to provide an interface between the processing component 302 and a peripheral interface module, and the peripheral interface module may be a keyboard, a click wheel, a button and the like. The button may include, but not limited to: a home button, a volume button, a starting button and a locking button.
The sensor component 314 may include one or more sensors configured to provide status assessment in various aspects for the device 300. For instance, the sensor component 314 may detect an on/off status of the device 300 and relative positioning of components, such as a display and small keyboard of the device 300, and the sensor component 314 may further detect a change in a position of the device 300 or a component of the device 300, presence or absence of contact between the user and the device 300, orientation or acceleration/deceleration of the device 300 and a change in temperature of the device 300. The sensor component 314 may include a proximity sensor configured to detect presence of an object nearby without any physical contact. The sensor component 314 may also include a light sensor, such as a Complementary Metal Oxide Semiconductor (CMOS) or Charge Coupled Device (CCD) image sensor, configured for use in an imaging APP. In some embodiments, the sensor component 314 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor or a temperature sensor.
The communication component 316 is configured to facilitate wired or wireless communication between the device 300 and other equipment. The device 300 may access a communication-standard-based wireless network, such as a Wireless Fidelity (WiFi) network, a 4th-Generation (4G) or 5th-Generation (5G) network or a combination thereof. In an exemplary embodiment, the communication component 316 receives a broadcast signal or broadcast associated information from an external broadcast management system through a broadcast channel. In an exemplary embodiment, the communication component 316 further includes a Near Field Communication (NFC) module to facilitate short-range communication. In an exemplary embodiment, the communication component 316 may be implemented based on a Radio Frequency Identification (RFID) technology, an Infrared Data Association (IrDA) technology, an Ultra-WideBand (UWB) technology, a Bluetooth (BT) technology and another technology.
In an exemplary embodiment, the device 300 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components, and is configured to execute the abovementioned method.
In an exemplary embodiment, there is also provided a computer-readable storage medium including an instruction, such as the memory 304 including an instruction, and the instruction may be executed by the processor 320 of the device 300 to implement the abovementioned method. For example, the computer-readable storage medium may be a ROM, a Compact Disc Read-Only Memory (CD-ROM), a magnetic tape, a floppy disc, an optical data storage device and the like.
The device 400 may further include a power component 426 configured to execute power management of the device 400, a wired or wireless network interface 450 configured to connect the device 400 to a network and an I/O interface 458. The device 400 may be operated based on an operating system stored in the memory 44:2, for example, Windows Server™, Mac OS X™, Unix™, Linux™, FreeBSD™ or the like.
In the embodiments of the present disclosure, a dynamic BN module, i.e., a BN module in one-to-one correspondence with a closest upper layer, may be arranged in a substructure of a hypernetwork topology, and when a subnetwork is sampled, the corresponding BN module may be used according to the selected substructure of a closest upper layer, so that the sampled subnetwork requires no additional training time and may be directly used. Therefore, efficiency is improved.
Other implementation solutions of the present disclosure will be apparent to those skilled in the art based on the specification and implementation of the embodiments of the present disclosure. This application is intended to cover any variations, uses, or adaptations of the embodiments of the present disclosure following the general principles thereof and including such departures from the embodiments of the present disclosure as come within known or customary practice in the art. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the embodiments of the present disclosure as indicated by the following claims.
It will be appreciated that the embodiments of the present disclosure are not limited to the exact construction that has been described above and illustrated in the accompanying drawings, and that various modifications and changes may be made without departing from the scope thereof. It is intended that the scope of the embodiments of the present disclosure only be limited by the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
201910810199.7 | Aug 2019 | CN | national |