This application is based on and claims priority under 35 U.S.C. § 119 to Korean Patent Application No. 10-2021-0179966, filed on Dec. 15, 2021, in the Korean Intellectual Property Office, the disclosure of which is incorporated by reference herein in its entirety.
The present disclosure relates to neural network devices and electronic systems including the same.
Interest in neuromorphic processors that perform neural network operations has increased. For example, research on a neuromorphic processor including a neuron circuit and a synaptic circuit has been conducted. Such neuromorphic processors can be used in neural network devices for driving various neural networks such as convolutional neural networks (CNN), recurrent neural networks (RNN), feedforward neural networks (FNN), etc. and can be used in fields including data classification, image recognition, autonomous control, speak recognition, etc.
Provided are neural network devices and electronic systems including the same. The technical objectives to be achieved by the present disclosure are not limited to the technical objectives as described above or below, and other technical objectives may be inferred from the following embodiments.
Additional aspects will be set forth in part in the description which follows and, in part, will be apparent from the description, or may be learned by practice of the presented embodiments of the disclosure.
According to an aspect of an embodiment, a neural network device includes a plurality of word lines extending in a first direction, a plurality of bit lines extending in a second direction intersecting the first direction, and a plurality of memory cells at points where the plurality of word lines and the plurality of bit lines intersect, wherein each of the plurality of memory cells comprises at least two ferroelectric memories connected in parallel along a corresponding word line of the plurality of word lines.
According to another aspect of an embodiment, an electronic system includes the neural network device described above, a non-transitory memory, and a processor configured to control a function of the neural network device by executing programs stored in the memory, wherein the neural network device performs a neural network operation based on input data received from the processor and generates an information signal corresponding to the input data based on a result of the neural network operation.
The above and other aspects, features, and advantages of certain embodiments of the disclosure will be more apparent from the following description taken in conjunction with the accompanying drawings, in which:
Reference will now be made in detail to some embodiments, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to like elements throughout. In this regard, the present embodiments may have different forms and should not be construed as being limited to the descriptions set forth herein. Accordingly, the embodiments are merely described below, by referring to the figures, to explain aspects. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items. Expressions such as “at least one of,” when preceding a list of elements, modify the entire list of elements and do not modify the individual elements of the list.
The terms used in the embodiments are selected from general terms that are currently widely used as possible while considering the functions in the embodiments. However, the terms may vary depending on the intention or precedent of those skilled in the art, the emergence of new technologies, etc. In addition, in a specific case, there are terms that are arbitrarily selected, and in this case, the meaning of the terms will be described in detail in the description of the embodiment. Therefore, the terms used in the present embodiments are not simple names, but it should be defined based on the meaning of the terms and the contents of the present embodiments.
In the descriptions of the embodiments, when a part is connected to another part, this is not only a case that is directly connected, but also a case that is electrically connected with another component therebetween. It will be understood that when a portion includes a component, it does not exclude other components, but may further include other components, unless otherwise stated.
It should not be construed that the terms “consisting of” or “comprising” used herein are meant to include all of various components and steps described in the specification, and it should be construed that some components or some steps may not be included, and/or additional components or steps may be further included.
It will be understood that although the terms “first,” “second,” etc., may be used herein to describe various components, these components should not be limited by these terms. The terms are only used to distinguish one component from another.
The description of the following embodiments should not be construed as limiting the scope of the rights, and what those skilled in the art can easily infer should be construed as belonging to the scope of the embodiments. Hereinafter, embodiments for illustration only will be described in detail with reference to the accompanying drawings.
Referring to
In the convolution layer, a first feature map FM1 may correspond to an input feature map, and the second feature map FM2 may correspond to an output feature map. The feature map may mean and/or represent a data set in which various features of input data are expressed. The first and second feature maps FM1 and FM2 may be a high-dimensional matrix of two or more dimensions and have respective activation parameters. The feature maps FM1 and FM2 may correspond to three-dimensional feature maps (for example, the feature maps FM1 and FM2 may have a width W (or referred to as a column), a height (or referred to as a row), and a depth C). In these cases, the depth C may correspond to the number of channels in the corresponding feature map.
In the convolution layer, a convolution operation may be performed on the first feature map FM1 and a weight map WM. As a result, the second feature map FM2 may be generated. The weight map WM may filter the first feature map FM1 and may also be referred to as a weight filter and/or weight kernel. In an example, the depth of the weight map WM (e.g., the number of channels of the weight map WM) may be the same as the depth of the first feature map FM1 (e.g., the number of channels of the first feature map FM1). The weight map WM may be shifted to transverse the first feature map FM1 as a sliding window. During each shift, each of weights included in the weight map WM may be multiplied and added to all feature values in an area overlapping the first feature map FM1. As the first feature map FM1 and the weight map WM are convoluted, one channel of the second feature map FM2 may be generated.
Although one weight map WM is indicated in
Referring to
In some example embodiments, the neural network 2 may be a DNN and/or n-layers neural networks including two or more hidden layers, as described above. For example, as shown in
Each of the layers included in the neural network 2 may include a plurality of channels. The plurality of channels may correspond to a plurality of artificial nodes, known as neurons, processing elements (PEs), units, and/or similar terms. For example, as shown in
The channels included in each of the layers of the neural network 2 may be connected to one another to process data. For example, one channel may receive data from other channels to perform an arithmetic operation and/or to output the result of the arithmetic operation to other channels.
Each of an input and an output of each of the channels may be referred to as an input activation and an output activation. For example, the activation may be an output of one channel and/or, a parameter corresponding to an input of channels included in the next layer. Also, each of the channels may determine its own activation based on activations received from channels included in the previous layer and weights. The weight may represent a parameter used to calculate an output activation at each channel may be a value allocated to a connection relationship between the channels.
Each of the channels may be processed by a computational unit and/or processing element for outputting an output activation by receiving an input, and an input-output of each of the channels may be mapped. For example, when σ is an activation function, WjKi is a weight from a k-tph channel included in an (i-1)-th layer to a j-th channel included in an i-th layer, is a bji of the j-th channel included in the i-th layer and aji is an activation of the j-th channel included in the i-th layer, the activation aji may be calculated using equation 1 below.
As shown in
As described above, in the neural network 2, numerous data sets may be exchanged between a plurality of channels interconnected, and an arithmetic operation process may be performed while passing through layers. In the computation process, numerous multiply-accumulate (MAC) operations may be performed, and numerous memory access operations for loading an activation and a weight to be calculated of the MAC operation at an appropriate time need to be performed together.
Also, in a general digital computer, a computational unit and memory are separated from each other, and a Von Neumann structure including a common data bus for data transmission between two separated blocks may be used. Thus, in the process of implementing the neural network 2 in which data movement and operations are continuously repeated, a lot of time may be required for data transmission, and excessive power may be consumed. In order to solve these problems, an in-memory computing circuit has been proposed as an architecture that integrates the memory and the computational unit for performing the MAC operation.
Hereinafter, the in-memory computing circuit will be described in more detail with reference to
Referring to
The analog crossbar array 30 may include a plurality of word lines 310, a plurality of bit lines 320, and a plurality of memory cells 330. In at least one example, when the in-memory computing circuit 3 is used to implement a neuromorphic processor, the plurality of word lines 310 may correspond to lines for receiving an input from a presynaptic neuron circuit, and the plurality of bit lines 320 may correspond to lines for transmitting an output to postsynaptic neurons. In addition, the plurality of memory cells 330 may correspond to synapse circuits for storing information about connection intensity between the presynaptic neuron circuit and the postsynaptic neurons.
The plurality of word lines 310 may be used to receive input data. For example, when the number of the plurality of word lines 310 is N (where N represents a natural number), voltages V1, V2, . . . , and VN, corresponding to input activations, may be applied to N word lines. The plurality of bit lines 320 may intersect the plurality of word lines 310. For example, when the number of the plurality of bit lines 320 is M (where M represents a natural number), the plurality of bit lines 320 and the plurality of word lines 310 may intersect one another at intersecting points of N×M.
Also, the plurality of memory cells 330 may be arranged at the intersecting points of the plurality of word lines 310 and the plurality of bit lines 320. Each of the plurality of memory cells 330 may be implemented with non-volatile memory for storing weights. However, embodiments are not limited thereto, and each of the plurality of memory cells 330 may be volatile memory.
In the example shown in
The ADC 50 may convert the result (e.g., the current sum I1, . . . , and IM) of the analog MAC operation output from the analog crossbar array 30 into a digital signal. The result of the MAC operation converted into the digital signal may be output from the ADC 50 and may be used in a procedure of a subsequent neural network operation.
In
In the neural network device according to the present disclosure, the resistance (or conductance) of the memory cell may be derived to be changed linearly with respect to the voltage applied to the memory cell while the ferroelectric memory is used to implement the memory cell. Hereinafter, a neural network device according to the present disclosure will be described in detail with reference to
Referring to
The plurality of word lines WL1, WL2, . . . , and WLi may extend in a first direction, and the plurality of bit lines BL1, BL2, . . . , and BLj may extend in a second direction intersecting the first direction. In
The plurality of memory cells 40 may be disposed at points where a plurality of word lines WL1, WL2, . . . , and WL1 and a plurality of bit lines BL1, BL2, . . . , and BLj intersect. Because the number of the plurality of word lines WL1, WL2, . . . , and WLi is i and the number of the plurality of bit lines BL1, BL2, . . . , and BLj is j, the number of the plurality of memory cells 40 may be i X j. Each of the plurality of memory cells 40 may include at least two ferroelectric memories connected in parallel along a word line corresponding to each of the plurality of memory cells 40.
For example, as shown in
Also, each of the plurality of bit lines BL1, BL2, . . . , and BLj may include sub-bit lines respectively connected to ferroelectric memories included in one memory cell among memory cells connected to each bit line. For example, a bit line may include an appropriate number of sub-bit lines such that the number of sub-bit lines corresponds to the number of ferroelectric memories. In the example illustrated in
For example, in
Each of the ferroelectric memories 410, 420, 430, and 440 may include at least one capacitor having a metal-ferroelectric-metal structure, a ferroelectric tunnel junction (FTJ) element, and a ferroelectric field-effect transistor (FeFET); but embodiments are not necessarily limited thereto. All the ferroelectric memories 410, 420, 430, and 440 may include the same and/or different type of elements.
Each of the plurality of memory cells 40 may further include a selection element for selectively approaching the memories 410, 420, 430, and 440 included in each of the plurality of memory cells 40. Thus, at least some of the plurality of memory cells 40 may be selected, and at least some of the ferroelectric memories included in the selected memory cells may be selected. For example, a memory cell arranged at an intersecting point of the word line WL2 and the bit line BL2 among the plurality of memory cells 40 may be selected by the selection element, and an access to ferroelectric memories included in the memory cell may be allowed.
For example, each of the plurality of memory cells 40 may include a transistor, a threshold switch, and/or the like, which is connected in series to each of the ferroelectric memories 410, 420, 430, and 440. However, embodiments are not limited thereto, and each of the plurality of memory cells 40 may include an element that enables optional access to the ferroelectric memories 410, 420, 430 and 440.
When the selection element is a transistor, a word line may be connected to a gate terminal of the transistor, and a ferroelectric memory connected to the transistor may be selected by a voltage applied through the word line. However, the example embodiments are not limited thereto, and a separate control line may also be connected to the gate terminal of the transistor. The transistor may be a metal oxide semiconductor (MOS) transistor, a bipolar transistor, and/or the like, but the example embodiments are not limited thereto.
The threshold switch is an element allowing the flow of a current only when a difference between voltages applied to both ends of the threshold switch is equal to or greater than a threshold value and which may perform a similar operation to the transistor. For example, the threshold switch may control an access to a ferroelectric memory connected to the threshold switch based on a difference between a voltage applied to an input terminal of a word line and a voltage applied to an output terminal of a bit line.
Each of the ferroelectric memories 410, 420, 430, and 400 has a conductance (or resistance). The conductance may correspond to a weight indicating connection intensity between a presynaptic neuron circuit connected to an input terminal of the word line and a postsynaptic neuron circuit connected to an output terminal of the bit line. For example, the ferroelectric memories 410, 420, 430, and 440 included in the memory cell arranged at an intersecting point of the word line WL2 and the bit line BL2 may have conductances corresponding to weights w221, w222, w223, and w224, respectively.
A synthesis conductance of the ferroelectric memories 410, 420, 430, and 440 included in each of the plurality of memory cells 40 may correspond to a weight stored in each of the plurality of memory cells 40. For example, a synthesis conductance W22(=Σk=1k=4w22k) of the ferroelectric memories 410, 420, 430, and 440 included in the memory cell 40 arranged at the intersecting point of the word line WL2 and the bit line BL2 may correspond to the weight stored in the corresponding memory cell 40.
The weight stored in each of the plurality of memory cells 40 may be linearly updated. Linearly updating of the weight stored in each of the plurality of memory cells 40 means that the conductance of each of the plurality of memory cells 40 is linearly changed in proportional to the magnitude of the voltage applied to each of the plurality of memory cells 40. For example, the linear change may provide more precise and/or accurate updates to the memory cells compared to memory cells including a single ferroelectric memory, which have nonlinear state-change characteristics. A reason for which the memory cell 40 included in the neural network device 4 according to the present disclosure may have linear state-change characteristics, will be described in further detail with reference to
Referring to
The neural network device 4 may apply different voltages to the ferroelectric memories 410, 420, 430, and 440 included in each of the plurality of memory cells 40 in the process of updating the weight stored in each of the plurality of memory cells 40 so as to train the neural network implemented by the neural network device 4.
For example, as shown in
In this way, different voltages may be applied to the ferroelectric memories 410, 420, 430, and 440 connected in parallel. In the neural network device 4 according to the present disclosure, the ferroelectric memories 410, 420, 430, and 440 connected in parallel may constitute one memory cell 40, and different voltages may be applied to the ferroelectric memories 410, 420, 430, and 440 during training (programming) so that linearity of the weight stored in the memory cell 40 may be secured. The reasons for securing the linearity of the weight by the above-described training method will be described in detail with reference to
Referring to
Conductance changes in the ferroelectric memory corresponding to the section a of the voltage-current characteristic curve 60 may be represented by a potentiation and depression (PD) curve 610, and conductance changes in the ferroelectric memory corresponding to a section b of the voltage-current characteristic curve 60 may be represented by a PD curve 620, and conductance changes in the ferroelectric memory corresponding to a section c of the voltage-current characteristic curve 60 may be represented by a PD curve 630. All of the PD curve 610, the PD curve 620, and the PD curve 630 may have nonlinear characteristics with respect to voltage.
However, a case where a ferroelectric memory following the PD curve 620 and a ferroelectric memory following the PD curve 630 are connected in parallel, is illustrated. When two ferroelectric memories are the same elements, the conductance of one ferroelectric memory following the PD curve 620 corresponding to the section b and the conductance of another ferroelectric memory following the PD curve 630 corresponding to the section c mean that different voltages are applied to two ferroelectric memories during training. The synthesis conductance of two ferroelectric memories connected in parallel may correspond to a PD curve 625 that is a result of normalizing and combining the PD curve 620 and the PD curve 630. Comparing the PD curve 610 with the PD curve 625, the PD curve 625 may be more linear with respect to voltage changes.
As described above, even if each ferroelectric memory has nonlinear state-change characteristics, when a plurality of ferroelectric memories are connected to each other in parallel, to configure one memory cell, and state changes are derived by applying different voltages to the parallel-connected ferroelectric memories, a synthetic conductance having linear state-change characteristics may be obtained. Because the synthesis conductance corresponds to a weight of a memory cell, the weight of the memory cell may be represented to be linearly updated.
Referring back to
Referring to
A difference (e.g., ΔV of
However, the example embodiments are not necessarily limited thereto, and different voltages applied to ferroelectric memories included in one memory cell may be determined by an arbitrary appropriate method when nonlinear state characteristics of each ferroelectric memory are offset in a process of synthesizing conductances by parallel connection. Different voltages applied to ferroelectric memories included in one memory cell may be differently set according to the number of ferroelectric memories included in one memory cell. Also, different voltages applied to the ferroelectric memories included in one memory cell may be differently set according to a minimum voltage and/or step size of a voltage (e.g., Vtrain of
Also, in
Hereinafter, an experimental design that may show that the nonlinear state-change characteristics of each ferroelectric memory may be offset in a process of synthesizing conductances by parallel connection will be described with reference to
A voltage (e.g., Vtrain of
In particular, a pulse increased by a voltage interval of Vps from the start of Vpi may be applied from first through sixth cycles, and a pulse with an absolute value increased by a voltage interval of Vds from the start of −Vdi may be applied from seventh through twelfth cycles. Vpi may be set greater than a minimum voltage for potentiation ferroelectric switching in the ferroelectric memory, and Vdi (e.g., an absolute value of −Vdi) may be set greater than a minimum voltage for depression ferroelectric switching in the ferroelectric memory.
Because the ferroelectric memory has an asymmetric material structure, more energy may be required in the case where electric polarization is depressed than it is potentiated. Thus, Vds that is a voltage interval during depression may be greater than Vps that is a voltage interval during potentiation. However, embodiments are not necessarily limited thereto, and Vds and Vps may be properly set according to an experiment method. Also, an example in which the number of pulses is 12, is shown in
When a voltage corresponding to
Referring to
In this way, the memory cell according to the present disclosure has linear characteristics with respect to voltage changes and thus, the synthesis conductance of the memory cell may be more precisely controlled. Further, as the synthesis conductance of the memory cell is precisely controlled, the distinctive steps of the weight corresponding to the synthesis conductance of the memory cell may be increased, and multi-value characteristics may be enhanced. When the multi-value characteristics are enhanced, a more elaborate neural network may be implemented.
Referring to
When the inference is performed using the trained neural network, the neural network device 4 may connect output terminals of sub-bit lines so that currents flowing through each of the sub-bit lines are summed and output. For example, as shown in
Because the plurality of memory cells 40 have conductances corresponding to weights and voltages V1, V2, . . . , and Vi input through the plurality of word lines WL1, WL2, . . . , and WLi correspond to input activations, summation currents I1, I2, . . . , and Ij output through a plurality of bit lines BL1, BL2, . . . , and BLj may correspond to the result of a neural network operation (e.g., a multiply-accumulate (MAC) operation) performed in an analog manner.
Also, the neural network device 4 may include a switching circuit (not shown) that controls output terminals of the sub-bit lines to be connected to each other or to be connected to DC voltages. For example, the neural network device 4 may control the switching circuit so that the output terminals of the sub-bit lines are connected to the DC voltages during training, as shown in
Referring to
The electronic system 11 may include a processor 1110, random access memory (RAM) 1120, a neural network device 1130, a memory 1140, a sensor module 1150, and a communication module 1160. The electronic system 11 may further include an input/output module, a security module, a power control device, and/or the like. Some of hardware configurations of the electronic system 11 may be mounted on at least one semiconductor chip.
The processor 1110 may control the overall operation of the electronic system 11. The processor 1110 may include one processor core Single Core and/or a plurality of processor cores Multi-Core. The processor 1110 may process and/or execute programs and/or data stored in the memory 1140. In some embodiments, the processor 1110 may execute the programs stored in the memory 1140, thereby controlling the function of the neural network device 1130. The processor 1110 may be implemented with (and/or include) at least one of a central processing unit (CPU), a graphics processing unit (GPU), an application processor (AP), and/or the like.
The RAM 1120 may be a non-transitory computer readable storage and may temporarily store programs, data, and/or instructions. For example, the programs and/or data stored in the memory 1140 may be temporarily stored in the RAM 1120 according to control and/or booting code of the processor 1110. The RAM 1120 may be implemented with memory such as dynamic RAM (DRAM) or static RAM (SRAM).
The neural network device 1130 may perform an operation of the neural network based on the received input data and may generate an information signal based on the result of performing. The neural network may include CNN, RNN, FNN, Deep Belief Networks, Restricted Boltzmann Machines, and/or the like but is not limited thereto. The neural network device 1130 may be (or include) a hardware accelerator dedicated to the neural network and/or a device including the same. The neural network device 1130 may perform an operation of reading and/or writing in addition to the operation of the neural network.
The neural network device 1130 may correspond to the neural network device 4 described with reference to
The information signal may include one of a variety of recognition signals, such as speech recognition signals, object recognition signals, image recognition signals, biometric information recognition signals, and the like. For example, the neural network device 1130 may receive frame data included in a bitstream as input data and may generate recognition signals for an object included in an image represented by the frame data from the frame data. However, embodiments are not limited thereto, and the neural network device 1130 may receive various types of input data according to the type and/or function of the electronic device on which the electronic system 11 is mounted and may generate recognition signals according to the input data.
The memory 1140 is a storage place for storing data may store an operating system (OS), various types of programs, and various types of data. In an embodiment, the memory 1140 may store the intermediate results generated during the operation execution of the neural network device 1130.
The memory 1140 may be a non-transitory computer readable storage and may be, e.g., a DRAM. However, embodiments are not limited thereto. The memory 1140 may include at least one of volatile memory and nonvolatile memory. The nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable and programmable ROM (EEPROM), flash memory, phase-change RAM (PRAM), magnetic RAM (MRAM), resistive RAM (RRAM), ferroelectric RAM (FRAM), and/or the like. The volatile memory may include a DRAM, an SRAM, a synchronous DRAM (SDRAM), a PRAM, an MRAM, a RRAM, a ferroelectric RAM (FeRAM), and/or the like. In an embodiment, the memory 1140 may include at least one of a hard disk drive (HDD), a solid-state drive (SSD), a compact flash (CF), secure digital (SD), micro secure digital (Micro-SD), mini secure digital (Mini-SD), and a memory stick.
The sensor module 1150 may collect information around the electronic device on which the electronic system 11 is mounted. The sensor module 1150 may sense and/or receive a signal (e.g., an image signal, a sound signal, a magnetic signal, a bio-signal, a touch signal, and/or the like) from an outside of the electronic device and may convert the sensed and/or received signal into data. To this end, the sensor module 1150 may include at least one of various types of sensing devices, such as a sensing device, e.g., a microphone, an image capturing device, an image sensor, a light detection and ranging (LIDAR) sensor, an ultrasonic sensor, an infrared ray sensor, a bio sensor, and a touch sensor.
In at least one embodiment, the sensor module 1150 may provide the converted data as input data to the neural network device 1130. For example, the sensor module 1150 may include an image sensor, may generate a video stream by capturing an external environment of the electronic device and may provide a consecutive data frame of the video stream to the neural network device 1130 as input data in order. However, embodiments are not limited thereto, and the sensor module 1150 may provide various types of data to the neural network device 1130.
The communication module 1160 may include wired and/or wireless interfaces that communicate with an external device through transmitting (Tx) and/or receiving (Rx) signals. For example, the communication module 1160 may include a local area network (LAN), a wireless local area network (WLAN) such as wireless fidelity (Wi-Fi),wireless personal area network (WPAN) such as Bluetooth, a wireless universal serial bus (USB), Zigbee, near field communication (NFC), radio-frequency identification (RFID), power line communication (PLC), a communication interface connectable to a mobile cellular network (such as 3rd generation (3G), 4th generation (4G), 5th generation (5G), long term evolution (LTE), and/or the like), etc.
It should be understood that embodiments described herein should be considered in a descriptive sense only and not for purposes of limitation.
Descriptions of features or aspects within each embodiment should typically be considered as available for other similar features or aspects in other embodiments. While one or more embodiments have been described with reference to the figures, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope as defined by the following claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2021-0179966 | Dec 2021 | KR | national |