This application is based upon and claims the benefit of priority from Japanese Patent Application No. 2021-102877, filed on Jun. 22, 2021; the entire contents of which are incorporated herein by reference.
Embodiments described herein relate generally to a semiconductor integrated circuit and an arithmetic logic operation system.
Semiconductor integrated circuits can execute a sum-of-product operation that adds a plurality of results obtained by multiplying each of a plurality of input values by a weight. In this event, the sum-of-product operation is desired to be efficient.
In general, according to one embodiment, there is provided a semiconductor integrated circuit including a plurality of storage devices, a plurality of multiplication circuits, one or more capacitive devices and an adder circuit. The plurality of storage devices are arranged in a form of a plurality of rows. Each of the storage devices are configured to store a bit position value of a weight of multiple bits. The plurality of multiplication circuits are arranged in a form of a plurality of rows. The plurality of multiplication circuits are configured to multiply a plurality of input voltages by the weight of multiple bits to generate a plurality of multiplication results. The one or more capacitive devices are configured to accumulate charges corresponding to the plurality of multiplication results. The adder circuit are configured to generate an output voltage corresponding to the total value of the charges accumulated in the one or more capacitive devices. The plurality of input voltages have different amplitudes. Each of the input voltages is associated with a corresponding bit position of the weight.
Exemplary embodiments of an arithmetic logic operation system will be explained below in detail with reference to the accompanying drawings. The present invention is not limited to the following embodiments.
An arithmetic logic operation system according to a first embodiment can be used, for example, to execute a part of a processing of a neural network, such as artificial intelligence (AI) accelerators. The AI accelerator executes a sum-of-product operation that multiplies a multi-bit input by a multi-bit weight and adds results of the multiplication, for learning a task or making an inference. The arithmetic logic operation system needs to perform such an arithmetic logic operation in parallel at high speed and with efficiency. In one example, the arithmetic logic operation system performs the sum-of-product operation in parallel for a plurality of neurons in a given layer in a neural network (parallel operation). The parallel operation includes a multiplication operation between a multi-bit input and a multi-bit weight with optional bit precision.
The parallel operation may be implemented in hardware by a semiconductor integrated circuit of a configuration in which a memory circuit and an arithmetic logic operation circuit are arranged integrally or physically close to each other. The semiconductor integrated circuit has a weight that is set fixedly rather than dynamically for multiplication. Such configuration of the semiconductor integrated circuit includes a digital-based configuration, an analog current-based configuration, an analog charge-based configuration, and the like.
In the digital-based configuration, the calculation is executed in a memory device by performing the arithmetic logic operation corresponding to a weight bit for an input bit in a bit-wise logical operation circuit arranged in the memory. The analog current-based configuration applies a voltage to each row of the array of variable resistive devices (or elements) with programmed weights and adds current values using Kirchhoff's law.
On the other hand, the analog charge-based configuration performs the arithmetic logic operation for a logical product of multiple bits of the input and multiple bits of the weight in parallel on a bit-by-bit basis and causes a voltage corresponding to a result obtained by performing the arithmetic logic operation for the multiple bits to be held in a plurality of capacitive devices (or elements) of a capacitive array. Then, it sequentially reads out the charges of the plurality of capacitive devices in the capacitive array and performs analog-to-digital (A/D) conversion to generate a digital signal, while performing the addition of digital signal by repeating the bit shift and adding the digital signals.
The analog charge-based configuration is superior to the digital-based configuration in that the arithmetic logic operation rate is faster upon performing the arithmetic logic operation with medium precision of bits. The analog charge-based configuration is superior to the analog current-based configuration in that easily reducible influence of variation between devices.
However, in the analog charge-based configuration, the bit shift operation for the addition of signals is performed for the number of times proportional to the number of bits of at least one of an input and a weight. Thus, the increase in the number of bits of at least one of an input and a weight is more likely to cause a decrease in the parallel operation efficiency. The processing time required for the parallel operation is more likely to increase depending on the bit precision of at least one of an input and a weight.
Thus, according to the present embodiment, the semiconductor integrated circuit multiplies a plurality of input voltages having a positive signal amplitude by a weight of multiple bits representing a positive integer and generates an output voltage corresponding to an addition result based on the charges corresponding to a plurality of multiplication results. This makes it possible to improve the parallel operation efficiency regardless of the bit precision of any one of input or weight, intending to obtain fixed processing time of parallel operation.
The semiconductor integrated circuit converts each bit of the digital input data into a plurality of input voltages by digital-to-analog (D/A) conversion or the like. The semiconductor integrated circuit multiplies a plurality of input voltages by a weight of multiple bits. The semiconductor integrated circuit generates an output voltage corresponding to the addition result on the basis of the plurality of multiplication results. In one example, the output voltage can be a voltage corresponding to the charge obtained by accumulating a plurality of multiplication results in parallel in a plurality of capacitive devices and collectively redistributing the charges of the plurality of capacitive devices so that the accumulated voltages are averaged. Thus, it is possible to obtain the addition result without performing the bit shift and to improve the efficiency of parallel operation regardless of the bit precision of any one of input or weight, obtaining fixed processing time of parallel operation.
An arithmetic logic operation system 100 including a semiconductor integrated circuit 1 can be configured as illustrated in
The arithmetic logic operation system 100 multiplies the input data Din=(dm-1, dm-2, . . . , d1, d0) having m bits d and a weight vector W=(bn-1, bn-2, . . . , b1, b0) having n bits b and outputs data Dout to be output according to the multiplication results, where n is any integer greater than or equal to 1. The number of bits n of the weight vector W can be equal to the number of bits m of the input data Din. It should be noted that this is not important for this embodiment.
In other words, in the arithmetic logic operation system 100, the input circuit 2 converts the digital domain input data Din=(dm-1, dm-2, . . . , d1, d0) into an analog domain input vector X=(VX, VX/2, . . . , VX/2n-2, VX/2n-1) by D/A conversion or the like and inputs the result to the semiconductor integrated circuit 1. In other words, the input vector X includes n input voltages VX, VX/2, . . . , VX/2n-2, VX/2n-1. The semiconductor integrated circuit 1 has the preset weight vector W=(bn-1, bn-2, . . . , b1, b0). Each element of the weight vector W is a digital signal having one bit. The n-bit pattern “bn−1, bn-2, . . . , b1, b0” of the weight vector W represents a positive integer. In the following description, the weight vector W is also simply referred to as a weight W. The n input voltages included in the input vector X correspond to n bits of the weight W. Each input voltage included in the input vector X is an analog signal having a positive signal amplitude corresponding to the position of the corresponding bit of the weight W. A global circuit 10 generates global signals MULT, PRE, and SUM and supplies the signals to an arithmetic logic operation circuit 20.
The semiconductor integrated circuit 1 performs the arithmetic logic operation for the scalar product between the input vector X=(VX, VX/2, . . . , VX/2n-2, VX/2n-1) and the weight vector W=bn−1, bn-2, . . . , b1, b0) on the basis of the global signals MULT, PRE, and SUM and inputs an output voltage Y=X·W=VY as the arithmetic logic operation result to the output circuit 3. The output voltage Y=VY is an analog signal. The output circuit 3 converts the output voltage VY into data Dout to be output by A/D conversion or the like and outputs the converted result. The output data Dout is a digital signal including one or more bits.
The semiconductor integrated circuit 1 can be configured as illustrated in
The semiconductor integrated circuit 1 has the global circuit 10 and the arithmetic logic operation circuit 20. The arithmetic logic operation circuit 20 receives n input voltages VX, . . . , VX/2n-1 at a plurality of input nodes 20a_(n−1), . . . , 20a_0, respectively. The n input voltages VX, . . . , VX/2n-1 are included in the input vector X. The input voltages VX, . . . , VX/2n-1 are applied to the respective corresponding input nodes 20a_(n−1), . . . , 20a_0. Here, n is the bit precision of the weight. The global circuit 10 generates global signals MULT, PRE, and SUM and supplies the signals to the arithmetic logic operation circuit 20. The arithmetic logic operation circuit 20 performs the arithmetic logic operation corresponding to the global signals MULT, PRE, and SUM and outputs the output voltage VY as the arithmetic logic operation result from the output node 20b.
The arithmetic logic operation circuit 20 includes n storage devices 21 (n−1) to 21_0, a plurality of multiplication circuits 22 (n−1) to 22_0, n capacitive devices 23_(n−1) to 23_0, and an adder circuit 24, as illustrated in
The n storage devices 21_(n−1) to 21_0 representing an n bit weight are arranged in the form of n row lines (i.e., row lines (n−1) to 0). The n storage devices 21_(n−1) to 21_0 store the n-bit weight W. The n storage devices 21_(n−1) to 21_0 correspond to the bits bn−1, . . . , b0 included in the weight W, respectively. The storage devices 21 store and output the respective corresponding bit values of the respective corresponding weights W. The storage device 21 (n−1) stores the value of bit bn−1 and outputs a voltage corresponding to the value of bit bn−1. The storage device 21_0 stores the value of bit b0 and outputs a voltage corresponding to the stored value of bit b0. The n-bit pattern “bn−1, bn-2, . . . , b1, b0” included in the weight W represents positive integers.
The n multiplication circuits 22_(n−1) to 22_0 are arranged in the form of n row lines (i.e., row lines (n−1) to 0). The n multiplication circuits 22_(n−1) to 22_0 are connected in parallel to the n input nodes 20a_(n−1) to 20a_0 and the output node 20b via n intermediate nodes 20c_(n−1) to 20c_0, respectively. Each multiplication circuit 22 has a first end connected to the input node 20a and a second end connected to the intermediate node 20c. Each multiplication circuit 22 has a first control terminal that receives the global signal MULT and a second control terminal that receives the output of the storage device 21.
The n multiplication circuits 22_(n−1) to 22_0 equivalently multiply n input voltages by the n-bit weight W to generate n multiplication results. The n multiplication circuits 22_(n−1) to 22_0 correspond to n input voltages and to n bits included in the weight W. Each multiplication circuit 22 equivalently multiplies the corresponding analog signal by the corresponding bit of the weight W to generate a multiplication result. The multiplication circuit 22_(n−1) equivalently multiplies the input signal VX by the value of bit bn−1 to generate a multiplication result (VX)×bn-1. The multiplication circuit 22_0 equivalently multiplies the input signal VX/2n-1 by the value of bit b0 to generate a multiplication result (VX/2n-1)×b0.
Each multiplication circuit 22 includes a series connection of a switch SW1 and a switch SW2 between the input node 20a and the intermediate node 20c. The switch SW1 turns ON or OFF depending on the global signal MULT. The switch SW2 remains ON or OFF depending on the bit value of the weight W. The switches SW1 and SW2 are an example of a switching device and are only necessary to be constituted of, for example, a transistor.
The switch SW1 of each of the n multiplication circuits 22_(n−1) to 22_0 turns ON when an active level global signal MULT is received from the global circuit 10 at its control terminal and electrically connects the input node 20a and the first end of the switch SW2. The switch SW1 of each of the n multiplication circuits 22_(n−1) to 22_0 turns OFF when a non-active level global signal MULT is received from the global circuit 10 at its control terminal and electrically cuts off the input node 20a and the first end of the switch SW2.
The n multiplication circuits 22_(n−1) to 22_0 respectively correspond to the n storage devices 21_(n−1) to 21_0. The switches SW2 of the n multiplication circuits 22_(n−1) to 22_0 remain ON or OFF depending on the bit value of the weight W output from the respectively corresponding storage devices 21.
The switch SW2 of the multiplication circuit 22_(n−1) remains ON in the case where the value of the bit bn−1 of the weight W output from the storage device 21_(n−1) is 1. In this case, when the switch SW1 turns ON, the input node 20a_(n−1) and the intermediate node 20c_(n−1) are electrically connected.
The switch SW2 of the multiplication circuit 22_(n−1) remains OFF in the case where the value of the bit of the weight W output from the storage device 21_(n−1) is 0. In this case, even if the switch SW1 turns ON, the input node 20a_(n−1) and the intermediate node 20c_(n−1) remains electrically cut off.
The switch SW2 of the multiplication circuit 22_0 remains ON in the case where the value of the bit b0 of the weight W output from the storage device 21_0 is 1. In this case, when the switch SW1 turns ON, the input node 20a_0 and the intermediate node 20c_0 are electrically connected.
The switch SW2 of the multiplication circuit 22_0 remains OFF in the case where the value of the bit b0 of the weight W output from the storage device 21_0 is 0. In this case, even if the switch SW1 turns ON, the input node 20a_0 and the intermediate node 20c_0 remains electrically cut off.
The n capacitive devices 23_(n−1) to 23_0 are arranged in the form of n row lines (i.e., row lines (n−1) to 0). The n capacitive devices 23_(n−1) to 23_0 respectively correspond to the n storage devices 21_(n−1) to 21_0 and the n multiplication circuits 22_(n−1) to 220. The n capacitive devices 23_(n−1) to 23_0 are connected in parallel between the n intermediate nodes 20c_(n−1) to 20c_0 and a reference node 20d, respectively.
The capacitive devices 23 can have a capacitance value C equal to each other. Each of the capacitive devices 23 has a first end connected to the intermediate node 20c and a second end connected to the reference node 20d. The first end of each capacitive device 23 is connected to the corresponding multiplication circuit 22 via the intermediate node 20c. The reference node 20d is supplied with a reference voltage (e.g., the ground voltage of the arithmetic logic operation circuit 20). The reference node 20d can be shared for the n capacitive devices 23_(n−1) to 23_0.
Each of the multiplication circuits 22 changes whether or not to supply the first end of the capacitive device 23 with the input signal depending on the bit value of the weight W. Thus, equivalently, the n multiplication circuits 22_(n−1) to 22_0 multiply the n input voltages by the weight W of n bits.
The adder circuit 24 is arranged between the n multiplication circuits 22_(n−1) to 22_0 and the n capacitive devices 23_(n−1) to 23_0. The adder circuit 24 redistributes the charges accumulated in the n capacitive devices 23_(n−1) to 23_0 depending on the capacitance ratio and generates the output voltage V. The output voltage VY corresponds to the addition result obtained by adding n multiplication results. The n capacitive devices 23_(n−1) to 23_0 have capacitance values C equal to each other, so the output voltage VY corresponds to a voltage obtained by averaging the n voltages as n multiplication results. The output voltage VY can be converted into the addition result by multiplying the value by the number of capacitive devices (i.e., n).
The adder circuit 24 has a switch SW3 between a common node 20e and at least one of the intermediate nodes 20c_(n−1) to 20c_0.
The adder circuit 24 has n switches SW4_(n−1) to SW4_0 between the output node 20b and the n intermediate nodes 20c_(n−1) to 20c_0. The n intermediate nodes 20c_(n−1) to 20c_0 are respectively connected to the first ends of the n capacitive devices 23_(n−1) to 23_0. The n switches SW4_(n−1) to SW4_0, when receiving an active level global signal SUM from the global circuit 10 at their control terminals, turn ON and electrically connect first ends of the n capacitive devices 23_(n−1) to 23_0 to the output node 20b. The n switches SW4_(n−1) to SW4_0, when receiving a non-active level global signal SUM from the global circuit 10 at their control terminals, turn OFF and electrically cut off the first ends of the n capacitive devices 23_(n−1) to 23_0 from the output node 20b. The n switches SW4_(n−1) to SW4_0 are an example of a switching device and are only necessary to be constituted of, for example, a transistor.
Moreover, the configuration including the storage device 21, the multiplication circuit 22, the capacitive device 23, and the switch SW4 in each row will be referred to as a unit cell UC. The unit cell UC_(n−1) in the row line (n−1) includes the storage device 21_(n−1), the multiplication circuit 22 (n−1), the capacitive device (n−1), and the switch SW4_(n−1). The unit cell UC_0 in the row line 0 includes the storage device 21_0, the multiplication circuit 22_0, the capacitive device 23_0, and the switch SW4_0.
The configuration illustrated in
(1) The bit values of the weight W are allocated from the most significant bit (MSB) to the least significant bit (LSB), and each bit value is stored into the storage device 21. The global signals PRE and SUM are set to the active level, and the global signal MULT is set to the non-active level. Accordingly, the charges of the capacitive devices 23_(n−1) to 23_0 are discharged to the common node 20e, and the accumulated voltages of the capacitive devices 23_(n−1) to 23_0 are reset to VCOM.
(2) The global signal MULT is set to the active level, and the global signals PRE and SUM are set to the non-active level. Accordingly, in the n unit cells UC_(n−1) to UC_0, the weight W is equivalently multiplied by the input voltage, and the charge corresponding to the multiplication result is accumulated in the first end of each of the capacitive devices 23_(n−1) to 23_0. The multiplication and the charge accumulation in the n unit cells UC_(n−1) to UC_0 are performed in parallel with each other. Thus, the accumulated voltage of each capacitive device 23 becomes V (where i =(n−1) to 0).
(3) The global signal SUM is set to the active level, and the global signals PRE and MULT are set to the non-active level. Accordingly, the charges accumulated in the first ends of the n capacitive devices 23_(n−1) to 23_0 are redistributed depending on the capacitance ratio. The n capacitive devices 23_(n−1) to 23_0 have capacitance values C equal to each other, so the accumulated voltage of each capacitive device 23 becomes an averaged voltage.
The semiconductor integrated circuit 1 can operate as illustrated in
Immediately before timing tl, the global circuit 10 maintains the global signal PRE, the global signal SUM, and the global signal MULT at respective L levels. The semiconductor integrated circuit 1 is set to the common voltage VCOM=0 V (≈ ground potential).
In one example, in the case where n=3 and the weight W=(b2, b1, b0)=(1, 1, 1), the semiconductor integrated circuit 1 stores bit values 1, 1, and 1 in three storage devices 21_2, 21_1, and 21_0, respectively. The multi-bit pattern “111” with the weight W represents a positive integer corresponding to “7” in decimal. The three storage devices 21_2, 21_1, and 21_0 all output an H level that is a logical value corresponding to the bit value 1. Accordingly, all the switches SW2 of three multiplication circuits 22_2, 22_1, and 22_0 remain ON.
The input circuit 2 generates a voltage VX from the 3-bit input data Din=(d2, d1, d0), for example, when m=3, using any method, converts the voltage VX to the input vector X=(VX, VX/2, VX/22), and inputs the converted result to the semiconductor integrated circuit 1. The three input voltages VX, VX/2, and VX/22 correspond to the multiple bits b2, b1, and b0 included in the weight W, respectively. The input signals VX, VX/2, and VX/22 have positive amplitudes depending on the positions of the respective corresponding bits.
In the row line i, an input signal VX,i given in Formula 1 below is applied to the input node 20a_i. The row line i corresponds to bit b1 of the weight W. Here, i represents the row line number, also represents the bit position of the weight W, and can have values ranging from n-1 (n is an integer of 2 or more) to 0. In Formula 1, n-1 corresponds to the MSB in multiple bits of the weight W, and 0 corresponds to the LSB.
In one example, when n=3, in the arithmetic logic operation circuit 20, the input voltage VX is applied to the input node 20a_2 in the row line 2, the input voltage VX/2 is applied to the input node 20a_1 in the row line 1, and the input voltage VX/22 is applied to the input node 20a_0 in the row line 0.
The operation of Cycle (1) is performed during the period from timings t1 to t2. At timing tl, the global circuit 10 shifts and maintains the global signal PRE and the global signal SUM to the H level while maintaining the global signal MULT at the L level. Accordingly, the common voltage VCOM=0 V is applied to the first ends of the three capacitive devices 23_2 to 23_0, and the holding voltages VC,0 to VC,2 of the respective capacitive devices 23_2 to 23_0 are reset to 0 V. In other words, the charges accumulated in the capacitive devices 23_2 to 23_0 are reset. At timing t2, the global circuit 10 shifts the global signal PRE and the global signal SUM to the L level. Accordingly, the reset of the holding voltages VC,0 to VC,2 of the respective capacitive devices 23_2 to 23_0 is completed.
The operation of Cycle (2) is performed during the period from timings t3 to t4. At timing t3, the global circuit 10 shifts and maintains the global signal MULT to the H level while maintaining the global signal PRE and the global signal SUM at the L level. Accordingly, the multiplication circuit 22_i multiplies the input voltage VX,i by the bit value bi of the weight W. In other words, in the row line i, the switch SW1 of the multiplication circuit 22 i remains ON. In the case where the bit value bi=1 of the weight W stored in the storage device 21_i, the switch SW2 remains ON, and the input voltage VX,i is applied to the first end of the capacitive device 23_i. In the case where the bit value bi=0 of the weight W stored in the storage device 21_i, the switch SW2 remains OFF, and the input voltage VX,i is not applied to the first end of the capacitive device 23_i. Accordingly, in the case where the bit value bi=1 of the weight W, the holding voltage of the capacitive device 23_i increases from 0 V to VX,i gradually. In the case where the bit value bi=0 of the weight W, the holding voltage of the capacitive device 23_i remains 0 V. In other words, a charge Qi corresponding to the result obtained multiplying the input voltage VX,i by the bit value bi of the weight W in the multiplication circuit 22 i is accumulated in the capacitive device 23_i. The charge Qi is given as Formula 2 below.
In one example, in the case where n=3 and the weight W=(b2, b1, b0)=(1, 1, 1) , the input signal VX is applied to the first end of the capacitive device 23_2 depending on the bit value b2=1 of the weight W in the row line 2, and the holding voltage VC,2 of the capacitive device 23_2 gradually increases from 0 V to VX. Accordingly, the charge Q2=C×VX is accumulated in the capacitive device 23_2. In the row line 1, the input signal VX/2 is applied to the first end of the capacitive device 23_1 depending on the bit value b1=1 of the weight W, and the holding voltage VC,1 of the capacitive device 23_1 is gradually increased from 0 V to VX/2. Accordingly, the charge Q1=C×VX/2 is accumulated in the capacitive device 23_1. In the row line 0, the input signal VX/22 is applied to the first end of the capacitive device 23_0 depending on the bit value b0=1 of the weight W, and the holding voltage Vi2,0 of the capacitive device 23_0 is gradually increased from 0 V to VX/22. Accordingly, the charge Q1=C×VX/22 is accumulated in the capacitive device 23_0.
At timing t4, the global circuit 10 shifts the global signal MULT to the L level. Accordingly, the accumulation of charges in the capacitive devices 23_2 to 23_0 is completed, and the holding voltages VC,0 to VC,2 remain.
The operation of Cycle (3) is performed during the period from timings t5 to t6. At timing t5, the global circuit 10 shifts and maintains the global signal SUM to the H level while maintaining the global signal PRE and the global signal MULT at the L level. Accordingly, the adder circuit 24 redistributes the charges accumulated in the n capacitive devices 23_(n−1) to 23_0 depending on their respective capacitance ratios. The capacitance values of the n capacitive devices 23_(n−1) to 23_0 are equal, so the charges of the n capacitive devices 23_(n−1) to 23_0 are redistributed so that the accumulated voltages are averaged. Accordingly, the adder circuit 24 generates the output voltage VY and supplies the output node 20b. The output voltage V1 corresponds to a voltage obtained by averaging a plurality of voltages VC,0 to VC,1 as a result obtained by the plurality times of multiplication. The output voltage V1 can be converted into an addition result by multiplying the number of capacitive devices 23 (i.e., n). The output voltage VY is given by Formula 3.
In one example, in the case where n=3 and the weight W=(b2, b1, b0)=(1, 1, 1) , the adder circuit 24 redistributes the charges accumulated in the three capacitive devices 23_2 to 23_0 and generates the output voltage VY=(VC,0+VC,1+VC,2)/3={VX+(VX/2)+(Vx/22)}/3.
As illustrated in
As described above, according to the first embodiment, the semiconductor integrated circuit 1 multiplies the plurality of input voltages having a positive signal amplitude by the weight of multiple bits representing a positive integer and accumulates charges corresponding to the plurality of multiplication results in parallel in the plurality of capacitive devices. Then, the charges of the plurality of capacitive devices are collectively redistributed in such a way that the accumulated voltages are averaged, and the averaged accumulated voltage is generated as an output voltage corresponding to the addition result. Thus, it is possible to obtain the addition result without performing the bit shift and to obtain fixed processing time of parallel operation regardless of the bit precision of any one of input or weight.
Moreover, as illustrated in
An arithmetic logic operation system 200 according to a second embodiment is now described. The configurations different from the first embodiment are hereinafter mainly described.
Although the first embodiment illustrates the configuration of the semiconductor integrated circuit in the case where multiple bits of the weight represent a positive integer, the second embodiment illustrates a semiconductor integrated circuit of a case where multiple bits of the weight represent a positive or negative integer.
The arithmetic logic operation system 200 has a semiconductor integrated circuit 201 instead of the semiconductor integrated circuit 1 (see
The n input voltages being input from the input circuit 2 to the semiconductor integrated circuit 201 respectively correspond to n bits bn−1, bn-2, . . . , b1, b0 of the weight W. Each input voltage has a positive or negative signal amplitude corresponding to the position of the individually corresponding bit when the common voltage VCOM is set to 0 V. The input signal received from the input circuit 2 by the input node 20a_(n−1) in the row line (n−1) is given as Formula 4 below. The input signal received from the input circuit 2 by the input node 20a_i in the row line i is given as Formula 5 below. In Formula 5, i=(n−2) to 0.
The arithmetic logic operation circuit 220 performs arithmetic logic operation for a scalar product Y=X·W of an input vector X=(VCOM−VX, VCOM+VX/2, . . . , VCOM+VCOM +VX/2n−1) and a weight vector W=(bn-1, bn-2, . . . , b1, b0) , but the n-bit pattern “bn-1, bn-2, . . . , b1, b0” of the weight W can be represented as a negative integer.
For example, in the case where the parameter VX has a positive value and the n-bit patterns of the weight W represent a negative integer, the semiconductor integrated circuit 201 can operate as illustrated in
Immediately before timing tll, the global circuit 10 maintains the global signals PRE, SUM, and MULT at the L level. The semiconductor integrated circuit 201 sets the potential of the common voltage VCOM to an intermediate potential between the ground potential and the power supply potential of the semiconductor integrated circuit 201. In one example, the semiconductor integrated circuit 201 sets the common voltage to VCOM =Vdd/2. Here, Vdd is the power supply potential. The common voltage VCOM has a potential of Vdd/2, but has 0 V (reference value) concerning the signal amplitude of the input vector X. The input voltage has a positive signal amplitude Vin−Vdd/2 if its potential Vin is higher than Vdd/2, and has a negative signal amplitude Vin−Vdd/2 if the potential Vin is lower than Vdd/2. It should be noted that |Vin| should always be smaller than Vdd/2.
In one example, in the case where n=3 and the weight W=(b2, b1, b0)=(1, 1, 1) , the weight W represents a negative integer corresponding to “−1” in decimal.
During the period from timings t11 to t12, an operation different from Cycle (1) of the first embodiment is performed as described in the following points depending on the common voltage VCOM being Vdd/2. At timing t11, the global circuit 10 shifts and maintains individually the global signals PRE and SUM to the H level while maintaining the global signal MULT to the L level. Accordingly, the holding voltages VC, 0 to VC,2 of the capacitive devices 23_2 to 23_0 are reset to VCOM=Vdd/2 (=signal amplitude of 0 V)
During the period from timings t13 to t14, an operation different from Cycle (2) of the first embodiment is performed as described in the following points depending on the common voltage VCOM being Vdd/2. At timing t13, the global circuit 10 shifts and maintains the global signal MULT to the H level while maintaining the global signal PRE and the global signal SUM at the L level. Accordingly, in the case where the bit value bi=1 of the weight W, the holding voltage of the capacitive device 23_i gradually varies from VCOM =Vdd/2 to VX,i. In the case where the parameter VX has a positive value, the input voltage VX,n−1 corresponding to bn−1 of the MSB of the weight W has a negative signal amplitude, and the holding voltage of the capacitive device 23_(n−1) decreases from VCOM to VX,n−1 gradually. In the case where the other input voltage VX,i (where i=(n−2) to 0) has a positive signal amplitude, the holding voltage of the capacitive device 23_i gradually increases from VCOM to VX,i. In the case where the bit value bi=0 of the weight W, the holding voltage of the capacitive device 23_i remains VCOM. In other words, the charge accumulated in the capacitive device 23_(n−1) in the row line (n−1) is given as in Formula 6 below, and the charge accumulated in the capacitive device 23_i in the row line i is given as in Formula 7 below. In Formula 7, i=(n−2) to 0.
In one example, in the case where n=3 and the weight W=(b2, b1, b0)=(1, 1, 1) , the input signal VCOM−VX is applied to the first end of the capacitive device 23_2 depending on the bit value b2=1 of the weight W in the row line 2, and the holding voltage VC,2 of the capacitive device 23_2 gradually decreases from VCOM to VCOM−VX. Accordingly, the charge Q2=C X (VCOM VX) is accumulated in the capacitive device 23_2. In the row line 1, the input signal VCOM +VX/2 is applied to the first end of the capacitive device 23_1 depending on the bit value b1=1 of the weight W, and the holding voltage VC,1 of the capacitive device 23_1 is gradually increased from VCOM to VCOM+VX/2. Accordingly, the charge Q1=C×(VCOM+VX/2) is accumulated in the capacitive device 23_1. In the row line 0, the input signal VCOM+VX/22 is applied to the first end of the capacitive device 23_0 depending on the bit value b0=1 of the weight W, and the holding voltage VC,0 of the capacitive device 23_0 is gradually increased from VCOM to VCOM+VX/22. Accordingly, the charge Q1=C×(VCOM+VX/22) is accumulated in the capacitive device 23_0.
During the period from timings t15 to t16, an operation similar to Cycle (3) of the first embodiment is performed. The adder circuit 24 collectively redistributes the charges accumulated in the n capacitive devices 23_(n−1) to 23_0 in such a way that the accumulated voltages are averaged, generates the output voltage VY as given in Formula 8 and supplies the output node 20b with the generated voltage.
Moreover, the parameter V, that defines each element of the input vector X can have values ranging from −VCOM to +VCOM. The parameter VX can also have a negative value.
For example, in the case where the parameter VX has a negative value and the multi-bit pattern of the weight W represent a negative integer, the semiconductor integrated circuit 201 operates as illustrated in
As illustrated in
During the period from timings t23 to t24, an operation different from Cycle (2) of the first embodiment is performed as described in the following points depending on the common voltage VCOM being Vdd/2. At timing t23, the global circuit 10 shifts and maintains the global signal MULT to the H level while maintaining the global signal PRE and the global signal SUM at the L level. Accordingly, in the case where the bit value bi=1 of the weight W, the holding voltage VC,i of the capacitive device 23_i gradually varies from VCOM to VX,i. In the case where the parameter VX has a negative value, VX=−|VX|. The input voltage VX,n-1 corresponding to bn−1 of the MSB of the weight W has a positive signal amplitude, and the holding voltage of the capacitive device 23_(n−1) increases from VCOM to VX,n-1 gradually. The other input voltage VX,i (where i=(n-2) to 0) has a negative signal amplitude, and the holding voltage VC,i of the capacitive device 23_i gradually decreases from VCOM to VX,i. In the case where the bit value bi=0 of the weight W, the holding voltage of the capacitive device 23_i remains 0 V.
In one example, in the case where n=3 and the weight W=(b2, b1, b0)=(1, 1, 1) , the input signal VCOM−VX is transferred to the first end of the capacitive device 23_2 depending on the bit value b2=1 of the weight W in the row line 2, and the holding voltage VC,2 of the capacitive device 23_2 gradually increases from VCOM to VCOM−VX=VCOM+|VX|. Accordingly, the charge Q2=C×(VCOM+|VX|) is accumulated in the capacitive device 23_2. In the row line 1, the input signal VCOM+VX/2 is transferred to the first end of the capacitive device 23_1 depending on the bit value b1=1 of the weight W, and the holding voltage VC,1 of the capacitive device 23_1 is gradually decreased from VCOM to VCOM+VX/2=VCOM 1VX1/2. Accordingly, the charge Q1=C×(VCOM−|VX|/2) is accumulated in the capacitive device 23_1. In the row line 0, the input signal VCOM+VX/22 is transferred to the first end of the capacitive device 23_0 depending on the bit value b0=1 of the weight W, and the holding voltage VC,0 of the capacitive device 23_0 is gradually decreased from VCOM to VCOM+VX/22=VCOM−|VX|/22. Accordingly, the charge Q1=C×(VCOM−|VX|/22) is accumulated in the capacitive device 23_0.
During the period from timings t25 to t26, an operation similar to Cycle (3) of the first embodiment is performed. The adder circuit 24 collectively redistributes the charges accumulated in the n capacitive devices 23 (n−1) to 23_0 in such a way that the accumulated voltages are averaged, generates the output voltage VY as in Formula 8 and supplies the output node 20b with the generated voltage in a similar manner as the operation in
As described above, according to the second embodiment, the semiconductor integrated circuit 201 multiplies a plurality of input voltages having a positive or negative signal amplitude by a multiple bits weight representing a positive or negative integer in two's complement representation. The semiconductor integrated circuit 201 redistributes the charges corresponding to the plurality of multiplication results to generate an output voltage corresponding to the addition result. This also makes it possible to obtain the addition result without performing the bit shift and to improve the efficiency of parallel operation regardless of the bit precision of any one of input or weight, obtaining fixed processing time of parallel operation.
An arithmetic logic operation system 300 according to a third embodiment is now described. The configurations different from the first embodiment and the second embodiment are hereinafter mainly described.
The first embodiment illustrates the configuration for setting a fixed time in the case of performing the parallel operation for each bit when the multi-bit multiplication is divided for each bit of the weight. However, the third embodiment extends the configuration of the first embodiment to a configuration for achieving the fixed time for the parallel operation of vectors and matrices.
The arithmetic logic operation system 300 has a semiconductor integrated circuit 301 instead of the semiconductor integrated circuit 1. The semiconductor integrated circuit 301 can be configured as illustrated in
The arithmetic logic operation circuit 320 further includes a plurality of word lines WL(N-1, n-1) to WL(0, 0) and a plurality of bit lines BL0 to BL(n−1). A plurality of unit cells UC(N-1, M-1, n-1) to UC(0, 0, 0) is arranged at positions where the plurality of word lines WL and the plurality of bit lines BL intersect. The word lines WL extend in the column direction and are arranged in the row direction. The bit lines BL extend in the row direction and are arranged in the column direction. The unit cells UC are similar to the unit cell UC of the first embodiment.
Moreover, the arithmetic logic operation circuit 20 is expanded to N rows x M columns, accordingly a driver DV1 as a part of the input circuit 2 (see
The arithmetic logic operation circuit 320 performs the arithmetic logic operation for a scalar product of an input vector X′ expressed in Formula 9 and a weight matrix W′ expressed in Formula 10. A plurality of output voltages VY,j (j=M-1 to 0) included in the output vector Y′ =X′W′ expressed in Formula 11 as the arithmetic logic operation result is input to the output circuit 3.
In one example, in the case where the bit pattern of the weight W represents a positive integer, each element vector Xi (i =N-1 to 0) included in the input vector X′ is similar to the input vector X in the first embodiment and includes n input voltages VX,i, VX,i/2, . . . , VX,i/2n-2, VX,i/2n-1. The input vector X′ includes N×n input voltages, each of which has a positive signal amplitude. Each element vector Wi,j (i=N-1 to 0, j=M-1 to 0) included in the weight matrix W′ is similar to the weight vector W in the first embodiment and includes n bits bn−1, bn-2, . . . , b1, b0. The point that n bits bn−1, bn-2, . . . , b1, b0 are stored in the n storage devices 21_(n−1) to 21_0 (see
In other words, in the arithmetic logic operation circuit 20 of each column, the multiplication circuit 22 (see
As described above, according to the third embodiment, for the parallel operation of vectors and matrices, a plurality of input voltages having a positive signal amplitude is multiplied by a multiple bits weight representing a positive integer, and the charges corresponding to the plurality of multiplication results are redistributed for each column to generate a plurality of output voltages corresponding to the addition result of a plurality of columns. This also makes it possible to obtain the addition result without performing the bit shift and to obtain fixed processing time of parallel operation regardless of the bit precision of any one of input or weight.
Moreover, although not illustrated, in the arithmetic logic operation circuit 320, the arithmetic logic operation circuit 220 according to the second embodiment is expanded to N row×M columns and can be arranged as a plurality of arithmetic logic operation circuits 220 (N−1, M−1) to 220 (0, 0). Each arithmetic logic operation circuit 220 is basically similar to the arithmetic logic operation circuit 220 of the second embodiment. In this case, the input vector X′ includes N×n input voltages, each with a positive or negative signal amplitude. In addition, the n-bit pattern included in each element vector Wi,j represents a positive or negative integer in two's complement representation.
That is, for the parallel operation of vectors and matrices, a plurality of input voltages having a positive or negative signal amplitude is multiplied by a multiple bits weight representing a positive or negative integer in two's complement representation, and the charges corresponding to the plurality of multiplication results are redistributed for each column to generate a plurality of output voltages corresponding to the addition result of a plurality of columns. This also makes it possible to obtain the addition result without performing the bit shift and to obtain fixed processing time of parallel operation regardless of the bit precision of any one of input or weight.
An arithmetic logic operation system 100 according to a fourth embodiment is now described. The configurations different from the first embodiment to the third embodiment are hereinafter mainly described.
Although the first to third embodiments do not describe the specific configuration of the unit cell UC, the fourth embodiment exemplifies the specific configuration of the unit cell UC.
The unit cell UC is arranged at a position where the word line WL and the bit line BL intersect, as illustrated in
A line L11 is arranged at a position where the word line WL and the bit line BL intersect. The line L11 has a first end connected to the word line WL and a second end connected to the bit line BL. The multiplication circuit 22 i in the unit cell UC_i is inserted into the line L11. The series connection of the switch SW1 and the switch SW2 in the multiplication circuit 22_i is inserted into the line L11. The capacitive device 23_i in the unit cell UC_i has a first end connected to the intermediate node 20c on the line L11 and a second end connected to the reference node 20d. The reference node 20d is supplied with a reference voltage (e.g., a ground voltage). The first end of the capacitive device 23_i is connected to the bit line BL via the intermediate node 20c. On the bit line BL, a switch SW4_i is electrically connected between the intermediate node 20c and the output node 20d.
The switch SW1 is controlled to be ON or OFF depending on the global signal MULT from the global circuit 10. The switch SW4_i is controlled to be ON or OFF depending on the global signal SUM from the global circuit 10.
The storage device 21_i in the unit cell UC_i has an output node connected to the control terminal of the switch SW2. The storage device 21_i stores one bit of the weight W and outputs a voltage corresponding to the value of the stored bit to the control terminal of the switch SW2. The switch SW2 remains ON or OFF depending on the voltage output from the storage device 21_i.
The storage device 21_i may be a volatile memory cell (e.g., an SRAM memory cell) or a non-volatile memory cell (e.g., a flash memory cell, a memory cell in which variable resistive devices are connected in two stages with opposite polarities, or the like). In considering the ease of programming and the characteristics of low power consumption during operation, it is desirable that the storage device 21_i is desirable to include an SRAM memory cell as illustrated in
The storage device 21_i illustrated in
The transfer transistor T1 is connected between the storage node Nt of the storage device 21_i and a weighting bit line WBL. The transfer transistor T2 is connected between the flip-flop inverting storage node Nc and a weight inverting bit line WBLB. The transfer transistors T1 and T2 turn ON when their gates are supplied with an active level control signal via a weighting word line WWL from the corresponding driver.
In one example, when the transfer transistors T1 and T2 turn ON while the weighting bit line WBL remains at the H level and the weight inverting bit line WBLB remains at the L level, the H level is held in the storage node Nt and the L level is held in the inverting storage node Nc. In other words, the bit value “1” is written in the storage device 21_i. The inverter INV1 and the inverter INV2 operate in a complementary manner, the H level is held in the storage node Nt, and the L level is held in the inverting storage node Nc even after the transfer transistors T1 and T2 turn OFF. In other words, the storage device 21_i holds the bit value “1” and outputs the H level voltage corresponding to the bit value “1” to the control terminal of the switch SW2.
When the transfer transistors T1 and T2 turn ON while the weighting bit line WBL remains at the L level and the weight inverting bit line WBLB remains at the H level, the L level is held in the storage node Nt and the H level is held in the inverting storage node Nc. In other words, the bit value “0” is written in the storage device 21_i. The inverter INV1 and the inverter INV2 operate in a complementary manner, the L level is held in the storage node Nt, and the H level is held in the inverting storage node Nc even after the transfer transistors T1 and T2 turn OFF. In other words, the storage device 21_i holds the bit value “0” and outputs the L level voltage corresponding to the bit value “0” to the control terminal of the switch SW2.
In the unit cell UC_i, the switches SW1, SW2, and SW4 i can be constituted of transistors T7, 18, and T9, respectively. The transistors T7, T8, and T9 can be constituted of an NMOS transistor. The transistor T7 has a source connected to the word line WL, a drain connected to the transistor T8, and a gate that receives the global signal MULT. The transistor T8 has a source connected to the transistor T7, a drain connected to the bit line BL via the intermediate node 20c, and a gate that receives a voltage output from the storage node Nt of the storage device 21_i. The transistor T9 has a source that can be connected to the output node 20b, a drain that can be connected to the common voltage VCOM via the switch SW3 (see
Moreover, in the case where the storage device 21_i is a volatile memory cell (SRAM memory cell), the backup power can be supplied from a power storage device (not illustrated) to the power node VDD of the storage device 21_i. Accordingly, the bit value of the weight W can be non-volatilely stored in the storage device 21_i even when the power of the semiconductor integrated circuit 1 is turned OFF. The power storage device is, for example, a secondary battery.
As described above, according to the fourth embodiment, in the unit cell UC_i, the storage device 21_i is constituted of a 6T SRAM cell, and the switches SW1, SW2, and SW4 are constituted of transistors, so it is possible to implement the unit cell UC_i in the 9T1C configuration (SRAMx).
Moreover, the unit cell UC_i can be configured as illustrated in
The transfer gate TG1 includes transistors T7 and T10 to which their respective sources and drains are commonly connected. The transistors T7 and T10 can be constituted of an NMOS transistor and a PMOS transistor, respectively. The transistor T10 has a source connected to the word line WL, a drain connected to the transfer gate TG2, and a gate receiving the global signal MULT−.
The transfer gate TG2 includes transistors T8 and T11 to which their respective sources and drains are commonly connected. The transistors T8 and T11 can be constituted of an NMOS transistor and a PMOS transistor, respectively. The transistor T11 has a source connected to the transfer gate TG1, a drain connected to the bit line BL via the intermediate node 20c, and a gate that receives a voltage output from the inverting storage node Nc of the storage device 21_i.
The transfer gate TG3 includes transistors T9 and T12 to which their respective sources and drains are commonly connected. The transistors T9 and T12 can be constituted of an NMOS transistor and a PMOS transistor, respectively. The transistor T12 has a source that can be connected to the output node 20b, a drain that can be connected to the common voltage VCOM via the switch SW3 (see
The unit cell UC_i illustrated in
An arithmetic logic operation system 100 according to a fifth embodiment is now described. The configurations different from the first embodiment to the fourth embodiment are hereinafter mainly described.
Although the first embodiment does not describe in detail the configuration of the input circuit of the semiconductor integrated circuit, the fifth embodiment illustrates the specific configuration of the input circuit of the semiconductor integrated circuit.
In the arithmetic logic operation system 100 illustrated in
Each unit configuration 30 can be configured as illustrated in
The D/A converter DA1 converts digital signals Din,i into the analog signals VX,i. A row control signal isMSB indicates whether the bit bi of the weight W of the corresponding row is the MSB. The row control signal isMSB is set to the active level (e.g., 1) when the bit bi of the weight W of the corresponding row is the MSB and is set to the non-active level (e.g., 0) when the bit bi of the weight W of the corresponding row is not the MSB.
The switch SW11 is controlled to be ON or OFF depending on the row control signal isMSB. The switch SW11 receives the row control signal isMSB at its control terminal. The switch SW11 remains ON when its control terminal is supplied with an active level row control signal isMSB, and remains OFF when its control terminal is supplied with a non-active level row control signal isMSB. In other words, the switch SW11 remains ON when the bit ID, of the weight W of the corresponding row is the MSB, and the switch SW11 remains OFF when the bit b, of the weight W of the corresponding row is not the MSB.
The switch SW12 is controlled to be ON or OFF depending on the row control signal isMSB− in which the row control signal isMSB is logically inverted. The switch SW12 receives the row control signal isMSB− at its control terminal. The switch SW12 remains OFF when its control terminal is supplied with a non-active level row control signal isMSB−, and remains ON when its control terminal is supplied with an active level row control signal isMSB−. In other words, the switch SW12 remains OFF when the bit bi of the weight W of the corresponding row is the MSB, and the switch SW12 remains ON when the bit bi of the weight W of the corresponding row is not the MSB.
The driver DV1 has a variable gain G. The driver DV1 sets the variable gain G to G =1 when its control terminal is supplied with the active level row control signal isMSB, and sets the variable gain G to G=0.5 when its control terminal is supplied with the non-active level row control signal isMSB. When the variable gain G=1, the drive capacity of the driver DV1 corresponds to a first capacity, and when the variable gain G=0.5, the drive capacity of the driver DV1 corresponds to a second capacity. In other words, the drive capacity of the driver DV1 is controlled to be the first capacity when the bit bi of the weight W in the corresponding row is the MSB. The drive capacity of the driver DV1 is controlled to be the second capacity lower than the first capacity when the bit bi of the weight W in the corresponding row is not the MSB.
The input circuit 2 can be configured by connecting the n unit configurations 30_(n−1) to 30_0 in a ladder shape, as illustrated in
The input node 30c_i of the unit configuration 30_i is connected to the output node 30b (i+1) of the unit configuration 30_(i+1) one level higher. An input node 30c_(i+1) of the unit configuration 30_(i+1) is connected to an output node 30b_(i+2) of a unit configuration (i+2) one level higher. An input node 30c_(i+2) of the unit configuration 30_(i+2) is connected to an output node 30b_(i+3) of a unit configuration 30_(i+3) one level higher.
The switches SW11 and SW12 of the unit configuration 30_i are controlled to be ON or OFF depending on the row control signals isMSBi and isMSBi−, respectively. The switches SW11 and SW12 of the unit configuration 30_(i+1) are controlled to be ON or OFF depending on the row control signals isMSBill and isMSBill−, respectively. The switches SW11 and SW12 of the unit configuration 30_(i+2) are controlled to be ON or OFF depending on the row control signals isMSBi+2 and isMSBi+2−, respectively. The switches SW11 and SW12 of the unit configuration 30_(i+3) are controlled to be ON or OFF depending on the row control signals isMSBi+3 and isMSBi+3−, respectively.
The input signals VX,i+3 to VX,i, which are input to the semiconductor integrated circuit 1 from the plurality of unit configurations 30_(i+3) to 30_i, are driven with a gain of 1 or 0.5 by the driver DV1 depending on the bit precision n of the weight W. In other words, in the unit configuration 30 corresponding to the MSB row among the plurality of rows corresponding to the multiple bits n of the weight W, the variable gain G of the driver DV1 is set to G =1. In the unit configuration 30 corresponding to the other row lines, the variable gain G of the driver DV1 is set to G=0.5.
In one example, in the case of the bit precision n=2 of the weight W, the connection of the circuit illustrated in
In the example illustrated in
Such a configuration enables an analog signal VX,1, which is output from the D/A converter DA1 in the row line (i+3) depending on the input data Din, 1, to be transmitted as the input voltage VX,1 to unit cell UC_(i+3) of the arithmetic logic operation circuit 20 via the driver DV1 in the row line (i+3). In addition, the analog signal VX,1 is transmitted to the unit cell UC_(i+2) of the arithmetic logic operation circuit 20 as the input voltage VX,1/2 via the driver DV1 in the row line (i+3) and the driver DV1 in the row line (i+2).
Further, two bits b1 and b0 of a weight W0 are held in two unit cells UC_(i+1) and UC_i. Accordingly, the switch SW11 of the unit configuration 30_(1+1) corresponding to the MSB remains ON, and the switch SW11 of the unit configuration 30_i corresponding to the other bit remains OFF. The switch SW12 of the unit configuration 30_(i+1) corresponding to the MSB remains OFF, and the switch SW11 of the unit configuration 30_i corresponding to the other bit remains ON. Accordingly, the D/A converter DA1 in the row line i is deactivated, and the output node of the driver DV1 in the row line (i+1) is connected to the input node of the driver DV1 in the row line i. The variable gains G of the driver DV1 in the row line (i+1) and the driver DV1 in the row line i are set to 1 and 0.5, respectively.
Such a configuration enables an analog signal VX,0, which is output from the D/A converter DA1 in the row line (i+l) depending on the input data Din,0, to be transmitted as the input voltage VX,0 to unit cell UC_(i+1) of the arithmetic logic operation circuit 20 via the driver DV1 in the row line (i+1). In addition, the analog signal VX,0 is transmitted to the unit cell UC_i of the arithmetic logic operation circuit 20 as the input voltage VX,0/2 via the driver DV1 in the row line (i+1) and the driver DV1 in the row line i.
The connection configuration illustrated in FIG. 14 makes it possible to reduce the number of D/A converters DA1 operating in the input circuit 2 to one-half of the number of rows, decreasing the power consumption of the arithmetic logic operation system 100 including the input circuit 2.
In one example, in the case of the bit precision n=4 of the weight W, the connection of the circuit illustrated in
In the example illustrated in
Such a configuration enables an analog signal VX,0, which is output from the D/A converter DA1 in the row line (i+3) depending on the input data Din,0 , to be transmitted as the input voltage VX,0 to unit cell UC_(i+3) of the arithmetic logic operation circuit 20 via the driver DV1 in the row line (i+3). In addition, the analog signal VIK,0 is transmitted to the unit cell UC_(i+2) of the arithmetic logic operation circuit 20 as the input voltage VX, 0/2 via the driver DV1 in the row line (i+3) and the driver DV1 in the row line (i+2). The analog signal VX,0 is transmitted as the input voltage VX,0/22 to the unit cell UC_(i+1) of the arithmetic logic operation circuit 20 via the driver DV1 in the row line (i+3), the driver DV1 in the row line (i+2), and the driver DV1 in the row line (i+1). The analog signal VX,0 is transmitted as the input voltage VX,0/23 to the unit cell UC_i of the arithmetic logic operation circuit 20 via the driver DV1 in the row line (i+3), the driver DV1 in the row line (i+2), the driver DV1 in the row line (i+1), and the driver DV1 in the row line i.
The connection configuration illustrated in
As described above, according to the fifth embodiment, in the arithmetic logic operation system 100, the input circuit 2 is configured by connecting a plurality of unit configurations 30 in a ladder shape. Such a configuration makes it possible to reduce the number of D/A converters DA1 operating in the input circuit 2, decreasing the power consumption of the arithmetic logic operation system 100 including the input circuit 2.
Moreover, each unit configuration 30′ can be configured in such a way that the gain of the driver DV1 is fixed, and the voltage of the input node 30c is divided by a plurality of impedance devices and is input to the driver DV1 as illustrated in
Each of the unit configurations 30′ sets the fixed gain of the driver DV1 to 1 and further has a plurality of impedance devices Z1 and Z2. The plurality of impedance devices Z1 and Z2 has impedance equal to each other. Any device having impedance can be used as each impedance device Z1 and Z2, and for example, a capacitive device, a resistance device, a transistor, or the like can be used. The impedance devices Z1 and Z2 can be shared with some devices in the D/A converter DA1 (e.g., a capacitive device CE, see
In each of the unit configurations 30′, the impedance device Z1 has a first end connected to the switch SW12 and a second end connected to the intermediate node 30d. The impedance device Z2 has a first end connected to the intermediate node 30d and a second end connected to a reference potential (e.g., a ground potential).
In one example, in the case where an analog signal VX,i+3 is input to the driver DV1 of a unit configuration 30′_(i+3) and the switch SW12 of a unit configuration 30′_(i+2) remains ON, the driver DV1 outputs the analog signal VX,i+3. In addition, the analog signal VX,i+3 is divided by the plurality of impedance devices Z1 and Z2 having the unit configuration 30′_(i+2), and the analog signal VX,i+3/2 is input to the driver DV1 having the unit configuration 30′_(i+2). The driver DV1 outputs the analog signal VX,i+3/2.
As described above, the configuration of the input circuit 2′ illustrated in
An arithmetic logic operation system 200 according to a sixth embodiment is now described. The configurations different from the first embodiment to the fifth embodiment are hereinafter mainly described.
Although the second embodiment does not describe in detail the configuration of the input circuit of the semiconductor integrated circuit, the sixth embodiment illustrates the specific configuration of the input circuit of the semiconductor integrated circuit.
The arithmetic logic operation system 200 according to the sixth embodiment has an input circuit 202 instead of the input circuit 2 (see
Each of the unit configurations 230 can be configured as illustrated in
The unit configuration 230_i has a configuration expanded from the unit configuration 30_i (see
The D/A converter DA201 is electrically connected between an input node 230a_i and the switch SW211. The switch SW211 is electrically connected between the D/A converter DA201 and an intermediate node 230d_i. The driver DV201 is electrically connected between the intermediate node 230d _i and an intermediate node 230e_i. The switch SW212 is electrically connected between an input node 230c_i and the intermediate node 230d_i. The OR gate OG201 performs an OR operation for a global signal INV and the row control signal isMSB and supplies a control terminal of the switch SW212 with the arithmetic logic operation result. The input node 230c_i can be electrically connected to an intermediate node 230e_(i+1) of the unit configuration 230 (i+1) adjacent in the row direction (e.g., one level higher) (see
Moreover, it is possible to limit the number of the D/A converters DA201, which operate depending on the number of bits n of the weight W by connecting n unit configurations 230_(n−1) to 230_0 in a ladder shape to form the input circuit 202 (see
The switch SW213 is electrically connected between the intermediate node 230e_i and an output node 230b_i. The impedance device Z201 has a first end connected to the switch SW212 and a second end connected to the intermediate node 230d_i. The impedance device Z202 has a first end connected to the intermediate node 230d_i and a second end connected to an intermediate node 230f_i. The impedance devices Z201 and Z202 can employ any device having an impedance.
The switch SW214 is electrically connected between the intermediate node 230f_i and a reference node 230g_i. The reference node 230g_i is supplied with the global signal INV. The transistor TR201 is, for example, an NMOS transistor and has a source connected to a reference node 230h_i, a drain connected to the intermediate node 230f_i, and a gate that receives the row control signal isMSB−. The row control signal isMSB−is a signal in which the row control signal isMSB is logically inverted. Accordingly, a potential VCB of the intermediate node 230f_i is set to the value of the global signal INV when the row line i corresponds to the MSB of the weight W, and is set to the common voltage VCOM when the row line i corresponds to the bit other than the MSB of the weight W.
The D/A converter DA201 of each unit configuration 230 in the case when the bit precision of inputs is m can be configured as illustrated in
The D/A converter DA201 is a standard charge redistribution D/A converter. The D/A converter DA201 includes m capacitive devices CE(m−1) to CEO, m inverters IV(m−1) to IVO, m exclusive OR (XOR) gates XG(m−1) to XG0, m flip-flops FF(m−1) to FF0, and a flip-flop FFe. The m flip-flops FF(m−1) to FF0, m exclusive OR gates XG(m−1) to XGO, m inverters IV(m−1) to IVO, and m capacitive devices CE(m−1) to CEO correspond to each other, respectively. Each of m inverters IV(m−1) to IVO includes an NMOS transistor and a PMOS transistor connected to the inverter.
The flip-flop FFe holds the row control signal isMSB in synchronization with a clock CLK. At the same time, the flip-flop FFe outputs the row control signal isMSB to the input node of the OR gate OG201 and the control terminal of the switch SW214, and outputs the row control signal isMSB−to the gate of the transistor TR201 (see
The n flip-flops FF(m−1) to FF0 hold the respective values of m bits dm−1 to d0 of the input data Din in synchronization with the clock CLK. At the same time, each flip-flop FF outputs the value of bit d to the corresponding exclusive OR gate XG. Each flip-flop FF resets the held value to an initial value (e.g., 0) when a clear signal CLR− reaches the active level (e.g., L level).
The m exclusive OR gates XG(m−1) to XG0 switch whether or not to invert the m bits dm−1 to d0 depending on the global signal INV. Each exclusive OR gate XG performs the exclusive OR operation of the bit d received from the corresponding flip-flop FF and the global signal INV to generate arithmetic logic operation results Bm−1 to B0. Accordingly, the m exclusive OR gates XG(m−1) to XG0 output m bits dm−1 to d0 as the arithmetic logic operation results Bm−1 to B0 without any modification to the m inverters IV(m−1) to IV0 in the case of the global signal INV=0. The m exclusive OR gates XG (m−1) to XG0 output the arithmetic logic operation results Bm−1 to B0 obtained by bit-inverting the m bits dm−1 to d0 to the m inverters IV (m−1) to IV0 in the case of the global signal INV=1.
The m inverters IV (m−1) to IV0 logically invert the m bit values output from the m exclusive OR gates XG(m−1) to XG0 and output them to the m capacitive devices CE(m−1) to CE0. Each inverter IV logically inverts the bit value output from the corresponding exclusive OR gate XG and supplies the first end of the corresponding capacitive device CE.
The m capacitive devices CE (m−1) to CE0 have different capacitance values 2 (m−1)C to C in binary. Each capacitive device CE has a first end connected to the corresponding inverter IV and a second end connected to an output node Nout. Each capacitive device C accumulates a charge corresponding to the bit value supplied to the first end. In response, the charges at the second ends of the m capacitive devices CE (m−1) to CEO are redistributed depending on the capacitance ratio, and the voltage corresponding to the charges after the redistribution appears at the output node Nout. Moreover, depending on the control logic, at this time, the switch SW211 illustrated in
As the operation of the D/A converter DA201, the following first and second operations can be considered.
In the first operation, the D/A converter DA201 can perform D/A conversion for the input data Din and output VCOM+VX as a conversion result from the output node Nout. The output VCOM+VX can vary between 0 V and Vdd=2Vcam. The parameter Vx included in the output can have a positive or negative value. In addition, |Vx| can vary between 0 V and VCOM. The multiple bits of the input data Din have the MSB as a sign bit. The multiple bits indicate a positive integer when the MSB value is 0 and the multi-bit pattern to be a negative integer when MSB value is 1. Thus, a positive or negative integer represented by the multiple bits of the input data Din can be mapped to the voltage value of the output VCOM+VX with a pseudo “binary offset”. In the case where the bit precision n of the D/A converter DA201 is 4 bits, the output VCOM+VX can typically vary between VCOM−(8/8)×VCOM and vCOM+(7/8)×VCOM.
In the second operation, as expressed in Formulas 4 and 5, in the row corresponding to the MSB of the weight W, the D/A converter DA201 is required to output VCOM−VX, and in the row corresponding to the bit other than the MSB of the weight W, the D/A converter DA201 is required to output VCOM+VX/2n−i−1.
The unit configuration 230_i illustrated in
In the D/A converter DA201, the m exclusive OR gates XG(m−1) to XG0 perform the exclusive OR operation between the m bits dm−1 to d0 and the global signal INV, and the m capacitive devices CE(m−1) to CEO redistribute the charges depending on the capacitance ratio. The first operation is represented by Formula 13 below, and the second operation is represented by Formula (14) below.
[Mathematical 9]
I NV=0:D(X)=VCOM+VX Formula 13
I NV=1: ˜D(X)=VCOM−VX−1 LSB Formula 14
As expressed in Formulas 13 and 14, the second operation has an error by 1 LSB with respect to the first operation. This error of 1 LSB is correctable by using a mapping table, as illustrated in
The arrangement in which the input circuit 202 can be configured by connecting n unit configurations 230_(n−1) to 230_0 in a ladder shape is similar to the fifth embodiment. For any row line i, the input node 230c_i of the unit configuration 230_i is connected to the intermediate node 230e_(i+1) of the unit configuration 230_(i+1) one level higher. The repetition of this connection enables n unit configurations 230_(n−1) to 230_0 to be connected in a ladder shape.
In one example, in the case where the bit precision of the weight W held in the arithmetic logic operation circuit 220 (see
An input node 230c_0 of the unit configuration 230_0 is connected to an intermediate node 230e_1 of the unit configuration 230_1 one level higher. An input node 230c_1 of the unit configuration 230_1 is connected to an intermediate node 230e_2 of the unit configuration 230_2 one level higher. Thus, three unit configurations 230_2 to 230_0 can be connected in a ladder shape.
In the ladder-like connection of n unit configurations 230_(n−1) to 230_0, the processing is performed in two phases, and the error of 1 LSB between the unit configurations 230 is correctable by using the global signal INV.
In one example, in the case where the parameter VX has a positive value, the bit precision of the weight W held in the arithmetic logic operation circuit 220 (see
Immediately before timing t31, the switch SW214 and the transistor TR201 of each unit configuration 230 are controlled to be ON or OFF depending on the values of the row control signals isMSB and isMSB−. In other words, the voltage VCB of the intermediate node 230f is set depending on the value of the row control signal isMSB. As illustrated in
The switches SW211 and SW212 of each unit configuration 230 are controlled to be ON or OFF depending on the values of the row control signals isMSB and isMSB−. In the row line 2 corresponding to the MSB of the weight W, the row control signals isMSB2=1 and isMSB2−=0, the switch SW211 turns ON (the state shown by the dotted line), and the switch SW212 turns OFF. Thus, the D/A converter DA201 is activated. In the row line 1 corresponding to the bits other than the MSB of the weight W, the row control signals isMSB1=0 and isMSB1−=1, the switch SW211 turns OFF, and the switch SW212 turns ON (the state shown by the dotted line). Thus, the converter DA201 is deactivated, and the intermediate node 230e_2 in the row line 2 is connected to the first end of the impedance device Z201. In the row line 0 corresponding to the LSB of the weight W, the row control signals isMSB0=0 and isMSB0−=1, the switch SW211 turns OFF, and the switch SW212 turns ON (the state shown by the dotted line). Thus, the D/A converter DA201 is deactivated and the intermediate node 230e_2 in the row line 1 is connected to the first end of the impedance device Z201. Accordingly, a ladder-like connection configuration of the D/A converter DA201 of the unit configuration 230_2 → the switch SW211 → the driver DV201 the impedance device Z201 of the unit configuration 230_1 → the driver DV201 → the impedance device Z201 of the unit configuration 230_0 the driver DV201 is formed. In addition, it is possible to limit the number of D/A converters DA201 operating in the unit configurations 230_2 to 230_0 to one, decreasing the power consumption of the input circuit 202.
As illustrated in
Thus, the voltage of the output node Nout of the D/A converter DA201 of the unit configuration 230_2 gradually increases from 0 V to VCOM+VX, and the voltage VDV2 of the intermediate node 230e_2 of the unit configuration 230_2 gradually increases from 0 V to VCOM+VX. Accordingly, a voltage VDV1 of the intermediate node 230e_1 of the unit configuration 230_1 gradually increases from 0 V to VCOM+VX/2 due to the voltage division of the impedance devices Z201 and Z202 of the unit configuration 230_1. Furthermore, the voltage VDV0 of the intermediate node 230e_0 of the unit configuration 230_0 gradually increases from 0 V to VCOM+VX/22 due to the voltage division of the impedance devices Z201 and Z202 of the unit configuration 230_0. In this event, the control signal O is maintained to be 0.
During the period from timings t32 to t33, the processing of the second phase is performed. At timing t32, the global signal INV shifts from 0 to 1, so the voltage VCB2 shifts from 0 to 1. Accordingly, to correct the error of 1 LSB, the exclusive OR gates XG2 to XG0 in the D/A converter DA201 of the unit configuration 230_2 output three operation results B2 to B0 obtained by bit-inverting the 3 bits d2 to d0 to the three inverters IV2 to IV0, respectively. In addition, this causes the voltage of the output node Nout of the D/A converter DA201 of the unit configuration 230_2 to decrease from VCOM+VX to VCOM−VX gradually and the voltage VDV2 of the intermediate node 230e_2 of the unit configuration 230_2 to decrease from VCOM+VX to VCOM−VX gradually.
On the other hand, the switch SW212 of the unit configuration 230_1 and the switch SW212 of the unit configuration 230_0 turn OFF depending on the global signal INV=1 (the state shown by the solid line in
At timing t33, the control signal O shifts from 0 to 1. In each of the unit configurations 230_2 to 230_0, the switch SW213 turns ON, and the voltage of the intermediate node 230e is transmitted to the output node 230b. This causes the unit configuration 230_2 to input the input voltage VCOM−VX to the arithmetic logic operation circuit 220, the unit configuration 230_1 to input the input voltage VCOM+VX/2 to the arithmetic logic operation circuit 220, and the unit configuration 230_0 to input the input voltage VCOM+VX/2′ to the arithmetic logic operation circuit 220 (see
As described above, according to the sixth embodiment, in the arithmetic logic operation system 200, the input circuit 202 is configured by connecting a plurality of unit configurations 230 in a ladder shape. Such a configuration makes it possible to reduce the number of D/A converters DA201 operating in the input circuit 202, decreasing the power consumption of the arithmetic logic operation system 200 including the input circuit 202.
Moreover, mapping the output voltage to the set of the multiple bits b of the input data Din and the bits of the VCB in the D/A converter DA201 is not limited to the example illustrated in
Further, there may be the case where the input circuit 202 is configured as the ladder-like connection of n unit configurations 230 (n−1) to 230_0. In this case, a configuration in which the power supply to the driver DV201 is stopped and the switch SW213 remains OFF can be provided in the unit configuration 230 in the row corresponding to the bit value 0 of the input data Din. This makes it possible to decrease the power consumption of the input circuit 202.
While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the novel embodiments described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the embodiments described herein may be made without departing from the spirit of the inventions. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the inventions.
Number | Date | Country | Kind |
---|---|---|---|
2021-102877 | Jun 2021 | JP | national |