Results processing circuits and methods associated with computational memory cells

Description

FIELD

The disclosure relates generally to a computational memory element and in particular to a computational memory element array having results processing circuitry.

BACKGROUND

Memory cells have traditionally been used to store bits of data. It is also possible to architect a memory cell so that the memory cell is able to perform some simple logical functions when multiple memory cells are connected to the same read bit line. For example, when memory cells A, B, and C are connected to a particular read bit line and are read simultaneously, and the memory cells and read bit line circuitry are designed to produce a logical AND result, then the result that appears on the read bit line is AND (a,b,c) (i.e. “a AND b AND c”), where a, b, and c represent the binary data values stored in memory cells A, B, and C respectively. More particularly, in these computational memory cells, the read bit line is pre-charged to a logic “1” before each read operation, and the activation of one or more read enable signals to one or more memory cells discharges the read bit line to a logic “0” if the data stored in any one or more of those memory cells=“0”; otherwise, the read bit line remains a logic “1” (i.e. in its pre-charge state). In this way, the read bit line result is the logical AND of the data stored in those memory cells. The memory cells may be subdivided into a plurality of sections and each section may have a plurality of bit line sections that perform a logical function and their logical functions, may be used to execute a wide variety of computational algorithms.

Typically, all bit line sections in a section execute the same computational algorithm on their respective data because the bit line sections share the same read and write control signals. Oftentimes, it is desirable to be able to combine the results across multiple bit line sections to generate a single result for those multiple bit line sections. For example, if the algorithm is designed to search for a particular data pattern within the data stored in the bit line sections, it is desirable to know if any of the one or more of the bit line sections contain that data pattern. It is to this end that this disclosure is directed.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates an example of a semiconductor memory that may include a plurality of computation memory cells and RSP circuitry;

FIG. 2 illustrates an example of a computer system that may include a plurality of computation memory cells and RSP circuitry;

FIG. 3A illustrates an example of a processing array with computational memory cells that may be incorporated into a semiconductor memory or computer system;

FIG. 3B illustrates the processing array with computational memory cells having one section and multiple bit line sections;

FIG. 3C illustrates the processing array with computational memory cells having multiple sections and multiple bit line sections;

FIGS. 4A and 4B illustrate examples of two different types of computational memory cells that may be used in the semiconductor memory of FIG. 1, the computer system of FIG. 2 or the processing array of FIGS. 3A-3C;

FIG. 5 illustrates read/write logic including read logic, read data storage, and write logic associated with each bit line section in the processing array device depicted in FIG. 3C;

FIG. 6 illustrates the read/write logic that includes a results processing (RSP) data line and RSP circuitry in each bit line section that is used to produce combined computational results across all bit line sections on the RSP data line;

FIG. 7 illustrates a single bit line section with the RSP circuitry including RSP logic that produces the combined computational results on the RSP line and an additional line to the write multiplexer that provides the ability to store the logic state of the RSP data line;

FIG. 8 illustrates an example of the signal timing associated with the RSP logic shown in FIG. 7;

FIG. 9 illustrates high level RSP functionality when “K” sections with “n” bit lines in the processing array including the generation of the RSP result from each section;

FIG. 10 illustrates a processing array with 2048 bit lines and four RSP data lines;

FIG. 11 illustrates a single bit line section from the processing array in FIG. 10;

FIG. 12 illustrates the detailed RSP logic implemented in FIG. 11;

FIG. 13 illustrates an example of the signal timing for generating an rsp2K result and an resp32K result and transmitting those results out of the processing array;

FIG. 14 illustrates an example of the signal timing for generating an rsp16 result and transmitting that result to the write multiplexer in each bit line section;

FIG. 15 illustrates an example of the signal timing for generating an rsp256 result, transmitting the result back to rsp16 and transmitting that result to the write multiplexer in each bit line section;

FIG. 16 illustrates an example of the signal timing for generating an rsp2K result, transmitting the result back to rsp256 and then to rsp16 and transmitting that result to the write multiplexer in each bit line section;

FIG. 17 illustrates an example of the signal timing for generating an rsp32K result, transmitting the result back to rsp2K, to rsp256 and then to rsp16 and transmitting that result to the write multiplexer in each bit line section; and

FIG. 18 illustrates the high level RSP functionality for sixteen sections with 2K bit lines.

DETAILED DESCRIPTION OF ONE OR MORE EMBODIMENTS

The disclosure is particularly applicable to a processing array, semiconductor memory or computer that utilizes a plurality of computational memory cells (with each cell being formed with a static random access memory (SRAM) cell) and additional response (RSP) circuitry to provide a mechanism to logically combine the computation results across multiple bit line sections in a section and across multiple sections, and transmit the combined result as an output of the processing array and/or store the combined result into one or more of those multiple bit line sections. It will be appreciated, however, that each computational memory cell may be other types of volatile and non-volatile memory cell that are within the scope of the disclosure, that other additional circuitry (including more, less or different logic) may be used and are within the scope of the disclosure or that different computational memory cell architectures that those disclosed below are within the scope of the disclosure.

The disclosure is directed to a memory/processing array that has a plurality of computing memory cells in an array with additional RSP circuitry. Each computing memory cell in a column in the array may have a read bit line and the read bit line for each of the computing memory cells in the column may be tied together as a single read bit line. The memory/processing array may be subdivided into one or more sections (an example of which is shown in FIGS. 3B and 3C) wherein each section has a unique set of “n” bit lines (each bit line being part of a bit line section) where each bit line section (bl-sect) comprises a single read bit line and a pair of positive and negative write bit lines, with each bit line connected to “m” computational memory cells. Each bit line section also may have a read data storage that is used to capture and store the read result from the read bit line during read operations (so a read data storage is implemented per read bit line) and read circuitry for routing the read data or the selected write data for performing logical operations. In the disclosure, BL-Sect[x,y] is a shorthand notation indicating a bit line section with bit line “y” in section “x” and “bl-sect” means bit line section.

FIG. 1 illustrates an example of a semiconductor memory 10 that may include a plurality of computation memory cells and circuitry that provides an RSP capability that is described below in more detail. The below disclosed plurality of computation memory cells and RSP circuitry allow the semiconductor memory 10 to logically combine computation results across bit line sections of the plurality of computation memory cells. FIG. 2 illustrates an example of a computer system 20 that may include a plurality of computation memory cells and the RSP circuitry that are described below in more detail. The below disclosed plurality of computation memory cells and RSP circuitry allow the semiconductor memory 10 or computer system 20 and memory 24 to logically combine computation results across bit line sections of the plurality of computation memory cells. The computer system 20 in FIG. 2 may have at least one processor 22 and a memory 24 that may include the plurality of computation memory cells and read circuitry for selecting read or write data.

FIG. 3A illustrates an example of a processing array 30 with computational memory cells in an array that may be incorporated into a semiconductor memory or computer system and may include RSP circuitry. The processing array 30 may include an array of computational memory cells (cell 00, . . . , cell 0n and cell m0, . . . , cell mn). In one embodiment, the array of computational memory cells may be rectangular as shown in FIG. 3A and may have a plurality of columns and a plurality of rows wherein the computational memory cells in a particular column may also be connected to the same read bit line (RBL0, . . . , RBLn). The processing array 30 may further include a wordline (WL) generator and read/write logic control circuit 32 that may be connected to and generate signals for the read word line (RE) and write word line (WE) for each memory cell (such as RE0, . . . , REn and WE0, . . . , WEn) to control the read and write operations is well known and one or more read/write circuitry 34 that are connected to the read and write bit lines of the computational memory cells. In the embodiment shown in FIG. 3A, the processing array may have read/write circuitry 34 for each set of bit line signals of the computational memory cells (e.g., for each column of the computational memory cells whose read bit lines are connected to each other). For example, BL0 read/write logic 340 may be coupled to the read and write bit lines (WBLb0, WBL0 and RBL0) for the computational memory cells in column 0 of the array and BLn read/write logic 34n may be coupled to the read and write bit lines (WBLbn, WBLn and RBLn) for the computational memory cells in column n of the array as shown in FIG. 3A.

The wordline (WL) generator and read/write logic control circuit 32 may also generate one or more control signals that control each read/write circuitry 34. For example, for the different embodiments of the read/write logic described in the co-pending U.S. patent application Ser. No. 16/111,178 filed Aug. 23, 2018 and entitled “Read Data Processing Circuits and Methods Associated with Computational Memory Cells” and incorporated herein by reference, the one or more control signals may include a Read_Done control signal, an XORacc_En control signal, an ANDacc_En control signal and an ORacc_En control signal whose operation and details are described in the above incorporated by reference application. Note that for each different embodiment, a different one or more of the control signals is used so that the wordline (WL) generator and read/write logic control circuit 32 may generate different control signals for each embodiment or the wordline (WL) generator and read/write logic control circuit 32 may generate each of the control signals, but then only certain of the control signals or all of the control signals may be utilized as described in the above incorporated by reference co-pending patent application.

During a read operation, the wordline (WL) generator and read/write logic control circuit 32 may activate one or more word lines that activate one or more computational memory cells so that the read bit lines of those one or more computational memory cells may be read out. Further details of the read operation are not provided here since the read operation is well known.

FIGS. 3B and 3C illustrate the processing array 30 with computational memory cells having sections having the same elements as shown in FIG. 3A. The array 30 in FIG. 3B has one section (Section 0) with “n” bit lines (bit line 0 (BL0), . . . , bit line n (BLn)) in different bit line sections (bl-sect), where each bit line connects to “m” computational memory cells (cell 00, . . . , cell m0 for bit line 0, for example). In the example in FIG. 3B, the m cells may be the plurality of computational memory cells that are part of each column of the array 30. FIG. 3C illustrates the processing array 30 with computational memory cells having multiple sections. In the example in FIG. 3C, the processing array device 30 comprises “k” sections with “n” bit lines each, where each bit line within each section connects to “m” computational memory cells. Note that the other elements of the processing array 30 are present in FIG. 3C, but not shown for clarity. In FIG. 3C, the BL-Sect(0,0) block shown corresponds to the BL-Sect(0,0) shown in FIG. 3B with the plurality of computational memory cells and the read/write logic 340 and each other block shown in FIG. 3C corresponds to a separate portion of the processing array. As shown in FIG. 3C, the set of control signals, generated by the wordline generator and read/write logic controller 32, for each section may include one or more read enable control signals (for example S[0] RE[m:0] for section 0), one or more write enable control signals (for example S[0]_WE[m:0] for section 0) and one or more read/write control signals (for example S[0]_RW_Ctrl[p:0] for section 0). As shown in FIG. 3C, the array 30 may have a plurality of sections (0, . . . , k in the example in FIG. 3C) and each section may have multiple bit line sections (0, . . . , n per section, in the example in FIG. 3C).

FIG. 4A illustrates an example of a dual port SRAM cell 20 that may be used for computation. The dual port SRAM cell may include two cross coupled inverters 121, 122 and two access transistors M23 and M24 that interconnected together to form a 6T SRAM cell. The SRAM may be operated as storage latch and may have a write port. The two inverters are cross coupled since the input of the first inverter is connected to the output of the second inverter and the output of the first inverter is coupled to the input of the second inverter as shown in FIG. 4A. A Write Word line carries a signal and is called WE and a write bit line and its complement are called WBL and WBLb, respectively. The Write word line WE is coupled to the gates of the two access transistors M23, M24 that are part of the SRAM cell. The write bit line and its complement (WBL and WBLb) are each coupled to one side of the respective access transistors M23, M24 as shown in FIG. 4A while the other side of each of those access transistors M23, M24 are coupled to each side of the cross coupled inverters (labeled D and Db in FIG. 4A.)

The circuit in FIG. 4A may also have a read word line RE, a read bit line RBL and a read port formed by transistors M21, M22 coupled together to form as isolation circuit as shown. The read word line RE may be coupled to the gate of transistor M21 that forms part of the read port while the read bit line is coupled to the source terminal of transistor M21. The gate of transistor M22 may be coupled to the Db output from the cross coupled inverters 121, 122.

During reading, multiple cells (with only a single cell being shown in FIG. 4A) can turn on to perform an AND function. Specifically, at the beginning of the read cycle, RBL is pre-charged high and if the Db signal of all cells that are turned on by RE is “0”, then RBL stays high since, although the gate of transistor M21 is turned on by the RE signal, the gate of M22 is not turned on and the RBL line is not connected to the ground to which the drain of transistor M22 is connected. If the Db signal of any or all of the cells is “1” then RBL is discharged to 0 since the gate of M22 is turned on and the RBL line is connected to ground. As a result, RBL=NOR (Db0, Db1, etc.) where Db0, Db1, etc. are the complementary data of the SRAM cells that have been turned on by the RE signal. Alternatively, RBL=NOR (Db0, Db1, etc.)=AND (D0, D1, etc.), where D0, D1, etc. are the true data of the cells that have been turned on by the RE signal.

As shown in FIG. 4A, the Db signal of the cell 20 may be coupled to a gate of transistor M22 to drive the RBL. However, unlike the typical 6T cell, the Db signal is isolated from the RBL line and its signal/voltage level by the transistors M21, M22. Because the Db signal/value is isolated from the RBL line and signal/voltage level, the Db signal is not susceptive to the lower bit line level caused by multiple “0” data stored in multiple cells in contrast to the typical SRAM cell. Therefore, for the cell in FIG. 4A, there is no limitation of how many cells can be turned on to drive RBL. As a result, the cell (and the device made up for multiple cells) offers more operands for the AND function since there is no limit of how many cells can be turned on to drive RBL. Furthermore, in the cell in FIG. 4A, the RBL line is pre-charged (not a static pull up transistor as with the typical 6T cell) so this cell can provide much faster sensing because the current generated by the cell is all be used to discharge the bit line capacitance with no current being consumed by a static pull up transistor so that the bit line discharging rate can be faster by more than 2 times. The sensing for the disclosed cell is also lower power without the extra current consumed by a static pull up transistor and the discharging current is reduced by more than half.

The write port of the cell in FIG. 4A is operated in the same manner as the 6T typical SRAM cell. As a result, the write cycle and Selective Write cycle for the cell have the same limitation as the typical 6T cell. In addition to the AND function described above, the SRAM cell 20 in FIG. 4A also may perform a NOR function by storing inverted data. Specifically, if D is stored at the gate of M22, instead of Db, then RBL=NOR (D0, D1, etc.). One skilled in the art understand that the cell configuration shown in FIG. 4A would be slightly altered to achieve this, but that modification is within the scope of the disclosure. Further details of this exemplary computational memory cell is found in co-pending U.S. patent application Ser. Nos. 15/709,379, 15/709,382 and 15/709,385 all filed on Sep. 19, 2017 and entitled “Computational Dual Port Sram Cell And Processing Array Device Using The Dual Port Sram Cells” which are incorporated herein by reference.

FIG. 4B illustrates an implementation of a dual port SRAM cell 100 with an XOR function. The dual port SRAM cell 100 may include two cross coupled inverters 131, 132 and two access transistors M33 and M34 that are interconnected together as shown in FIG. 4B to form the basic SRAM cell. The SRAM may be operated as storage latch and may have a write port. The two inverters 131, 132 are cross coupled since the input of the first inverter is connected to the output of the second inverter (labeled D) and the output of the first inverter (labeled Db) is coupled to the input of the second inverter as shown in FIG. 4B. The cross coupled inverters 131, 132 form the latch of the SRAM cell. The access transistor M33 and M34 may have their respective gates connected to write bit line and its complement (WBL, WBLb) respectively. A Write Word line carries a signal WE. The Write word line WE is coupled to the gate of a transistor M35 that is part of the access circuitry for the SRAM cell.

The circuit in FIG. 4B may also have a read word line RE, a read bit line RBL and a read port formed by transistors M31, M32 coupled together to form as isolation circuit as shown. The read word line RE may be coupled to the gate of transistor M31 that forms part of the read port while the read bit line RBL is coupled to the drain terminal of transistor M31. The gate of transistor M32 may be coupled to the Db output from the cross coupled inverters 131, 132. The isolation circuit isolates the latch output Db (in the example in FIG. 4B) from the read bit line and signal/voltage level so that the Db signal is not susceptive to the lower bit line level caused by multiple “0” data stored in multiple cells in contrast to the typical SRAM cell.

The cell 100 may further include two more read word line transistors M36, M37 and one extra complementary read word line, REb. When the read port is active, either RE or REb is high and the REb signal/voltage level is the complement of RE signal/voltage level. RBL is pre-charged high, and if one of (M31, M32) or (M36, M37) series transistors is on, RBL is discharged to 0. If none of (M31, M32) or (M36, M37) series transistors is on, then RBL stay high as 1 since it was precharged high. The following equation below, where D is the data stored in the cell and Db is the complement data stored in the cell, describes the functioning/operation of the cell:

RBL=AND(NAND(RE,Db),NAND(REb,D))=XNOR(RE,D) (EQ1)

If the word size is 8, then it needs to be stored in 8 cells (with one cell being shown in FIG. 4B) on the same bit line. On a search operation, an 8 bit search key can be entered using the RE, REb lines of eight cells to compare the search key with cell data. If the search key bit is 1, then the corresponding RE=1 and REb=0 for that cell. If the search key bit is 0, then the corresponding RE=0 and REb=1. If all 8 bits match the search key, then RBL will be equal to 1. IF any 1 of the 8 bits is not matched, then RBL will be discharged and be 0. Therefore, this cell 100 (when used with 7 other cells for an 8 bit search key) can perform the same XNOR function but uses half the number of cell as the typical SRAM cell. The following equation for the multiple bits on the bit line may describe the operation of the cells as:

RBL=AND(XNOR(RE1,D1),XNOR(RE2,D2), . . . ,XNOR(REi,Di)), where i is the number of active cell. (EQ2)

By controlling either RE or REb to be a high signal/on, the circuit 100 may also be used to do logic operations mixing true and complement data as shown below:

RBL=AND(D1,D2, . . . ,Dn,Dbn+1,Dbn+2, . . . Dbm) (EQ3)

where D1, D2, . . . Dn are “n” number of data with RE on and Dbn+1, Dbn+2, . . . Dbm are m-n number of data with REb on.

Furthermore, if the cell 100 stores inverse data, meaning WBL and WBLb shown in FIG. 4B is swapped, then the logic equation EQ1 becomes XOR function and logic equation EQ3 becomes NOR a function and can be expressed as EQ 4 and EQ5

RBL=XOR(RE,D) (EQ4)
RBL=NOR(D1,D2, . . . ,Dn,Dbn+1,Dbn+2, . . . Dbm) (EQ5)

where D1, D2, . . . Dn are n number of data with RE on and Dbn+1, Dbn+2, . . . Dbm are m-n number of data with REb on.

In another embodiment, the read port of the circuit 100 is FIG. 4B may be reconfigured differently to achieve different Boolean equation. Specifically, transistors M31, M32, M36 and M37 may be changed to PMOS and the source of M32 and M37 is VDD instead of VSS, the bit line is pre-charged to 0 instead of 1 and the word line RE active state is 0. In this embodiment, the logic equations EQ1 is inverted so that RBL is an XOR function of RE and D (EQ6). EQ3 is rewritten as an OR function (EQ7) as follows:

RBL=XOR(RE,D) (EQ6)
RBL=OR(D1,D2, . . . ,Dn,Dbn+1,Dbn+2, . . . Dbm) (EQ7)

where D1, D2, . . . Dn are n number of data with RE on and Dbn+1, Dbn+2, . . . Dbm are m-n number of data with REb on.

If the cell stores the inverse data of the above discussed PMOS read port, meaning WBL and WBLb is swapped, then

RBL=XNOR(RE,D) (EQ8)
RBL=NAND(D1,D2, . . . ,Dn,Dbn+1,Dbn+2, . . . Dbm) (EQ9)

where D1, D2, . . . Dn are n number of data with RE on and Dbn+1, Dbn+2, . . . Dbm are m-n number of data with REb on.

For example, consider a search operation where a digital word needs to be found in a memory array in which the memory array can be configured as each bit of the word stored on the same bit line. To compare 1 bit of the word, then the data is stored in a cell and its RE is the search key, then EQ1 can be written as below:

RBL=XNOR(Key,D) EQ10

If Key=D, then RBL=1. If the word size is 8 bits as D[0:7], then the search key Key[0:7] is its RE, then EQ2 can be expressed as search result and be written as below:

RBL=AND(XNOR(Key[0],D[0]),XNOR(Key[1],D[1], . . . ,Key[7],D[7]) EQ11

If all Key[i] is equal to D[i] where i=0-7, then the search result RBL is match. Any one of Key[i] is not equal to D[i], then the search result is not match. Parallel search can be performed in 1 operation by arranging multiple data words along the same word line and on parallel bit lines with each word on 1 bit line. Further details of this computation memory cell may be found in U.S. patent application Ser. Nos. 15/709,399 and 15/709,401 both filed on Sep. 19, 2017 and entitled “Computational Dual Port Sram Cell And Processing Array Device Using The Dual Port Sram Cells For Xor And Xnor Computations”, which are incorporated herein by reference.

FIG. 5 illustrates more details of the read/write circuitry 34 including read logic, read data storage, and write logic for each bl-sect in the processing array device depicted in FIG. 3C. The read/write circuitry 34 for each bit line section may include read circuitry 50, a read storage 52, implemented as a register, and write circuitry 54. The read/write circuitry 34 may also implement one embodiment of the RSP circuitry as described below. The read circuitry 50 and read storage 52 allows the data on the read bit lines connected to the particular read circuitry and read storage to accumulate so that more complex Boolean logic operations may be performed. Various implementations of the read circuitry 50 and read storage 52 may be found in Ser. No. 16/111,178 filed Aug. 23, 2018 and entitled “Read Data Processing Circuits and Methods Associated with Computational Memory Cells” that is co-pending and co-owned and is incorporated herein by reference. The write circuitry 54 manages the writing of data from each section. Each of the read circuitry 50, read storage 52 and write circuitry 54 may be connected to one or more control signals (S[x]_RW_Ctrl[p:0] in the example implementation shown in FIG. 5) that control the operation of each of the circuits. The control signals may include the read control signals that are described above in the incorporated by reference patent application.

The read circuitry 50 may receive inputs from the read bit line of the computing memory cells of the section (S[x]_RBL[y]) and the write circuitry 54 may receive an input from the read data storage 52 and output data to the write bit lines of the computing memory cells of the section (S[x]_WBL[y] and S[x]_WBLb[y] in the example in FIG. 5).

First Embodiment of RSP Circuitry

One way to provide a mechanism to logically combine the computation results across multiple bl-sects in a section, where each bl-sect produces a computation result that is ultimately captured in its Read Register 52, is to:

- Implement an RSP data line (S[x]_RSP for example) that spans all bl-sects in the section.
- Utilize additional circuitry in the bl-sects to produce a pre-defined logical function—in particular, a logical OR—on the RSP data line of the computation results captured in all bl-sect Read Registers in the section. That is, when the section has “n” bl-sects, to produce on the RSP data line the function:
  
  RBL[0]_Reg_Out OR RBL[1]_Reg_Out OR . . . RBL[n−1]_Reg_Out

A logical OR is chosen because a common use case of RSP functionality is to determine if the algorithm (e.g. a search algorithm) produced a result=“1” (e.g. indicating a positive match on the search data) in any one or more of the bl-sects in the section. In the embodiments below, each of the RSP logic circuits, control logic may be implemented using known circuits including Boolean logic circuits.

One way to produce the logical OR on the RSP data line is to define its default state as “0”, and enable (by default) a pull-down transistor on the RSP data line to pull it to its “0” default state. Then when RSP functionality is engaged, temporarily and unconditionally disable the pull-down transistor on the RSP data line, and temporarily enable a pull-up transistor—one per bl-sect—on the RSP data line if the computation result captured in its associated bl-sect Read Register=“1”. In that way, the RSP data line remains in its “0” default state only if the computation result captured in all bl-sect Read Registers is “0”; otherwise, the RSP data line is pulled to “1”, thereby producing a logical OR of all bl-sect Read Register states on the RSP data line.

Functionally, this scheme only requires one pull-down transistor on the RSP data line. However, several pull-down transistors (controlled by the same control signal) may be implemented on the RSP data line if needed (e.g. to decrease the amount of time required to discharge the RSP data line to “0”).

Furthermore, additional RSP circuitry is implemented in each bl-sect such that the logical OR result produced on the RSP data line can be stored in a computational memory cell in the bl-sect.

And furthermore, additional RSP circuitry is implemented in the section such that the logical OR result produced on the RSP data line is driven to circuitry outside the processing array.

FIG. 6 illustrates the read/write logic 34 that includes a results processing (RSP) data line and RSP circuitry in each bit line section that is used to produce combined computational results across all bit line sections on the RSP data line. The read/write logic 34 has the same read logic 50 and read register 52 as the circuitry in FIG. 5 that operates in the same manner. As shown in FIG. 6, the write logic 54 may further have a write multiplexer 62, such as a 6:1 multiplexer) that selects, based on the RW_Ctrl[p:0] control signal, the data to be written into the memory cells for this bit line section wherein the data is selected from the read register 52 output (S[x]_RBL[y]_Reg_Out signal for example) as well as the read register output data from the neighboring bit line sections (S[x−1]_RBL[y]_Reg_Out, S[x+1]_RBL[y]_Reg_Out, S[x]_RBL[y−1]_Reg_Out and S[x]_RBL[y+1]_Reg_Out).

For the RSP circuitry, the read/write circuitry 34 may further include RSP Logic 64 that is used to produce the combined computation result on the RSP data line using the “n” bl-sect Read Register outputs as the data sources. In addition, the RSP data line data may be input to the write MUX 62 as shown that provides the ability to store the logic state of the RSP data line in a bl-sect memory cell. The RSP logic is controlled by the RSPsel and RSPend control signals (generated by the read/write logic control 32 in FIG. 3A) and has an output that drives the RSP data line.

FIG. 7 illustrates a single bit line section with the RSP circuitry 70 including RSP logic that produces the combined computational results on the RSP line and an additional line to the write multiplexer that provides the ability to store the logic state of the RSP data line. The circuitry may include a pull down enable signal (S[x]_RSP_PDen for example) control signal that defaults to “1”, and is only “0” from the rising edge of the RSPsel control signal to the falling edge of the RSPend control signal. The circuitry may further include a latch 72 (RSP latch) whose data input is the “RBL_Reg_Out” output (S[x]_RBL[y]_Reg_out for example) of the Read Register 52 of the bit line section, whose clock input is the RSPsel control signal, whose reset input is the RSP_PDen control signal, and whose data output is an RSP Pull-Up Enable (“RSP_PUen”) control signal. The functionality of RSP LAT 72 is such that when RSP_PDen=1, then RSP_PUen=0; when RSP_PDen=0 and RSPsel=1, then RSP_PUen=RBL_Reg_Out; otherwise, RSP_PUen remains unchanged. In the depicted implementation, there is one RSP LAT per bit line section.

The RSP circuitry 70 may further include an RSP Pull-Down Transistor 74 (“RSP PD”) implemented on the RSP data line, whose enable input is the RSP_PDen control signal. In the depicted implementation, there is one RSP PD per bl-sect. The circuitry 70 may further include an RSP Pull-Up Transistor 76 (“RSP PU”) implemented on the RSP data line, whose enable input is the RSP_PUen control signal. In the depicted implementation, there is one RSP PU per bl-sect. The circuitry 70 may further include a first Buffer 78 (“BUF1”) whose input is the RSP data line, and whose output “RSP_in” is driven to the bl-sect Write Mux shown in FIG. 6. In the depicted implementation, there is one such buffer per bl-sect. The circuitry 70 may further include a second Buffer 79 (“BUF2”) whose input is the RSP data line, and whose output “RSP_out” is driven out of the processing array. In the depicted implementation, there is one such buffer for the entire section.

FIG. 8 illustrates an example of the signal timing associated with the RSP logic shown in FIG. 7 when RSP is engaged to store the RSP result in the bl-sect and to drive the RSP result out of the processing array.

- 1. In cycle 0, GRE is asserted to “1” for a half-cycle to initiate a read operation that causes a computation result=“1” to be captured in the bl-sect Read Register. In this cycle, RSP_PDen=“1” (its default state), thereby causing RSP_PUen=“0” (its default state), thereby causing the RSP data line=“0” (its default state).
- 2. In cycle 1, RSPsel is asserted to “1” for a half-cycle to engage RSP functionality. The rising edge of RSPsel causes RSP_PDen=“0”, and RSPsel=1 causes RSP LAT to capture the RBL_Reg_Out output of the Read Register (i.e. the logic “1” from cycle 0), thereby causing RSP_PUen=“1”, thereby causing the RSP data line=“1”.
- 3. In cycle 2, RSPend is asserted to “1” for a half-cycle to disengage RSP functionality. The falling edge of RSPend causes RSP_PDen=“1”, thereby causing RSP_PUen=“0”, thereby causing the RSP data line=“0” (back to its default state).
- 4. Simultaneously with the assertion of RSPend, GWE is asserted to “1” for a half-cycle to initiate a write operation that stores the state of the RSP data line in a computational memory cell in the bl-sect. Note that the assertion of GWE unnecessary if the sole objective of the RSP engagement is to use the RSP result outside the processing array.

The timing diagram illustrates the minimum delay required between the assertion of RSPsel and the assertions of RSPend and GWE. If needed, this delay can be increased without affecting the depicted RSP functionality.

The RSP implementation in FIG. 7 allows for an RSP read operation (i.e. a read operation that generates a computation result in the bl-sect Read Register that is subsequently used for RSP engagement) to be initiated as often as every 3 cycles. The RSP implementation in FIG. 7 allows for a non-RSP read operation (i.e. a read operation that generates a computation result in the bl-sect Read Register that is not subsequently used for RSP engagement) to be initiated in any cycle(s) after an RSP read operation is initiated, without affecting the RSP result associated with that RSP read operation.

FIG. 9 illustrates high level RSP functionality for a processing array having “K” sections (Section 0, Section 1, . . . , Section K) with each section having “n” bit line sections (BL-Sect[0,0], . . . , BL-Sect[0,n], etc.) with “n” bit lines in the processing array. As shown in FIG. 9, each section of the processing array has RSP circuitry and logic in each bit line section and a buffer that generate the RSP result (S[0]_RSP_out, etc.) from each section.

Second Embodiment of RSP Circuitry

In a second embodiment, a mechanism to logically combine the computation results across multiple bit line sections (bl-sects) in a section and across multiple sections where each bl-sect produces a computation result that is ultimately captured in its Read Register is provided. The circuitry for the second embodiment may include:

- A set of “s” RSP data lines (an example of which is shown in FIG. 10) that, in stages, span an increasing number of bl-sects, where the stage 1 through stage “s-2” RSP data lines span a subset of bl-sects in a section, the stage “s-1” RSP data line spans all bl-sects in a section, and the stage “s” RSP data line spans all bl-sects across multiple sections.
- The generation of outbound RSP functionality from the stage 1 RSP data line to the stage “s” RSP data line, as follows:
  - Utilize additional circuitry in the bl-sects to produce a logical OR on each stage 1 RSP data line of the computation results captured in the bl-sect Read Registers spanned by each stage 1 RSP data line.
  - Utilize additional circuitry in the section to produce a logical OR on each stage 2 RSP data line of the states of the multiple stage 1 RSP data lines spanned by each stage 2 RSP data line.
  - Utilize additional circuitry in the section to produce a logical OR on each additional stage RSP data line from the stage 3 RSP data line to the stage “s-1” RSP data line (e.g., perform this for each successive stage RSP data line).
  - Utilize additional circuitry in each section to produce a logical OR on the stage “s” RSP data line of the states of the multiple stage “s-1” RSP data lines spanned by the stage “s” RSP data line.

One way to produce the logical OR on each RSP data line during the outbound flow is (as was done in the first embodiment) to define their default states as “0”, and enable (by default) pull-down transistor(s) to pull each RSP data line to its “0” default state. Then, when RSP functionality is engaged:

- At the first stage of the outbound RSP flow, temporarily and unconditionally disable the pull-down transistor(s) on the stage 1 RSP data line, and temporarily enable a pull-up transistor—one per bl-sect—if the computation result captured in the bl-sect Read Register=“1”.
- At the second stage of the outbound RSP flow, temporarily and unconditionally disable the pull-down transistor(s) on the stage 2 RSP data line, and temporarily enable a pull-up transistor—one per stage 1 RSP data line—if the state of the stage 1 RSP data line=“1”.
- At the last stage of the outbound RSP flow, temporarily and unconditionally disable the pull-down transistor(s) on the stage “s” RSP data line, and temporarily enable a pull-up transistor—one per stage “s-1” RSP data line—if the state of the stage “s-1” RSP data line=“1”.

The implementation is such that the outbound RSP flow can be halted at any RSP stage before the inbound RSP flow (see below) is engaged to return the OR result from the RSP stage at which the outbound RSP flow was halted back to all stage 1 RSP data lines, and from there store it in the bl-sects. In this way any RSP stage result, from stage 1 to stage “s”, can be stored in the bl-sects.

- The generation of inbound RSP functionality from the stage “s” RSP data line to the stage “1” RSP data line, as follows:
  - Utilize additional circuitry in each section to enable the state of the stage “s” RSP data line, if =“1”, onto all stage “s-1” RSP data lines spanned by the stage “s” RSP data line. Note that the state of the stage “s” RSP data line can only be “1” if the outbound flow is engaged all the way to stage “s”.
  - Utilize additional circuitry in the section to enable the state of each stage “s-1” RSP data line, if =“1, onto all stage “s-2” RSP data lines spanned by each stage “s-1” RSP data line. Note that the state of the stage “s-1” RSP data line can only be “1” if the outbound flow is engaged to stage “s-1” or further.
  - Utilize additional circuitry in the section to enable the state of each stage 2 RSP data line, if =“1”, onto all stage 1 RSP data lines spanned by each stage 2 RSP data line. Note that the state of the stage 2 RSP data line can only be “1” if the outbound flow is engaged to stage 2 or further.

The purpose of the inbound RSP functionality is to return the OR result from the RSP stage at which the outbound RSP flow is halted back to the stage 1 RSP data lines, and from there store it in the bl-sects. The inbound RSP flow is only engaged when the objective of an RSP engagement includes storing the RSP result in the bl-sects. If the sole objective of an RSP engagement is to use the RSP result outside the processing array, then the outbound RSP flow is engaged to stage “s” and the inbound RSP flow is not engaged.

When the inbound RSP flow is engaged after the outbound RSP flow is halted at stage “r”, where 1<r<=s, one way to return the OR result from the stage “r” RSP data line to the stage “1” RSP data line is to:

- Enable a pull-up transistor on each stage “s-1” RSP data line if the state of the stage “s” RSP data line that spans it=“1” (which is only possible if r=s).
- Enable a pull-up transistor on each stage “s-2” RSP data line if the state of the stage “s-1” RSP data line that spans it=“1” (which is only possible if r>=s-1).
- Enable a pull-up transistor on each stage 1 RSP data line if the state of the stage 2 RSP data line that spans it=“1” (which is only possible if r>=2).

These pull-up transistors are enabled long enough for the stage “r” RSP data line state, if “1”, to propagate back to the multiple stage 1 RSP data lines spanned by the stage “r” RSP data line.

If the outbound RSP flow is halted at the stage 1 RSP data line (i.e. r=1), then the inbound RSP flow is still engaged as part of the process to store the result in the bl-sects even though, in this case, no return propagation from a latter RSP stage is involved.

Furthermore, additional RSP circuitry is implemented in each bl-sect such that the logical OR result produced on the stage 1 RSP data line can be stored in a computational memory cell in the bl-sect. Additional RSP circuitry is also implemented in each section such that the logical OR result produced on each stage “s-1” RSP data line (spanning an entire section) is driven to circuitry outside the processing array. Additional RSP circuitry is also implemented such that the logical OR result produced on the single stage “s” RSP data line (spanning multiple sections) is driven to circuitry outside the processing array.

FIG. 10 illustrates a processing array 30 with 2048 bit lines (and cells 0,0, . . . , cell m,2047), 2K bit line sections (BL-Sect[0,0], . . . , BL-Sect[0,2047]) and four RSP data lines including rsp16 (stage 1), rsp256 (stage 2), rsp2K (stage 3), and rsp32K (stage 4). FIG. 10 shows that each bit line section has its read/write logic with RSP circuitry 341, 342 connected to one of the RSP data lines and each section has RSP logic 343 connected to all of the RSP lines. All of the read/write logic and RSP circuits are also connected to the control signals (RW_Ctrl[p:0] and RSP Ctrl) generated by the controller 32 of the processing array 30. The RSP circuitry in each bl-sect (341, 342 in the example in FIG. 10) is used to produce a combined computation result across groups of 16 bl-sects on each of the 128 rsp16 data lines in the section. The RSP circuitry 343 in each section is used to: 1) produce a combined computation result across groups of 256 bl-sects on each of the 8 rsp256 data lines in the section; 2) produce a combined computation result across all 2K bl-sects on the rsp2K data line in the section; and 3) produce a combined computation result across 32K bl-sects in 16 sections on the rsp32K data line associated with those 16 sections.

FIG. 11 illustrates a single bit line section (Section[x], BL[y] in this example) from the processing array 30 in FIG. 10 that has the same read logic 50, read register 52 and write multiplexer 54 as described above that operates in the same way as described above. In addition to the circuitry in the bit line section, FIG. 11 also shows RSP circuitry outside the bl-sect. As shown in FIG. 11, the group of sixteen sections in the computation memory has RSP Control Logic circuitry 1100. In particular, blocks 1100 and 1104˜1114 shown in FIG. 1 are implemented outside of the bl-sect. For the RSP out logic and the RSP in logic (1104-1114), there are multiple circuits per rspXXX data line. For example, there is one 1104 block and one 1110 block per rsp16 data line, as noted immediately to the right of those blocks; since each rsp16 data line spans 16 bl-sects which means there are one of these particular blocks per 16 bl-sects.

The RSP control logic 1100 has inputs that are rsp16sel, rsp256sel, rsp2Ksel, rsp32Ksel, rspStartRet, and rspEnd control signals that are generated by the control logic 32 of the processing array 30 described above, and whose outputs are rsp16_LATen, rsp16_PDen, rsp256_LATen, rsp256_PDen, rsp2K_LATen, rsp2K_PDen, rsp32K_LATen, rsp32K_PDen, and rspReturn control signals that control the RSP output circuits and logic and RSP input circuits and logic for the bit line section as shown in FIG. 11.

The logic illustrated in FIG. 11 may further comprise:

- RSP16 Out Logic 1102 whose inputs are the output of the bl-sect Read Register, rsp16sel, rsp16 LATen, and rsp16 PDen as control signals, and whose output drives the rsp16 data line (S[x]_rsp16).
- RSP256 Out Logic 1104 whose inputs are the rsp16 data line, rsp256sel, rsp256_LATen, and rsp256_PDen control signals, and whose output drives the rsp256 data line.
- RSP2K Out Logic 1106 whose inputs are the rsp256 data line, rsp2Ksel, rsp2K_LATen, and rsp2K_PDen control signals, and whose output drives the rsp2K data line.
- RSP32K Out Logic 1108 whose inputs are the rsp2K data line, rsp32Ksel, rsp32K_LATen, and rsp32K_PDen control signals, and whose output drives the rsp32K data line.
- RSP2K In Logic 1114 whose inputs are the rsp32K data line and rspReturn control signal, and whose output drives the rsp2K data line.
- RSP256 In Logic 1112 whose inputs are the rsp2K data line and rspReturn control signal, and whose output drives the rsp256 data line.
- RSP16 In Logic 1110 whose inputs are the rsp256 data line and rspReturn control signal, and whose output drives the rsp16 data line.
- The 6:1 Write Mux 54 in the bl_sect whose 6^thinput is the rsp16 data line.

FIG. 12 illustrates the detailed RSP logic implemented in FIG. 11. The circuits include:

- The RSP control logic 1100 (shown in FIG. 11) that generates RSP control signals including:
  - An rspReturn control signal, that defaults to “0” and is only “1” from the rising edge of rspStartRet to the rising edge of rspEnd.
  - An RSP16 Latch Enable (“rsp16 LATen”) control signal, that defaults to “0” and is only “1” from the rising edge of rsp16sel to the falling edge of rsp256sel or the falling edge of rspReturn.
  - An RSP16 Pull-Down Enable (“rsp16 PDen”) control signal that defaults to “1”, and is only “0” when rsp16_LATen=1 or rspReturn=1.
  - An RSP256 Latch Enable (“rsp256 LATen”) control signal, that defaults to “0” and is only “1” from the rising edge of rsp256sel to the falling edge of rsp2Ksel or the falling edge of rspReturn.
  - An RSP256 Pull-Down Enable (“rsp256 PDen”) control signal, that defaults to “1” and is only “0” when rsp256 LATen=1 or rspReturn=1.
  - An RSP2K Latch Enable (“rsp2K_LATen”) control signal, that defaults to “0” and is only “1” from the rising edge of rsp2Ksel to the falling edge of rsp32Ksel or the falling edge of rspReturn.
  - An RSP2K Pull-Down Enable (“rsp2K_PDen”) control signal, that defaults to “1” and is only “0” when rsp2K_LATen=1 or rspReturn=1.
  - An RSP32K Latch Enable (“rsp32K_LATen”) control signal, that defaults to “0” and is only “1” from the rising edge of rsp32Ksel to the falling edge of rspEnd.
  - An RSP32K Pull-Down Enable (“rsp32K_PDen”) control signal, that defaults to “1” and is only “0” when rsp32K_LATen=1.
- RSP16 Out Logic (one set per bl-sect) consisting of:
  - An RSP16 Latch (“rsp16 LAT”) whose data input is the “RBL_Reg_Out” output of the bl-sect's Read Register, whose clock input is the rsp16sel control signal, whose reset input is the rsp16 LATen control signal, and whose data output is an RSP16 Out Pull-Up Enable (“rsp16_PUen_out”) control signal. The functionality of rsp16 LAT is such that when rsp16_LATen=0, then rsp16_PUen_out=0; when rsp16_LATen=1 and rsp16sel=1, then rsp16_PUen_out=RBL_Reg_Out; otherwise, rsp16_PUen_out remains unchanged.
  - An RSP16 Pull-Down Transistor (“rsp16 PD”) implemented on the rsp16 data line, whose enable input is the rsp16 PDen control signal.
  - An RSP16 Out Pull-Up Transistor (“rsp16 out PU”) implemented on the rsp16 data line, whose enable input is the rsp16_PUen_out control signal.
  - A first Buffer (“BUF1”) whose input is the rsp16 data line, and whose output “rsp16 in” is driven to the bl-sect's Write Mux.
- RSP16 In Logic (one set per rsp16 data line) consisting of:
  - An RSP16 In AND gate (“rsp16 in AND”) whose output is an RSP16 In Pull-Up Enable (“rsp16_PUen_in”) control signal equal to the logical AND of the rsp256 data line and rspReturn.
  - An RSP16 In Pull-Up Transistor (“rsp16 in PU”) implemented on the rsp16 data line, whose enable input is the rsp16_PUen_in control signal.
- RSP256 Out Logic (one set per rsp16 data line) consisting of:
  - An RSP256 Latch (“rsp256 LAT”) whose data input is the rsp16 data line, whose clock input is the rsp256sel control signal, whose reset input is the rsp256 LATen control signal, and whose data output is an RSP256 Out Pull-Up Enable (“rsp256_PUen_out”) control signal. The functionality of rsp256 LAT is such that when rsp256_LATen=0, then rsp256_PUen_out=0; when rsp256_LATen=1 and rsp256sel=1, then rsp256_PUen_out=RBL_Reg_Out; otherwise, rsp256_PUen_out remains unchanged.
  - An RSP256 Pull-Down Transistor (“rsp256 PD”) implemented on the rsp256 data line, whose enable input is the rsp256 PDen control signal.
  - An RSP256 Out Pull-Up Transistor (“rsp256 out PU”) implemented on the rsp256 data line, whose enable input is the rsp256_PUen_out control signal.
- RSP256 In Logic (one set per rsp256 data line) consisting of:
  - An RSP256 In AND gate (“rsp256 in AND”) whose output is an RSP256 In Pull-Up Enable (“rsp256_PUen_in”) control signal equal to the logical AND of the rsp2K data line and rspReturn.
  - An RSP256 In Pull-Up Transistor (“rsp256 in PU”) implemented on the rsp256 data line, whose enable input is the rsp256_PUen_in control signal.
- RSP2K Out Logic (one set per rsp256 data line) consisting of:
  - An RSP2K Latch (“rsp2K LAT”) whose data input is the rsp256 data line, whose clock input is the rsp2Ksel control signal, whose reset input is the rsp2K_LATen control signal, and whose data output is an RSP2K Out Pull-Up Enable (“rsp2K_PUen_out”) control signal. The functionality of rsp2K LAT is such that when rsp2K_LATen=0, then rsp2K_PUen_out=0; when rsp2K_LATen=1 and rsp2Ksel=1, then rsp2K_PUen_out=RBL_Reg_Out; otherwise, rsp2K_PUen remains unchanged.
  - An RSP2K Pull-Down Transistor (“rsp2K PD”) implemented on the rsp2K data line, whose enable input is the rsp2K_PDen control signal.
  - An RSP2K Out Pull-Up Transistor (“rsp2K out PU”) implemented on the rsp2K data line, whose enable input is the rsp2K_PUen_out control signal.
- RSP2K In Logic (one set per rsp2K data line) consisting of:
  - An RSP2K In AND gate (“rsp2K in AND”) whose output is an RSP2K In Pull-Up Enable (“rsp2K_PUen_in”) control signal equal to the logical AND of the rsp32K data line and rspReturn.
  - An RSP2K In Pull-Up Transistor (“rsp2K in PU”) implemented on the rsp2K data line, whose enable input is the rsp2K_PUen_in control signal.
- rsp32K Out Logic (one set per rsp2K data line) consisting of:
  - An RSP32K Latch (“rsp32K LAT”) whose data input is the rsp2K data line, whose clock input is the rsp32Ksel control signal, whose reset input is the rsp32K_LATen control signal, and whose data output is an RSP32K Pull-Up Enable (“rsp32K_PUen_out”) control signal. The functionality of rsp32K LAT is such that when rsp32K_LATen=0, then rsp32K_PUen_out=0; when rsp32K_LATen=1 and rsp32Ksel=1, then rsp32K_PUen_out=RBL_Reg_Out; otherwise, rsp32K_PUen remains unchanged.
  - An RSP32K Pull-Down Transistor (“rsp32K PD”) implemented on the rsp32K data line, whose enable input is the rsp32K_PDen control signal.
  - An RSP32K Out Pull-Up Transistor (“rsp32K out PU”) implemented on the rsp32K data line, whose enable input is the rsp32K_PUen control signal.
  - A second Buffer (“BUF2”) whose input is the rsp32K_PUen_out control signal, and whose output “rsp2K_out” is driven out of the processing array. In the depicted implementation, there is one BUF2 per section.
  - A third Buffer (“BUF3”) whose input is the rsp32K data line, and whose output “rsp32K_out” is driven out of the processing array. In the depicted implementation, there is one BUF3 per 16 sections.

FIGS. 13-17 illustrate the RSP signal timing associated with the circuit in FIG. 12. In all five figures:

- The timing diagrams illustrates the minimum delays required between the assertion of rsp16sel and the assertion of each successive RSP control signal (i.e. rsp256sel, rsp2Ksel, rsp32Ksel, rspStartRet, rspEnd), and the assertion of GWE, required for the depicted flow. If needed, these delays can be increased without affecting the depicted RSP functionality.
- The RSP implementation in FIG. 12 allows for a non-RSP read operation to be initiated in any cycle(s) after an RSP read operation is initiated, without affecting the RSP result associated with that RSP read operation.

FIG. 13 illustrates an example of the signal timing for generating an rsp2K result and an resp32K result and transmitting those results out of the processing array.

- 1. In cycle 0, GRE is asserted to “1” for a half-cycle to initiate a read operation that causes a computation result=“1” to be captured in the bl-sect Read Register. In this cycle, rsp16_LATen=“0” (its default state) and rsp16_PDen=“1” (its default state), thereby causing rsp16_PUen_out=“0” (its default state), thereby causing the rsp16 data line=“0” (its default state).
- 2. In cycle 1, rsp16sel is asserted to “1” for a half-cycle to engage stage 1 outbound RSP functionality. The rising edge of rsp16sel causes rsp16_LATen=“1” and rsp16_PDen=“0”, and rsp16sel=1 causes rsp16 LAT to capture the RBL_Reg_Out output of the Read Register (i.e. the logic “1” from cycle 0), thereby causing rsp16_PUen_out=“1”, thereby causing the rsp16 data line=“1”.
- 3. In cycle 2, rsp256sel is asserted to “1” for a half-cycle to engage stage 2 outbound RSP functionality. The rising edge of rsp256sel causes rsp256_LATen=“1” and rsp256_PDen=“0”, and rsp256sel=1 causes rsp256 LAT to capture the state of the rsp16 data line (i.e. the logic “1” from cycle 1), thereby causing rsp256 PUen_out=“1”, thereby causing the rsp256 data line=“1”. The falling edge of rsp256sel causes rsp16 LATen=“0” and rsp16_PDen=“1”, thereby causing rsp16_PUen_out=“0”, thereby causing the rsp16 data line=“0” (back to its default state).
- 4. In cycle 3, rsp2Ksel is asserted to “1” for a half-cycle to engage stage 3 outbound RSP functionality. The rising edge of rsp2Ksel causes rsp2K_LATen=“1” and rsp2K_PDen=“0”, and rsp2Ksel=1 causes rsp2K LAT to capture the state of the rsp256 data line (i.e. the logic “1” from cycle 2), thereby causing rsp2K_PUen_out=“1”, thereby causing the rsp2K data line=“1”. The falling edge of rsp2Ksel causes rsp256_LATen=“0” and rsp256_PDen=“1”, thereby causing rsp256_PUen_out=“0”, thereby causing the rsp256 data line=“0” (back to its default state).
- 5. In cycle 4, rsp32Ksel is asserted to “1” for a half-cycle to engage stage 4 outbound RSP functionality. The rising edge of rsp32Ksel causes rsp32K_LATen=“1” and rsp32K_PDen=“0”, and rsp32Ksel=1 causes rsp32K LAT to capture the state of the rsp2K data line (i.e. the logic “1” from cycle 3), thereby causing rsp32K_PUen_out=“1”, thereby causing the rsp32K data line=“1”. The falling edge of rsp32Ksel causes rsp2K_LATen=“0” and rsp2K_PDen=“1”, thereby causing rsp2K_PUen_out=“0”, thereby causing the rsp2K data line=“0” (back to its default state).
- 6. In cycle 5, rspEnd is asserted to “1” for a half-cycle to disengage outbound RSP functionality. The falling edge of rspEnd causes rsp32K_LATen=“0” and rsp32K_PDen=“1”, thereby causing rsp32K_PUen_out=“0”, thereby causing the rsp32K data line=“0” (back to its default state).

The RSP implementation in FIG. 9 allows for “RSP” read operations to be initiated every 3 cycles, when RSP is engaged to drive RSP results out of the processing array.

FIG. 14 illustrates an example of the signal timing for generating an rsp16 result and transmitting that result to the write multiplexer in each bit line section when RSP is engaged to produce and store rsp16 results in the bl_sects.

- 1. In cycle 0, GRE is asserted to “1” for a half-cycle to initiate a read operation that causes a computation result=“1” to be captured in the bl-sect Read Register. In this cycle, rsp16_LATen=“0” (its default state) and rsp16_PDen=“1” (its default state), thereby causing rsp16_PUen_out=“0” (its default state), thereby causing the rsp16 data line=“0” (its default state).
- 2. In cycle 1, rsp16sel is asserted to “1” for a half-cycle to engage stage 1 outbound RSP functionality. The rising edge of rsp16sel causes rsp16_LATen=“1” and rsp16 PDen=“0”, and rsp16sel=1 causes rsp16 LAT to capture the RBL_Reg_Out output of the Read Register (i.e. the logic “1” from cycle 0), thereby causing rsp16_PUen_out=“1”, thereby causing the rsp16 data line=“1”.
- 3. In cycle 2, rspStartRet is asserted to “1” for a half-cycle to disengage outbound RSP functionality and engage inbound RSP functionality. The rising edge of rspStartRet causes rspReturn=“1”.
- 4. In cycle 2 (and one cycle before rspEnd is asserted), GWE is asserted to “1” for a half-cycle to initiate a write operation that stores the state of the rsp16 data line in a computational memory cell in the bl-sect.
- 5. In cycle 3, rspEnd is asserted to “1” for a half-cycle to disengage inbound RSP functionality. The rising edge of rspEnd causes rspReturn=“0”, thereby causing rsp16_LATen=“0” and rsp16_PDen=“1”, thereby causing rsp16_PUen_out=“0”, thereby causing the rsp16 data line=“0” (back to its default state).

The RSP implementation in FIG. 11 allows for “RSP” read operations to be initiated every 3 cycles, when RSP is engaged to produce and store rsp16 results in the bl-sects.

- 1. In cycle 0, GRE is asserted to “1” for a half-cycle to initiate a read operation that causes a computation result=“1” to be captured in the bl-sect Read Register. In this cycle, rsp16_LATen=“0” (its default state) and rsp16_PDen=“1” (its default state), thereby causing rsp16_PUen_out=“0” (its default state), thereby causing the rsp16 data line=“0” (its default state).
- 2. In cycle 1, rsp16sel is asserted to “1” for a half-cycle to engage stage 1 outbound RSP functionality. The rising edge of rsp16sel causes rsp16_LATen=“1” and rsp16_PDen=“0”, and rsp16sel=1 causes rsp16 LAT to capture the RBL_Reg_Out output of the Read Register (i.e. the logic “1” from cycle 0), thereby causing rsp16_PUen_out=“1”, thereby causing the rsp16 data line=“1”.
- 3. In cycle 2, rsp256sel is asserted to “1” for a half-cycle to engage stage 2 outbound RSP functionality. The rising edge of rsp256sel causes rsp256_LATen=“1” and rsp256_PDen=“0”, and rsp256sel=1 causes rsp256 LAT to capture the state of the rsp16 data line (i.e. the logic “1” from cycle 1), thereby causing rsp256_PUen_out=“1”, thereby causing the rsp256 data line=“1”. The falling edge of rsp256sel causes rsp16_LATen=“0” and rsp16_PDen=“1”, thereby causing rsp16_PUen_out=“0”, thereby causing the rsp16 data line=“0” (back to its default state).
- 4. In cycle 3, rspStartRet is asserted to “1” for a half-cycle to disengage outbound RSP functionality and engage inbound RSP functionality. The rising edge of rspStartRet causes rspReturn=“1”, thereby causing rsp16_PDen=“0” and (because the rsp256 data line=“1”) rsp16_PUen_in =“1”, thereby causing the rsp16 data line=“1”.
- 5. In cycle 5 (and one cycle before rspEnd is asserted), GWE is asserted to “1” for a half-cycle to initiate a write operation that stores the state of the rsp16 data line in a computational memory cell in the bl-sect.
- 6. In cycle 6, rspEnd is asserted to “1” for a half-cycle to disengage inbound RSP functionality. The rising edge of rspEnd causes rspReturn=“0”, thereby causing rsp256_LATen=“0” and rsp256_PDen=“1”, thereby causing rsp256 PUen_out=“0”, thereby causing the rsp256 data line=“0” (back to its default state). The falling edge of rspReturn also causes rsp16_PDen=“1” and rsp16_PUen_in =“0”, thereby causing the rsp16 data line=“0” (back to its default state).

The RSP implementation in FIG. 9 allows for “RSP” read operations to be initiated every 6 cycles, when RSP is engaged to produce and store rsp256 results in the bl-sects.

- 1. In cycle 0, GRE is asserted to “1” for a half-cycle to initiate a read operation that causes a computation result=“1” to be captured in the bl-sect Read Register. In this cycle, rsp16_LATen=“0” (its default state) and rsp16 PDen=“1” (its default state), thereby causing rsp16_PUen_out=“0” (its default state), thereby causing the rsp16 data line=“0” (its default state).
- 2. In cycle 1, rsp16sel is asserted to “1” for a half-cycle to engage stage 1 outbound RSP functionality. The rising edge of rsp16sel causes rsp16_LATen=“1” and rsp16_PDen=“0”, and rsp16sel=1 causes rsp16 LAT to capture the RBL_Reg_Out output of the Read Register (i.e. the logic “1” from cycle 0), thereby causing rsp16_PUen_out=“1”, thereby causing the rsp16 data line=“1”.
- 3. In cycle 2, rsp256sel is asserted to “1” for a half-cycle to engage stage 2 outbound RSP functionality. The rising edge of rsp256sel causes rsp256_LATen=“1” and rsp256_PDen=“0”, and rsp256sel=1 causes rsp256 LAT to capture the state of the rsp16 data line (i.e. the logic “1” from cycle 1), thereby causing rsp256_PUen_out=“1”, thereby causing the rsp256 data line=“1”. The falling edge of rsp256sel causes rsp16_LATen=“0” and rsp16_PDen=“1”, thereby causing rsp16_PUen_out=“0”, thereby causing the rsp16 data line=“0” (back to its default state).
- 4. In cycle 3, rsp2Ksel is asserted to “1” for a half-cycle to engage stage 3 outbound RSP functionality. The rising edge of rsp2Ksel causes rsp2K_LATen=“1” and rsp2K_PDen=“0”, and rsp2Ksel=1 causes rsp2K LAT to capture the state of the rsp256 data line (i.e. the logic “1” from cycle 2), thereby causing rsp2K_PUen_out=“1”, thereby causing the rsp2K data line=“1”. The falling edge of rsp2Ksel causes rsp256_LATen=“0” and rsp256_PDen=“1”, thereby causing rsp256_PUen_out=“0”, thereby causing the rsp256 data line=“0” (back to its default state).
- 5. In cycle 4, rspStartRet is asserted to “1” for a half-cycle to disengage outbound RSP functionality and engage inbound RSP functionality. The rising edge of rspStartRet causes rspReturn=“1”, thereby causing rsp256_PDen=“0” and (because the rsp2K data line=“1”) rsp256_PUen_in =“1”, thereby causing the rsp256 data line=“1”. The rising edge of rspReturn also causes rsp16_PDen=“0” and (because the rsp256 data line=“1”) rsp16_PUen_in =“1”, thereby causing the rsp16 data line=“1”.
- 6. In cycle 7 (and one cycle before rspEnd is asserted), GWE is asserted to “1” for a half-cycle to initiate a write operation that stores the state of the rsp16 data line in a computational memory cell in the bl-sect.
- 7. In cycle 8, rspEnd is asserted to “1” for a half-cycle to disengage inbound RSP functionality. The rising edge of rspEnd causes rspReturn=“0”, thereby causing rsp2K_LATen=“0” and rsp2K_PDen=“1”, thereby causing rsp2K_PUen_out=“0”, thereby causing the rsp2K data line=“0” (back to its default state). The falling edge of rspReturn also causes rsp256_PDen=“1” and rsp256_PUen_in =“0”, thereby causing the rsp256 data line=“0” (back to its default state). The falling edge of rspReturn also causes rsp16_PDen=“1” and rsp16_PUen_in =“0”, thereby causing the rsp16 data line=“0” (back to its default state).

The RSP implementation in FIG. 11 allows for “RSP” read operations to be initiated every 8 cycles, when RSP is engaged to produce and store rsp2K results in the bl-sects.

- 1. In cycle 0, GRE is asserted to “1” for a half-cycle to initiate a read operation that causes a computation result=“1” to be captured in the bl-sect Read Register. In this cycle, rsp16_LATen=“0” (its default state) and rsp16 PDen=“1” (its default state), thereby causing rsp16_PUen_out=“0” (its default state), thereby causing the rsp16 data line=“0” (its default state).
- 2. In cycle 1, rsp16sel is asserted to “1” for a half-cycle to engage stage 1 outbound RSP functionality. The rising edge of rsp16sel causes rsp16 LATen=“1” and rsp16 PDen=“0”, and rsp16sel=1 causes rsp16 LAT to capture the RBL_Reg_Out output of the Read Register (i.e. the logic “1” from cycle 0), thereby causing rsp16_PUen_out=“1”, thereby causing the rsp16 data line=“1”.
- 3. In cycle 2, rsp256sel is asserted to “1” for a half-cycle to engage stage 2 outbound RSP functionality. The rising edge of rsp256sel causes rsp256_LATen=“1” and rsp256_PDen=“0”, and rsp256sel=1 causes rsp256 LAT to capture the state of the rsp16 data line (i.e. the logic “1” from cycle 1), thereby causing rsp256_PUen_out=“1”, thereby causing the rsp256 data line=“1”. The falling edge of rsp256sel causes rsp16 LATen=“0” and rsp16_PDen=“1”, thereby causing rsp16_PUen_out=“0”, thereby causing the rsp16 data line=“0” (back to its default state).
- 4. In cycle 3, rsp2Ksel is asserted to “1” for a half-cycle to engage stage 3 outbound RSP functionality. The rising edge of rsp2Ksel causes rsp2K_LATen=“1” and rsp2K_PDen=“0”, and rsp2Ksel=1 causes rsp2K LAT to capture the state of the rsp256 data line (i.e. the logic “1” from cycle 2), thereby causing rsp2K_PUen_out=“1”, thereby causing the rsp2K data line=“1”. The falling edge of rsp2Ksel causes rsp256_LATen=“0” and rsp256_PDen=“1”, thereby causing rsp256 PUen_out=“0”, thereby causing the rsp256 data line=“0” (back to its default state).
- 5. In cycle 4, rsp32Ksel is asserted to “1” for a half-cycle to engage stage 4 outbound RSP functionality. The rising edge of rsp32Ksel causes rsp32K_LATen=“1” and rsp32K_PDen=“0”, and rsp32Ksel=1 causes rsp32K LAT to capture the state of the rsp2K data line (i.e. the logic “1” from cycle 3), thereby causing rsp32K_PUen_out=“1”, thereby causing the rsp32K data line=“1”. The falling edge of rsp32Ksel causes rsp2K_LATen=“0” and rsp2K_PDen=“1”, thereby causing rsp2K_PUen_out=“0”, thereby causing the rsp2K data line=“0” (back to its default state).
- 6. In cycle 5, rspStartRet is asserted to “1” for a half-cycle to disengage outbound RSP functionality and engage inbound RSP functionality. The rising edge of rspStartRet causes rspReturn=“1”, thereby causing rsp2K_PDen=“0” and (because the rsp32K data line=“1”) rsp2K_PUen_in =“1”, thereby causing the rsp2K data line=“1”. The rising edge of rspReturn also causes rsp256_PDen=“0” and (because the rsp2K data line=“1”) rsp256_PUen_in =“1”, thereby causing the rsp256 data line=“1”. The rising edge of rspReturn also causes rsp16_PDen=“0” and (because the rsp256 data line=“1”) rsp16_PUen_in =“1”, thereby causing the rsp16 data line=“1”.
- 7. In cycle 9 (and one cycle before rspEnd is asserted), GWE is asserted to “1” for a half-cycle to initiate a write operation that stores the state of the rsp16 data line in a computational memory cell in the bl-sect.
- 8. In cycle 10, rspEnd is asserted to “1” for a half-cycle to disengage inbound RSP functionality. The rising edge of rspEnd causes rspReturn=“0”, thereby causing rsp2K_PDen=“1” and rsp2K_PUen_in =“0”, thereby causing the rsp2K data line=“0” (back to its default state). The falling edge of rspReturn also causes rsp256_PDen=“1” and rsp256_PUen_in =“0”, thereby causing the rsp256 data line=“0” (back to its default state). The falling edge of rspReturn also causes rsp16_PDen=“1” and rsp16_PUen_in =“0”, thereby causing the rsp16 data line=“0” (back to its default state). The falling edge of rspEnd causes rsp32K_LATen=“0” and rsp32K_PDen=“1”, thereby causing rsp32K_PUen_out=“0”, thereby causing the rsp32K data line=“0” (back to its default state).

The RSP implementation in FIG. 11 allows for “RSP” read operations to be initiated every 10 cycles, when RSP is engaged to produce and store rsp32K results in the bl-sects.

FIG. 18 illustrates the high level RSP functionality for sixteen sections with 2K bit lines when 16 sections with 2K bit lines each are implemented in the processing array device according to FIGS. 10-12. As shown, within a section there are:

- 128 rsp16 data lines—one per each group of 16 bls of 2K.
- 8 rsp256 data lines—one per each group of 16 rsp16 data lines, i.e. one per each group of 256 bls of 2K.
- 1 rsp2K data line—one per a single group of 8 rsp256 data lines, i.e. one per all 2K bls.

And across all 16 sections there is:

- 1 rsp32K data line—one per a single group of 16 rsp2K data lines, i.e. one per 32K bls.

The foregoing description, for purpose of explanation, has been described with reference to specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the disclosure to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The embodiments were chosen and described in order to best explain the principles of the disclosure and its practical applications, to thereby enable others skilled in the art to best utilize the disclosure and various embodiments with various modifications as are suited to the particular use contemplated.

The system and method disclosed herein may be implemented via one or more components, systems, servers, appliances, other subcomponents, or distributed between such elements. When implemented as a system, such systems may include an/or involve, inter alia, components such as software modules, general-purpose CPU, RAM, etc. found in general-purpose computers. In implementations where the innovations reside on a server, such a server may include or involve components such as CPU, RAM, etc., such as those found in general-purpose computers.

Additionally, the system and method herein may be achieved via implementations with disparate or entirely different software, hardware and/or firmware components, beyond that set forth above. With regard to such other components (e.g., software, processing components, etc.) and/or computer-readable media associated with or embodying the present inventions, for example, aspects of the innovations herein may be implemented consistent with numerous general purpose or special purpose computing systems or configurations. Various exemplary computing systems, environments, and/or configurations that may be suitable for use with the innovations herein may include, but are not limited to: software or other components within or embodied on personal computers, servers or server computing devices such as routing/connectivity components, hand-held or laptop devices, multiprocessor systems, microprocessor-based systems, set top boxes, consumer electronic devices, network PCs, other existing computer platforms, distributed computing environments that include one or more of the above systems or devices, etc.

In some instances, aspects of the system and method may be achieved via or performed by logic and/or logic instructions including program modules, executed in association with such components or circuitry, for example. In general, program modules may include routines, programs, objects, components, data structures, etc. that performs particular tasks or implement particular instructions herein. The inventions may also be practiced in the context of distributed software, computer, or circuit settings where circuitry is connected via communication buses, circuitry or links. In distributed settings, control/instructions may occur from both local and remote computer storage media including memory storage devices.

The software, circuitry and components herein may also include and/or utilize one or more type of computer readable media. Computer readable media can be any available media that is resident on, associable with, or can be accessed by such circuits and/or computing components. By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and can accessed by computing component. Communication media may comprise computer readable instructions, data structures, program modules and/or other components. Further, communication media may include wired media such as a wired network or direct-wired connection, however no media of any such type herein includes transitory media. Combinations of the any of the above are also included within the scope of computer readable media.

In the present description, the terms component, module, device, etc. may refer to any type of logical or functional software elements, circuits, blocks and/or processes that may be implemented in a variety of ways. For example, the functions of various circuits and/or blocks can be combined with one another into any other number of modules. Each module may even be implemented as a software program stored on a tangible memory (e.g., random access memory, read only memory, CD-ROM memory, hard disk drive, etc.) to be read by a central processing unit to implement the functions of the innovations herein. Or, the modules can comprise programming instructions transmitted to a general purpose computer or to processing/graphics hardware via a transmission carrier wave. Also, the modules can be implemented as hardware logic circuitry implementing the functions encompassed by the innovations herein. Finally, the modules can be implemented using special purpose instructions (SIMD instructions), field programmable logic arrays or any mix thereof which provides the desired level performance and cost.

As disclosed herein, features consistent with the disclosure may be implemented via computer-hardware, software and/or firmware. For example, the systems and methods disclosed herein may be embodied in various forms including, for example, a data processor, such as a computer that also includes a database, digital electronic circuitry, firmware, software, or in combinations of them. Further, while some of the disclosed implementations describe specific hardware components, systems and methods consistent with the innovations herein may be implemented with any combination of hardware, software and/or firmware. Moreover, the above-noted features and other aspects and principles of the innovations herein may be implemented in various environments. Such environments and related applications may be specially constructed for performing the various routines, processes and/or operations according to the invention or they may include a general-purpose computer or computing platform selectively activated or reconfigured by code to provide the necessary functionality. The processes disclosed herein are not inherently related to any particular computer, network, architecture, environment, or other apparatus, and may be implemented by a suitable combination of hardware, software, and/or firmware. For example, various general-purpose machines may be used with programs written in accordance with teachings of the invention, or it may be more convenient to construct a specialized apparatus or system to perform the required methods and techniques.

Aspects of the method and system described herein, such as the logic, may also be implemented as functionality programmed into any of a variety of circuitry, including programmable logic devices (“PLDs”), such as field programmable gate arrays (“FPGAs”), programmable array logic (“PAL”) devices, electrically programmable logic and memory devices and standard cell-based devices, as well as application specific integrated circuits. Some other possibilities for implementing aspects include: memory devices, microcontrollers with memory (such as EEPROM), embedded microprocessors, firmware, software, etc. Furthermore, aspects may be embodied in microprocessors having software-based circuit emulation, discrete logic (sequential and combinatorial), custom devices, fuzzy (neural) logic, quantum devices, and hybrids of any of the above device types. The underlying device technologies may be provided in a variety of component types, e.g., metal-oxide semiconductor field-effect transistor (“MOSFET”) technologies like complementary metal-oxide semiconductor (“CMOS”), bipolar technologies like emitter-coupled logic (“ECL”), polymer technologies (e.g., silicon-conjugated polymer and metal-conjugated polymer-metal structures), mixed analog and digital, and so on.

It should also be noted that the various logic and/or functions disclosed herein may be enabled using any number of combinations of hardware, firmware, and/or as data and/or instructions embodied in various machine-readable or computer-readable media, in terms of their behavioral, register transfer, logic component, and/or other characteristics. Computer-readable media in which such formatted data and/or instructions may be embodied include, but are not limited to, non-volatile storage media in various forms (e.g., optical, magnetic or semiconductor storage media) though again does not include transitory media. Unless the context clearly requires otherwise, throughout the description, the words “comprise,” “comprising,” and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is to say, in a sense of “including, but not limited to.” Words using the singular or plural number also include the plural or singular number respectively. Additionally, the words “herein,” “hereunder,” “above,” “below,” and words of similar import refer to this application as a whole and not to any particular portions of this application. When the word “or” is used in reference to a list of two or more items, that word covers all of the following interpretations of the word: any of the items in the list, all of the items in the list and any combination of the items in the list.

Although certain presently preferred implementations of the invention have been specifically described herein, it will be apparent to those skilled in the art to which the invention pertains that variations and modifications of the various implementations shown and described herein may be made without departing from the spirit and scope of the invention. Accordingly, it is intended that the invention be limited only to the extent required by the applicable rules of law.

While the foregoing has been with reference to a particular embodiment of the disclosure, it will be appreciated by those skilled in the art that changes in this embodiment may be made without departing from the principles and spirit of the disclosure, the scope of which is defined by the appended claims.

Claims

1. A processing array device, comprising: a plurality of memory cells arranged in an array having a plurality of columns and a plurality of rows, each memory cell having a storage element wherein the array has a plurality of sections and each section has a plurality of bit line sections and a plurality of bit lines with one bit line per bit line section, wherein the memory cells in each bit line section are all connected to a single read bit line that generates a computation result and the plurality of bit lines in each section are distinct from the plurality of bit lines included in the other sections of the array;each bit line section having a read storage device that captures the computation result;each section having circuitry to logically combine the computation results captured by the bit line sections in the section;each bit line section having circuitry that stores the combined computation results in one or more memory cells in the bit line section;an RSP data line, connected to each bit line section, that communicates the combined computation result outside of the processing array device; andwherein each section has an RSP data line that communicates the computation results for the plurality of bit line sections in the section, and wherein each bit line section has circuitry that generates a logical OR on the RSP data line of the computation results captured in all of the read storage devices in each of the bit line sections in the section and each bit line section stores the combined computation result.
2. The processing array device of claim 1 further comprising one or more pull-down transistors are connected to the RSP data line that are enabled during a default state of the RSP data line, wherein each bit line section has a pull-up transistor connected to the RSP data line that is disabled during the default state of the RSP line and wherein, when the RSP function is enabled, each pull-down transistor is disabled, the pull-up transistor associated with each bit line section is enabled if the computational result captured in the read storage device is a logical one so that the bit line section circuitry generates the logical OR.
3. The processing array device of claim 2, wherein each bit line section further comprises RSP logic having inputs from the read storage device of the bit line section, an RSP select control signal and an RSP end control signal and an ouput that drives the RSP data line and wherein each bit line section has a write data selection circuit that selects the data being written into the one or more memory cells of the bit line section wherein an input of the write data selection circuit is the RSP data line.
4. The processing array device of claim 3, wherein the write data selection circuit further comprises a multiplexer.
5. The processing array device of claim 3, wherein the RSP logic in each bit line section generates a RSP pull down enable control signal that is a logic zero from a rising edge of a RSP select control signal to a falling edge of the RSP end control signal, the RSP logic in each bit line section further comprises a latch having a data input that is the output of the read storage device, a clock input connected to the RSP select control signal, a reset input connected to a RSP pull down enable (RSP_PDen) control signal and a data output that is a RSP pull up enable (RSP_PUen) control signal.
6. The processing array device of claim 5, wherein the RSP logic in each bit line section further comprises a pull down transistor whose enable input is the RSP_PDen control signal and whose output is connected to RSP data line and a pull up transistor whose enable input is the RSP_PUen control signal and whose output is connected to the RSP data line.
7. The processing array device of claim 6, wherein the RSP logic in each bit line section further comprises a first buffer whose input is the RSP data line and whose output is an input to a write multiplexer in each bit line section and a second buffer whose input is the RSP data line and whose output is an output signal off of the processing array device.
8. The processing array device of claim 1 further comprising a plurality of RSP data lines wherein each RSP data line is connected to a larger number of bit line sections in the processing array device.
9. The processing array device of claim 8, wherein the plurality of RSP data lines further comprises a plurality of RSP data line stages wherein each RSP data line stage is connected to a subset of the bit line sections in each section, a first particular RSP data line stage that is connected to all of the bit line sections in a section and a second particular RSP data line that is connected to each of the bit lines sections across a plurality of sections.
10. The processing array device of claim 9 further comprising RSP outbound circuitry, connected to each of the plurality of RSP data line stages, that produces a logical OR, on each of the plurality of RSP data line stages, of the computation results captured in the read storage of the bit line sections.
11. The processing array device of claim 10 further comprising RSP outbound circuitry in each section that produces a logical OR on each successive outbound stage RSP data line.
12. The processing array device of claim 11 further comprises RSP inbound circuitry, connected to each of the plurality of RSP data line stages, that propagates the logical OR result, when the RSP outbound flow is halted, from a particular RSP data line stage to each pervious RSP data line stage including the first RSP data line stage and stores the result in the bit line sections.
13. The processing array device of claim 12 further comprising RSP inbound circuitry in each section that enables a state of the second particular RSP data line to all first RSP data lines spanned by the second particular RSP data line.
14. The processing array device of claim 11, wherein the plurality of RSP data line stages further comprises a first RSP data line stage and a plurality of subsequent RSP data line stages.
15. The processing array device of claim 11, wherein the plurality of section further comprises sixteen section wherein each section further comprises 2048 bit lines and wherein the plurality of RSP data lines further comprises a plurality of first stage RSP data lines with each first stage RSP data line spanning a unique group of sixteen bit line sections in the section, a plurality of second stage RSP data lines with each second stage RSP data line spanning a unique group of sixteen first stage RSP data lines in the section, a third stage RSP data line that spans the plurality of second stage RSP data lines in the section and a fourth stage RSP data line for all sixteen sections that spans the third stage data lines for all sections in the processing array device.
16. The processing array device of claim 15, wherein each section has RSP control logic that generates control signals for the plurality of first stage RSP data lines, the plurality of second stage RSP data lines, a third stage RSP data line and a fourth stage RSP data line.
17. The processing array device of claim 16 further comprising a first stage RSP data line out logic whose input is the output of the bit line section read storage device and that is controlled by the RSP control logic control signals and whose output drives the first stage RSP data line, a second stage RSP data line out logic whose input is the first stage RSP data line and that is controlled by the RSP control logic control signals and whose output drives the second stage RSP data line, a third stage RSP data line out logic whose input is the second stage RSP data line and that is controlled by the RSP control logic control signals and whose output drives the third stage RSP data line, and a fourth stage RSP data line out logic whose input is the third stage RSP data line and that is controlled by the RSP control logic control signals and whose output drives the fourth stage RSP data line.
18. The processing array device of claim 17 further comprising a third stage RSP data line in logic whose input is the fourth stage RSP data line and that is controlled by the RSP control logic control signals and whose output drives the third stage RSP data line, a second stage RSP data line in logic whose input is the third stage RSP data line and that is controlled by the RSP control logic control signals and whose output drives the second stage RSP data line, and a first stage RSP data line in logic whose input is the second stage RSP data line and that is controlled by the RSP control logic control signals and whose output drives the first stage RSP data line.
19. The processing array device of claim 17, wherein the first stage RSP data line out logic in each bit line section further comprises a latch whose input is an output of the read storage for the bit line section and whose output is a pull up enable control signal, a pull down transistor connected to the first stage RSP data line and a pull up transistor connected to the first stage RSP data line, wherein the second stage RSP data line out logic in each section further comprises a latch whose input is the first stage RSP data line and whose output is a pull up enable control signal, a pull down transistor connected to the second stage RSP data line and a pull up transistor connected to the second stage RSP data line, wherein the third stage RSP data line out logic in each section further comprises a latch whose input is the second stage RSP data line and whose output is a pull up enable control signal, a pull down transistor connected to the third stage RSP data line and a pull up transistor connected to the third stage RSP data line and wherein the fourth stage RSP data line out logic in each section further comprises a latch whose input is the third stage RSP data line and whose output is a pull up enable control signal, a pull down transistor connected to the fourth stage RSP data line and a pull up transistor connected to the fourth stage RSP data line.
20. The processing array device of claim 19, wherein the third stage RSP data line in logic further comprises an AND gate that generates a third stage RSP in pull-up enable control signal equal to a logical AND of the fourth stage RSP data line and the RSPreturn control signal and a pull up transistor connected to the third stage RSP data line whose enable line is the third stage RSP in pull-up enable control signal, wherein the second stage RSP data line in logic further comprises an AND gate that generates a second stage RSP in pull-up enable control signal equal to a logical AND of the third stage RSP data line and the RSPreturn control signal and a pull up transistor connected to the second stage RSP data line whose enable line is the second stage RSP in pull-up enable control signal and wherein the first stage RSP data line in logic further comprises an AND gate that generates a first stage RSP in pull-up enable control signal equal to a logical AND of the second stage RSP data line and the RSPreturn control signal and a pull up transistor connected to the first stage RSP data line whose enable line is the first stage RSP in pull-up enable control signal.
21. The processing array device of claim 9, wherein each bit line section further comprises circuitry that stores a state of the RSP data line.
22. The processing array device of claim 21, wherein each section further comprises circuitry that transmits the RSP data line outside of the processing array.
23. A response results method for a processing array device, the method comprising: generating a computation result using a processing array device having a plurality of memory cells arranged in an array having a plurality of columns and a plurality of rows, each memory cell having a storage element wherein the array has a plurality of sections and each section has a plurality of bit line sections and a plurality of bit lines with one bit line per bit line section, wherein the memory cells in each bit line section are all connected to a single read bit line that generates the computation result and the plurality of bit lines in each section are distinct from the plurality of bit lines included in the other sections of the array;logically combining, in each section, the computation results captured by the bit line sections in the section;storing, in each bit line section, the logically combined computation results in one or more memory cells in the bit line section;communicating, by an RSP data line connected to each bit line section, the combined computation result outside of the processing array device; andcommunicating, by an RSP data line in each section, the computation results for the plurality of bit line sections in the section and generating, in each bit line section, a logical OR on the RSP data line of the computation results captured in all of the read storage devices in each of the bit line sections in the section and storing, in each bit line section, the combined computation result.
24. The method of claim 23 further comprising disabling, in each bit line section, a pull-up transistor connected to the RSP data line during the default state of the RSP line and enabling, when a RSP function is enabled, the pull-up transistor associated with each bit line section if the computational result captured in the read storage device is a logical one so that the bit line section circuitry generates the logical OR.
25. The method of claim 23 further comprising providing a plurality of RSP data lines wherein each RSP data line is connected to a larger number of bit line sections in the processing array device.
26. The method of claim 25, wherein the plurality of RSP data lines further comprises a plurality of RSP data line stages wherein each RSP data line stage is connected to a subset of the bit line sections in each section, a first particular RSP data line stage that is connected to all of the bit line sections in a section and a second particular RSP data line that is connected to each of the bit lines sections across a plurality of sections.
27. The method of claim 26 further comprising generating, by RSP outbound circuitry connected to each of the plurality of RSP data line stages, a logical OR, on each of the plurality of RSP data line stages, of the computation results captured in the read storage of the bit line sections.
28. The method of claim 27 further comprising producing, by RSP outbound circuitry in each section, a logical OR on each successive outbound stage RSP data line.
29. The method of claim 28 further comprising storing, in each bit line section, a state of the RSP data line.
30. The method of claim 29 further comprising transmitting, in each section, the RSP data line outside of the processing array.
31. The method of claim 27 further comprising propagating, at each of the plurality of RSP data line stages, the logical OR result, when the RSP outboard flow is halted, back to the each prior RSP data line stage including the first RSP data line stage and stores the result in the bit line sections.
32. The method of claim 31 further comprising enabling, by RSP inbound circuitry in each section, a state of the second particular RSP data line to all first RSP data lines spanned by the second particular RSP data line.
33. The method of claim 26, wherein the plurality of RSP data line stages further comprises a first RSP data line stage and a plurality of subsequent RSP data line stages.
34. The method of claim 26, wherein the plurality of section further comprises sixteen section wherein each section further comprises 2048 bit lines and wherein the plurality of RSP data lines further comprises a plurality of first stage RSP data lines with each first stage RSP data line spanning a unique group of sixteen bit line sections in the section, a plurality of second stage RSP data lines with each second stage RSP data line spanning a unique group of sixteen first stage RSP data lines in the section, a third stage RSP data line that spans the plurality of second stage RSP data lines in the section and a fourth stage RSP data line for all sixteen sections that spans the third stage data lines for all sections in the processing array device.
35. The method of claim 34 further comprising providing a first stage RSP data line out logic whose input is the output of the bit line section read storage device and that is controlled by the RSP control logic control signals and whose output drives the first stage RSP data line, a second stage RSP data line out logic whose input is the first stage RSP data line and that is controlled by the RSP control logic control signals and whose output drives the second stage RSP data line, a third stage RSP data line out logic whose input is the second stage RSP data line and that is controlled by the RSP control logic control signals and whose output drives the third stage RSP data line, and a fourth stage RSP data line out logic whose input is the third stage RSP data line and that is controlled by the RSP control logic control signals and whose output drives the fourth stage RSP data line.
36. The method of claim 35 further comprising providing a third stage RSP data line in logic whose input is the fourth stage RSP data line and that is controlled by the RSP control logic control signals and whose output drives the third stage RSP data line, a second stage RSP data line in logic whose input is the third stage RSP data line and that is controlled by the RSP control logic control signals and whose output drives the second stage RSP data line, and a first stage RSP data line in logic whose input is the second stage RSP data line and that is controlled by the RSP control logic control signals and whose output drives the first stage RSP data line.
37. The method of claim 23 further comprising engaging a RSP function to communicate the combined computation result.
38. The method of claim 37 further comprising temporarily disabling, at a first stage of outbound RSP flow, each pull-down transistor in each first stage RSP data line and temporarily enabling each pull-up transistor in each first stage RSP data line if the computation result is “1”.
39. The method of claim 38 further comprising temporarily disabling, at each successive stage of outbound RSP flow, each pull-down transistor in each successive stage RSP data line and temporarily enabling each pull-up transistor in each successive stage RSP data line if the computation result is “1”.
40. The method of claim 37, wherein engaging the RSP function further comprises engaging each successive stage of an outbound RSP flow one or more clock cycles after a previous stage of the outbound RSP flow is engaged.
41. The method of claim 40, wherein engaging the RSP function further comprises engaging all stages of an inbound RSP flow simultaneously over one or more clock cycles to generate the RSP result.
42. The method of claim 23 further comprising halting the outbound RSP flow wherein the communication of the computation results from the first stage RSP data line to the successive stage RSP data lines is stopped.
43. The method of claim 42 further comprising engaging an inbound RSP flow when the outbound RSP flow is halted wherein the computation results are communicated from the successive stage RSP data lines to the first stage RSP data line.
44. The method of claim 43, wherein the plurality of RSP data line stages further comprises a last RSP data line stage and wherein engaging the inbound RSP flow further comprises temporarily enabling, for each inbound pull-up transistor in each successive stage RSP data line if the state of the last RSP data line stage is “1” and temporarily enabling, for each inbound pull-up transistor in each RSP stage data line except the last RSP data line stage, if a prior RSP data line stage is “1”.

PRIORITY CLAIM/RELATED APPLICATIONS

This application is a continuation in part of and claims priority under 35 USC 120 to U.S. patent application Ser. No. 15/709,399, filed Sep. 19, 2017 and entitled “Computational Dual Port Sram Cell And Processing Array Device Using The Dual Port Sram Cells For Xor And Xnor Computations”, U.S. patent application Ser. No. 15/709,401, filed Sep. 19, 2017 and entitled “Computational Dual Port Sram Cell And Processing Array Device Using The Dual Port Sram Cells For Xor And Xnor Computations”, U.S. patent application Ser. No. 15/709,379, filed Sep. 19, 2017 and entitled “Computational Dual Port Sram Cell And Processing Array Device Using The Dual Port Sram Cells”, U.S. patent application Ser. No. 15/709,382, filed Sep. 19, 2017 and entitled “Computational Dual Port Sram Cell And Processing Array Device Using The Dual Port Sram Cells”, and U.S. patent application Ser. No. 15/709,385, filed Sep. 19, 2017 and entitled “Computational Dual Port Sram Cell And Processing Array Device Using The Dual Port Sram Cells” that in turn claim priority under 35 USC 119(e) and 120 and claim the benefit of U.S. Provisional Patent Application No. 62/430,767, filed Dec. 6, 2016 and entitled “Computational Dual Port Sram Cell And Processing Array Device Using The Dual Port Sram Cells For Xor And Xnor Computations” and U.S. Provisional Patent Application No. 62/430,762, filed Dec. 6, 2016 and entitled “Computational Dual Port Sram Cell And Processing Array Device Using The Dual Port Sram Cells”, the entirety of all of which are incorporated herein by reference.

US Referenced Citations (399)

Number	Name	Date	Kind
3451694	Hass	Jun 1969	A
3747952	Graebe	Jul 1973	A
3795412	John	Mar 1974	A
4227717	Bouvier	Oct 1980	A
4308505	Messerschmitt	Dec 1981	A
4587496	Wolaver	May 1986	A
4594564	Yarborough, Jr.	Jun 1986	A
4677394	Vollmer	Jun 1987	A
4716322	D'Arrigo et al.	Dec 1987	A
4741006	Yamaguchi et al.	Apr 1988	A
4856035	Lewis	Aug 1989	A
5008636	Markinson	Apr 1991	A
5302916	Pritchett	Apr 1994	A
5375089	Lo	Dec 1994	A
5382922	Gersbach	Jan 1995	A
5400274	Jones	Mar 1995	A
5473574	Clemen et al.	Dec 1995	A
5530383	May	Jun 1996	A
5535159	Nii	Jul 1996	A
5563834	Longway et al.	Oct 1996	A
5587672	Ranganathan et al.	Dec 1996	A
5608354	Hori	Mar 1997	A
5661419	Bhagwan	Aug 1997	A
5696468	Nise	Dec 1997	A
5736872	Sharma et al.	Apr 1998	A
5744979	Goetting	Apr 1998	A
5744991	Jefferson et al.	Apr 1998	A
5748044	Xue	May 1998	A
5768559	Iino et al.	Jun 1998	A
5805912	Johnson et al.	Sep 1998	A
5883853	Zheng et al.	Mar 1999	A
5937204	Schinnerer	Aug 1999	A
5942949	Wilson et al.	Aug 1999	A
5963059	Partovi et al.	Oct 1999	A
5969576	Trodden	Oct 1999	A
5969986	Wong	Oct 1999	A
5977801	Boerstler	Nov 1999	A
5999458	Nishimura et al.	Dec 1999	A
6005794	Sheffield et al.	Dec 1999	A
6044034	Katakura	Mar 2000	A
6058063	Jang	May 2000	A
6072741	Taylor	Jun 2000	A
6100721	Durec et al.	Aug 2000	A
6100736	Wu et al.	Aug 2000	A
6114920	Moon et al.	Sep 2000	A
6115320	Mick et al.	Sep 2000	A
6133770	Hasegawa	Oct 2000	A
6167487	Camacho	Dec 2000	A
6175282	Yasuda	Jan 2001	B1
6226217	Riedlinger et al.	May 2001	B1
6262937	Arcoleo et al.	Jul 2001	B1
6263452	Jewett et al.	Jul 2001	B1
6265902	Klemmer et al.	Jul 2001	B1
6286077	Choi et al.	Sep 2001	B1
6310880	Waller	Oct 2001	B1
6366524	Abedifard	Apr 2002	B1
6377127	Fukaishi et al.	Apr 2002	B1
6381684	Hronik et al.	Apr 2002	B1
6385122	Chang	May 2002	B1
6407642	Dosho et al.	Jun 2002	B2
6418077	Naven	Jul 2002	B1
6441691	Jones et al.	Aug 2002	B1
6448757	Hill	Sep 2002	B2
6473334	Bailey et al.	Oct 2002	B1
6483361	Chiu	Nov 2002	B1
6504417	Cecchi et al.	Jan 2003	B1
6538475	Johansen et al.	Mar 2003	B1
6567338	Mick	May 2003	B1
6594194	Gold	Jul 2003	B2
6642747	Chiu	Nov 2003	B1
6661267	Walker et al.	Dec 2003	B2
6665222	Wright et al.	Dec 2003	B2
6683502	Groen et al.	Jan 2004	B1
6683930	Dalmia	Jan 2004	B1
6732247	Berg et al.	May 2004	B2
6744277	Chang et al.	Jun 2004	B1
6757854	Zhao	Jun 2004	B1
6789209	Suzuki et al.	Sep 2004	B1
6816019	Delbo′ et al.	Nov 2004	B2
6836419	Loughmiller	Dec 2004	B2
6838951	Nieri et al.	Jan 2005	B1
6842396	Kono	Jan 2005	B2
6853696	Moser et al.	Feb 2005	B1
6854059	Gardner	Feb 2005	B2
6856202	Lesso	Feb 2005	B2
6859107	Moon et al.	Feb 2005	B1
6882237	Singh et al.	Apr 2005	B2
6897696	Chang et al.	May 2005	B2
6933789	Molnar et al.	Aug 2005	B2
6938142	Pawlowski	Aug 2005	B2
6940328	Lin	Sep 2005	B2
6954091	Wurzer	Oct 2005	B2
6975554	Lapidus et al.	Dec 2005	B1
6998922	Jensen et al.	Feb 2006	B2
7002404	Gaggl et al.	Feb 2006	B2
7002416	Pettersen et al.	Feb 2006	B2
7003065	Homol et al.	Feb 2006	B2
7017090	Endou et al.	Mar 2006	B2
7019569	Fan-Jiang	Mar 2006	B2
7042271	Chung et al.	May 2006	B2
7042792	Lee	May 2006	B2
7042793	Masuo	May 2006	B2
7046093	McDonagh et al.	May 2006	B1
7047146	Chuang et al.	May 2006	B2
7053666	Tak et al.	May 2006	B2
7095287	Maxim et al.	Aug 2006	B2
7099643	Lin	Aug 2006	B2
7141961	Hirayama et al.	Nov 2006	B2
7142477	Tran et al.	Nov 2006	B1
7152009	Bokui et al.	Dec 2006	B2
7180816	Park	Feb 2007	B2
7200713	Cabot et al.	Apr 2007	B2
7218157	Van De Beek et al.	May 2007	B2
7233214	Kim et al.	Jun 2007	B2
7246215	Lu et al.	Jul 2007	B2
7263152	Miller et al.	Aug 2007	B2
7269402	Uozumi et al.	Sep 2007	B2
7282999	Da Dalt et al.	Oct 2007	B2
7312629	Chuang et al.	Dec 2007	B2
7313040	Chuang et al.	Dec 2007	B2
7330080	Stoiber et al.	Feb 2008	B1
7340577	Van Dyke et al.	Mar 2008	B1
7349515	Chew et al.	Mar 2008	B1
7352249	Balboni et al.	Apr 2008	B2
7355482	Meltzer	Apr 2008	B2
7355907	Chen et al.	Apr 2008	B2
7369000	Wu et al.	May 2008	B2
7375593	Self	May 2008	B2
7389457	Chen et al.	Jun 2008	B2
7439816	Lombaard	Oct 2008	B1
7463101	Tung	Dec 2008	B2
7464282	Abdollahi-Alibeik et al.	Dec 2008	B1
7487315	Hur et al.	Feb 2009	B2
7489164	Madurawe	Feb 2009	B2
7512033	Hur et al.	Mar 2009	B2
7516385	Chen et al.	Apr 2009	B2
7538623	Jensen et al.	May 2009	B2
7545223	Watanabe	Jun 2009	B2
7565480	Ware et al.	Jul 2009	B2
7577225	Azadet et al.	Aug 2009	B2
7592847	Liu et al.	Sep 2009	B2
7595657	Chuang et al.	Sep 2009	B2
7622996	Liu	Nov 2009	B2
7630230	Wong	Dec 2009	B2
7633322	Zhuang et al.	Dec 2009	B1
7635988	Madurawe	Dec 2009	B2
7646215	Chuang et al.	Jan 2010	B2
7646648	Arsovski	Jan 2010	B2
7659783	Tai	Feb 2010	B2
7660149	Liaw	Feb 2010	B2
7663415	Chatterjee et al.	Feb 2010	B2
7667678	Guttag	Feb 2010	B2
7675331	Jung et al.	Mar 2010	B2
7689941	Ooi et al.	Mar 2010	B1
7719329	Smith	May 2010	B1
7719330	Lin et al.	May 2010	B2
7728675	Kennedy et al.	Jun 2010	B1
7737743	Gao et al.	Jun 2010	B1
7746181	Moyal	Jun 2010	B1
7746182	Ramaswamy et al.	Jun 2010	B2
7750683	Huang et al.	Jul 2010	B2
7760032	Ardehali	Jul 2010	B2
7760040	Zhang et al.	Jul 2010	B2
7760532	Shirley et al.	Jul 2010	B2
7782655	Shau	Aug 2010	B2
7812644	Cha et al.	Oct 2010	B2
7830212	Lee et al.	Nov 2010	B2
7839177	Soh	Nov 2010	B1
7843239	Sohn et al.	Nov 2010	B2
7843721	Chou	Nov 2010	B1
7848725	Zolfaghari et al.	Dec 2010	B2
7859919	De La Cruz, II et al.	Dec 2010	B2
7876163	Hachigo	Jan 2011	B2
7916554	Pawlowski	Mar 2011	B2
7920409	Clark	Apr 2011	B1
7920665	Lombaard	Apr 2011	B1
7924599	Evans, Jr. et al.	Apr 2011	B1
7940088	Sampath et al.	May 2011	B1
7944256	Masuda	May 2011	B2
7956695	Ding et al.	Jun 2011	B1
7965108	Liu et al.	Jun 2011	B2
8004920	Ito et al.	Aug 2011	B2
8008956	Shin et al.	Aug 2011	B1
8044724	Rao et al.	Oct 2011	B2
8063707	Wang	Nov 2011	B2
8087690	Kim	Jan 2012	B2
8089819	Noda	Jan 2012	B2
8117567	Arsovski	Feb 2012	B2
8174332	Lombaard et al.	May 2012	B1
8218707	Mai	Jul 2012	B2
8242820	Kim	Aug 2012	B2
8258831	Banai	Sep 2012	B1
8284593	Russell	Oct 2012	B2
8294502	Lewis et al.	Oct 2012	B2
8400200	Kim et al.	Mar 2013	B1
8488408	Shu et al.	Jul 2013	B1
8493774	Kung	Jul 2013	B2
8526256	Gosh	Sep 2013	B2
8542050	Chuang et al.	Sep 2013	B2
8575982	Shu et al.	Nov 2013	B1
8593860	Shu et al.	Nov 2013	B2
8625334	Liaw	Jan 2014	B2
8643418	Ma et al.	Feb 2014	B2
8692621	Snowden et al.	Apr 2014	B2
8693236	Shu	Apr 2014	B2
8817550	Oh	Aug 2014	B1
8837207	Jou	Sep 2014	B1
8885439	Shu et al.	Nov 2014	B1
8971096	Jung et al.	Mar 2015	B2
8995162	Sang	Mar 2015	B2
9018992	Shu et al.	Apr 2015	B1
9030893	Jung	May 2015	B2
9053768	Shu et al.	Jun 2015	B2
9059691	Lin	Jun 2015	B2
9070477	Clark	Jun 2015	B1
9083356	Cheng	Jul 2015	B1
9093135	Khailany	Jul 2015	B2
9094025	Cheng	Jul 2015	B1
9135986	Shu	Sep 2015	B2
9142285	Hwang et al.	Sep 2015	B2
9159391	Shu et al.	Oct 2015	B1
9171634	Zheng	Oct 2015	B2
9177646	Arsovski	Nov 2015	B2
9196324	Haig et al.	Nov 2015	B2
9240229	Oh et al.	Jan 2016	B1
9311971	Oh	Apr 2016	B1
9318174	Chuang et al.	Apr 2016	B1
9356611	Shu et al.	May 2016	B1
9384822	Shu et al.	Jul 2016	B2
9385032	Shu	Jul 2016	B2
9396790	Chhabra	Jul 2016	B1
9396795	Jeloka et al.	Jul 2016	B1
9401200	Chan	Jul 2016	B1
9412440	Shu et al.	Aug 2016	B1
9413295	Chang	Aug 2016	B1
9431079	Shu et al.	Aug 2016	B1
9443575	Yabuuchi	Sep 2016	B2
9484076	Shu et al.	Nov 2016	B1
9494647	Chuang et al.	Nov 2016	B1
9552872	Jung	Jan 2017	B2
9608651	Cheng	Mar 2017	B1
9613670	Chuang et al.	Apr 2017	B2
9613684	Shu et al.	Apr 2017	B2
9679631	Haig et al.	Jun 2017	B2
9685210	Ghosh et al.	Jun 2017	B1
9692429	Chang et al.	Jun 2017	B1
9697890	Wang	Jul 2017	B1
9722618	Cheng	Aug 2017	B1
9729159	Cheng	Aug 2017	B1
9789840	Farooq	Oct 2017	B2
9804856	Oh et al.	Oct 2017	B2
9847111	Shu et al.	Dec 2017	B2
9853633	Cheng et al.	Dec 2017	B1
9853634	Chang	Dec 2017	B2
9859902	Chang	Jan 2018	B2
9916889	Duong	Mar 2018	B1
9935635	Kim et al.	Apr 2018	B2
9966118	Shu et al.	May 2018	B2
10065594	Fukawatase	Sep 2018	B2
10153042	Ehrman	Dec 2018	B2
10192592	Shu et al.	Jan 2019	B2
10249312	Kim et al.	Apr 2019	B2
10249362	Shu	Apr 2019	B2
10388364	Ishizu et al.	Aug 2019	B2
10425070	Cheng et al.	Sep 2019	B2
10521229	Shu et al.	Dec 2019	B2
10535381	Shu et al.	Jan 2020	B2
10659058	Cheng et al.	May 2020	B1
10673440	Camarota	Jun 2020	B1
10770133	Haig et al.	Sep 2020	B1
10777262	Haig et al.	Sep 2020	B1
10854284	Chuang et al.	Dec 2020
20010052822	Kim et al.	Dec 2001	A1
20020006072	Kunikiyo	Jan 2002	A1
20020060938	Song	May 2002	A1
20020136074	Hanzawa et al.	Sep 2002	A1
20020154565	Noh et al.	Oct 2002	A1
20020168935	Han	Nov 2002	A1
20030016689	Hoof	Jan 2003	A1
20030107913	Nii	Jun 2003	A1
20030185329	Dickmann	Oct 2003	A1
20040053510	Little	Mar 2004	A1
20040062138	Partsch et al.	Apr 2004	A1
20040090413	Yoo	May 2004	A1
20040160250	Kim et al.	Aug 2004	A1
20040169565	Gaggl et al.	Sep 2004	A1
20040199803	Suzuki et al.	Oct 2004	A1
20040240301	Rao	Dec 2004	A1
20040264279	Wordeman	Dec 2004	A1
20040264286	Ware et al.	Dec 2004	A1
20050024912	Chen et al.	Feb 2005	A1
20050026329	Kim et al.	Feb 2005	A1
20050036394	Shiraishi	Feb 2005	A1
20050186930	Rofougaran et al.	Aug 2005	A1
20050226079	Zhu et al.	Oct 2005	A1
20050226357	Yoshimura	Oct 2005	A1
20050253658	Maeda et al.	Nov 2005	A1
20050285862	Noda	Dec 2005	A1
20060039227	Lai et al.	Feb 2006	A1
20060055434	Tak et al.	Mar 2006	A1
20060119443	Azam et al.	Jun 2006	A1
20060139105	Maxim et al.	Jun 2006	A1
20060143428	Noda	Jun 2006	A1
20060248305	Fang	Nov 2006	A1
20070001721	Chen et al.	Jan 2007	A1
20070047283	Miyanishi	Mar 2007	A1
20070058407	Dosaka et al.	Mar 2007	A1
20070109030	Park	May 2007	A1
20070115739	Huang	May 2007	A1
20070139997	Suzuki	Jun 2007	A1
20070171713	Hunter	Jul 2007	A1
20070189101	Lambrache et al.	Aug 2007	A1
20070229129	Nakagawa	Oct 2007	A1
20080010429	Rao	Jan 2008	A1
20080049484	Sasaki	Feb 2008	A1
20080068096	Feng et al.	Mar 2008	A1
20080079467	Hou et al.	Apr 2008	A1
20080080230	Liaw	Apr 2008	A1
20080117707	Manickavasakam	May 2008	A1
20080129402	Han et al.	Jun 2008	A1
20080155362	Chang et al.	Jun 2008	A1
20080175039	Thomas	Jul 2008	A1
20080181029	Joshi et al.	Jul 2008	A1
20080265957	Luong et al.	Oct 2008	A1
20080273361	Dudeck et al.	Nov 2008	A1
20090027947	Takeda	Jan 2009	A1
20090089646	Hirose	Apr 2009	A1
20090141566	Arsovski	Jun 2009	A1
20090154257	Fukaishi et al.	Jun 2009	A1
20090231943	Kunce et al.	Sep 2009	A1
20090256642	Lesso	Oct 2009	A1
20090296869	Chao et al.	Dec 2009	A1
20090319871	Shirai et al.	Dec 2009	A1
20100020590	Hsueh et al.	Jan 2010	A1
20100085086	Nedovic et al.	Apr 2010	A1
20100157715	Pyeon	Jun 2010	A1
20100169675	Kajihara	Jul 2010	A1
20100172190	Lavi	Jul 2010	A1
20100177571	Shori et al.	Jul 2010	A1
20100214815	Tam	Aug 2010	A1
20100232202	Lu	Sep 2010	A1
20100260001	Kasprak et al.	Oct 2010	A1
20100271138	Thakur et al.	Oct 2010	A1
20100322022	Shinozaki et al.	Dec 2010	A1
20110018597	Lee et al.	Jan 2011	A1
20110063898	Ong	Mar 2011	A1
20110153932	Ware et al.	Jun 2011	A1
20110211401	Chan et al.	Sep 2011	A1
20110267914	Ishikura	Nov 2011	A1
20110280307	Macinnis et al.	Nov 2011	A1
20110292743	Zimmerman	Dec 2011	A1
20110299353	Ito et al.	Dec 2011	A1
20120049911	Ura	Mar 2012	A1
20120133114	Choi	May 2012	A1
20120153999	Kim	Jun 2012	A1
20120242382	Tsuchiya et al.	Sep 2012	A1
20120243347	Sampigethaya	Sep 2012	A1
20120250440	Wu	Oct 2012	A1
20120281459	Teman et al.	Nov 2012	A1
20120327704	Chan	Dec 2012	A1
20130039131	Haig et al.	Feb 2013	A1
20130083591	Wuu	Apr 2013	A1
20130170289	Grover et al.	Jul 2013	A1
20140056093	Tran et al.	Feb 2014	A1
20140125390	Ma	May 2014	A1
20140136778	Khailany et al.	May 2014	A1
20140185366	Chandwani et al.	Jul 2014	A1
20140269019	Kolar	Sep 2014	A1
20150003148	Iyer et al.	Jan 2015	A1
20150029782	Jung	Jan 2015	A1
20150063052	Manning	Mar 2015	A1
20150187763	Kim et al.	Jul 2015	A1
20150213858	Tao	Jul 2015	A1
20150248927	Fujiwara	Sep 2015	A1
20150279453	Fujiwars	Oct 2015	A1
20150302917	Grover	Oct 2015	A1
20150310901	Jung	Oct 2015	A1
20150357028	Huang et al.	Dec 2015	A1
20160005458	Shu et al.	Jan 2016	A1
20160027500	Chuang et al.	Jan 2016	A1
20160064068	Mojumder	Mar 2016	A1
20160141023	Jung	May 2016	A1
20160225436	Wang	Aug 2016	A1
20160225437	Kumar	Aug 2016	A1
20160247559	Atallah et al.	Aug 2016	A1
20160284392	Block et al.	Sep 2016	A1
20160329092	Akerib	Nov 2016	A1
20170194046	Yeung, Jr. et al.	Jul 2017	A1
20170345505	Noel et al.	Nov 2017	A1
20180122456	Li	May 2018	A1
20180123603	Chang	May 2018	A1
20180157621	Shu et al.	Jun 2018	A1
20180158517	Shu et al.	Jun 2018	A1
20180158518	Shu et al.	Jun 2018	A1
20180158519	Shu et al.	Jun 2018	A1
20180158520	Shu	Jun 2018	A1
20200117398	Haig et al.	Apr 2020	A1
20200160905	Charles et al.	May 2020	A1
20200301707	Shu et al.	Sep 2020	A1

Foreign Referenced Citations (3)

Number	Date	Country
104752431	Jul 2015	CN
10133281	Jan 2002	DE
2005-346922	Dec 2005	JP

Non-Patent Literature Citations (2)

Entry
US 10,564,982 B1, 02/2020, Oh et al. (withdrawn)
Wang et al., “A Two-Write and Two-Read Multi-Port SRAM with Shared Write Bit-Line Scheme and Selective Read Path for Low Power Operation”, Journal of Low Power Electronics vol. 9. 9-22, 2013, Department of Electronics Engineering and Institute of Electronics, National Chiao-Tung University, Hsinchu 300, Taiwan (Received: Oct. 15, 2012: Accepted: Feb. 11, 2013), 14 pages.

Provisional Applications (2)

	Number	Date	Country
	62430767	Dec 2016	US
	62430762	Dec 2016	US

Continuation in Parts (5)

	Number	Date	Country
Parent	15709399	Sep 2017	US
Child	16152374		US
Parent	15709401	Sep 2017	US
Child	15709399		US
Parent	15709379	Sep 2017	US
Child	15709401		US
Parent	15709382	Sep 2017	US
Child	15709379		US
Parent	15709385	Sep 2017	US
Child	15709382		US

Results processing circuits and methods associated with computational memory cells

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

CPC

Field of Search

US

CPC

International Classifications

Abstract

Description

Claims

PRIORITY CLAIM/RELATED APPLICATIONS

US Referenced Citations (399)

Foreign Referenced Citations (3)

Non-Patent Literature Citations (2)

Provisional Applications (2)

Continuation in Parts (5)