The present invention relates to the field of integrated circuit, and more particularly to a storage with in-situ string-searching capabilities.
Big data is a term for data sets that are so large or complex that conventional data processing methods are inadequate to deal with them. Big data philosophy encompasses unstructured, semi-structured and structured data, however the main focus is on unstructured data. With high volume, high velocity and high variety, big-data analytics demand cost-effective and innovative forms of information processing.
An important aspect of big-data analytics is string searching. The basic string-searching operations are pattern matching and/or pattern recognition. Pattern matching and pattern recognition are the acts of searching a target pattern (i.e. the pattern to be searched) for the presence of the constituents or variants of a search pattern (i.e. the pattern used for searching). The match usually has to be “exact” for pattern matching, while it could be “likely to a certain degree” for pattern recognition. In the case of big-data analytics, the target pattern is a computer data, while the search pattern is a search string. Unless explicitly stated, the present invention does not differentiate pattern matching and pattern recognition. They are collectively referred to as pattern processing. In addition, search patterns and target patterns are collectively referred to as patterns.
Big data has become big: its “size” ranges from a few dozen of TBs to many PBs and is still growing. This makes it difficult to use a conventional computer to perform big-data analytics. Based on the von Neumann architecture, the storage and the processor of the conventional computer are separated. Because a conventional storage is “dumb”, i.e. without any analyzing capabilities per se, the data to be analyzed have to be read out from the storage first, which could take hours. Consequently, the von Neumann architecture is inefficient for big-data analytics. At present, big-data analytics generally requires tens, hundreds, or even thousands of servers.
It is a principle object of the present invention to improve the efficiency of big-data analytics.
It is a further object of the present invention to improve the string-searching speed for big data.
It is a further object of the present invention to provide a storage with in-situ string-searching capabilities at a reasonable cost.
In accordance with these and other objects of the present invention, the present invention discloses a storage with in-situ string-searching capabilities.
The present invention discloses a storage with in-situ string-searching capabilities. It is primarily a storage, with string searching as its secondary function. Compared with prior art, the preferred storage is “smarter”, i.e. it has an in-situ pattern-processing capabilities. To be more specific, the primary purpose of the preferred storage is to store data, while its secondary purpose is to search the stored data for at least a search string from an input.
The preferred storage comprises at least a three-dimensional memory (3D-M) die. The 3D-M die is a monolithic integrated circuit comprising a plurality of storage-processing units (SPU). Each SPU comprises a pattern-processing circuit and at least a 3D-M array. The 3D-M array stores computer data, while the pattern-processing circuit searches the computer data for the search string. The 3D-M array is stacked above the pattern-processing circuit and is communicatively coupled with the pattern-processing circuit through a plurality of contact vias. These contact vias are collectively referred to as inter-storage-processor (ISP) connection. Vertically stacked, this type of integration is referred to as 3-D integration. With in-situ string-searching capabilities, the preferred 3D-M of the present invention is referred to as 3D-MSS.
The 3-D integration of the memory circuit (i.e. 3D-M array) and the processing circuit (i.e. pattern-processing circuit) offers many advantages. Although there is a growing trend to integrate a processing circuit into a memory circuit, the type of integration used by prior art is a two-dimensional (2-D) integration. To be more specific, the processing circuit and the memory circuit are formed side-by-side on the surface of a semiconductor substrate. With the 2-D integration, adding pattern-processing circuits into a memory die would increase the die size, which results in a higher die cost.
In contrast, with the 3-D integration, adding pattern-processing circuits into a 3D-M die will not increase the die size because the pattern-processing circuits are formed under the 3D-M array. It should be noted that most of the substrate area can be used to form the pattern-processing circuits, since the peripheral circuits of the 3D-M array only occupy a small portion of the substrate area. Better yet, because the peripheral circuits of the 3D-M array need to be formed anyway and the pattern-processing circuits can be considered as a byproduct of the peripheral circuits as they are formed at the same time, integrating the pattern-processing circuits into the 3D-M die does not increase its overall manufacturing cost. For a given storage capacity, a “smart” 3D-MSS, which has string-searching capabilities, costs almost as much as a conventional “dumb” 3D-M, which is just a simple storage.
Besides the cost advantage, the 3-D integration provides a better performance. With the 2-D integration, the connections between the memory circuits and the processing circuits are long (at least tens of microns) and few (tens to hundreds). In comparison, with the 3-D integration, the contact vias between the 3D-M arrays and the pattern-processing circuits are short (microns) and numerous (thousands). As a result, the ISP-connection in the preferred 3D-MSS has a large bandwidth.
Accordingly, the present invention discloses a storage with in-situ string-searching capabilities, comprising: an input for transferring at least a search string; a semiconductor substrate having transistors thereon; a plurality of storage-processing units (SPU) on said semiconductor substrate, each of said SPUs comprising at least a three-dimensional memory (3D-M) array for storing at least a computer data and a pattern-processing circuit for searching said computer data for said search string; wherein said pattern-processing circuit is formed on said semiconductor substrate; said 3D-M array is stacked above said pattern-processing circuit and communicatively coupled with said pattern-processing circuit by a plurality of contact vias.
As used herein, the phrase “permanent” is used in its broadest sense to mean any long-term storage; the phrase “communicatively coupled” is used in its broadest sense to mean any coupling whereby information may be passed from one element to another element; the symbol “/” refers to an “and” or “or” relationship, e.g. “text/code” means “text” only, “code” only, or “text” and “code” both; the phrase “data” is used both in both singular and plural forms
It should be noted that all the drawings are schematic and not drawn to scale. Relative dimensions and proportions of parts of the device structures in the figures have been shown exaggerated or reduced in size for the sake of clarity and convenience in the drawings. The same reference symbols are generally used to refer to corresponding or similar features in the different embodiments.
Those of ordinary skills in the art will realize that the following description of the present invention is illustrative only and is not intended to be in any way limiting. Other embodiments of the invention will readily suggest themselves to such skilled persons from an examination of the within disclosure.
Referring now to
As used herein, a computer (or, a computer system) includes any device(s) with a processor and a memory. Such devices can range from non-networked standalone devices as simple as calculators, to networked computing devices such as “smart” devices, including smart-phones, televisions and tiny devices as part of the Internet of Things (IoT). The computer data could be a part of a document, a file, a message, a program, or the like. The search string could include a key word, a search word, a regular expression, or the like.
The pattern-processing circuit 180 performs pattern matching and/or pattern recognition. It may take many forms. In one example, since a portion of the search string can be represented by a string of characters, the pattern-processing circuit 180 may comprise a text-matching circuit or a code-matching circuit. The text/code-matching circuits could be implemented by a content-addressable memory (CAM) or a comparator including XOR circuits. In another example, since another portion of the search string can be represented by a regular expression, the pattern-processing circuit 180 can be implemented by finite-state automata (FSA) circuits, which include non-deterministic FSA (NFA) circuits or deterministic FSA (DFA) circuits. It should be noted that, besides string searching, the pattern-processing circuit 180 may perform other functions, e.g. filtering, sorting, malware-screening, etc.
Referring now to
Based on the orientation of the memory cells, the 3D-M can be categorized into three-dimensional horizontal memory (3D-MH) and three-dimensional vertical memory (3D-MV). In a 3D-MH, all address lines are horizontal and the memory cells form a plurality of horizontal memory level(s). A well-known 3D-MH is 3D-XPoint. In a 3D-MV, at least one set of the address lines are vertical and the memory cells form a plurality of vertical memory strings. A well-known 3D-MV is 3D-NAND. In general, the 3D-MH (e.g. 3D-XPoint) is faster, while the 3D-MV (e.g. 3D-NAND) is denser.
The 3D-M suitable for computer storage is three-dimensional writable memory (3D-W), whose cells are electrically programmable. Based on the number of programming allowed, a 3D-W can be further categorized into three-dimensional one-time-programmable memory (3D-OTP) and three-dimensional multiple-time-programmable memory (3D-MTP). Types of the 3D-MTP cell include flash-memory cell, memristor, resistive random-access memory (RRAM or ReRAM) cell, phase-change memory (PCM) cell, programmable metallization cell (PMC), conductive-bridging random-access memory (CBRAM) cell, and the like.
The 3D-W comprises a substrate circuit OK formed on the substrate 0. A first memory level 16A is stacked above the substrate circuit OK, with a second memory level 16B stacked above the first memory level 16A. The substrate circuit OK includes the peripheral circuits of the memory levels 16A, 16B, as well as the pattern-processing circuits 180. It comprises transistors 0t and the associated interconnect 0M. Each of the memory levels (e.g. 16A, 16B) comprises a plurality of first address-lines (i.e. y-lines, e.g. 2a, 4a), a plurality of second address-lines (i.e. x-lines, e.g. 1a, 3a) and a plurality of 3D-W cells (e.g. 5aa). The first and second memory levels 16A, 16B are coupled to the substrate circuit OK through contact vias 1av, 3av, respectively. Coupling the 3D-M array 170 and the pattern-processing circuit 180, the contacts vias 1av, 3av are collectively referred to as inter-storage-processor (ISP) connection 160.
In this preferred embodiment, the 3D-W cell 5aa comprises a programmable layer 12 and a diode layer 14. The programmable layer 12 could be an OTP layer (e.g. an antifuse layer, used for the 3D-OTP) or an MTP layer (e.g. a phase-change layer, used for the 3D-MTP). The diode layer 14 is broadly interpreted as any layer whose resistance at the read voltage is substantially lower than the case when the applied voltage has a magnitude smaller than or polarity opposite to that of the read voltage. The diode could be a semiconductor diode (e.g. p-i-n silicon diode), or a metal-oxide (e.g. TiO2) diode.
Referring now to
Besides a smaller die size, the 3-D integration provides a better performance. With the 2-D integration, the connections between the memory circuits and the processing circuits are long (at least tens of microns) and few (tens to hundreds). In comparison, with the 3-D integration, the contact vias 1av, 3av between the 3D-M arrays 170 and the pattern-processing circuits 180 are short (microns) and numerous (thousands). As a result, the ISP-connection 160 in the preferred 3D-MSS 200 has a large bandwidth.
Referring now to
The embodiment of
The embodiment of
In some embodiments of the present invention, the pattern-processing circuit 180 may perform partial pattern processing. For example, the pattern-processing circuit 180 only performs a preliminary pattern processing (e.g. a simple feature extraction and analysis). After being filtered by the preliminary pattern processing, the remaining patterns are sent to an external processor (e.g. CPU, GPU) to complete the full pattern processing. Because a majority of the patterns will be filtered out by the preliminary pattern processing, the patterns outputted from the pattern-processing circuit 180 are far fewer than the patterns in the preferred storage. This can alleviate the bandwidth requirement on the output bus 120.
One of the great benefits of the 3D-MSS is that the additional string-searching capabilities add little or no cost. With the 3-D integration, adding pattern-processing circuits 180 into a 3D-M die will not increase the die size because the pattern-processing circuits 180 are formed under the 3D-M array 170. It should be noted that most of the substrate area 0 can be used to form the pattern-processing circuits 180, since the peripheral circuits (15, 17 . . . ) of the 3D-M array 170 only occupy a small portion of the substrate area 0. Better yet, because the peripheral circuits (15, 17 . . . ) of the 3D-M array 170 need to be formed anyway and the pattern-processing circuits 180 can be considered as a byproduct of the peripheral circuits (15, 17 . . . ) as they are formed at the same time, integrating the pattern-processing circuits 180 into the 3D-M die does not increase its overall manufacturing cost. For a given storage capacity, a “smart” 3D-MSS, which has string-searching capabilities, costs almost as much as a conventional “dumb” 3D-M, which is just a simple storage.
Like a flash memory, the preferred 3D-MSS of the present invention can be used to form a storage card (e.g. an SD card, a TF card) with in-situ string-searching capabilities, or a solid-state drive (SSD) with in-situ string-searching capabilities. To be more specific, a plurality of the preferred 3D-MSS dice 200 can be vertically stacked, and/or horizontally placed inside a package to form a storage card; and, a plurality of storage cards can be placed together and electrically coupled to form an SSD. These preferred storage card and SSD can not only store computer data, but also perform string searching on the stored computer data in situ.
An amazing benefit of the preferred storage card and SSD with in-situ string-searching capabilities is that their string-searching time does not increase with the storage capacity. Because each SPU 100ij in each 3D-MSS die 200 has its own pattern-processing circuit 180, this pattern-processing circuit 180 only needs to process the computer data stored in the 3D-M array 170 of this SPU 100ij. As a result, no matter how large is the capacity of the card/SSD, the string-searching time for the whole card/SSD is similar to that of a single SPU 100ij. This is much faster than a conventional computer system whose string-searching time increases linearly with the storage capacity.
While illustrative embodiments have been shown and described, it would be apparent to those skilled in the art that many more modifications than that have been mentioned above are possible without departing from the inventive concepts set forth therein. The invention, therefore, is not to be limited except in the spirit of the appended claims.
Number | Date | Country | Kind |
---|---|---|---|
201610127981.5 | Mar 2016 | CN | national |
201710122861.0 | Mar 2017 | CN | national |
201710130887.X | Mar 2017 | CN | national |
201710461236.9 | Jun 2017 | CN | national |
201710461243.9 | Jun 2017 | CN | national |
This application is a continuation-in-part of application “Distributed Pattern Processor Comprising Three-Dimensional Memory”, application Ser. No. 15/452,728, filed Mar. 7, 2017, which claims priorities from Chinese Patent Application No. 201610127981.5, filed Mar. 7, 2016; Chinese Patent Application No. 201710122861.0, filed Mar. 3, 2017; Chinese Patent Application No. 201710130887.X, filed Mar. 7, 2017, in the State Intellectual Property Office of the People's Republic of China (CN). This application also claims priorities from Chinese Patent Application No. 201710461236.9, filed Jun. 18, 2017; Chinese Patent Application No. 201710461243.9, filed Jun. 19, 2017, in the State Intellectual Property Office of the People's Republic of China (CN), the disclosures of which are incorporated herein by references in their entireties.
Number | Date | Country | |
---|---|---|---|
Parent | 15452728 | Mar 2017 | US |
Child | 15784074 | US |