Various embodiment of the invention relate generally to search engines and particularly to search engines employed for storage devices.
Digital information explosion continues to rapidly increase in the amount of published information or data available to users and the effects of this abundance of information. As the amount of data grows, the challenge to find useful information from network devices also grows. Though search engine technology has improved by many folds, search efficiency remains the bottle neck for search engines. Information is generally stored in many network devices, such as servers. Finding relevant information from thousands and thousands of storage/network devices is currently inefficient. One might wonder why these devices do not search the information internally with embedded searching engines and report the result to a system? In this way, even the searching results feedback from these devices are not accurate enough, it will help to improve the system searching efficiency.
The days of searching on a single server are long gone due to the limitation of computer speeds. Such kind of a solution is implemented by software instead of hardware searching engines. This has been replaced with distributed storage using multiple hardware engines which share search tasks and distribute among storage channels, with this approach being commonplace. However, it is very costly to maintain the many storage/network devices at an added cost of high power consumption.
Accordingly, there is a need for less costly and low power-consuming search engine storage devices.
Briefly, a self-search storage device includes a data buffer coupled between a host and a data storage medium and configured to multi-function receiving search configure information from the host and caching data from/to data storage medium and storing the search result information in the host. The self-search storage device further includes a data compare engine coupled to the data buffer and the data storage medium and including more than one data search units. The data compare engine is configured to receive a data stream from/to data storage medium and operable to employ the more than one data search units to compare the data to the keyword, each data search unit being configured with different keywords where more than one keyword is being searched concurrently, the data compare engine further operable to report the outcome of the comparison for use by the host.
A further understanding of the nature and the advantages of particular embodiments disclosed herein may be realized by reference of the remaining portions of the specification and the attached drawings.
Particular embodiments and methods of the invention disclose a self-search storage device includes a data buffer coupled between a host and a data storage medium and configured to multi-function by receiving a search configure information from the host and caching data from/to data storage medium and storing the search result information in the host. The self-search storage device further includes a data compare engine coupled to the data buffer and the data storage medium and including more than one data search units. The data compare engine is configured to receive a data stream from/to data storage medium and operable to employ the more than one data search units to compare the data to the keyword, each data search unit being configured with different keywords where more than one keyword is being search at the same time. The data compare engine further operable to report the outcome of the comparison for use by the host.
Referring now to
In operation, the host 1 issues a search command with a list of keywords to the controller 21 which distributes the keywords to multiple data search units (shown in
The self-search storage device controller 21 includes a data compare engine that is capable of searching more than one keywords in a data stream. The self-search storage device 2 is further capable of monitoring the data stream from the host 1 to the data storage medium 22, monitoring the data stream from the data storage medium 22 to the host 1, automatically reading the data from data storage medium to data buffer and searching keywords in the meanwhile, or any kind of data streaming in/out of the self-search storage device controller 21.
The self-search storage device controller 21 is self-contained in terms of searching and in this respect, it is self-searching.
In some embodiments of the invention, the host 1 may be a desktop/notebook computer, a server system, a mobile computing device, or any other suitable device capable of accessing the storage medium 22. The self-search storage device controller 21 may be a hard disk, a Solid State Device (SSD), a Personal Computer Memory Card International Association (PCMCIA) card, a Secure Digital Memory Card (SD)/MultiMediaCard (MMC) card, a universal serial bus (USB) disk, a micro-SD card, a Embedded MultiMediaCard (EMMC) chip, a compact disk (CD), a Digital Video Disc (DVD), or any other device for non-violation data storage. The data storage medium 22 may be flash, magnetic storage medium, magnetic tape, or any non-volatile memory.
The host side controller 213 is shown coupled to the host 1 and the data buffer 214, the latter coupling being through the data bus0 217. The data buffer 214 is further shown coupled to the main controller 211 and the data storage controller 215 with the latter coupling being through the data bus1 216. The data storage controller 215 is additionally shown coupled to the data storage medium 22. Through the data bus1 216, the data storage controller 215 and the data buffer 214 are shown coupled to the data compare engine 212. The data compare engine 212 is shown coupled to the main controller 211, which is shown coupled to the data buffer 214. The main controller 211 is shown to generate a control signal 218 to the host side controller 213.
The main controller 211 communicates command and data to the host 1 through the host side controller 213. Further, the main controller 211 accesses and manages the data storage medium 22 using the data storage controller 215 and/or configures keywords for the data compare engine 212.
The data compare engine 212 includes several data search units used to perform real-time keyword data searching with the result of the search being coupled onto the data bus1 216. The host side controller 213 handles the protocol between the host 1 and the data storage medium 22. Examples of such protocol are, without limitation, Serial Advanced Technology Attachment (SATA), Peripheral Component Interconnect (PCI), PCI Express (PCIE), Serial Attached SCSI (SAS), 1394, USB, SD, or MMC, or any suitable protocol for data exchange.
The data buffer 214 has a multi-function capability. For example, it caches data when the host writes data to the data storage medium or reads data from the data storage medium. It further temporarily stores search configure information data from the host or the search result information that is ultimately sent to the host and in this respect, may be employed by the main controller 211 as a data cache. The data storage controller 215 manages accesses to the data storage medium 22.
In operation, the host side controller 213 receives a command with a list of keywords and a search range (collectively referred to herein as “search information”) from the host 1 and communicates the same to the data buffer 214 through the data bus0 217. The main controller 211 fetches the search information from data the buffer 214 and communicates the keywords to the data compare engine 212 for comparison of the keywords with data that is in the data storage medium 22. The main controller 211 controls the data storage controller 215 to start to read according to the search range, communicated by host. The data storage controller 215 receives data from data storage medium 22 and communicates the same to data buffer 214 through the data bus1 216. The data that is in the data storage medium 22 is received by the data storage controller 215 and passed onto the data compare engine 212. The data compare engine 212 compares the data to a keyword and reports the result to main controller 211. The main controller 211 stores the result in the data buffer 214 and reports the result of the comparison back to the host when the search is done. Additional commands, such as write and read commands described in
The control signal 218 indicates that the main controller 211 can control the host side of the controller 213 to start a command or data transfer. An example of the main controller 211 is a central processing unit and available to control another module]
If at 614, it is determined that the entire block has been processed, at step 618, the search is considered done. Next, at step 620, the storage device 2 sends the host 1 the search result. Next, at step 622, the host 1 reads the data that the host 1 sent at step 604, from the corresponding address, through the controller 21, according to the search result. That is, at step 604, the keyword information is sent and at step 622, the host knows the address of the data that contains the keyword. Accordingly, the host need not read all of the data from the storage medium and move it to the computer memory and then search using the CPU. Instead, the host simply sends the keyword to the storage medium, which automatically searches for the keyword(s) and reports the result of the search by reporting the location of the data that includes the keyword. Thus, the time that is required to transfer the data to the host is saved and because search is performed by dedicated hardware and in real-time, the CPU search time is eliminated and the host CPU tasks are reduced. The process ends at 624.
If at 714, it is determined that the write operation is completed, the process continues to step 716 and if not, the process repeats starting from step 708 until the write operation is complete. At step 716, the host 1 receives the search result and at 718, the process ends. In summary, the process of
If at 814, it is determined that the read operation is completed, the process continues to step 816 and if not, the process repeats starting from step 808 until the read operation is complete. At step 816, the host 1 is sent the search result and at 818, the process ends.
One of the differences between the process of
Although the description has been described with respect to particular embodiments thereof, these particular embodiments are merely illustrative, and not restrictive.
As used in the description herein and throughout the claims that follow, “a”, “an”, and “the” includes plural references unless the context clearly dictates otherwise. Also, as used in the description herein and throughout the claims that follow, the meaning of “in” includes “in” and “on” unless the context clearly dictates otherwise.
Thus, while particular embodiments have been described herein, latitudes of modification, various changes, and substitutions are intended in the foregoing disclosures, and it will be appreciated that in some instances some features of particular embodiments will be employed without a corresponding use of other features without departing from the scope and spirit as set forth. Therefore, many modifications may be made to adapt a particular situation or material to the essential scope and spirit.
What we claim is: