Many search engines that search unstructured data, such as web pages and text documents search through whole documents looking for words in a search string. Some search engines display the results in the form a list, taking a snippet of the text, and highlighting search terms that were used in the search string.
Searching for items in structure data may also be done, and may likewise return many results. However, it can be unclear why the results were returned in many instances. This is especially the case with searching of structured data. Structured data is data that is stored in a structured manner, such as records having predetermined fields. A text based search of the structured data, such as the use of a search string without specifying fields, can lead a search application to determine how to apply the search string to the field contents. It can return too many results and results that don't seem to be logical. In other words, it is difficult to tell why the record returned is related to the search string. Applying the search string to each original field, one after the other, can return too few results if the application expects a matching record's field to contain all words of the search string, or too may result if it expects a matching record's field to contain any word of the search string. Either way, the returned results may not appear logical to a user, and it is difficult to determine why some records are returned. There is a need for a better way to display search results for structured data.
A search string is received to be applied to structured data. A concatenated version of the structured data is obtained and the search string is applied to the concatenated version to obtain search results. A list of results is generated and consists of search string correlated information from different portions of the structured data.
In the following description, reference is made to the accompanying drawings that form a part hereof, and in which is shown by way of illustration specific embodiments which may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention, and it is to be understood that other embodiments may be utilized and that structural, logical and electrical changes may be made without departing from the scope of the present invention. The following description is, therefore, not to be taken in a limited sense, and the scope of the present invention is defined by the appended claims.
The functions or algorithms described herein are implemented in software or a combination of software and human implemented procedures in one embodiment. The software consists of computer executable instructions stored on computer readable media such as memory or other type of storage devices. The term “computer readable media” is also used to represent carrier waves on which the software is transmitted. Further, such functions correspond to modules, which are software, hardware, firmware or any combination thereof. Multiple functions are performed in one or more modules as desired, and the embodiments described are merely examples. The software is executed on a digital signal processor, ASIC, microprocessor, or other type of processor operating on a computer system, such as a personal computer, server or other computer system.
At 110, a search string is received. It may be received from a user using a user interface, or from a process or other entity. The search string is for application to structured data, such as a database having records with multiple fields. In one embodiment, the structured data consists of records relating to items in a catalog. The items may be products, materials, goods, services or other items that may be ordered. In further embodiments, the structured data may be related to information other than product catalogs, such as databases covering literature on various topics, or any other type of information that people may want to search.
At 120, a concatenated form of the structured data is obtained. The concatenated data consists of selected fields of each record placed end to end, or bundled, into a searchable string. While the fields need not be physically stored contiguously, they may be logically bundled in a manner that allows a search string to be applied across the fields of data as represented at 130. In one embodiment, some fields may contain technical data for cross-references between records, and need not be included. Another name for the concatenated form of a record of the structured data is generalized description of the record.
At 140, a list of search results is generated with information that correlates the search results with the search string. In other words, the context in which the information is relevant to the search string is included in the list of search results. The list may be displayed, printed, electronically distributed or stored in various embodiments.
A search string: “sap press” may be entered by a user or otherwise received in a desire to search the database. The search string does not specify in which fields the terms “sap” and “press” occur. Since the record has been concatenated, or logically bundled, the first record 245 is found. “sap” occurs at 245 in the title column 225. “press” occurs at 250 in the publisher column 235. A returned list illustrated at 255 reads “SAP WAS by VELLA published by Galielo Press”. Both “SAP” and “Press” may be provided an attribute to highlight them, such as by underlining as shown, bolding, or many other different attributes that may be provided to text.
Context information in one embodiment consists of portions of the generalized record that contain parts of the search string, as illustrated at 255. The context may be enhanced, as indicated first by the word “by” at 260. The enhancement refers to the Author column 230. This enhanced context information may have also read: “Authored by” to clearly identify that part of the context corresponded to the author column or field. The enhanced context information may be specified by the database, such as in a schema for the database. A second enhanced context information reads: “published by” at 265, referring to the publisher column 235. Thus, in this simple list, the context in which the search result is obtained is clear, both via attributes of words corresponding to the search terms, and by the enhanced context information provided in the results.
Search strings may be created in conformance with search engine 320 capabilities. Typical search engines may search a string of text, or advanced searches such as those including logical connectors, proximity, and different query languages, such as structured query language and natural language queries. Many other types of queries may be handled by various search engines.
A block diagram of a computer system that executes programming for performing the above functions is shown in
Computer-readable instructions stored on a computer-readable medium are executable by the processing unit 502 of the computer 510. A hard drive, CD-ROM, and RAM are some examples of articles including a computer-readable medium. The term “computer readable medium” is also used to represent carrier waves on which the software is transmitted. For example, a computer program 525 capable of providing a generic technique to perform access control check for data access and/or for doing an operation on one of the servers in a component object model (COM) based system according to the teachings of the present invention may be included on a CD-ROM and loaded from the CD-ROM to a hard drive. The computer-readable instructions allow computer 510 to provide generic access controls in a COM based computer network system having multiple users and servers.
The Abstract is provided to comply with 37 C.F.R. § 1.72(b) to allow the reader to quickly ascertain the nature and gist of the technical disclosure. The Abstract is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims.