1. Field of the Invention
The present invention relates to string processing, and more particularly, to a string processing apparatus for processing a plurality of strings simultaneously.
2. Description of the Prior Art
String comparison is a frequently used function in string processing. For example, text searching, HTML/XML parsing, virus detection, and pattern matching are essential functions utilizing string comparison to achieve specific functions. The efficiency of string comparison greatly influences the overall performance of a string processing function. Conventional string comparison is implemented using byte-related instructions. That is, the system processes input strings once per byte.
Please refer to
In order to execute string comparison more efficiently, the present invention provides a string processing apparatus for processing a plurality of strings simultaneously.
In a first aspect of the present invention, an apparatus for string processing is disclosed. The apparatus includes: a first storage, storing a plurality of first predetermined strings; a second storage; a loading module, coupled to the first storage and the second storage, for loading the first predetermined strings from the first storage into the second storage; a comparing module, coupled to the second storage, for comparing a specific string with the first predetermined strings simultaneously, thereby generating a plurality of comparison results corresponding to the specific string; and a control logic, coupled to the comparing module, for generating a string processing result according to the comparison results.
In a second aspect of the present invention, an apparatus for string processing is provided. The apparatus includes: a first storage, storing a plurality of first predetermined strings; a second storage; a loading module, coupled to the first storage and the second storage, for loading the first predetermined strings from the first storage into the second storage; a comparing module, coupled to the second storage, for comparing the first predetermined strings with a plurality of second predetermined strings, respectively and simultaneously, to generate a plurality of comparison results corresponding to the second predetermined strings, respectively; and a control logic, coupled to the comparing module, for generating a string processing result according to the comparison results.
These and other objectives of the present invention will no doubt become obvious to those of ordinary skill in the art after reading the following detailed description of the preferred embodiment that is illustrated in the various figures and drawings.
Certain terms are used throughout the description and following claims to refer to particular components. As one skilled in the art will appreciate, manufacturers may refer to a component by different names. This document does not intend to distinguish between components that differ in name but not function. In the following description and in the claims, the terms “include” and “comprise” are used in an open-ended fashion, and thus should be interpreted to mean “include, but not limited to . . . ”. Also, the term “couple” is intended to mean either an indirect or direct electrical connection. Accordingly, if one device is coupled to another device, that connection may be through a direct electrical connection, or through an indirect electrical connection via other devices and connections.
Please refer to
S201: Load one word (4 bytes) as first predetermined strings. Each byte in the loaded word serves as one first predetermined string.
S202: Compare each string of the first predetermined strings with an 8-bit string constant imm8 simultaneously, thereby generating a plurality of comparison results.
S203: Determine whether none of the first predetermined strings is identical to the 8-bit string constant imm8. If yes, go to step S204; otherwise, go to step S205.
S204: Set the string processing result to zero.
S205: Check if the first string found identical to the 8-bit string constant imm8 is the first string of the first predetermined strings according to a data endian of the first predetermined strings. If yes, go to step S206; otherwise, go to step 207.
S206: Set the string processing result to −4.
S207: Check if the first string found identical to the 8-bit string constant imm8 is the second string of the first predetermined strings according to the data endian of the first predetermined strings. If yes, go to step S208; otherwise, go to step S209.
S208: Set the string processing result to −3.
S209: Check if the first string found identical to the 8-bit string constant imm8 is the third string of the first predetermined strings according to the data endian of the first predetermined strings. If yes, go to step S210; otherwise, go to step 211.
S210: Set the string processing result to −2.
S211: Set the string processing result to −1.
In this embodiment, steps S205, S207, and S209 are configured to refer to the data endian of the first predetermined strings to find a first string from the first predetermined strings which is identical to the 8-bit string constant imm8.
Please note that actions in steps S206, S208, S210, and S211 indicate a sequence number of a string which is first found identical to the 8-bit string constant imm8 according to the data endian of the first predetermined strings; for instance, when the data endian of the first predetermined strings is little and only the string with the second smallest sequence number and the string with the third smallest sequence number are identical to the 8-bit string constant imm8 within the first predetermined strings. In this case, where the string number of the first predetermined strings is four, the final result of the string processing result is −3, which indicates that the first string found identical to the 8-bit string constant imm8 is 4−3=1, i.e., the string with the second smallest sequence number (since 0 is the indication of the smallest sequence number). In another case, the data endian of the first predetermined strings is big and only the string with the second smallest sequence number and the string with the third smallest sequence number are identical to the 8-bit string constant imm8 within the first predetermined strings. In this case, where the string number of the first predetermined strings is four, the final result of the string processing result is −2, which indicates that the first string found identical to the 8-bit string constant imm8 is 4−2=2, i.e., the string with the third smallest sequence number.
Furthermore, the control unit 150 sets the string processing result by a logic value to indicate the specific sequence number of a string which is first found identical to the string constant imm8 according to the data endian of the word Ra. In this embodiment, data endian of the word Ra is little for illustrative purposes. Assume only the string with the second smallest sequence number (i.e., Ra1) and the string with the third smallest sequence number (i.e., Ra2) is identical to the string constant imm8 within the word Ra. In this case, since there are four strings (i.e., Ra0-Ra3) in the word Ra, the control logic 150 sets the string processing result to −3 which indicates that the first string found identical to imm8 is 4−3=1, i.e., Ra1. On the other hand, when data endian of the word Ra is big, the control logic 150 will set the string processing result by −2 which indicates that the first string found identical to imm8 is 4−2=2, i.e., Ra2.
The operation of the aforementioned apparatus and method for finding the first byte which is identical to a specific string constant can be briefly summarized using pseudo codes shown in
Please refer to
S301: Load one word (4 bytes) as first predetermined strings, and load one word (4 bytes) as second predetermined strings.
S302: Compare each of the first predetermined strings with the initial string of the second predetermined strings simultaneously, thereby generating comparison results.
S303: Determine whether none of the first predetermined strings is identical to the initial string of the second predetermined strings. if yes, go to step S304; otherwise, go to step S305.
S304: Set the string processing result to zero.
S305: Check if the first string found identical to the initial string of the second predetermined strings is the first string of the first predetermined strings according to the data endian of the first predetermined strings. If yes, go to step S306; otherwise, go to step S307.
S306: Set the string processing result to −4.
S307: Check if the first string found identical to the initial string of the second predetermined strings is the second string of the first predetermined strings according to the data endian of the first predetermined strings. If yes, go to step S308; otherwise, go to S309.
S308: Set the string processing result to −3.
S309: Check if the first string found identical to the initial string of the second predetermined strings is the third string of the first predetermined strings according to the data endian of the first predetermined strings. If yes, go to step S310; otherwise, go to step S311.
S310: Set the string processing result to −2.
S311: Set the string processing result to −1.
In this embodiment, steps S305, S307, and S309 are configured to refer to a data endian of the first predetermined strings to find a first string from the first predetermined string which is identical to the initial string of the second predetermined strings. In addition, as a person skilled in the art can readily understand the principle of setting the string processing result after reading the above paragraphs directed to the first exemplary embodiment of the string processing method, further description is omitted here for the sake of brevity.
The operation of the aforementioned apparatus and method for finding the first byte which is identical to the initial byte of a loaded string can be briefly summarized using pseudo codes shown in
Please refer to
S401: Load one word (4 bytes) as first predetermined strings, and load one word (4 bytes) as second predetermined strings.
S402: Compare each of the first predetermined strings with each of the second predetermined strings, respectively and simultaneously, thereby generating comparison results.
S403: Determine whether all of the first predetermined strings are identical to a corresponding string of the second predetermined strings. If yes, go to step S404; otherwise go to step S405.
S404: Set the string processing result to zero.
S405: Check if the first string found not identical to the corresponding string of the second predetermined strings is the first string of the first predetermined strings according to a data endian of the first predetermined strings. If yes, go to step S406; otherwise go to step S407.
S406: Set the string processing result to −4 when the string processing is to find the first mismatch, and set the string processing result to −1 when the string processing is to find the last mismatch.
S407: Check if the first string found not identical to the corresponding string of the second predetermined strings is the second string of the first predetermined strings according to the data endian of the first predetermined strings. If yes, go to step S408; otherwise go to step S409.
S408: Set the string processing result to −3 when the string processing is to find the first mismatch, and set the string processing result to −2 when the string processing is to find the last mismatch.
S409: Check if the first string found not identical to the corresponding string of the second predetermined strings is the third string of the first predetermined strings according to the data endian of the first predetermined strings. If yes, go to step S410; otherwise go to step S411.
S410: Set the string processing result to −2 when the string processing is to find the first mismatch, and set the string processing result to −3 when the string processing is to find the last mismatch.
S411: Set the string processing result to −1 when the string processing is to find the first mismatch, and set the string processing result to −4 when the string processing is to find the last mismatch.
In this embodiment, steps S405, S407, and S409 are configured to refer to the data endian of the first predetermined strings to find a first/last string from the first predetermined string which is not identical to the corresponding string of the second predetermined strings. In addition, as a person skilled in the art can readily understand the principle of setting the string processing result after reading the above paragraphs directed to the first exemplary embodiment of the string processing method, further description is omitted here for the sake of brevity.
In this embodiment, data endians of the words Ra and Rb are both little. Assume only the strings with the second smallest sequence number (i.e., Ra1 and Rb1) and the strings with the third smallest sequence number (i.e., Ra2 and Rb2) are mismatched. In this case, when the objective is to find the first mismatch, since there are four strings (i.e., Ra0-Ra3 and Rb0-Rb3) in the words Ra and Rb, respectively, the control logic 350 sets the string processing result to −3 which indicates that the first mismatch is 4−3=1, i.e., Ra1 and Rb1. However, when the objective is to find the last mismatch, the control logic 350 sets the string processing result to −2 which indicates that the first mismatch is 4−2=2, i.e., Ra2 and Rb2.
The operations of the aforementioned apparatus and method for finding the first and last mismatch between two words can be briefly summarized using pseudo codes shown in
Please refer to
S501: Load one word (4 bytes) as first predetermined strings.
S502: Compare each of the first predetermined strings with each of second predetermined strings, respectively and simultaneously, thereby generating comparison results. The second predetermined strings include one word (4 bytes) and a string constant, zero.
S503: Determine whether none of the first predetermined strings is identical to zero or all of the first predetermined strings are identical to corresponding strings of the second predetermined strings according to the data endian of the first predetermined strings. If yes, go to step S504; otherwise, go to step S505.
S504: Set the string processing result to zero.
S505: Check if the first string found identical to zero or found not identical to the corresponding string of the second predetermined strings is the first string of the first predetermined strings according to the data endian of the first predetermined strings. If yes, go to step S506; otherwise go to step S507
S506: Set the string processing result to −4.
S507: Check if the first string found identical to zero or found not identical to the corresponding string of the second predetermined strings is the second string of the first predetermined strings according to the data endian of the first predetermined strings. If yes, go to step S508; otherwise go to step S509.
S508: Set the string processing result to −3.
S509: Check if the first string found identical to zero or found not identical to the corresponding string of the second predetermined strings is the third string of the first predetermined strings according to the data endian of the first predetermined strings. If yes, go to step S510; otherwise go to step S511.
S510: Set the string processing result to −2.
S511: Set the string processing result to −1.
In this embodiment, steps S505, S507, and S509 are configured to refer to a data endian of the first predetermined strings to find a first string from the first predetermined string which is identical to zero or is not identical to the corresponding string of the second predetermined strings. In addition, as a person skilled in the art can readily understand the principle of setting the string processing result after reading the above paragraphs directed to the first exemplary embodiment of the string processing method, further description is omitted here for the sake of brevity.
It should be noted that the aforementioned examples are for illustrative purposes only, and are not meant to be limitations of the application. For example, those skilled in the pertinent art should readily comprehend that the final string processing result can be set to any convenient number, e.g. 1-4, depending on the design requirements. Further description of these alternative designs obeying the spirit of the present invention is therefore omitted here for the sake of brevity.
In summary, in accordance with the present invention, a string processing method and apparatus process a plurality of bytes (strings) among a word, simultaneously and respectively. In this way, each string is processed more efficiently. Please refer to
Those skilled in the art will readily observe that numerous modifications and alterations of the device and method may be made while retaining the teachings of the invention.