This invention relates generally to tree-based searching of a knowledge base, and more specifically, to the use of local memory to improve the speed of the searching process.
Many different applications require a search of a large data base of elements, also referred to as a knowledge base, to locate a match with a given search object. In certain applications the search object and the elements of the knowledge base comprise a string of binary bits. An example of such an application is a bridge in a communications system.
A typical communication system includes a number of devices or nodes that communicate over a plurality of connections. The system is organized into a plurality of local connections with a limited number of nodes associated with (connected to) each local connection. A network of bridges interconnects the local connections so that each device can communicate with other devices not associated with the same local connection. The bridge for each local connection monitors input traffic from other bridges in the network to determine if traffic originating at another bridge is addressed to a node connected to it locally. In response, the bridge provides a path that allows the information to pass through to the local connection. Similarly, when information is sourced from the local connection to an external destination node, the bridge allows the information to pass from the local connection to the next bridge on the path to the destination node.
Typically, the information carried between nodes is in the form of packets of binary bits that travel from the source node to the destination node across the system. A packet typically includes bits identifying the addresses of the packet's source node and the destination node. In one addressing protocol, the address portion of the packet is 48 bits long, with the remainder of the packet comprising payload information bits.
In certain systems, a bridge monitors both internally generated traffic (i.e., traffic sourced at nodes connected directly to the bridge) and also monitors externally-generated traffic (i.e., traffic sourced at nodes external to the bridge) that is broadcast to all bridges of the network. For example, information broadcast over a local area network may not be intended for all network nodes, but is monitored by each network bridge to determine whether any of the intended destination nodes are connected to the bridge. This analysis is performed by maintaining, at each bridge, a knowledge base with an entry for each of the nodes on the bridge's local connection. Thus the bridge receives externally sourced packets and searches its knowledge base to determine whether the 48-bit destination address matches any of the node addresses located on its local connection. The destination address (i.e., the search object) could have a value of any one of 2^48 or about 280 trillion possible addresses. However, the number of entries in the bridge's knowledge base will be equal only to the number of nodes connected locally to it, and therefore will be significantly less than 280 trillion.
Searching a knowledge base to determine a match to a given search object is an important requirement for many different applications. For example, the following applications rely heavily on the performance of speedy searches: data base retrieval; expert systems; robotic and state control strategy; signal recognition, including for example speech and image recognition; communications, including for example data compression and protocol processing for bridging, routing and switching applications; natural language cognitive systems; modeling operations; parsers; and compilers.
One important attribute of any searching scheme is the worst case time required to complete a search. Generally, searching schemes are implemented in a plurality of steps or cycles that each take a predetermined amount of time to complete. Thus, the maximum time to complete a search is generally reduced by minimizing the time spent at each step of the search.
A data network classification engine typically utilizes a tree search process to determine various characteristics associated with each data packet or data block that enters the network device, i.e., to classify the input data according to one or more data attributes. Since the data is conventionally presented in the form of binary bits, the classification engine compares groups of the input bits with known bit patterns, represented by entries in the tree structure. A match between the group of input bits and the bits at a tree entry directs the process to the next sequential entry in the tree. The matching processes progress through each entry of the tree until the end is reached, at which point the input bits have been characterized. Because a large number of bits must be classified in a data network, these trees can require many megabits of memory storage capacity.
The classification process finds many uses in a data communications network. The input data packets can be classified based on a priority indicator within the packet, using a tree structure where the decision paths represent the different network priority levels. Once the priority level is determined for each packet, based on a match between the input bits and the tree bits representing the available network priority levels, then the packets can be processed in priority order. As a result, the time sensitive packets (e.g., those carrying video-conference data) are processed before the time insensitive packets (a file transfer protocol (FTP) data transfer). Other packet classifications processes determine the source of the packet (for instance, so that a firewall can block all data from one or more sources), examine the packet protocol to determine which web server can best service the data, or determine network customer billing information. Information required for the reassembly of packets that have been broken up into data blocks for processing through a network processor can also be determined by a classification engine that examines certain fields in the data blocks. Packets can also be classified according to their destination address so that packets can be grouped together according to the next device they will encounter as they traverse the communications medium.
The tree structure for performing the classification process is segregated into a plurality of memory elements, providing the processor with parallel and simultaneous access to the levels of the tree structure. According to the present invention, one or more of the lower level branches of the tree can be stored on-chip with the classification engine, (i.e., the processor) thereby reducing the read cycle time for the lower level tree entries. Advantageously, there are fewer lower level tree entries as these appear near the tree root. Therefore, the on-chip storage requirements are considerably less than the storage requirements for the entire tree.
The present invention can be more easily understood and the further advantages and uses thereof more readily apparent, when considered in view of the description of the invention and the following figures in which:
According to the teachings of the present invention, the tree structure is partitioned between one or more memory elements, such that depending on the memory elements chosen (i.e., faster memory on-chip versus slower off-chip memory) different read access times are available and thus certain tree entries, that is nodes or instructions as discussed above, are accessible faster than others.
As shown in
The use of two separate memory structures is merely exemplary as additional memory structures can also be employed for storing levels of the tree. Selection of the optimum number of memory elements, the memory access time requirements of each, and the tree levels stored in each memory element can be based on the probability that certain patterns will appear in the incoming data stream. The tree levels or sections of tree levels that are followed by the most probable data patterns are stored in the memory having the fastest access time. For example, all the input patterns traverse the lower levels of the tree, thus these lower levels can be stored within a memory having a fast read cycle time to speed up the tree analysis process.
The teachings of the present invention can also be applied to parallel processing of tree structures. See
In the embodiment of
In another embodiment, a search engine processor is multi-threaded, allowing it to execute a plurality of simultaneous searches throughout one or more search trees. For example, the processor 50 of
In another embodiment, as illustrated in
It has been shown that the storage of the lower tree branches on-chip reduces the number of clock cycles required to traverse through an average size tree from about 30 to 40 clock cycles according to the prior art, to about two or three clock cycles according to the teachings of the present invention. Depending on the structure of the particular tree, many of the search processes may terminate successfully at a lower level branch in the on-chip memory, and thereby avoid traversing the upper level branches stored in the slower memory.
In yet another embodiment, it may be possible to store especially critical or frequently-used small trees entirely within the internal memory element 80. Thus providing especially rapid tree searches for any tree that is located entirely on-chip. The segregation between the tree levels stored within the internal memory 80 and the external memory 84 can also be made on the basis of the probabilities of certain patterns in the input data.
Typically, the data input to a network processor using a tree characterization process is characterized according to several different attributes. There will therefore be a corresponding number of trees through which segments of the data packet or data block are processed to perform the characterization function. According to the present invention, the lower level branches are stored on-chip and the higher-level branches are stored off-chip. To perform the multiple characterizations, a pipelined processor will access a lower branch of a tree stored in the on-chip memory and then move to the off-chip memory as the tree analysis progresses. But since the off-chip access time is longer, while waiting to complete the read cycle off-chip, the processor can begin to characterize another aspect of the input data by accessing the lower branches of another on-chip tree. In this way, several simultaneous tree analyses can be performed by the processor, taking advantage of the faster on-chip access speeds while waiting for a response from a slower off-chip memory.
In another embodiment, certain portions of the tree (not necessarily an entire tree level) are stored within different memory elements. For example, the most frequently traversed paths can be stored in a fast on-chip or local memory and the less-frequently traversed paths stored in a slower remote or external memory.
The tree according to the present invention is also adaptable to changing system configurations. Assume that the tree is processing a plurality of TCP/IP addresses. When the process begins the tree is empty and therefore all of the input addresses default to the same output address. The tree process begins at the root and immediately proceeds to the default output address at the single leaf. Then an intermediate instruction or decision node is added to direct certain input addresses to a first output address and all others to the default address. As more output addresses are added, the tree becomes deeper, i.e., having more branches or decision nodes. According to the teachings of the present invention, the growth of the tree can occur in both the local and the remote memory elements.
Number | Name | Date | Kind |
---|---|---|---|
4611272 | Lomet | Sep 1986 | A |
5295261 | Simonetti | Mar 1994 | A |
5404513 | Powers et al. | Apr 1995 | A |
5463777 | Bialkowski et al. | Oct 1995 | A |
5535365 | Barriuso et al. | Jul 1996 | A |
5630125 | Zellweger | May 1997 | A |
5813001 | Bennett | Sep 1998 | A |
5894586 | Marks et al. | Apr 1999 | A |
5930805 | Marquis | Jul 1999 | A |
5946679 | Ahuja et al. | Aug 1999 | A |
5963675 | van der Wal et al. | Oct 1999 | A |
5968109 | Israni et al. | Oct 1999 | A |
5983224 | Singh et al. | Nov 1999 | A |
6061712 | Tzeng | May 2000 | A |
6226714 | Safranek et al. | May 2001 | B1 |
6247016 | Rastogi et al. | Jun 2001 | B1 |
6260044 | Nagral et al. | Jul 2001 | B1 |
6266706 | Brodnik et al. | Jul 2001 | B1 |
6304260 | Wills | Oct 2001 | B1 |
6381607 | Wu et al. | Apr 2002 | B1 |
6516319 | Benayoun et al. | Feb 2003 | B1 |
6563952 | Srivastava et al. | May 2003 | B1 |
6571238 | Pollack et al. | May 2003 | B1 |
6625591 | Vahalia et al. | Sep 2003 | B1 |
6636802 | Nakano et al. | Oct 2003 | B1 |
6662184 | Friedberg | Dec 2003 | B1 |
6678772 | McKenney | Jan 2004 | B2 |
6766424 | Wilson | Jul 2004 | B1 |
6772223 | Corl et al. | Aug 2004 | B1 |
6839739 | Wilson | Jan 2005 | B2 |
6961821 | Robinson | Nov 2005 | B2 |
20020049824 | Wilson | Apr 2002 | A1 |
20020078284 | McKenney | Jun 2002 | A1 |
Number | Date | Country |
---|---|---|
WO 03030019 | Apr 2003 | WO |
Number | Date | Country | |
---|---|---|---|
20030120621 A1 | Jun 2003 | US |