This disclosure teaches pre-processing techniques to perform prefix optimization for network search engines.
The following papers provide useful background information, for which they are incorporated herein by reference in their entirety, and are selectively referred to in the remainder of this disclosure by their accompanying reference codes in square brackets (i.e., [3] for the paper by Dharmapurikar)
An increased use of applications with high bandwidth requirements, such as video conferencing and real-time movies, has resulted in a steep growth in internet traffic as well as an increase in the number of hosts. In order to provide acceptable performance, communication links are being upgraded to support higher data rates. However, higher link speeds translate to better end-to-end performance only if network routers, which filtering and route Internet Protocol (IP) packets, are commensurately faster. A significant bottleneck in a network router is the IP address lookup, which is the process of forwarding a packet to its destination. This is commonly known as IP forwarding. Other important tasks performed by network routers such as packet classification are also made faster if this basic lookup process is accelerated.
Given a set of prefixes, the lookup problem consists of finding the longest prefix that matches an incoming header. A prefix corresponds to an internet address, or its initial portion. This problem is referred to as Longest Prefix Matching, or LPM. Three major categories of hardware solutions for LPM are content-addressable memories (CAMs), tree-based algorithmic solutions [1,2] and hash-based solutions [3,4].
Bloomier filter-based content addressable memory is discussed in [5] along with an architecture of a network service engine and a network router based on the concept of the Bloomier filter. The architecture was designed for LPM. Several instances of the architecture that may be put together to solve the more complex problem of packet classification is discussed in [6]. Thus, if the memory usage of the basic LPM architecture is reduced, it helps both LPM and packet classification in the implementations of the above discussed architecture.
A general, architecture-independent prefix-processing technique for LPM that benefits hash-based and tree-based approaches in described in [7]. Two architecture-specific prefix optimization techniques for LPM to reduce the memory usage and consequently the power and chip area of the implementation are discussed.
A prefix of length L is a regular expression whose L most significant bits are valid, while all other bits are considered as “don't-cares”. We restrict ourselves to regular expressions that are integers (for instance, an internet address). The number of valid bits in a prefix is referred to as the prefix length.
a) An Example Network Search Engine
The architecture described in [8] is a hash-based lookup architecture based on embedded DRAM technology and is disclosed further in U.S. patent application Ser. No. 10/909,907 filed Aug. 2, 2004. It is characterized by low latency, low power, low cost and high performance. While it can be a general-purpose search engine, it can also be currently tailored to LPM and packet classification applications. When implemented for the LPM application, the described architecture has a guaranteed latency of 8 cycles per lookup, a 250 MHz clock implying 250 million lookups per second and 3-4 W worst-case power for 512K prefixes.
By way of background, the above architecture is described herein.
b) Technique for Setting Up the Network Service Engine
The described network service engine is based upon a content retrieval data structure called the Bloomier filter. It is an architecture that can store and retrieve information quickly. A function f:t→f(t) may be stored in the described architecture by storing various values of t and corresponding values of f(t). The idea is to quickly retrieve f(t) given t. In this network service engine, this retrieval is achieved in constant time. The following definitions assist in explaining how this may be achieved.
Storing a function f:t→f(t) in the described network service engine data structure in such a way that it can be retrieved in constant time is referred to as function encoding. Given a function f:t→f(t) stored in the described network service engine, the process of retrieving f(t) given t is referred to as performing a lookup.
Function encoding is done by storing the values of the function f(t) for several elements t. The collection of elements stored in the described network service engine data structure is the element set.
The core of the described network service engine consists of a Bloomier-filtering based data structure into which a function may be encoded. This data structure consists of a table indexed by K hash functions. This table is named as the Index Table. The K hash values of an element are collectively referred to as its hash neighborhood. The hash neighborhood of an element t is represented by HN(t).
If a hash function of an element produces a value that is not in the hash neighborhood of any other element in the element set, then that value is said to be a singleton.
The index table is set up such that for every t in the element set, any information corresponding to t (such as f(t)) may be safely written to a specific address belonging to the hash neighborhood of t. This address is called
Given t, the information stored in
In order to retrieve f(t) without knowledge of hT(t), a solution is to store some information in every location in the hash neighborhood of t such that a simple Boolean operation of the values in all hash locations necessarily yields f(t). Specifically, once
Equation 1: Encoding values in the Index Table.
where “ˆ” represents the XOR operation, Hi(t) the i'th hash value of t, D[Hi(t)] the index table data at address Hi(t), K the number of hash functions, I(t) any information corresponding to t that we want to store and retrieve and hT(t) the index of the hash function that produces
During a lookup, if the element is t, the information corresponding to t, I(t), may be retrieved by a XOR operation of the values in all hash locations of t:
Equation 2: Index Table Lookup.
It remains then to find a way of discovering
First, an order Γ is defined on the elements to be stored in the described network service engine. Γ dictates that every element t has a corresponding hash value (in the hash neighborhood of the element) that is not hashed to by any of the elements appearing before t in the order. Once an order is found, the elements and encode values in their hash locations (Equation 1) are processed in the same order. This is in fact a sufficient condition for that hash location to be T(t). The reason is as follows. The first element in Γ, t1, has a hash location that is not in the hash neighborhood of any other element. Therefore, information corresponding to t1 can be safely stored in this hash location since no other element has been encoded yet. The second element in Γ, t2, has a hash location that is not in the hash neighborhood of t1. Since t1 has already been encoded, information corresponding to t2 can be safely written into this hash location. Note that during encoding, only one hash location per element is modified (written into). Hence, encoding t2 will not corrupt any of the locations written to or read by t1.
This applies to all elements in the order.
Such an order may be discovered using the following greedy technique. An element t1 with a singleton is found and put at the bottom of a stack. The element t1 is removed from the element set, and recursively repeat the process. The final stack obtained represents the elements in the required order Γ. The algorithm is shown in
Like Bloom filters, the basic Bloomier filtering data structure also suffers from a small probability of false positives. This means that, when an element t′ is looked up, Equation 2 can produce an apparently legitimate value of I(t) even though t′ was never in the set of elements originally encoded into the Index Table. False positives are removed in the described network service engine by the addition of a second table called the Filtering Table, so called because it filters false positives. The Filtering Table has as many entries as the number of elements encoded in the Index Table. It contains the actual elements that are encoded in the Index Table, one element per location. During lookups, the idea is to compare the actual element stored in the Filtering Table with the element to be looked up, and thus eliminate false positives.
In the described architecture, this is done using the following method. During function encoding, the information corresponding to element t, I(t), is set to the address in the filtering table where t is stored. Assume a lookup for element t′ needs to be done. When I(t′) is retrieved from the Index Table, the stored element t″ can be retrieved from address I(t′) in the Filtering Table. Following this, t′ and t″ are compared. If the comparison fails, the lookup is a false positive. An advantageous method of doing this is to allocate sequential Filtering Table addresses to the elements in the order r determined during Index Table encoding.
4. An Architecture for the Described Network Service Engine
The described network service engine architecture consists of three tables: the Index Table, the Filtering Table and a third table called the Result Table. The function f(t) is encoded in the Index Table, and corresponding values of t are inserted in the Filtering Table as described above.
During a lookup for element t′, the Index Table retrieves an address I(t′) into the Filtering Table, from where the actual element t″ is retrieved for comparison with t′. In addition to the actual element, f(t′) may also be stored at address I(t′) in the Filtering Table. This portion of the Filtering Table is called as the Result Table since it holds the result. Note that the Result Table could also be implemented as a separate memory, distinct from the Filtering Table. However, the Filtering and Result Tables are “parallel” and have the same number of entries. The described network service engine architecture is shown in
a) LPM Using the Above Architecture
A commonly used longest prefix matching (LPM) is described hereunder.
A prefix refers to an IP address or its initial portion. For instance, “100.10” is a prefix of “100.10.1.2”. A prefix database contains forwarding information for “100.10”. However, it may contain more refined forwarding information for the larger prefix “100.10.1.2”. Therefore, an incoming IP address must be compared with all prefixes, and the forwarding information corresponding to the longest matching prefix must be discovered.
For instance, consider a router which forwards “.com” packets to port A. If the router is located such that the domain “nec.com” is more easily accessible via port B, it should route “nec.com” packets to port B. Therefore, incoming packets with “nec.com” as their destination address will be forwarded to B, while all other “.com” packets to port A.
In the example shown in
The described network service engine architecture for LPM consists of the tables shown in
A described network service engine-based architecture for packet classification, the multi-dimensional version of the LPM problem, is discussed in [6]. This uses the LPM architecture as building blocks. Hence the prefix optimization techniques described in this work will benefit both LPM and packet classification.
To overcome the disadvantages discussed above, the disclosed teachings provide a network router comprising at least one index table operable to store encoded values of a function associated with an input source address in at least two locations. The encoded values are obtained by hashing the input source address such that all the encoded values must be used to recover the function. At least one filtering table is provided that is operable to store prefixes of at least two different lengths, the prefixes corresponding to network addresses. The filtering table is indexed by entries in said index table. At least one result table is provided. The result table is operable to be indexed by entries in said index table. The result table stores destination addresses. At least one record in the filtering table has a prefix length field that is operable to store a prefix length of a prefix stored in said at least one record.
Another aspect of the disclosed teachings is a method of processing addresses in a network search engine, the method comprises receiving an input source address. The input source address are hashed to create encoded values of a function associated with the input source address such that all the encoded values are needed to recover the function. The encoded values are stored in an index table. A prefix of the input source address is stored in a filtering table, said filtering table operable to store prefixes of at least two different lengths. A length of the prefix is stored in the filtering table. The filtering table is indexed by entries in the index table. The destination addresses are stored in a result table. The results are indexed the result table by entries in said index table.
Another aspect of the disclosed teachings is a computer program product including computer-readable media that includes instructions to enable a computer to perform the disclosed techniques.
The above objectives and advantages of the disclosed teachings will become more apparent by describing in detail preferred embodiments thereof with reference to the attached drawings in which:
Multiple Prefix Length Per Sub-Cell in the Engine
In the described network service engine LPM architecture shown in
The above architecture is modified such that multiple prefix lengths can coexist in a single described network service engine sub-cell. This is shown in
Each sub-cell now has three outputs: “matching prefix length (MPL)”, “valid” and “next hop” values. The MPL is used in the priority encoder to correctly determine the longest prefix match.
An advantage is that this technique moves prefixes from sparsely populated prefix lengths to other prefix lengths in order to utilize the described network service engine sub-cells more efficiently.
Reducing Filtering Table Size
In an described network service engine sub-cell, the size (depth) of the Filtering Table must be equal to or greater than the number of elements stored in the sub-cell (Section I.C.3.b). The technique present in this sub-section makes it possible for the Filtering Table to be smaller, i.e., the Filtering Table can have fewer entries than the number of elements stored in the described network service engine sub-cell.
An exemplary implementation of the technique is described below. If there exists a “complete set” S of 2P−1 prefixes that have a common sub-prefix of length l, the largest subset of prefixes that have the same destination can be extracted and collapsed into a single Filtering Table location. For example, 0000*→E, 0001*→E, 0010*→E, 0011*→F comprise a complete set of 4 prefixes of length 4 with a common sub-prefix “00”. Without this technique 4 Filtering Table locations would be required for these prefixes. However, prefixes 0000*, 0001* and 0010* have the same destination E. The present technique collapses the into a single Filtering Table entry.
In alternate embodiments, this technique can also be used in combination with prefix pre-processing that is described, for example, in [7, 9, 10].
The above discussed techniques can be implemented in any suitable computing environment. A computer program product including computer readable media that includes instructions to enable a computer or a computer system to implement the disclosed teachings is also an aspect of the invention.
Other modifications and variations to the invention will be apparent to those skilled in the art from the foregoing disclosure and teachings. Thus, while only certain embodiments of the invention have been specifically described herein, it will be apparent that numerous modifications may be made thereto without departing from the spirit and scope of the invention.
This Application claims priority from co-pending U.S. Provisional Application Ser. No. 60/658,168, with inventors Srihari Cadambi, Srimat Chakradhar, Hirohiko Shibata, filed Mar. 7, 2005, which is incorporated in its entirety by reference.
Number | Date | Country | |
---|---|---|---|
60658168 | Mar 2005 | US |