The present invention relates to packet processing in a packet-switched network. In particular, the present invention is directed to a method and system of controlled delay of packet processing using multiple delay loop paths.
Packet-switched networks, such as the Internet, transport data and information between communicating devices in packets that are routed and switched across one or more links that make up a connection path. As packet-switched networks have grown in size and complexity, their role in the critical functioning of businesses, institutions, and organizations has increased dramatically. At the same time, the need to secure networks against sophisticated internal and external attacks in the forms of viruses, Trojan horses, worms, and malware, among others, has correspondingly taken on heightened importance. Consequently, advances in methods and technologies for network security are needed to keep pace with the rising threats.
One approach is Intrusion Detection Systems (IDSs) that can detect network attacks. However, being passive systems, they generally offer little more than after-the-fact notification. A more active approach is Intrusion Prevention Systems (IPSs), which go beyond traditional security products, such as firewalls, by proactively analyzing network traffic flows and active connections while scanning incoming and outgoing requests. As network traffic passes through the IPS, it is examined for malicious packets. If a potential threat is detected or traffic is identified as being associated with an unwanted application it is blocked, yet legitimate traffic is passed through the system unimpeded.
An IPS can be implemented as an in-line hardware and/or software based device that can examine each packet in a stream or connection, invoking various levels of intervention based on the results. Thus in addition to routing and switching operations that networks carry out as they route and forward packets between sources and destinations, an IPS can introduce significant packet processing actions that are performed on packets as they travel from source to destination. Other network security methods and devices may similarly act on individual packets, packet streams, and other packet connections.
In carrying out its functions of protecting a network against viruses, Trojan horses, worms, and other sophisticated forms of threats, an IPS effectively monitors every packet bound for the network, subnet, or other devices that it acts to protect. An important aspect of the monitoring is “deep packet inspection” (DPI), a detailed inspection of each packet in the context of the communication in which the packet is transmitted. DPI examines the content encapsulated in packet headers and payloads, tracking the state of packet streams between endpoints of a connection. Its actions may be applied to packets of any protocol or transported application type. As successive packets arrive and are examined, coherence of the inspection and tracking may require continuity of packet content from one packet to the next. Thus if a packet arrives out of sequence, inspection may need to be delayed until an earlier-sequenced packet arrives and is inspected.
Another important aspect of IPS operation is speed. While the primary function of an IPS is network protection, the strategy of placing DPI in the packet streams between endpoints necessarily introduces potential delays, as each packet is subject to inspection. Therefore, it is generally a matter of design principle to perform DPI efficiently and rapidly.
In traversing a network from source to destination, packets may arrive at an IPS out-of-sequence with respect to their original transmission order. When this occurs, it may be desirable to delay, in a controlled manner, the processing of out-of-sequence packets until the in-sequence packets arrive. Under certain operational conditions, it may be possible to predict the latency period between the arrival of an out-of-sequence packet and the later arrival of the adjacent, in-sequence packet. Such predictions could be based, for instance, on empirical measurements observed at the point of arrival (e.g., an IPS or other packet processing platform), known traffic characteristics of incoming (arriving packet) links, known characteristics of traffic types, or combinations of these and other factors. Predicted (or estimated) delay can then be used to match the delay imposed on a given out-of-sequence packet to the predicted arrival of the adjacent, in-sequence packet. By doing so, packet processing that depends on in-order sequencing of packets may be efficiently tuned to properties of the out-of-sequence arrivals encountered by the IPS (or other packet-processing platform).
Accordingly, described herein is a method and system of introducing controlled delay in the processing of packets in a packet-switched data network, the method comprising determining that a packet should be delayed, selecting a delay loop path (DLP) according to a desired delay for the packet, and sending the packet to the selected DLP. The determination that a delay is needed, as well as the selection of DLP according to the desired delay, is preferably based on a property of the packet. In particular, recognizing that a packet has been received out of order with respect to at least one other packet in a communication or connection may be used to determine both that a delay is required, and what the delay should be. There may be other properties of a packet that necessitate controlled delay of processing, as well.
These as well as other aspects, advantages, and alternatives will become apparent to those of ordinary skill in the art by reading the following detailed description, with reference where appropriate to the accompanying drawings. Further, it should be understood that this summary and other descriptions and figures provided herein are intended to illustrate the invention by way of example only and, as such, that numerous variations are possible. For instance, structural elements and process steps can be rearranged, combined, distributed, eliminated, or otherwise changed, while remaining within the scope of the invention as claimed.
The method and system described herein is based largely on introducing controlled delay of packets using a construct called a delay loop path (DLP). More particularly, in order to impose a range of delay sizes (or times) to accommodate a possibly large number of packets and a variety of packet types and delay conditions, multiple delay loop paths are employed. Packets that are determined to have arrived to a packet processing platform (such as an IPS) out of sequence may then be subject to a controlled delay appropriate to the properties of the individual packets. Additionally, other criteria, such as network conditions or routes, may be considered as well in determining the need for and size of a delay.
To facilitate the discussion of controlled delay using multiple DLPs, it is useful to first consider a simplified example of network packet transmission that yields out-of-sequence arrival of packets and packet fragments at a packet processing platform. Such a scenario is depicted in
In the exemplary transmission, each packet is fragmented into smaller packets at some point within the network elements represented by ellipses 107, for example at one or more packet routers. The exemplary fragmentation results in packet P1 being subdivided into two packets, designated P1-A and P1-B. Packet P2 is subdivided into three packets, P2-A, P2-B, and P2-C, while packet P3 is subdivided into two packets P3-A and P3-B. As indicated, all of the initial fragments are transmitted in order as sequence {P1-A, P1-B, P2-A, P2-B, P2-C, P3-A, P3-B}. During traversal of the network elements represented by ellipses 109, the order of transmission of the packet fragments becomes altered such that they arrive at packet processing platform 106 out of sequence as {P3-B, P2-C, P1-B, P2-B, P3-A, P1-A, P2-A}, that is, out order with respect to the originally-transmitted fragments. While the cause is not specified in the figure, the re-ordering could be the result of different routers and links traversed by different packet fragments, routing policy decisions at one or more packet routers, or other possible actions or circumstances of the network.
Packet processing platform 106 could be an IPS or other security device, and may require that fragmented packets be reassembled before processing can be carried out. In the exemplary transmission of
Note that depending upon the particular packet processing carried out, it may or may not be necessary to wait for all fragments of a given packet to arrive before processing begins. For example, it may be sufficient that just pairs of adjacent packet fragments be processed in order. Further, it may not be necessary to actually reassemble packets prior to processing, but only to ensure processing packets or fragments in order. Other particular requirements regarding packet ordering or packet fragment ordering are possible as well. The present invention ensures that delay of packet processing may be introduced in a controlled manner, regardless of the specific details of the processing or the reason(s) for controlled delay.
Controlled Delay of Packet Processing with Multiple Delay Loop Paths
With the out-of-sequence arrival of packets to packet processing platform 106 as an exemplary context, various embodiments of multiple DLPs for controlled delay of packet processing may be described.
In a preferred embodiment, packet processing platform 202 could be an IPS system, and packet processing block 204 could incorporate DPI and other, related security functionality, as well as routine packet receipt and transmission tasks. It should be understood the method and system described herein could apply to other types of packet processing platforms without limiting the scope and spirit of the invention.
Each of a plurality of DLPs could be a physical path or a virtual path within packet processing platform 202, but, in any case, one that functions independently (or largely independently) of packet processing block 204. Each DLP does not significantly impact resources of the packet processing block 202. That is, each imposes a delay on a packet that enters, but the timing resources, and possibly the storage resources, used to yield the delay operate independently and without impacting performance of packet processing block 204.
There are a variety of techniques that may be employed to achieve tuned delay of each of multiple DLPs. In a preferred embodiment, the delay associated with a DLP is comprised of various components. In the following discussion, each component is defined in general terms, with brief examples noted. Later discussions of exemplary embodiments include further descriptions of the delay components in terms of aspects of the embodiments.
Each respective DLP will have an associated path delay that corresponds to the total delay that a given packet would experience if it were sent to the respective DLP. This is the delay that the method and system seeks to impose on a given packet. As an example, the path delay could be the time between placing a packet in a queue and the packet's arrival back to a processing buffer.
Each respective DLP would also have a loop delay, which corresponds to the delay a given packet would experience from entry point to exit point on the respective DLP under the condition that there are no other packets on the DLP when the packet enters. For the example of a queue, the loop delay would correspond to the time it takes for a packet to enter and arrive at the front of the queue assuming the queue were empty when the packet arrived. If the queue operates according to a clock tick, then the loop delay might correspond to the time between ticks.
Additionally, each DLP would have a transit delay, which corresponds to the delay a given packet would experience from entry point to exit point on the respective DLP accounting for all packets (including the given packet) on the DLP when the given packet enters. Again, for the example of a queue, the transit delay would be the loop delay multiplied by the number of packets in the queue when the packet in question arrives.
Finally, each DLP would have a service time, which corresponds to the time it takes a packet to actually exit (or be removed) from the DLP; e.g., the time it takes for a packet to exit the DLP and be received back at packet processing block 204 in the example depicted in
Summarizing then, for a given packet the transit delay of any particular DLP may be computed as the loop delay for the particular DLP multiplied by the number of packets on the particular DLP (including the given packet). The path delay may then be computed as the transit delay plus the service time. As described below, for any particular DLP this formula may further depend on the relative sizes of loop delay and service time.
In a preferred embodiment, a plurality of DLPs is arranged so that the loop delay of each successive DLP is one-half the loop delay of the preceding DLP. According to this configuration, the transit delays of any two DLPs may be the same if the ratio of the number of packets on each respective DLP is inverse to the ratio of the loop delays of each respective DLP. As a specific example, a set of 12 DLPs may be designated as DLPi, i=0, . . . , 11. Each DLP may have a loop delay, ti, given by ti=2−i×t0, where t0 is a base loop delay defined for DLP0. For instance, for t0=20 ms, the set of 12 DLPs is associated with a corresponding set of (rounded) loop delays given by {(20, 10, 5, 2.5, 1.25) ms; (625, 312.5, 156.2, 78.1, 39, 19.5, 9.7) μs }, where the first five values are in milliseconds (ms) and the last seven are in microseconds (μs).
For this example, a given packet entering DLP0 under the condition that DLP0 contains no other packets would see a transit delay of 20 ms. The same transit delay would result for a packet entering DLP1 if there is already one packet on that DLP1, or for a packet entering DLP11 if there are already 2,047 packets on DLP11. Similar calculations can be made for the other DLPs as well. At any given time, the distribution of packets across the plurality of DLPs may not necessarily correspond to equal transit delays across DLPs. However, with a plurality of DLPs to choose from, there is a good likelihood that one or more DLPs will have a particular transit delay (or path delay). And as the number of DLPs is increased, so does the likelihood that a DLP with a particular path delay will be available. Further, it is not necessarily required that the same transit delays be applied to all packets. Rather, multiple DLPs offer the ability to select a path delay that is as close as possible to the desired delay for any given packet.
It should be understood that the configuration with 12 DLPs is exemplary, as is the base transit delay of 20 ms. Other arrangements are possible as well, including different numbers of DLPs and different delays. Further, delays need not be multiples of a common base value, and multiple DLPs may yield the same delay values.
According to another embodiment, multiple DLPs could be implemented in the form of multiple packet queues, as illustrated in
In exemplary operation, as packets are received at processing block platform 304 in packet processing platform 302, each is checked to determine if processing can proceed or if a delay is necessary (i.e., if the packet arrives out-of-sequence, in accordance with the IPS example). As illustrated, a given packet that arrives out of sequence is placed in one of queues 306, 310, or 314 (or possibly one of the queues represented by the horizontal ellipses). In order to select which queue to use, packet processing block 304 preferably determines a desired delay for the given packet, and then determines which of the plurality of packet queues would yield a path delay most closely matched to the desired delay.
The determination of the desired delay could be based on one or more properties of the packet, including, without limitation, packet type (e.g., IP protocol), application type (e.g., application protocol), packet size, sequence number, or fragmentation (if any), to name a few. Additional factors for determining a desired delay could include network conditions, and known or observed characteristics associated with traffic type or other classifications that may be inferred from packet properties. For example, empirical observations of TCP traffic at a packet processing platform may indicate that, with some likelihood, the receipt of any given out-of-sequence packet fragment will be followed by receipt of the in-sequence counterpart within a predictable amount of time. More specifically, some network research suggests that 90% of out-of-sequence TCP packet arrivals are followed by their in-sequence counterparts within 100 ms. Thus if a packet is determined to be part of a TCP connection, and also determined to be out-of-sequence, then the empirical observations indicate that if the packet is delayed for 100 ms following its arrival, there is a 90% likelihood that the in-sequence counterpart will arrive by the time the delay completes. Observed properties for other types of packets could differ from those observed for TCP connections, but may nevertheless be useful in determining delays that may be imposed in a controlled manner.
Other properties of packet that may be used to determine delay may be related to aspects of the application being transported in the packets, or to the particular type of processing that is carried out at the packet processing platform. Alternative embodiments may thus include different algorithms for the determination of desired delay.
The determination of path delay for each queue could be calculated in a manner similar to that described above, or looked up in a table that is dynamically updated according to the current occupancy of the queues, for instance. Path delay for each queue may be further understood by way of example by considering FIFO queues. Each queue may be inspected periodically according to a respective polling cycle, the start of each polling cycle coinciding with the inspection of the respective queue. If a packet is found at the front of a particular queue during inspection, the packet is then removed from the queue and returned to packet processing block 302. When a packet is removed from a queue, all of the remaining packets in the queue are then moved forward. Note that the forward movement could comprise actual movement across queue storage locations or virtual movement corresponding to adjustment of a queue-location pointer (e.g., as in a ring-buffer).
The polling cycle for a queue corresponds to the loop delay defined above, and the time required to remove a packet from a queue and return it for processing corresponds to the service time (e.g., the time required for a memory copy). Assuming the service time for a given queue is shorter than its polling cycle, then packet removal will be complete by the time the next polling cycle begins. In this case, a particular packet entering the given queue will arrive at the front of the queue after a time given approximately by the polling cycle multiplied by the number of packets in the queue (including the particular packet). This waiting time corresponds to the transit delay defined above. (Note that the actual transit delay may also depend on the phase of the polling cycle when a packet enters the queue. For example, the transit delay for a packet that enters a queue at the midpoint of the queue's polling cycle will be shorter by about one-half of a polling cycle than that for a packet that enters at the start of the polling cycle.) Once the particular packet arrives at the front of the queue, it then takes an additional service time before the packet is returned to packet processing block 304. Thus for the given queue, the path delay is determined as the transit time plus the service time.
In accordance with the example above of 12 DLPs, an embodiment of controlled delay could comprise 12 packet queues, Qi, i=0, . . . , 11, and 12 corresponding polling cycles given by {(20, 10, 5, 2.5, 1.25) ms; (625, 312.5, 156.2, 78.1, 39, 19.5, 9.7) μs}, where, again, the first five values are in ms and the last seven are in μs. Assuming all service times are shorter than 9.7 μs, then the transit time for each queue could be calculated as the polling cycle multiplied by the number of packets in the queue. A desired delay of 20 ms for a given packet could be closely attained (i.e., ignoring the assumed-negligible service time) by placing the given packet by itself in the first queue (Q0), behind one other packet in the second queue (Q1), behind three other packets in the third queue (Q2), or behind 2,047 packets in the last queue (Q11), for instance. Any of the other queues could also yield a 20 ms path delay (or close to it), depending on the occupancy of packets. Further, the exemplary configuration of 12 queues and polling cycles could be used to yield other path delays, and there could be more or fewer queues and polling times as well. The present invention is not limited to a specific number of queues, or a specific number and value of polling times and path delays.
Thus, once a desired delay for a packet is determined, a calculation like the one above could be performed for each queue in sequence, followed by selection of the queue with the closest matching path delay. Alternatively, path delay calculations could be continually performed as packets enter and exit the queues. The results could be stored in a look-up table, and the table in turn consulted for each packet that needs to be delayed.
As another example, selection of a path delay may be based on an algorithm that traverses a list of queues (or more generally, DLPs) and chooses the first one for which the path delay is no greater than a certain value. Considering again the 12 exemplary queues and polling cycles described above, an algorithm to select the first queue that yields a path delay of no more than 20 ms (ignoring an assumed-negligible service time) may comprise a logical “case” statement, as represented in the following pseudo-code:
As the logical cases are traversed for selecting a delay queue for a given packet, for instance in the course of execution of program instructions, the number of packets already in each successively-examined queue is determined in turn (Q0_occupancy, Q1_occupancy, etc.), and the number (implicitly) multiplied by 20 ms. The first case that yields the condition of a path delay no greater than 20 ms triggers the selection of the corresponding queue, according to the “Enqueue” instruction. Any untested cases are then abandoned. The symbolic names for the queues in this example are given as LoopPath0, LoopPath1, and so one. Also in this example, an error is triggered if all of the queues are full. The specific path delays and corresponding case tests could be modified by using different values for polling cycles, for instance.
Note that this exemplary algorithm merely selects the first available queue that yields a delay no greater than 20 ms (base delay). Moreover, alternative algorithms could use a different base delay, or determine a queue selection that yields a specific delay value (or a delay that is closest to a specific value), rather than an upper-limit value.
It should be noted that the assumption that the service time is always smaller than the polling cycle for a queue may not necessarily hold. Service times associated with actual data copy operations will generally vary with packet size, larger packets incurring longer service times. Thus, the service time for a given queue may be more appropriately represented by the average of the service times for all of the packets in the queue. When the average service time for a queue exceeds its respective polling cycle, then loop delay is more accurately represented by the average service time, and the transit delay becomes the average service time multiplied by the number of packets in the queue (this corresponds to the conventional definition for queuing delay). Under these circumstances, the computation of path delay for each queue will preferably take into account the relative sizes of the average service time and the polling cycle for each queue. Further, there may be other circumstances that require additional or alternative methods of path delay computation as well. One skilled in the art will readily recognize that there could be numerous ways to adapt the computation to the particular parameters that apply. The present invention is not limited to one particular form of path delay or path delay computation.
As noted, the description of queuing according to a FIFO discipline is exemplary. Other types of queuing could be employed in the present invention that yield path delays that accommodate the particular range of desired delays expected or required by the packet processing application. Examples include last-in-first-out (LIFO) and priority queues. Moreover, servicing of the queues may be accomplished by methods other than polling cycles. None of these possible variations are limiting with respect to the present invention.
In further accordance with the exemplary embodiment, multiple queues could be implemented in such a way as to minimize the use of processor resources. For instance, the entrance to and exit from each queue could comprise DMA operations. In such an arrangement, data copying involving packets could avoid impacting processor resources associated with programmed data transfer. Further, the polling cycle could be interrupt-driven, wherein each queue might require only one timer. Considering again the example of 12 queues described above, a total of 8,191 packets could be accommodated using just 12 timers if each queue uses just one timer for its polling cycle. One skilled in the art will recognize that other techniques could be used to further reduce the number of timers required. Thus, the multiple DLPs, and multiple queue-based DLPs in particular, make it possible to incorporate in a packet processing platform concurrent controlled delay of processing for a large number of packets in a manner that has minimal impact on the packet processing resources.
Exemplary Operation of Multiple Delay Loop Paths
Exemplary operation of multiple DLPs for controlled delay of packet processing is illustrated in
At step S-14, the packet is received at the packet processor, which could be packet processing block 302, for instance. Then, as indicated at step S-16, the packet processor then determines if the packet can be processed immediately or if needs to be delayed. As discussed in the examples above, this determination could be based on whether or not the packet was received in or out of sequence with respect to the original transmission order of packets or packet fragments in the communication. Since step S-14 also applies to packets that are returned to the packet processor following a DLP delay, the decision as to whether or not to delay a packet (step S-16) may also account for the number of DLP delays (if any) the packet has already experienced; this is further discussed below.
If a delay in processing is required, then the packet processor determines a value for the desired delay at step S-18. As discussed above, the desired delay could be based on one or more factors including, without limitation, packet type, transport protocol, application type, fragmentation, and the predicted time until the in-sequence counterpart arrives, among others. The desired delay could also take account of how many previous DLP delays (if any) the packet has already experienced. And again, the determination could additionally account for empirical observations related to the reasons for delay, such as the likelihood of an in-sequence counterpart packet arriving within the delay period of an out-of-sequence packet. Further, empirical or other analyses may indicate dynamical considerations that apply to the factors used to determine desired delay. For instance, the desired delay that applies to out-of-sequence TCP packets might vary with time of day. The scope of the present invention includes any reason for requiring a delay and any value of desired delay, and both may vary on a packet-to-packet basis.
At step S-20, the packet processor determines the path delay for each of the DLPs that are part of the system, and at step S-22 a DLP is selected according to the closest match of its respective path delay to the desired delay. Exemplary arrangements of DLPs and corresponding calculations of their respective path delays have been described above. Note that the exemplary method presented in the pseudo-code does not necessarily compute the path delay for each DLP, but rather traverses a list of DLPs and selects the first one that meets the selection criteria.
At step S-24, the packet is sent on the selected DLP. In the example of multiple queues described above, this action corresponds to placing the packet in the selected queue, possibly via a DMA transfer, for example. The occupancy information for the DLPs is then updated at step S-26. In an exemplary embodiment, a dynamic table is used to track queue occupancy and current path delays. Thus, at step S-26, the table is updated so that the next time a path delay determination is made (i.e., step S-20), the updated table can consulted.
Once the packet delay is complete, the occupancy information is again updated and the packet is returned to the packet processor (step S-28). For a returning packet the process proceeds again from step S-14, but the decision as to whether or not to impose a delay (step S-16) could have a different outcome for a given packet on its second or subsequent visits to the packet processor following a DLP delay. For example, the in-sequence packet (or packet fragment) for which the given packet was delayed may have arrived by the time the given packet returns to the packet processor. Similarly, a packet that is subjected to successive DLP delays may be given a different desired delay value for each one, for instance each successive delay could be shorter than the previous one.
Referring again to step S-16, if the decision is made to not delay the packet, then the packet is processed directly, as indicated at step S-30. The specific processing will preferably correspond to the particular functions and/or tasks of the packet processing platform or system (e.g., an IPS device), and may further depend on whether or not the packet is newly arrived or returning from a DLP delay. For example, packet processing directly following the decision at step S-16 could apply to a newly arrived packet (i.e., from step S-12) that is received in sequence. Note that processing of a packet that arrives in sequence could proceed immediately, or the in-sequence packet could be held while its out-of-sequence counterpart (if any) completes its delay. Other reasons for directly processing a newly arrived packet are possible as well.
Alternatively, as discussed above, packet processing directly following step S-16 could also apply to a packet that has already completed one or more DLP delays. For instance, by the time a DLP delay of an out-of-sequence packet is complete, the packet's in-sequence counterpart may have arrived so that in-sequence processing may proceed. If instead, the in-sequence counterpart does not arrive by the time an out-of-sequence packet completes one or more DLP delays, the out-of-sequence packet could still proceed to processing at step S-30, but in this case processing may amount to discarding the out-of-sequence packet. Other actions are possible as well.
The process for a given packet completes at step S-32 (the possibility of infinitely looping a packet on DLP delays is assumed to be ruled out by some other aspect of the logic not specified in the figure). While step S-32 represents the end of processing for one packet (or an in-order sequence of packets), new packets may arrive according to step S- 12, so that the overall process is continuous.
It should be understood that the flowchart of
Exemplary Embodiment Based On a Field-Programmable Gate Array
As mentioned above, an important aspect of certain packet processing platforms or systems, such as IPS and other security devices, is speed. That is, monitoring every packet, e.g., via DPI, in all packet streams and connections that traverse a packet processing platform potentially introduces unwanted delay. Thus it is desirable to carry out packet-inspection operations (or other packet-processing operations) as rapidly as possible. One approach to achieving such processing speed is to implement processing in specialized hardware, such as application-specific integrated circuits (ASICs) and/or field-programmable gate arrays (FPGAs). In such a system, it is further advantageous to implement controlled delay of packet processing using multiple DLPs within the specialized hardware as well. An exemplary FPGA implementation of the present invention is described below.
A typical FPGA comprises a large number of low-level elements physically interconnected on an integrating platform. The low-level elements may each perform simple logical functions that, when sequenced and combined by way of commonly-distributed control signals and/or switched communications between elements, can accomplish complex, but specialized, tasks faster than general purpose processors executing programmed instructions. For instance, certain specialized computations or packet manipulations may be performed more quickly using customized operations of an FPGA than by a general purpose computer.
Modern FPGAs generally also include a number of components that perform higher-level functions that are either common among most general computing devices or at least among a class or classes of devices designed for similar or related purposes. For example, an FPGA device that receives and sends network packets might include an integrated network interface module. Similarly, general data storage may also be included as an integrated module or modules. Other examples are possible as well.
Processor block 506 represents one or more sub-components, such as low-level logic blocks, gates, and additional interconnecting data paths. As such, processor block 506 could support the type of customization and/or specialization that is the basis for the speed and efficiency characteristic of FPGAs. In particular, processor block 506 may carry out operations related to packet processing, including IPS operations, content monitoring and analysis, and DPI, among others.
Network interface module 504 preferably supports connections to one or more types of packet networks, such as Ethernet or Wireless Ethernet (IEEE 802.11), and may comprise one or more input packet buffers for receiving network packets and one or more output packet buffers for transmitting network packets. Further, network interface module 504 may also include hardware, software, and firmware to implement one or more network protocol stacks to support packet communications to and from the interface.
Data storage 508, which could comprise some form of solid state memory, for example, includes program data 510 and user data 512. Program data 510 could comprise program variables, parameters and executable instructions, for instance, while user data 512 could comprise intermediate results applicable to a particular packet stream or communication connection. Other examples are possible as well.
Packet queues 514 comprise one or more individual queues 516, 518, and 520, representing packet queue 1, 2, and N, respectively. As indicated by the horizontal ellipses, there may be additional packet queues in between 2 and N, and the further no particular limitation is placed on N (other than it represents an integer). Being implemented as a distinct component, packet queues 514 may support packet queue buffering, as well as queuing functionality (e.g., FIFO operation), independently of any packet processing that may be performed by processing block 506. Thus, the goal of minimal impact of DLP delay on packet processing resources is achieved.
As indicated, network interface module 504 and processor block 506 are communicatively coupled via data paths 521 and 523, which transfer data, respectively, to and from the processor block (from and to the network interface). Similarly, data storage 508 and processor block 506 are communicatively coupled via data paths 525 and 527, which also transfer data, respectively, to and from the processor block (from and to data storage). Each of queues 516, 518, and 520 are also communicatively coupled with processor block 506 via data path pairs (531, 533), (535, 537) and (541, 543), respectively. Each data path pair transfers data packets to/from the respective packet queue from/to the processor block.
The common clock signal 519 may be used to drive and/or synchronize various functions of the platform components. Each packet queue may receive its own instance of the clock signal, as indicated by the horizontal arrows directed toward packet queues 514. Each queue may then use the signal to control a polling cycle. For instance, considering again the example of 12 queues and polling cycles (e.g., N=12), a common clock signal could be delivered to each queue every 9.7 μs, corresponding to the shortest polling cycle. Thus every clock signal would trigger the start of a polling cycle for the 12th queue, every other clock signal for the 11th queue, and so on up to 2,048 clock signals for the first queue (i.e., 20 ms polling cycle). Thus by counting clock signals, for instance, each queue could determine when its own polling cycle begins. In this way, the queuing operations associated with controlled DLP-based delay are further isolated from packet processing operations of processor block 506. Note that 9.7 μs is exemplary of a clock period, and others are possible as well.
In operation, packets will be received at packet processing platform 502 by network interface module 504, and will be transferred to processing block 506 via data path 521. If processing block 506 determines that a given packet needs to be delayed, for example because it arrived out of sequence, then a packet queue will be selected by matching a path delay to the desired delay for the packet, as described above (for example in connection with
The above description of an FPGA-based embodiment of multiple DLPs is exemplary, and details of FPGA operation and implementation have been simplified or omitted. It should be understood that any simplifications or omissions are not limiting with respect to the present invention.
An exemplary embodiment of the present invention has been described above. Those skilled in the art will understand, however, that changes and modifications may be made to the embodiment described without departing from the true scope and spirit of the invention, which is defined by the claims.