The present invention relates generally to packet processing performed by packet forwarding hardware. In particular, the present invention relates to a method for representing and controlling packet data flow through packet forwarding hardware.
Today, numerous independent hardware vendors (IHV) produce networking application specific integrated circuits (ASIC) to perform a myriad of packet processing tasks. The current interface to such ASICs are generally memory mapped registers that have corresponding bit level behavior and documentation. However, not all IHVs limit their products to register level descriptions. Some offer C level or other software interfaces to the hardware, but usually, these are merely a convenient reflection of the underlying registers and therefore differ from one IHV to another. These register level models represent a steep learning curve and tight coupling for an original equipment manufacturer (OEM) or an independent software vendor (ISV) that desires to use the ASICs or networking silicon in a product. At such a micro level description (i.e., the register bits), it is difficult to write code that is reusable across these various ASICs. It is also difficult to decipher the micro level functionality of the ASICs networking silicon.
A patent issued to Narid et al. (U.S. Pat. No. 6,157,955), entitled “Packet Processing System Including A Policy Engine Having A Classification Unit,” describes a general purpose, programmable packet processing platform for accelerating network infrastructure applications, which have been structured to separate the stages of classification and action. Narid et al. thus attempts to describe a software model for programming packet data flow. The application programming interface (API) described in Narid et al. defines action/classification engines (ACE) which form software objects that can be connected together to form a directed graph of data/packet flow. Packet flow, as described herein, refers to the path of a packet from its point of origination to its destination, including all intermediate nodes. However, ACEs have a high level of granularity due to the fact that each ACE contains a classification and action portion. Furthermore, the ACE directed graph is not an abstraction of data flow. Rather than providing an abstraction of underlying hardware which performs the packet processing, the ACE objects perform the packet processing at a software level. Unfortunately, performing packet processing at a software level sacrifices performance provided by performing packet processing at a hardware level.
A recent trend in the networking industry is the replacement of ASICs, which are relatively inflexible, with more programmable but still performance-oriented, network processors. Network processors are in their infancy stages and many do not have an abstract programming model, or do not have one expressive and flexible enough to grow with advances in the processor itself. In both cases, the lack of a state of the art programming model hinders both ISVs, who must write their own firmware to a moving API, and silicon vendors. ISVs and silicon vendors inevitably compete for inclusion in the designs of network devices of other network equipment companies.
Therefore, there remains a need to overcome one or more of the limitations in the above described existing art.
The features, aspects, and advantages of the present invention will become more fully apparent from the following detailed description and appended claims when taken in conjunction with accompanying drawings in which:
A method for representing and controlling packet data flow through packet forwarding hardware is described. In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced without some of these specific details. The following description provides examples, and the accompanying drawings show various examples for the purposes of illustration. However, these examples should not be construed in a limiting sense as they are merely intended to provide examples of the present invention rather than to provide an exhaustive list of all possible implementations of the present invention. In other instances, well-known structures and devices are shown in block diagram form in order to avoid obscuring the details of the present invention.
In an embodiment, the steps of the present invention are embodied in machine-executable instructions. The instructions can be used to cause a general-purpose or special-purpose processor that is programmed with the instructions to perform the steps of the present invention. Alternatively, the steps of the present invention might be performed by specific hardware components that contain hardwired logic for performing the steps, or by any combination of programmed computer components and custom hardware components.
The present invention may be provided as a computer program product which may include a machine-readable medium having stored thereon instructions which may be used to program a computer (or other electronic devices) to perform a process according to the present invention. The machine-readable medium may include, but is not limited to, floppy diskettes, optical disks, CD-ROMs, and magneto-optical disks, ROMs, RAMs, EPROMs, EEPROMs, magnet or optical cards, flash memory, or other type of media/machine-readable medium suitable for storing electronic instructions. Moreover, the present invention may also be downloaded as a computer program product, wherein the program may be transferred from a remote computer (e.g., a server) to a requesting computer (e.g., a client) by way of data signals embodied in a carrier wave or other propagation medium via a communication link (e.g., a modem or network connection).
System Architecture
Referring now to
As such, the conventional network provides firewall capabilities, as known to those skilled in the art, and intrusion detection capabilities, as known to those skilled in the art, using the firewall router 110 and the IDS router 130. Additionally, the conventional network 100 is configured as a virtual private network utilizing the VPN router 120. The various network switching/routing devices 110-130 are essentially fixed function ASIC devices or fixed function forwarding elements.
Referring now to
The present invention defines an object-oriented programming model appropriate for both ASIC-based networking silicon as well as network processors. This model obtains this range of expressiveness by identifying the fundamental units of packet processing performed by underlying hardware 176 (either ASIC or network processor). Software objects as described in further detail below, called Stages, are then created to encapsulate and represent these fundamental units of packet processing. At the first level of decomposition, specific types of stages including, for example, links, classifiers, editors, schedulers, queues, and monitors are formed. A link is a stage which represents a physical interface, including framing. A classifier stage represents a filtering or matching algorithm, while schedulers and queues can be combined to represent packet flow. On the other hand, monitor stages gather statistical information about packets and their data flows. The present invention also defines a meta stage or composition of stages such that the meta stage includes the same interface as the stage itself. This enables groups of stages to be treated as one large unit of packet processing.
Referring now to
The API model described by the present invention provides an object-oriented abstraction of forwarding plane packet processing capabilities. These capabilities include packet classification, forwarding, queueing, and transmit scheduling which are abstracted into objects called Stages. Depending on the underlying hardware programmability, the API model can range from simply allowing a user to discover the static configuration of some Stages, to allowing arbitrary creation and interconnection of Stages. The API model provides a solution by abstracting the macro level functionality of network silicon ASICs. This enables firmware engineers to write re-useable code. More particularly, it provides a common understanding of the functionality of the silicon. In other words, the API model provides a framework in which IHVs need write only the lower layers of the API model to map from object-oriented abstractions (i.e., Stages) into their registers.
Stages have three main attributes: a set of numbered inputs, numbered outputs and named parameters. The API model enables the connection of the inputs and outputs of different Stages to form a data flow topology model of the underlying forwarding hardware. Each Stage has zero or more inputs and zero or more outputs as depicted in FIG. 3. The outputs (inputs) of one Stage are connected to inputs (outputs) of another Stage. These inputs and outputs represent both the packet data traversing the underlying forwarding engine hardware, as well as a tag. This tag is associated with the packet data and carries the interstage state. (Note, however, that the tag is not part of the packet data and is an addition to the packet data.) Some Stages pass the tag-through, some read the tag and others modify the outgoing tag. The parameters of a Stage, along with a few special (internal) synchronization objects affect the behavior of the Stage as described in further detail below. Also, the parameters of a Stage are not directly accessible, but indirectly via methods on the Stage, as synchronous modification of changes in the underlying hardware can be provided by parameters via a callback mechanism.
Referring now to
Referring now to
Referring now to
As described above, each Stage, or software object, is designed to describe both ordering and functionality of the underlying forwarding hardware 360. The directed path 330 is merely a representation of how the packet processing is done by the forwarding hardware. The packet data does not, itself, traverse the software objects of the directed graph 330. Rather, the packets still traverse the actual hardware, thus taking advantage of the performance innovations in the ASICs, or network processors. Moreover, the various stage objects can be added or removed to add/remove functionality without affecting the underlying hardware. An API for describing directed graphs of software objects to perform data path packet processing functionality is now described.
Application Programming Interface
API Forwarding Hardware Engine Model Infrastructure
The following describes an application programming (API) interface for modeling underlying forwarding engine hardware using an object-oriented programming model that abstracts the fundamental units of packet processing performed by the hardware into software objects called stages. Those skilled in the art will appreciate that the following API merely represents one possible implementation for such an application programming interface. As a result, changes or modifications to the following API, including various additions or deletions of software object stages or various interconnections therewith to form data flow topologies are within the scope and the contemplation of the present invention. In other words, the following API description should not be construed in a limiting sense, as this API is merely intended to provide an example of the present invention, rather than to provide an exhaustive list of all possible API implementations of the inventive techniques taught by the present invention. In the following class descriptions, some C++-like code is used. This code has been intentionally simplified for clarity. It has not been compiled, nor does it have sufficient error checking to be considered final.
In addition to these classes,
The Stage class is the central class in the Engine Model. It is the base class for virtually all other classes within this model. Its essential attributes are a set of numbered inputs, numbered outputs, and named parameters.
The inputs and outputs connect with other Stages' outputs and inputs respectively to form the topology (data flow) of the forwarding engine hardware. These inputs and outputs represent both a tag and the actual packet data. The tag represents inter-Stage state. The Engine Model programmer specifies the actual value of the tag (see Classifier, below).
Assuming the underlying forwarding hardware is not fixed function, the topology or data flow is dynamic, i.e., the Engine Model user can establish connectivity at runtime. This approach provides a more expressive, powerful model. If the underlying forwarding engine does not support such dynamic reconfiguration, the connect method can be implemented to unconditionally throw an exception.
The parameters, which are synchronized and double-buffered, provide a mechanism by which the internal behavior of a Stage is controlled. Synchronized and double-buffered is akin to a two-phase commit database transaction. See the Stage::Parameter class for more details. An external Engine Model user can register an asynchronous callback for each parameter of a Stage. Whenever the underlying forwarding engine changes the value of a parameter, the corresponding, registered callback is invoked.
Related Types:
Methods
The following methods are defined.
Example:
Refer to
Stage::Parameter Class
The Stage::Parameter class 410 represents a Stage's parameter. The motivation for this simple class is to encapsulate the notion of double buffering (or two-phase commit). As described with reference to
Parameters can also conveniently capture the asynchronous changes in the underlying forwarding engine hardware. To enable this, every Parameter can be given a list of callbacks to invoke when the parameter changes. The actual registration mechanism is exposed through the Stage::Register and Stage::DeRegister methods.
Related Types:
See Stage
Methods
The following methods are defined.
The EngineGraphManager 452, as depicted with reference to
The EngineGraphManager 452 exposes the possible interconnections of the Stages. Each Stage also holds an attribute optional which is true if the Stage can be bypassed. Intra-Stage parameter updating constraints are taken care of by providing read only attributes (i.e. only Get methods) for the corresponding parameters (e.g., in
The other way, is to have an mutable attribute for any connection. This has the advantage that a meta-Stage (composite Stage) defined at runtime, can be installed and removed as desired. (This can be done using the isoptional attribute in Stage too, but then that meta-Stage has to be defined at compile-time). In this case to represent an optional Stage we would have to make the outconnection of previous Stage and inconnection of the following Stage as mutable
Note: The capabilities of an FE related to links are in the Link class, other capabilities such as the ability to do certain types of filtering etc, can in some sense be represented in the interconnection of Stages. In order to represent a specific type we will use that Stage eg. Five Tuple Classifier in the interconnected model whereas to represent a general type of classification we will use Classifier Stage in the model, and the FEAPI model user can then model that Stage to be the specific type of classifier it wants.
Related Types:
See also LinksContainer.
Methods
The following methods are defined.
This class holds a list of the Links (terminal Stages) in the data-flow topology. The class provides access to an immutable iterator over these Links.
Related Types:
See also EngineGraphManager.
Attributes
Methods
The following methods are defined.
Example:
See EngineGraphManager.
Stage Types
Specific types of Stages are defined in this section. These build off the infrastructure direct graph of packet flow 250 depicted in FIG. 4. For each new type of Stage the number of inputs and outputs is specified. The parameters (or adjustments) of the particular Stage are described in terms of operations on the Stage.
Multiplexing and Demultiplexing Types
Scatterer Class
A scatterer 534, as depicted in
This class is essential to representing many of the parallel operations of the underlying forwarding-engine hardware. For example, it would allow a coprocessor to be switched on and off, or a remote monitoring agent to gather statistics in parallel with standard forwarding Stages.
Related Types:
Methods
The following methods are defined.
Example:
A Gather class 540, as depicted in
As with Scatterer, no data actually passes through the Gatherer object, which is merely a representation of the underlying hardware.
Stage Parameters
Methods
The following methods are defined.
Example:
Related Types:
See also Scatterer.
A SwitchFabric class 550, as depicted in
Related Types:
See also Classifier
Methods
The following methods are defined.
Other Stage Types
Additional Stage classes 560, are depicted in
Methods
The following Methods are defined.
Example:
Link Types
The Link will be further specialized into Ethernet, ATM link etc
The Queue class type 556, as depicted in
Methods
The following methods are defined.
A Classifier class 580, as depicted in
The following are characteristics of the Classifier interface:
The following pseudo-code illustrates how a Classifier 602 should work, as depicted with reference to FIG. 18:
In this example, the pattern 616, inTag 618, and outTag 619 are specified by each entry in the patternTable 604.
Finally, the abstract Classifier class 602 overrides the UpdateHardware method of the Stage class. This is done so that each specific Classifier implementation does not have to implement UpdateHardware. Rather, Classifier takes over this responsibility and calls three methods: Hw_DeleteEntry( ), Hw_AddEntry( ), and Hw_ModifyEntry( ). Classifier::UpdateHardware( ) calls hw_xxxEntry( ) methods for only those entries in the action table that have changed. This can potentially reduce hardware updates.
Methods
The following methods are defined.
Example:
IPFiveTupleClassifier
Qualifiers: None.
An IPFiveTupleClassifier is a specific type of Classifier whose patterns are standard IP five-tuples.
Example:
IPv4Classifier
The IPv4Classifier is a specific type of Classifier whose filters are routing entries. Such a Classifier is used to perform the lookup phase of IP routing.
Example:
Editor Type
An Editor class 620, as depicted in
The value in the Editor's action table that corresponds to an incoming tag specifies how to modify the current packet for that tag and on which connection to output. Editors do not modify the outgoing tag. The concrete subclasses of Editor define the format of the action table.
Finally, like Classifier, the abstract Editor class overrides the updateHardware method of the Stage class. This is done for the same reason, and with the same mechanism, as Classifier. See Classifier, Section 0, for details.
Related Types:
Methods
The following methods are defined.
Example:
See Classifier.
A Scheduler class 630, as depicted in
Example:
This is a concrete Stage that is empty, essentially a null Stage. It only holds on to a monitoring object, so can be used as a logging Stage. A logger can have either 0 or 1 outputs. If it has 0 outputs it passively monitors the packets in parallel with possibly other operations. If it has one output, it outputs the packet without modification on that output.
The monitor class 650, as depicted in
Methods
The following methods are defined.
Examples:
A monitor held by a link can be queried for packets/bytes received and transmitted.
A monitor held by a Queue can be used to query packets received, as well as packets pulled from the queue, and this can be used to calculate the number of packets dropped.
A ClassifierMonitor class 656, as depicted in
(Or since the map in classifier is ordered, we could have just a patternid/position as input, and get statistics for it; this would be more generic, but with less intuition about what the pattern actually is.)
Attributes
Methods
The following methods are defined.
Composition of Stages
Individual Stages can be combined to form more complex Stages, as depicted in
Classifier with a Switchfabric
A classifier 676 with a switch fabric 680 is described with reference to
MultiClassifier
A multiclassifier 690, as described with reference to
Link Aggregation
We can use composition of Stages to represent link aggregation too. Thus an output aggregate link 704 (704-1, . . . , 704-n) can be represented as depicted in FIG. 23. The other approach for link aggregation would be to use the composite design pattern. This would allow us to present the same interface for single as well as aggregate links. This would mean adding methods in Link class for adding/deleting links. Users then would always access links through the Link Interface, which could contain single or multiple Ethernet (or possibly other) links. It would internally contain the scheduler if it had more than one physical link.
It is to be understood that even though numerous characteristics and advantages of various embodiments of the present invention have been set forth in the foregoing description, together with details of the structure and function of various embodiment of the invention, this disclosure is illustrative only. Changes may be made in detail, especially matters of class structure and management of objects to form directed graphs of data/packet flow within the principles of the present invention to the full extent indicated by the broad general meaning of the terms in which the appended claims are expressed. For example, the particular API forwarding engine and underlying forwarding engine hardware may vary depending on the particular application for the novel object abstraction and identical API model while maintaining substantially the same functionality without departing from the scope and spirit of the present invention.
In addition, although an embodiment described herein is directed to a Forwarding Engine API, it will be appreciated by those skilled in the art that the teaching of the present invention can be applied to other systems. In fact, systems for network processing including ASICs as well as network process programmed using an API model as described herein are within the teachings of the present invention, without departing from the scope and spirit of the present invention.
The present invention provides many advantages over known techniques. The present invention provides a unified software programming model that is expressive enough to represent both fixed-function hardware as well as programmable network processors. This invention is a large step up from the register-level programming models offered by current ASICs because it is a state-of-the-art object-oriented programming model. This invention is a unifying technology for current network processors that are typically in their infancy in terms of programming model. A key advantage of this approach for network processors is that it does not sacrifice performance by actually executing the packet-processing in software, rather it abstracts the capabilities of the network processor but does not emulate them. In either case, this invention benefits both ISVs by providing a high-level programming model that does not sacrifice performance and the silicon vendors themselves, by providing an API that can grow with the evolution of their hardware while not alienating their ISV partners.
Having disclosed exemplary embodiments, modifications and variations may be made to the disclosed embodiments while remaining within the scope of the invention as defined by the following claims.
Number | Name | Date | Kind |
---|---|---|---|
5509123 | Dobbins et al. | Apr 1996 | A |
6157955 | Narad et al. | Dec 2000 | A |
6526062 | Milliken et al. | Feb 2003 | B1 |
6594268 | Aukia et al. | Jul 2003 | B1 |
6675218 | Mahler et al. | Jan 2004 | B1 |
6754219 | Cain et al. | Jun 2004 | B1 |
20020131364 | Virtanen et al. | Sep 2002 | A1 |
20030227871 | Hsu et al. | Dec 2003 | A1 |
20040032829 | Bonn | Feb 2004 | A1 |
Number | Date | Country | |
---|---|---|---|
20020126621 A1 | Sep 2002 | US |