System and method for router virtual networking

Information

  • Patent Grant
  • 8010637
  • Patent Number
    8,010,637
  • Date Filed
    Monday, April 26, 2010
    14 years ago
  • Date Issued
    Tuesday, August 30, 2011
    13 years ago
Abstract
A host router is logically partitioned into virtual router domains that manage independent processes and routing application copies but share a common operating system. Each v-net manages an independent set of sockets and host router interfaces, each associated with only one v-net at one time, but interchangeably repartitionable. Traffic is removed from an interface during repartitioning. Duplicate arrays of global variables copied to each v-net are accessed by macro references. A v-net facility can separate route tables used internally from the externally visible route tables and can avoid conflicts between internal and external IP addresses that share the same identifier. For example a common FreeBSD operating system supports a dynamic routing protocol (DRP) application. Each v-net runs an independent copy of the DRP software and is logically independent. A failure in one DRP copy does not adversely affect other copies.
Description
TECHNICAL FIELD

This application relates to the field of communication networks, and particularly to large-scale routers for optical communication networks.


BACKGROUND

Transmission Control Protocol (TCP) is an underlying connection protocol that is typically used for all types of network communication. A route is essentially the mapping of an IP address to an egress port of a router. Different network routers set up connections with their peer routers using operating systems, for example Border Gateway Protocol (BGP) over TCP or OSPF (Open Shortest Path First) over Internet Protocol (IP) to determine that they get route information from their peers, allowing them to construct essentially an internal map of the network and to select the route that they should use, as well as verification that their peers are operating correctly. This is accomplished by sending various keep-alive packets back and forth to make sure that their peers are still correctly functioning. Routes are used internally within a router, for example a Master Control Processor (MCP) communicates through an Ethernet control network (CNET) within a router with the shelf control processors, each of which have individual IP addresses. Processes including routing applications, for example Dynamic Routing Protocol (DRP), run on these operating systems. Sockets are end points of communication associated with a process. A particular process can have more than one socket.


In a router with a large number of ports, for example 320 ports, that communicates with peer routers, it is advantageous to subdivide that single large router logically into several smaller virtual routers, each of which can be individually configured. There can be separate departments in a large company, or an Internet provider wanting to partition a large router among clients, for example for security reasons. However, previous implementations of subdividing routers having large numbers of ports have been cumbersome.


SUMMARY OF THE INVENTION

The present invention is directed to a system and method which logically partition a host router into virtual router domains that run independent processes and routing application copies but share a common operating system. Each v-net domain manages an independent set of interface ports. Each process manages an independent set of sockets.


In some embodiments a v-net domain architecture is used to partition a host router. Some v-net domains support virtual routers, whereas other v-net domains support only internal router processes and management applications. Thus, not every v-net domain supports a virtual router. A single v-net domain can support more than one process. A v-net facility can advantageously separate route tables used internally from the externally visible routes, making network management easier and more transparent. With separate v-net domains for example, the IP address of an internal shelf control processor does not conflict with the same IP address that is assigned elsewhere on the Internet. In a v-net implementation, duplicate arrays of global variables are instantiated in each virtual router domain and are accessed by macro references.


A common FreeBSD operating system running on the MCP supports a dynamic routing protocol (DRP) application. Each new virtual router is independently managed by its own copy of the DRP application for as many virtual routers as exist. If something goes awry in one DRP copy, it does not affect other copies. Each v-net domain manages a separate set of the interfaces associated with the host router, which provide connections to peer routers. For example, if a host router has 320 ports, one v-net domain can manage 120 ports or interfaces, and another v-net domain can manage another 120 ports. All of these ports and interfaces can be interchangeably partitioned. For each Synchronous Optical Network (SONET) port on a line card, there is an interface (IF) data structure in FreeBSD that represents that SONET port. Any interface can be associated with only one v-net at one time, but can he moved among v-nets to reconfigure the host router. Traffic is removed from an interface while it is being moved. At a high level the host router is partitioned, and each partition normally is managed by an independent copy of the DRP software. In an administrative sense, each of these partitions is logically independent.


Certain activities are still managed across the entire host router, for example failure reporting of hardware in the host router, which is machine specific, and therefore is a resource shared by all of the partitions.


This partitioning also allows the routes between the individual components such as the line cards and processors internal to a router to be contained in route tables separate from externally visible routes. Partitioning the router also facilitates testing, such that one partition might be used for normal network traffic and another might be used to test for example new software or new network configurations for new types of protocols. Additionally, a degree of redundancy is achieved, such that failure of one partition generally does not adversely affect another partition sharing the same host router.


Various aspects of the invention are described in co-pending and commonly assigned U.S. application Ser. No. 09/703,057, entitled “System And Method For IP Router fly With an Optical Core,” filed Oct. 31, 2000, the disclosure of which has been incorporated herein by reference.


The foregoing has outlined rather broadly the features and technical advantages of the present invention in order that the detailed description of the invention that follows may be better understood Additional features and advantages of the invention will be described hereinafter which form the subject of the claims of the invention. It should be appreciated by those skilled in the art that the conception and specific embodiment disclosed may be readily utilized as a basis for modifying or designing other structures for carrying out the same purposes of the present invention. It should also be realized by those skilled in the art that such equivalent constructions do not depart from the spirit and scope of the invention as set forth in the appended claims. The novel features which are believed to be characteristic of the invention, both as to its organization and method of operation, together with further objects and advantages will be better understood from the following description when considered in connection with the accompanying figures. It is to be expressly understood, however, that each of the figures is provided for the purpose of illustration and description only and is not intended as a definition of the limits of the present invention.





BRIEF DESCRIPTION OF THE DRAWING

For a more complete understanding of the present invention, reference is now made to the following descriptions taken in conjunction with the accompanying drawing, in which:



FIG. 1 is a logical diagram illustrating the principles of router virtual networking, according to an embodiment of the present invention.





DETAILED DESCRIPTION

In embodiments of the present invention, a host network router is logically partitioned into multiple virtual networking domains sharing a common operating system. FIG. 1 is a logical diagram illustrating the principles of router virtual networking, according to an embodiment of the present invention. In the implementation of FIG. 1, a host router 10 is logically partitioned into v-net domains 12, 14, and 16 that are associated with networking systems. Each v-net 12, 14, 16 has a unique v-net ID address 13, 15, 17, in accordance with network protocols. Host router 10 and each of v-nets 12, 14, 16 are further logically subdivided into two spaces, shown in FIG. 1 separated horizontally by a solid line, namely a user level 18 and a kernel level 20 of the shared common operating system (OS), for example a version of FreeBSD. The present FreeBSD operating system runs on the host router Master Control Processor (MCP), described for example in U.S. application Ser. No. 09/703,057, entitled “System And Method For IP Router With an Optical Core,” filed Oct. 31, 2000, cited above, the disclosure of which has been incorporated herein by reference, and the dynamic routing protocol (DRP) application software runs on top of FreeBSD.


An operating system contains within it logical notions called processes 22-26, for example Internet Management Application 22, DRP 23, 25, or Simple Network Management Protocol (SNMP) agent application 24, 26, running on v-nets 12, 14, and 16. Different individual v-nets can manage the same, different, single, or multiple processes. V-net domains 14 and 16, each running DRP and SNMP processes, are virtual routers, whereas v-net domain 12, running only an internal management application, is not a virtual router. The present FreeBSD operating system supports multiple processes, among which arc DRP 23, 25, SNMP 24, 26, and Internal Management Application 22. Each process occupies some user level space 18 and also some operating system kernel level space 20. User level space 18 includes the application and the values of all the application variables (riot shown in FIG. 1), whereas OS or kernel level space 20 of the process includes internal data that the kernel maintains with each process. Typical examples of internal kernel data include descriptors or descriptions of open files and the ID of the user that owns the process, attributes that are added to each process associated with a particular v-net.


Among other things associated with a particular v-net are interfaces, for example interfaces 42-1 through 42-3 associated with v-net 12. An interface represents for example a particular physical hardware Ethernet card, gigabit Ethernet card, or SONBT line card interconnected with a remote router. This allows partitioning of host router interfaces, such that for example interfaces 42-1 through 42-3 contain v-net ID 13 of v-net 12 with which they are associated. V-net domain 12 maintains an interface list 42-0 pointing to interfaces 42-1 through 42-3. Similarly v-net domain 14 maintains an interface list 43-0 pointing to interfaces 43-1 through 43-3 carrying v-net ID 15 of v-net domain 14, and v-net domain 16 maintains an interface list 45-0 pointing to interfaces 45-1 through 45-3 carrying v-net ID 17 of v-net domain 16.


Each process 22-26 can create sockets, which are end points of communication associated with a process, for example sockets 32-1 through 32-3 associated with process 22 in v-net domain 12. A particular process can have more than one socket. Each socket has a v-net IDassociated with it, for example sockets 32-1 through 32-3 each contain v-net ID 13 of v-net 12. In v-net 12, management application 22 maintains a descriptor table, for example file descriptor table 32-0 of v-net 12, holding references to sockets 32-1 through 32-3 and to files, which are each associated with specific application 22. Similarly, in v-net 14, DRP application 23 maintains descriptor table 33-0, holding references to sockets 33-1 through 33-3 and to files associated with application 23, and SNMP application 24 maintains descriptor table 34-0 holding references to sockets 34-1 through 34-3 and to files associated with application 24. Likewise in v-net 16, DRP application 25 maintains descriptor table 35-0, holding references to sockets 35-1 through 35-3 and to files associated with application 25, and SNMP application 26 maintains descriptor table 36-0 holding references to sockets 36-1 through 36-3 and to files associated with application 26.


Sockets are partitioned basically according to the domain in which communication takes place. Each of the things done to the socket is interpreted in the context of the particular v-net in which the socket is created, and therefore the socket carries that particular v-net identifier. The process has a v-net identifier, because when a process creates a new socket, which it is able to do, each socket that it creates is then created in a process of that v-net identifier. For example, if a process is associated with v-net 0 creates a socket, then that socket is automatically associated with v-net 0, gets its routing tables from v-net 0, and can then use all of the interfaces that are assigned to v-net 0. A process can, however, change its v-net identifier and thereby its v-net association, for example by moving logically from v-net 0 to v-net 1, and can then create a new socket associated with v-net 1, which uses routing tables and interfaces of v-net 1, which are disjoint with the interfaces for v-net 0.


Once a socket is created, it cannot be moved to another v-net, but remains in the domain in which it was created. However, a process, by changing its v-net identifier, can then create sockets in multiple domains. Consequently, a process can essentially communicate across domains by creating a socket in each one, but each socket, throughout its existence, is fixed in its original domain. Multiple sockets created by a process are distinctly different from a single socket that is simply interpreted in different ways. For example a single process can create ten distinct sockets in one domain and five distinct sockets in another domain. For example, socket 35-4 is created in v-net domain 12 by DRP application 25 and carries v-net ID 13, although socket 35-4 is referenced in descriptor list 35-0 of DRP application 25, which is now in v-net domain 16. Likewise, socket 33-4 is created in v-net domain 12 by DRP application 23 and thus carries v-net ID 13, although socket 33-4 is referenced in descriptor list 33-0, which is now in v-net domain 14. A socket is destroyed when a process exits or when a process closes down the communication end point represented by that socket. After a socket is destroyed, it is no longer associated with any domain, and the memory associated with it is freed.


If for example v-net 14 and v-net 16 are two networking domains of host router 10, and if v-net 14 is a production network carrying live traffic with production code in it, or production network connections carrying real customer traffic, then a socket associated with v-net 14 is operating in that v-net's space and has routing tables 48 for that v-net to route live traffic. Consequently, if the socket were to select a particular IP address, that IP address would use production routing tables 48. A different socket in a different v-net 16 is for example used for a small test bed and contains a different set of routing tables 50. Accordingly, when a message is sent on v-net 16 with an IP address, that IP address is interpreted in the context of v-net 16 running the small test bed.


Global variables are variables that are accessible to all the various logical contexts or threads of execution that are running concurrently within an operating system. Thus a global variable is not on the stack of a particular thread. Accordingly, all global variables are available to every process that is running within the operating system. Global variables include at least at the top level, for example, the IP address of a machine or a copy of the routing tables so that a process knows where to send packets. There are a certain set of global variables associated with the networking code, and in order to make the networking codes support partitioning, the set of global variables associated with networking are replicated, one copy 47 for each v-net domain, such that the operating system effectively contains, rather than one copy of the networking data structures, N instantionations of the a networking stack, replicating all the various functions of the networking code, including replicated routing tables and replicated TCP control blocks linked together throughout the basic data structure. Thus, effectively all of the important variables in the networking system are replicated, so that they can be independently managed. This can be thought of as an operating system with N instantiations of the networking system.


The basic approach of the v-net code is to take global variables that need to be replicated for each v-net domain, and to make an array of them. As an example tcpstat, the tcp statistics structure, is declared in tcp_var.h struct tcpstat { . . . } and defined in tcp_input.c as struct tcpstat tcpstat. To have a separate set of statistics for each v-net domain requires changing the definition to struct tcpstat tcpstat[NVNET] and changing all references to index by the appropriate v-net domain number.


To make v-net facility a configuration option, the declarations and references are encapsulated in macros. The macros generate arrays when v-nets are configured in and scalars when v-nets are deconfigured. As an example the tcpstat declaration becomes VDECL (struct tcpstat, tcpstaT), in which the first macro argument is the type, and the second macro argument is the name. It will be noted that the variable name is changed from tcpstat to tcpstaT. This convention is followed throughout the global variable generation, i.e., variables that are virtualized and global across more than one file are changed to have the final letter in their name capitalized. This is done for three reasons:

    • 1) to differentiate global variables from local variables and/or types of the same name for readability,
    • 2) to ensure that all references to global variables are fixed appropriately (by causing a compile error if the variable name is not changed); and
    • 3) to denote global variables plainly for possible future changes.


References to virtualized variables are made using one of two macros, _v(name), or _V(name, index), where name is the variable name and index is the v-net domain index to be used. The macro _v uses a per CPU global index variable vnetindex. It will be noted that all references to virtualized variables must be made with these macros, without exception, so that the references are correct without requiring #ifdef's when v-nets are configured or deconfigured.


In addition to defining a methodology that handles virtualization of variables, a selection is needed of the correct set of global variables to be replicated for each v-net domain, and the replicated variables need to be correctly referenced by macros in the appropriate v-net domain. For example, global variables can be identified by using a script that analyzes object (_o) files for the global variables they define, by code inspection, or by information from other sources (see for example the tables of global variables in TCP/IP Illustrated, Volume 2: The Implementation, Gary R. Wright and W. Richard Stevens, Addison-Wesley 1995, p. 64, 97, 128, 158, 186, 207, 248, 277, 305, 340, 383, 398, 437, 476, 572, 680, 715, 756, 797, 1028, and 1051).


The following Appendix A is basically a table of the global variables that are virtualized in some implementations, listing the name and the purpose of the variable. The variables that are virtualized are generally marked “virtualized” in the table. Although virtualized variables shown in the table are usually marked “virtualized,” other variables in the table have been analyzed but excluded from virtualization. All of the “virtualized” variables are essentially replicated, such that each v-net maintains its own set of these variables. Then macros, program conventions that allow textural substitution, are provided, such that everywhere a global variable is accessed, a replacement access is a macro reference selected from the correct set of variables based on the correct v-net.


In the present embodiment, multiple networking domains are implemented by the same operating system, unlike previous approaches, in which for example a computer is subdivided into virtual domains that partition the hardware and run separate operating systems in each domain.









APPENDIX A







VARIABLE ANALYSIS















Analysis/


Variable
Data Type
Defining File
Description
Disposition





Head
static struct
igmp.c
Head of router_info linked list.
Virtualized.



router_info *


Addmask_key
static char *
radix.c
Temporary storage for
Invariant.





rn_addmask.


arp_allocated
static int
if_ether.c
Total number of llinfo_arp
Virtualized.





structures allocated.


arp_inuse
static int
if_ether.c
Current number of llinfo_arp
Virtualized.





structures in use.


arp_maxtrics
static int
if_ether.c
Tunable. Maximum number of
Tunable. Not





retries for an arp request.
virtualized.


arp_proxyall
static int
if_ether.c
Tunable. Enables forming a
Tunable. Not





proxy for all arp requests.
virtualized.


arpinit_done
static int
if_ether.c
Indicates initialization is done.
Invariant.






Initialization






handles all






vnets.


arpintrq
struct ifqueue
if_ether.c
Arp interrupt request queue.
Invariant.





Shared by all vnets. Vnet





switching when pulled off





queue.


arpt_down
static int
if_ether.c
Tunable. No. of seconds
Tunable. Not





between ARP flooding
virtualized.





algorithm.


arpt_keep
static int
if_ether.c
Tunable. No. seconds ARP
Tunable. Not





entry valid once resolved.
virtualized.


arpt_prune
static int
if_ether.c
Tunable. No. seconds between
Tunable. Not





checking ARP list.
virtualized.


bpf_bufsize
static int
bpf.c
Tunable.
Tunable. Not






virtualized.


bpf_cdevsw
static struct
bpf.c
Table of entry point function
Invariant.



cdevsw

pointers.


bpf_devsw_installed
static int
bpf.c
Initialization flag.
Invariant.


bpf_dtab
static struct
bpf.c
Descriptor structure, one per
Invariant.



bpf_d

open bpf device.



(NBPFILTER)


bpf_dtab_init
static int
bpf.c
Another initialization flag.
Invariant.


bpf_iflist
static struct
bpf.c
Descriptor associated with each
Invariant.



bpf_if

attached hardware interface.


clns_recvspace
static u_long
raw_clns.c
Constant (patchable). Amount
Not virtualized.





of receive space to reserve in





socket.


clns_sendspace
static u_long
raw_clns.c
Constant (patchable). Amount
Not virtualized.





of send space to reserve in





socket.


clns_usrreqs
struct pr_usrreqs
raw_clns.c
Function pointers for clns user
Invariant.





requests.


clnsg
struct clnsglob
raw_clns.c
Global state associated with
Virtualized.





ray_clns.c, including list heads





and counters.


clnsintrq
struct ifqueue
raw_clns.c
Clns interrupt request queue.
Invariant.





Shared by all vnets. Vnet





switching done when removed





from queue.


clnssw
struct protosw
raw_clns.c
Pointers to protocol entry
Invariant.





points & associated data.


counter
static u_int64_t
ip_fw.c
Counter for ipfw_report.
Virtualized.


div_recvspace
static u_long
ip_divert.c
Amount of receive space to
Invariant.





reserve in socket.


div_sendspace
static u_long
ip_divert.c
Amount of send space to
Invariant.





reserve in socket


divcb
static struct
ip_divert.c
Head of inpcb structures for
Virtualized.



inpcbhead

divert processing.


divcbinfo
static struct
ip_divert, c
Pcbinfo structure for divert
Virtualized.



inpcbinfo

processing.


dst
static struct
bpf.c
Sockaddr prototype.
Invariant.



sockaddr


err_prefix
char[ ]
ip_fw.c
Constant string for printfs.
Invariant.


etherbroadcastaddr
u_char [6]
if_ethersubr.c
Constant. Ethernet broadcast
Invariant.





link address.


expire_upcalls_ch
static struct
ip_mroute.c
Callout handle for
Virtualized.



callout_handle

expire_upcalls.


fcstab
static u_short
ppp_tty.c
Constant. Table for FCS
Invariant.



[256]

lookup.


frag_divert_port
static u_short
ip_input.c
Divert protocol port.
?





Conditionally compiled iwith





IPDIVERT.


fw_debug
static int
ip_fw.c
Tunable. Enables debug print.
Not virtualized.


fw_one_pass
static int
ip_fw.c
Tunable. Enables accepting
Not virtualized.





packet if passes first test.


fw_verbose
static int
ip_fw.c
Tunable; controls verbosity of
Not virtualized.





firewall debugging messages.


fw_verbose_limit
static int
ip_fw.c
Tunable. Limits amount of
Not virtualized.





logging.


have_encap_tunnel
static int
ip_mroute.c
Indicates presence of an
Virtualized.





encapsulation tunnel.


icmpbmcastecho
static int
ip_icmp.c
Tunable flag. Disables
Not virtualized.





broadcasting of ICMP echo and





timestamp packets.


icmpdst
static struct
ip_icmp.c
Saves the source address for
Virtualized.



sockaddr_in

ifaof_ifpforaddr.


icmpgw
static struct
ip_icmp.c
Holds the ip source address in
Virtualized.



sockaddr_in

icmp_input.
May not be






necessary


icmplim
static int
ip_icmp.c
Tunable. ICMP error-response
Not virtualized.





band with limiting sysctl.


icmpmaskrepl
static int
ip_icmp.c
Tunable flag. Enables ICMP
Not virtualized.





mask replacement.


icmpprintfs
int
ip_icmp.c
Enables printfs in icmp code.
Not virtualized.


icmpsrc
static struct
ip_icmp.c
Holds the ip dest address in
Virtualized.



sockaddr_in

icmp_input.
May not be






necessary


icmpstat
static struct
ip_icmp.c
Icmp statistics.
Virtualized.



icmpstat


if_indeX
int
if.c
Number of configured
Virtualized.





interfaces.


if_indexliM
static int
if.c
Number of entries in
Virtualized.





ifnet_addrS array.


ifneT
struct ifnethead
if.c
Head of list of ifnet structures.
Virtualized.


ifnet_addrS
struct iffaddr **
if.c
Array of pointers to link level
Virtualized.





interface addresses.


ifqmaxlen
int
if.c
Constant. Maximum queue
Invariant.





length for interface queue.


igmp_all_hosts_group
static u_long
igmp.c
Host order of
Invariant.





INADDR_ALLHOSTS_GROUP





constant


igmp_all_rtrs_group
static u_long
igmp.c
Host order of
Invariant.





INADDR_ALLRTS_GROUP





constant.


igmp_timers_are_running
static int
igmp.c
Flag indicating any igmp timer
Virtualized.





is active.


igmprt
static struct route
igmp.c
Temporary variable.
Invariant.


igmpstat
static struct
igmp.c
Igmp statistics.
Virtualized.



igmpstat


in_ifaddrheaD
struct
ip_input.c
Head of in_ifaddr structure list.
Virtualized.



in_ifaddrhead


in_interfaces
static int
in.c
Incremented each time a non-
Invariant.





loopback interface is added to
Never read.





in_ifaddrheaD. Not read.
Dead code.


in_multiheaD
struct
in.c
Head of list of
Virtualized.



in_multihead

in_multistructures (multicast





address).


inetclerrmap
u_char [ ]
ip_input.c
Array of constants (error
Invariant.





numbers).


inetdomain
struct domain
in_proto.c
Pointers to switch table,
Invariant.





initialization, etc. for internet





domain.


inetsw
struct protosw
in_proto.c
Pointers to entry points for
Invariant.





various internet protocols.


inited
static int
if.c
Flag indicating initialization
Invariant.





has been performed.





Initialization does all vnets.


ip_acceptsourceroute
static int
ip_input.c
Tunable flag. Enables
Tunable. Not





acceptance of source routed
virtualized.





packets.


ip_defttl
int
ip_input.c
Tunable. Default time to live
Tunable. Not





from RFC 1340.
virtualized.


ip_divert_cookiE
u_int16_t
ip_divert.c
Cookie passed to user process.
Virtualized.


ip_divert_porT
u_short
ip_divert.c
Global “argument” to
Virtualized.





div_input. Used to avoid





changing prototype.


ip_dosourceroute
static int
ip_input.c
Tunable flag. Enables acting as
Tunable. Not





a router.
virtualized.


ip_fw_chaiN
struct ip_fw_head
ip_fw.c
Head of ip firewall chains.
Virtualized.


ip_fw_chk_ptr
ip_fw_chk_t *
ip_input.c
IP firewall function callout
Invariant.





pointer; value depends on





loading fw module.


ip_fw_ctl_ptr
ip_fw_ctl_t *
ip_input.c
IP firewall function callout
Invariant.





pointer; value depends on





loading fw module.


ip_fw_default_rulE
struct
ip_fw.c
Pointer to default rule for
Virtualized.



ip_fw_chain*

firewall processing.


ip_fw_fwd_addR
struct
ip_input.c
IP firewall address.
Virtualized.



sockaddr_in *


ip_ID
u_short
ip_output.c
IP packet identifier
Virtualized.





(increments).


ip_mcast_src
ulong (*)(int)
ip_mroute.c
Pointer to function; selection
Invariant.





depends on compile options.


ip_mforward
int(*)(struct ip *,
ip_mroute.c
Function pointer set by module
Invariant.



struct ifnet *, . . .)

installation.


ip_mrouteR
struct socket *
ip_mroute.c
Socket of multicast router
Virtualized.





program.


ip_mrouter_done
int (*)(void)
ip_mroute.c
Function pointer set by module
Invariant.





installation.


ip_mrouter_get
int (*)(struct
ip_mroute.c
Function pointer selected by
Invariant.



socket *, struct

compile options.



sockopt *)


ip_mrouter_set
int (*)(struct
ip_mroute.c
Function pointer selected by
Invariant.



socket *, struct

compile options.



sockopt *)


ip_nat_clt_ptr
ip_nat_ctl_t *
ip_input.c
IP firewall function callout
Invariant.





hook; set by module install.


ip_nat_ptr
ip_nat_t *
ip_input.c
IP firewall function callout
Invariant.





hook; set by module install.


ip_nhops
static int
ip_input.c
Hop count for previous source
Virtualized.





route.


ip_protox
u_char
ip_input.c
Maps protocol numbers to
Invariant.



[PROTO_MAX]

inetsw array.


ip_rsvpD
struct socket *
ip_input.c
Pointer to socket used by rsvp
Virtualized.





daemon.


ip_rsvp_on
static int
ip_input.c
Boolean indicating rsvp is
Virtualized.





active.


ip_srcrt
struct ip_srcrt
ip_input.c
Previous source route.
Virtualized.


ipaddR
struct
ip_input.c
Holds ip destination address for
Virtualized.



sockaddr_in

option processing.


ipflowS
static struct
ip_flow.c
Hash table head for ipflow
Virtualized.



ipflowhead

structs.


ipflow_active
static int
ip_flow.c
Tunable. Enables “fast
Invariant.





forwarding” flow code.


ipflow_inuse
static int
ip_flow.c
Count of active flow structures.
Virtualized.


ipforward_rt
static struct route
ip_input.c
Cached route for ip forwarding.
Virtualized.


iforwarding
int
ip_input.c
Tunable that enabales ip
Virtualized.





forwarding.


ipintrq
struct ifqueue
ip_input.c
Ip interrupt request queue for
Invariant.





incoming packets. Vnet set





when packets dequeued.


ipport_firstauto
static int
ip_pcb.c
Bounds on ephemeral ports.
Invariant.


ipport_hifirstauto
static int
ip_pcb.c
Bounds on ephemeral ports.
Invariant.


ipport_hilastauto
static int
ip_pcb.c
Bounds on ephemeral ports.
Invariant.


ipport_lastauto
static int
ip_pcb.c
Bounds on ephemeral ports.
Invariant.


ipport_lowfirstauto
static int
ip_pcb.c
Bounds on ephemeral ports.
Invariant.


ipport_lowlastauto
static int
ip_pcb.c
Bounds on ephemeral ports.
Invariant.


ipprintfs
static int
ip_input.c
Flag for debug print.
Invariant.


ipq
static struct ipq
ip_input.c
Head of ip reassembly hash
Virtualized.



[IPREASS_NHASH]

lists.


ipqmaxlen
static int
ip_input.c
Patchable constant that sets
Invariant.





maximum queue length for





ipintrq.


isendredirects
static int
ip_input.c
Tunable that enable sending
Invariant.





redirect messages.


istaT
struct ipstat
ip_input.c
Ip statistics counters.
Virtualized.


k_igmpsrc
static struct
ip_mroute.c
Prototype sockaddr_in.
Invariant.



sockaddr_in


last_adjusted_timeout
static int
ip_rmx.c
Time value of last adjusted
Virtualized.





timeout.


last_encap_src
static u_long
ip_mroute.c
Cache of last encapsulated
Virtualized.





source address?


last_encap_vif
struct vif *
ip_mroute.c
Last encapsulated volume tag
Virtualized.





(vif).


last_zeroed
static int
radix.c
Number of bytes zeroed last
Invariant.





time in addmask_key.


legal_vif_num
int (*)(int)
ip_mroute.c
Pointer to function selected by
Invariant.





module installation.


llinfo_arP
struct
if_ether.c
Head of llinfo_arp linked list.
Virtualized.



llinfo_arp_head


log_in_vain
static int
tcp_input.c
Tunables that enable logging of
Invariant.




udp_usrreq.c
“in vain” connections.


loif
struct ifnet
if_loop.c
Array of ifnet structs fro
Invariant.



[NLOOP]

loopback device. One per





device, therefore invariant.


mask_rnhead
struct
radix.c
Head of mask tree.
Invariant.



radix_node_head *


max_keylen
static int
radix.c
Maximum key length of any
Invariant.





domain.


.maxnipq
static int
ip_input.c
Constant (nmbcluslter/4) that is
Invariant?





maximum number of ip
Scaled?





fragments waiting assembly.





Note: should this be scaled by





VNET?


mfctable
static struct mfc *
ip_mroute.c
Head of mfc hash table.
Virtualized.



[MFCTBLSIZ]


mrt_ioctl
int (*)(int,
ip_mroute.c
Function pointer selected by
Invariant.



caddr_t, struct

module initialization.



proc*)


mrtdebug
static u_int
ip_mroute.c
Enables debug log messages.
Invariant.


mrtstat
static struct
ip_mroute.c
Multicast routing statistics.
Virtualized.



mrtstat


mtutab
static int [ ]
ip_icmp.c
Static table of constants.
Invariant.


multicast_decap_if
static struct ifnet
ip_mroute.c
Fake encapsulator interfaces.
Virtualized.



[MAXVIFS]


multicast_encap_iphdr
static struct ip
ip_mroute.c
Multicast encapsulation header.
Invariant.


nexpire
static u_char
ip_mroute.c
Count of number of expired
Virtualized.



[MFCTBLSIZ]

entries in hash table?


nipq
static int
ip_input.c
Number of ip fragment chains
Virtualized.





awaiting reassembly.


normal_chars
static char [ ]
radix.c
Static table of mask constants.
Invariant.


nousrreqs
static struct
in_proto.c
Static structure of null function
Invariant.



pr_usrreqs
ipx_proto.c
pointers.


null_sdl.96
static struct
if_ether.c
Static null sockaddr_dl
Invariant.



sockaddr_dl

structure.


numvifs
static vifi_t
ip_mroute.c
Number of virtual interface
Virtualized.





structures.


old_chk_ptr
static
ip_fw.c
Function pointer holding
Invariant.



ip_fw_chk_t

previous state when module





loads.


old_ctl_ptr
static ip_fw_ctl_t
ip_fw.c
Function pointer holding
Invariant.





previous state when module





loads.


paritytab
static unsigned
ppp_tty.c
Static array of parity constants.
Invariant.



[8]


pim_assert
static int
ip_mroute.c
Enables pim assert processing.
Virtualized.


ppp_compressors
static struct
if_ppp.c
Static list of known ppp
Invariant.



compressor [8]

compressors.


ppp_softc
struct ppp_softc
if_ppp.c
Array of softc structures for
Invariant.


pppdisc
[NPPP]

ppp driver; one per device.


raw_recvspace
static u_long
raw_cb.c
Patchable constant that is
Invariant.





amount of receive space to





reserve in socket.


raw_sendspace
static u_long
raw_cb.c
Patchable constant that is
Invariant.





amount of send space to reserve





in socket.


raw_usrreqs
struct protosw
raw_usrreq.c
Table of function pointers.
Invariant.


rawcb_lisT
struct
raw_cb.c
Head of rawcb (raw prototocol
Virtualized.



rawcb_list_head

control blocks) list.


rawclnsdomain
struct domain
raw_clns.c
Table of function pointers.
Invariant.


rip_recvspace
static u_long
raw_ip.c
Tunable, amount of receive
Tunable. Not





space to reserve in socket.
virtualized.


rip_sendspace
static u_long
raw_ip.c
Tunable, amount of send space
Tunable. Not





to reserve in socket.
virtualized.


rip_usrreqs
struct pr_usrreqs
raw_ip.c
Table of function pointers.
Invariant.


ripcb
static struct
raw_ip.c
Head of raw ip control blocks
Virtualized.



inpcbhead


ripcbinfo
struct inpcbinfo
raw_ip.c
Pcb info. structure for raw ip.
Virtualized.


ripsrc
static struct
raw_ip.c
Static temporary variable in
Invariant.



sockaddr_in

rip_input


rn_mkfreelist
static struct
radix.c
Cache of free radix_mask
Invariant.



radix_mask *

structures.


rn_ones
static char *
radix.c
One mask computed from
Invariant.





maximum key length.


rn_zeros
static char *
radix.c
Zeros mask computed from
Invariant.





maximum key length.


ro
static struct route
ip_mroute.c
Temporary variable to hold
Invariant.



ro

route.


route_cB
struct route_cb
route.c
Counts on the number of
Virtualized.





routing socket listeners per





protocol.


route_dst
static struct
rtsock.c
Null address structure for
Invariant.



sockaddr route

destination.


route_proto
static struct
rtsock.c
Static prototype of structure
Invariant.



sockproto

used to pass routing info.


route_src
static struct
rtsock.c
Null address structure for
Invariant.



sockaddr

source.


route_usrreqs
static struct
rtsock.c
Table of function pointers for
Invariant.



pr_usrreqs

entry points.


routedomain
struct domain
rtsock.c
Table of function pointers for
Invariant.





entry points.


route_alert
static struct mbuf *
igmp.c
Statically constructed router
Invariant.





alert option.


routesw
struct protosw
rtsock.c
Table of function pointers for
Invariant.





entry points.


rsvp_oN
int
ip_input.c
Count of number of open rsvp
Virtualized.





control sockets.


rsvp_src
static struct
ip_mroute.c
Sockaddr prototype.
Invariant.



sockaddr_in


rsvpdebug
static u_int
ip_mroute.c
Enables debug print.
Invariant.


rt_tableS
struct
route.c
Head of the routing tables (a
Virtualized.



radix_node_head

table per address family.)



* [AF_MAX + 1]


rtq_minreallyold
static int
in_rmx.c
Tunable; minimum time for old
Invariant.





routes to expire.


rtq_reallyold
statinc int
in_rmx.c
Amount of time before old
Virtualized.





routes expire.


rtq_timeout
static int
in_rmx.c
Patchable constant timeout
Invariant.





value for walking the routing





tree.


rtq_toomany
static int
in_rmx.c
Tunable that represents the
Invariant.





number of active routes in the





tree.


rtstaT
struct rtstat
route.c
Routing statistics structure.
Virtualized.


rttrash
static int
route.c
Number of rtentrys not linked
Dead code. Not





to the routing table. Never read,
virtualized.





dead code.


sa_zero
struct sockaddr
rtsock.c
Zero address return in error
Invariant.





conditions.


sin
static struct
if_ether.c
Sockaddr prototype passed to
Invariant.



sockaddr_inarp
if_mroute.c
rtallocl.


sl_softc
static struct
if_sl.c
Softc structure for slip driver;
Invariant.



sl_soft [NSL]

one per device.


slipdisc
static struct
if_sl.c
Table of function pointers to
Invariant.



linesw

slip entry points.


srctun
static int
ip_mroute.c
Counter throttling error
Invariant.





message to log.


subnetsarelocal
static int
in.c
Tunable flag indicating subnets
Virtualized.





are local.


tbfdebug
static u_int
ip_mroute.c
Tbf debug level.
Invariant.


tbftable
static struct tbf
ip_mroute.c
Token bucket filter structures.
Virtualized.



[MAXVIFS]


tcB
struct inpcbhead
tcp_input.c
Head structure for tcp pcb
Virtualized.





structures.


tcbinfO
struct inpcbinfo
tcp_input.c
PCB info structure for tcp.
Virtualized.


tcp_backoff
int [ ]
tcp_timer.c
Table of times for tcp backff
Invariant.





processing.


tcp_ccgeN
tcp_cc
tcp_input.c
Connection count (per rfc
Virtualized.



(u_int32_t)

1644).


tcp_delack_enabled
int
tcp_input.c
Tunable that enables delayed
Tunable. Not





acknowledgments.
virtualized.


tcp_do_rfc1323
static int
tcp_subr.c
Tunable enables rcf 1323
Tunable. Not





(window scaling and
virtualized.





timestamps.)


tcp_do_rfc1644
static int
tcp_subr.c
Tunable enables rfc 1644.
Tunable. Not






virtualized.


tcp_keepcnt
static int
tcp_timer.c
Patchable constant for
Invariant.





maximum number of probes





before a drop.


tcp_keepidle
int
tcp_timer.c
Tunable value for keep alive
Tunable. Not





idle timer.
virtualized.


tcp_keepinit
int
tcp_timer.c
Tunable value for initial
Tunable. Not





connect keep alive.
virtualized.


tcp_maxidle
int
tcp_timer.c
Product of tcp_keepcnt *
Invariant.





tcp_keepintvl; recomputed in





slow timeout.


tcp_maxpersistidle
static int
tcp_timer.c
Patchable constant that is
Invariant.





default time before probing.


tcp_mssdflt
int
tcp_subr.c
Tunable default maximum
Tunable. Not





segment size.
virtualized.


tcp_noW
u_long
tcp_input.c
500 msec. counter for RFC1323
Virtualized.





timestamps.


tcp_outflags
u_char
tcp_fsm.h
Static table of flags in
Invariant.



[TCP_NSTATES]

tcp_output.


tcp_rttdflt
static int
tcp_subr.c
Tunable. Dead code, value not
Invariant. Dead





accessed.
code.


tcp_sendspace
u_long
tcp_usrreq
Tunable value for amount of
Tunable. Not





send space to reserve on socket.
virtualized.


tcp_totbackoff
static int
tcp_timer.c
Sum of tcp_backoff.
Invariant.


tcp_usrreqs
struct pr_usrreqs
tcp_usrreq.c
Table of function pointers for
Invariant.





tcp user request functions.


tcprexmtthresh
static int
tcp_input.c
Patchable constant; number of
Invariant.





duplicate acks to trigger fast





retransmit.


tcpstaT
struct tcpstat
tcp_input.c
TCP statistics structure.
Virtualized.


tun_cdevsw
struct cdevsw
if_tun.c
Table of function pointers for
Invariant.





tunnel interface entry points.


tun_devsw_installed
static int
if_tun.c
Flag indiating tun devsw table
Invariant.





installed.


tunctl
static struct
if_tun.c
Softc structure for tunnel
Invariant.



tun_softc

interface; one per device.



[NTUN]


tundebug
static int
if_tun.c
Flag enables debut print.
Invariant.


udb
static struct
udp_usrreq.c
UDP inpcb head structure.
Virtualized.



inpcbhead


udbinfo
static struct
udp_usrreq.c
UDP inpcb info. structure.
Virtualized.



inpcbinfo


udp_in
static struct
udp_usrreq.c
Prototype sockaddr for
Invariant.



sockaddr_in

AF_INET.


udp_recvspace
static u_long
udp_usrreq.c
Tunable; amount of receive
Tunable. Not





space to reserve on socket.
virtualized.


udp_sendspace
static u_long
udp_usrreq.c
Tunable; amount of send space
Tunable. Not





to reserve on socket.
virtualized.


udp_usrreqs
struct pr_usrreqs
udp_usrreq.c
Table of function pointers for
Invariant.





entry points.


udpcksum
static int
udp_usrreq.c
Tunable; enables udp
Tunable. Not





checksumming.
virtualized.


udpstat
struct udpstat
udp_usrreq.c
Udp statistics structure.
Virtualized.


useloopback
static int
if_ether.c
Tunable; enables use of
Tunable. Not





loopback device for localhost.
virtualized.


version
static int
ip_mroute.c
Version number of MRT
Invariant.





protocol.


viftable
static struct vif
ip_mroute.c
Table of vifs (virtual interface
Virtualized.



[MAXVIFS]

structure).


zeroin_addr
struct in_addr
in_pcb.c
Zero'd internet address.
Invariant.





NOTE:


In the Analysis/Disposition column, “Virtualized” means the variable becomes an array when vnets are configured (see the description above); “Invariant” means a separate instance of the variable is not needed for different vnet domains; and “Not Virtualized” means there was a choice about virtualization (e.g., whether a Tunable could have a different value in different domains), but the choice was made not to virtualize the variable.






Although the present invention and its advantages have been described in detail, it should be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims. Moreover, the scope of the present application is not intended to be limited to the particular embodiments of the process, machine, manufacture, composition of matter, means, methods and steps described in the specification. As one of ordinary skill in the art will readily appreciate from the disclosure of the present invention, processes, machines, manufacture, compositions of matter, means, methods, or steps, presently existing or later to be developed that perform substantially the same function or achieve substantially the same result as the corresponding embodiments described herein may be utilized according to the present invention. Accordingly, the appended claims are intended to include within their scope such processes, machines, manufacture, compositions of matter, means, methods, or steps.

Claims
  • 1. A routing system comprising: a physical router; anda plurality of virtual routers hosted on the physical host router and controlled by a common operating system residing on the physical host router;wherein a first set of variables in the common operating system on the physical router are replicated for and managed independently by the virtual routers; andwherein a second set of variables which control network functions in the operating system are shared by the virtual routers.
  • 2. The system of claim 1, wherein said operating system runs on a control processor within the physical router.
  • 3. The system of claim 2, wherein the operating system is based on UNIX.
  • 4. The system of claim 1, wherein the operating system manages reporting of hardware failures across multiple virtual routers.
  • 5. The system of claim 1, wherein a respective virtual router comprises routing software applications.
  • 6. The system of claim 5, wherein a respective routing software application comprises an instance of a dynamic routing protocol (DRP).
  • 7. The system of claim 5, wherein a respective routing software application comprises an instance of simple network management protocol (SNMP).
  • 8. The system of claim 1, wherein a respective virtual-router-specific variable is a component of an array; and wherein a virtual-router-specific variable is accessible by a macro reference in a respective virtual router.
  • 9. The system of claim 8, wherein said macros generate scalar variables when said virtual router is deconfigured.
  • 10. A method, comprising: configuring an operating system on a physical router; andconfiguring a plurality of virtual routers hosted on the physical router and controlled by the operating system on the physical router;wherein a first set of variables in the operating system are replicated for and managed independently by the virtual routers; andwherein a second set of variables which control network functions in the operating system are shared by the virtual routers.
  • 11. The method of claim 10, wherein the virtual-router-specific variables are generated by macros.
  • 12. The method of claim 11, wherein the macros generate arrays of virtual-router-specific variables when the virtual router is configured.
  • 13. The method of claim 12, wherein the macros generate scalar variables when the virtual router is deconfigured.
  • 14. The method of claim 10, wherein the operating system runs on a control processor within the physical router.
  • 15. The method of claim 14, wherein the operating system is based on the UNIX operating system.
  • 16. The method of claim 10, wherein a respective virtual router comprises routing software applications.
  • 17. The method of claim 16, wherein a respective virtual router comprises an instance of DRP.
  • 18. The method of claim 16, wherein a process in a respective virtual router manages instantiation of a common networking code.
  • 19. A system comprising: an operating system running on a processor;a plurality of virtual routers controlled by the operating system;wherein a first set of variables in the operating system are replicated for and managed independently by the virtual routers; andwherein a second set of variables which control network functions in the operating system are shared by the virtual routers.
  • 20. The system of claim 19, wherein the processor is a master control processor; andwherein the first set of variables are associated with a network protocol stack.
RELATED APPLICATIONS

This application is a continuation of, and hereby claims priority under 35 U.S.C §120 to, pending U.S. patent application Ser. No. 12/210,957, entitled “System and Method for Router Virtual Networking,” by inventors Thomas Lee Watson and Lance Arnold Visser, filed on 15 Sep. 2008, which is a continuation of U.S. patent application Ser. No. 09/896,228, filed 29 Jun. 2001, by the same inventors (now U.S. Pat. No. 7,441,017).

US Referenced Citations (39)
Number Name Date Kind
5159592 Perkins Oct 1992 A
5278986 Jourdenais et al. Jan 1994 A
5550816 Hardwick et al. Aug 1996 A
5649110 Ben-Nun Jul 1997 A
5878232 Marimuthu Mar 1999 A
5970232 Passint et al. Oct 1999 A
6104700 Haddock Aug 2000 A
6233236 Nelson May 2001 B1
6282678 Snay et al. Aug 2001 B1
6374292 Srivastava et al. Apr 2002 B1
6570875 Hegde May 2003 B1
6587469 Bragg Jul 2003 B1
6597699 Ayres Jul 2003 B1
6608819 Mitchem Aug 2003 B1
6633916 Kauffman Oct 2003 B2
6674756 Rao et al. Jan 2004 B1
6678248 Haddock Jan 2004 B1
6691146 Armstrong et al. Feb 2004 B1
6859438 Haddock Feb 2005 B2
6910148 Ho Jun 2005 B1
6938179 Iyer Aug 2005 B2
6975639 Hill Dec 2005 B1
7039720 Alfieri et al. May 2006 B2
7093160 Lau Aug 2006 B2
7236453 Visser Jun 2007 B2
7269133 Lu Sep 2007 B2
7292535 Folkes Nov 2007 B2
7382736 Mitchem Jun 2008 B2
7664119 Adams et al. Feb 2010 B2
7668166 Rekhter et al. Feb 2010 B1
7720076 Dobbins et al. May 2010 B2
7742420 Chapman et al. Jun 2010 B2
7792058 Yip et al. Sep 2010 B1
7818452 Matthews et al. Oct 2010 B2
7885207 Sarkar et al. Feb 2011 B2
20020035641 Kurose et al. Mar 2002 A1
20020129166 Baxter et al. Sep 2002 A1
20080225859 Mitchem Sep 2008 A1
20110040949 Hoese et al. Feb 2011 A1
Foreign Referenced Citations (1)
Number Date Country
0926859 Jun 1999 EP
Related Publications (1)
Number Date Country
20100208738 A1 Aug 2010 US
Continuations (2)
Number Date Country
Parent 12210957 Sep 2008 US
Child 12767210 US
Parent 09896228 Jun 2001 US
Child 12210957 US