Today, in a datacenter environment bandwidth billing is performed in a very coarse-grained manner. Bandwidth statistics are collected from a switch port or virtual interface attached to a virtual machine (VM). Interfaces are either billed or free, but all traffic on a billed interface must be billed at the same rate. Thus whether a VM communicates with the VM next to it, a server in another datacenter region of the same service provider, or a system on the public Internet, the traffic is billed at the same rate regardless of destination.
This billing operation does not reflect the true costs required to carry the traffic, since the true cost to transfer traffic progressively increases as it transits within a datacenter, between different datacenters of a service provider over dedicated circuits, or to the Internet.
In an embodiment, a method for performing distributed network billing in a datacenter environment may take the following form. Note that this method may be implemented in various hardware, firmware, and/or software of the datacenter, and can leverage information obtained from distributed sources to generate billing information for a datacenter customer with reduced complexity. As one such example, the method may be implemented in a computer readable medium that includes instructions that enable one or more systems of the datacenter to perform the distributed network billing operations.
The method includes receiving a network packet including a destination identifier in a virtual switch from a virtual machine of a server of the datacenter associated with a customer of the datacenter, determining if the destination identifier is present in an accounting list of the virtual switch that includes destination identifiers each associated with a first billing rate, and if so, updating a counter associated with the accounting list according to a size of the network packet.
In an embodiment, the method includes updating a second counter associated with a default billing rate according to the size of the network packet if the destination identifier is not present in any accounting list. In addition, an announcement of a route to a network block of a second datacenter coupled to the datacenter via a network backbone is received, where the announcement includes a message having a destination identifier of the network block, a route to the network block, and a billing tag associated with the destination identifier, and a list of routes in a route server of the datacenter is updated with an entry based on the message information.
Still further the method may include processing information from an entry of the list of routes to generate aggregated route information and populating an entry in a local route database with the aggregated route information and the billing tag, and updating a first accounting list of a middleware server of the datacenter, where the first accounting list is for a billing rate corresponding to the billing tag of the aggregated route information.
Another aspect is directed to a system for performing the distributed billing. More specifically, this system is a distributed billing system for a multi-tenant datacenter having a plurality of datacenter regions.
One such region includes a local route server to receive route messages and to update one or more routing tables based on the route messages, where each of the route messages includes a destination identifier to identify a network block and a billing tag to be associated with a billing rate to be applied to traffic destined to the network block. The region also includes an integration sever coupled to the route server to receive and process the destination identifier and the billing tag to generate an entry for storage in one of multiple accounting lists of the integration server, where each of the accounting lists is associated with a billing rate corresponding to the billing tag. Also, the region includes a software defined network (SDN) controller or other cluster controller coupled to the integration server to receive updates to the accounting lists and to send the updates to a plurality of virtual switches, where the virtual switches each include counters each to count network traffic destined to a location present on an accounting list of the virtual switch.
The region may also include one or more billing servers coupled to the SDN controller to communicate a request for billing information of a first customer of the multi-tenant datacenter via the distributed billing system, where the billing server is to receive count values responsive to a query from the SDN controller to one or more switches coupled to virtual machines associated with the first customer, where the count values are each associated with an accounting list of one or more virtual switches associated with the first customer.
Using an embodiment of the present invention, data traffic can be billed at different rates depending upon different network destinations of the traffic. Still further, embodiments provide for automatic updates to traffic destinations based upon real-time information extracted from the datacenter network. Thus dynamic changes to datacenter infrastructure can be reflected in billing functions in a transparent and autonomous manner. In addition, billing operations can be done in a distributed manner such that the overhead cost of billing is reduced.
In this way, a service provider can offer granular billing based upon which links the customer's traffic travels. Embodiments can automatically account for the dynamic nature of datacenter networks, where many new routes are added on a daily basis. Such new routes can automatically be associated with the appropriate billing rate based upon route tagging at announcement source.
Using an embodiment, the need to collect statistical data for every packet via routers throughout the datacenter and report these statistics to a central data aggregation point (normally via an sFlow or netflow protocol) can be avoided. In this conventional system the aggregation point then processes all of the incoming data in order to derive the actual amount of data transiting between any two end stations. The majority of hardware deployed cannot provide complete statistical data for every packet, so the resulting data is not at a sufficiently high resolution to be used for financial purposes such as billing. Collecting and processing this data is not a scalable approach for large datacenters with dense virtualization, due to the sheer volume of IP addresses generating data and the volume of network resources being used. Embodiments thus resolve this problem by shifting to a distributed model, where every host calculates the bandwidth consumption for any locally running VMs and exposes that information via an application programming interface (API) for a billing system to collect.
Referring now to
A route, which is an identification of a path to a destination network block, is propagated outward from its point of origination. Certain routing protocols, such as the border gate protocol (BGP), allow routes to be tagged with one or more tags at any point within the BGP routing domain. These tags can be used to express common traits about the routes (such as geographical location). Using an embodiment, routes may be tagged, e.g., at their creation point (within a route server) with a billing tag. In an embodiment, this billing tag may be associated with a general location of the network block. In an embodiment, route tags may be integers, though in some cases the tags may take on different formats to be more human readable. Of course other routing protocols may be used, as described below.
As one example, assume a service provider has a datacenter in each of multiple regions. For purposes of discussion, assume a first datacenter is located in Chicago (ORD) and a second datacenter is located in Dallas (DFW). If all major network blocks are announced out of ORD1 with routes having a common tag, and traffic between ORD1 and DFW1 costs the service provider a certain rate (e.g., $4/Mbps) to transfer data, the remote datacenter can use the tag of the routes to determine which IP ranges local customers might transfer data to. If another route is added to the first datacenter later, the second datacenter does not need any configuration update—it continues to look for routes with a given tag and simply finds a new route in the list.
By combining the near real-time convergence of routing protocols that support route tagging with a scalable, configurable IP accounting system, more advanced customer billing can occur. To this end, each datacenter may include one or more middleware servers (referred to herein as integration servers) including logic configured as an application to determine association of billing rates with corresponding billing tags of route information. In a basic implementation, the logic can determine that all routes from a given remote datacenter to a local datacenter (e.g., ORD1 to DFW1) be billed at the same rate. However, understand that the scope of the present invention is not limited in this regard, and in other embodiments different billing rates can be applied to different billing tags of the same datacenter, e.g., discrete billing rates for given destinations in a remote DC at a product level granularity, or in order to support an agreed upon contracted rate for a third party service hosted within the datacenter.
As will be described herein, when a routing server generates a new route, a tag is associated with the route. More specifically, this billing tag provides an identification associated with a destination of the route (e.g., to the corresponding network block). In turn this billing tag can be used to enable a billing rate to be associated with this tag as described herein. Note that with the route information itself, this billing tag may be propagated throughout the distributed network. Thus routes originating in remote route server 125 may be propagated through backbone 120 to a route server 110 of a local datacenter.
Still referring to
Thus as further shown in
Alternative example controller clusters may include vendor-specific network APIs or other services that interact with switches and routers. In an embodiment, the controller may be accessed through an abstracted API, which in an embodiment may be the Openstack Quantum API. Quantum has backend plugins that interact with different networking APIs. Thus part of a datacenter may use an SDN controller plugin and a different portion of the datacenter uses traditional switches supported through a separate plugin. In both cases, the integration server interacts with a single API (Quantum) but can still advertise new or updated accounting lists to ports that are associated with both virtual switches as written and physical switch ports.
In turn, these accounting lists can further be propagated to switches to which are coupled VMs that execute on servers 150. Note that while only a single server is shown, understand that in a given embodiment a datacenter can include a set, e.g., a large number of such servers. As is well known, each server may host a plurality of virtual machines, each having a hypervisor and configured to communicate with other entities via a virtual switch that in turn may include a number of logical switch ports. For example, within a cloud server system, every physical server runs multiple virtual machines. These virtual machines interact with the physical network via a virtual software switch that runs within the server. Each virtual machine connects to an individually configurable virtual port on a software switch that can be queried for traffic statistics.
By placing a list of routes onto the virtual port and allowing traffic to pass, a count of the bytes of data that are directed to any destination on this list may be generated as the traffic passes. From this count an accurate billing profile for traffic of a particular billing rate for the VM can be maintained. Traffic that does not match any list for the port can be billed at either another rate for a destination found in another list, or at a default rate. Thus in an embodiment, each virtual switch may include a set of counters, where each counter is associated with a given accounting list (that in turn is associated with a particular billing rate).
Thus these virtual ports are programmed with the accounting lists. In this way, when packets are to be sent from a VM through a given virtual switch, a count can be updated for the type of traffic (e.g., corresponding to a given billing rate). Given this information regarding the rate at which traffic to a given destination is to be billed, per-VM statistics may be collected. More specifically, a counter may be associated with each accounting list (each of which may include destination identifiers for destination locations to be charged at a common billing rate). This counter is configured to count how much traffic is being directed from the corresponding logical port to a given set of destinations. In an embodiment, the counter is configured as a byte counter to count bytes communicated. Although shown at this high level in the embodiment of
Consider now a scenario where a new major route is added to a first datacenter (e.g., ORD1) and how a dynamical change to the billing profile for that route in a second datacenter (e.g., DFW1) VM occurs. A major route (e.g., an aggregate route) is a summarization of multiple smaller routes contained within the datacenter. As an example, the route 10.1.0.0/16 is advertised out to the Internet. This encompasses every IP address from 10.1.0.0 to 10.1.255.255-65,000+ IP addresses. Within the datacenter this block could be broken up into 256 IP blocks that each hold 256 IPs each, but other datacenters and Internet peering points do not need that level of knowledge, and simply need to know that all 65,000 IPs are reached via the same datacenter. By tagging the aggregates, marking of every destination in a DC is effected while affecting a minimal amount of change/tagging into the network.
Assume that this new route, identified as 10.1.1.0/24, is advertised out of ORD1. A routing server in the local datacenter, using pre-defined policies, tags the route with a billing tag (simplified here as RAX:ORD1). This route is advertised over a backbone to other service provider facilities and arrives in DFW1 with the tag intact. An integration server that peers with the local route server scans for any RAX:ORD1 routes and assembles a list. The server then uses common IP libraries to aggregate the routes to the minimal number of routes and populates a local database with the results. This database may be monitored, e.g., by a separate process executing on the server, for changes. When the new route is discovered, it is added to a pre-defined bucket or accounting list of routes that are billed at a uniform rate. Next, the integration server updates or replaces a copy of this access list on a SDN controller. The SDN controller then automatically pushes the update or replaces the new list to all downstream software switches, which updates the list on any ports configured to participate in that accounting list.
Then the next time any of the VMs that have the list configured with the update send traffic to a destination within the network block corresponding to this route (e.g., 10.1.1.0/24), the counter associated with the list is updated. Since this counter can be queried, the next time the VM reaches the end of a billing cycle, a billing system can query the absolute value of the counter for the access list on the VM's logical switch port, apply the appropriate rate for traffic within this list, and add it to the total bill. The counter can then be reset so it can begin incrementing for the next billing cycle. Another method is to never reset the counters, but maintain a historical record of the counter at various intervals in an external system and bill based upon the value's delta from one period to another.
Referring now to
Assume for purposes of the discussion of
In addition, this routing table information can be analyzed by other entities of the datacenter. For example, an integration server may peer this route server to thus obtain updates. In an embodiment, the integration server receives routes from the network via a dynamic protocol such as BGP (or an open shortest path first (OSFP) or intermediate system to intermediate system (IS-IS) protocol) that include pre-configured tags that are added to the individual routes at their source of advertisement. Note that more generally a tag is an attribute or distinct piece of metadata that can be applied to a route and later referenced by another system. This could be an integer in the case of a routing protocol tag, a BGP community, or any similar transitive property. The integration server scans through all routes according to pre-configuration to only care about routes that have specific tags (or metadata) assigned, and groups them based upon tags to generate a set of lists. Example routes are shown in Table 1 below. Since different tags translate into different billing costs per bandwidth, tags may be enumerated equal to the level of discrete destination costs. For tracking purposes, the enumeration may be more discrete than needed for billing (for instance, if DFW and ORD both bill at the same rate per GB, their lists may be concatenated for billing, yet they are still tagged separately to track where customers are sending data).
Still referring to
Thus in general, the integration server collects routes, aggregates them to larger network constructs, and stores them for later update to the set of SDN controllers. Although shown at this high level in the embodiment of
Referring now to
If so, control passes to block 320 where an accounting list can be updated based on this new route information. More specifically, the integration server may maintain a set of accounting lists, each associated with a particular billing rate. Note that this accounting list may be of an integration server that peers with the local route servers to determine whether an update has occurred. This billing rate may correspond also to a particular tag or set of tags that are associated with the route information. Thus the corresponding accounting list for an updated route entry having the matching tag may thus be updated at block 320. Thus this integration server provides an entry into its accounting list for a billing rate corresponding to the billing tag of the route information. Control next passes to block 330 where this updated accounting list can be passed to an SDN controller. In some embodiments the entire accounting list can be sent on an update, while in other embodiments only the updated entry itself is communicated.
Finally at block 340, the information from the SDN controller can be distributed to various virtual machines of the datacenter. More specifically this same updated accounting list information can be applied to one or more logical ports of virtual machines within the local datacenter. In an embodiment, each logical port of a virtual switch (or hardware switch) stores a set of accounting lists, each list associated with a different billing rate. As such, accurate network traffic information for traffic being communicated from the VM through a given logical port can be obtained.
Referring now to
Next control passes to diamond 420 where it can be determined whether a destination identifier of this network packet is present in an accounting list stored in the logical port. If so, control passes to block 430 where a counter for the accounting list can be updated according to the packet size of the packet (block 430). For example, in an implementation in which a byte counter is associated with each accounting list, the corresponding byte counter is updated based on the size of the packet. For example, assume a 9000 byte packet, the counter can be updated accordingly. In some embodiments the byte counter may count in units of bytes, while in other embodiments different units of measure can be used.
Note that if the destination identifier for the outgoing packet is not present in any of the accounting lists of the logical port, control passes to block 440 where another counter that is associated with a default billing rate (e.g., an Internet billing rate) is updated according to the packet size. Note that in yet other embodiments, rather than applying a default billing rate, a rate of another one of the accounting lists can be used by updating that counter. Accordingly, this default billing rate may accommodate locations that have yet to be included in an accounting list for whatever reason.
Referring now to
As seen in
Still referring to
One alternate method of communicating billing/usage data is for the switch software itself to be configured to publish events at regular intervals to a publisher/subscriber system. In this way, if different consumers (e.g., network operations, billing, and resource planning) would all like to see the same data, each system need not query every software switch for usage metrics. In this model, they all subscribe to a live feed that allows them consume the events that every logical switch publishes.
Note that while the routes are layer 3 (IP destinations), the accounting lists have more granularity. For instance, if a datacenter allocates all user datagram protocol (UDP) traffic to be free, the integration system could be made aware of this policy. In this case, the accounting lists would be assembled and ordered such as this: count udp traffic to destination 1; and count IP traffic to destination 1. This separates the more specific UDP traffic from the less specific IP traffic for a given destination. The billing system could then discard all the UDP counters and only bill based upon the IP line, making the UDP traffic free.
Embodiments may be implemented in code and may be stored on a storage medium having stored thereon instructions which can be used to program a system to perform the instructions. The storage medium may include, but is not limited to, any type of non-transitory storage medium suitable for storing electronic instructions.
While the present invention has been described with respect to a limited number of embodiments, those skilled in the art will appreciate numerous modifications and variations therefrom. It is intended that the appended claims cover all such modifications and variations as fall within the true spirit and scope of this present invention.