Computer networks may provide centralized resources to multiple clients, or tenants, over communication links. A tenant is any entity that uses the resources of a network. As used herein, tenant segregation refers to the isolation of each tenant that accesses the network, such that the networking policies of each tenant are met by the network provider. In this manner, each tenant is unaware of other tenants using the resources of the network. A networking policy may include the networking services used by the tenant as well as the amount of data the tenant will place on the network. Tenant segregation ensures each tenant accesses the information belonging to that tenant and not the information of other tenants that access the same network.
As used herein, a communication link, or link, is a physical or wireless connection between the various resources of the network, between resources of the network and tenants that use the network, or between multiple networks. Communication links within a network are typically shared on a best effort basis. In a best effort scheme, each packet of data, regardless of the tenant where the packet originated, has an equal probability of accessing the link. Network protocols such as TCP/IP use a best effort scheme and may attempt to implement data flow fairness, but tenants can negatively impact other tenant's network usage by having multiple data flows or not using the TCP/IP protocol. As a result, a tenant may use more than the tenant's designated share of data flow across the network.
The quality of service (QoS) for a tenant of a network can dictate aspects of resource sharing across the network, including the designated amount of data flow for each tenant across the network. The designated data flow for a tenant can define the fair share of data flow for the tenant. The QoS that each tenant expects from a network provider may be formally agreed upon in a service level agreement (SLA). The network provider is tasked with providing services to each tenant that meet the QoS agreed upon under the terms of the SLA. To meet the terms of the SLA for each tenant, the network provider may implement over-provisioning of network resources or other mechanisms to control data flows and access to resources within the network.
Certain examples are described in the following detailed description and in reference to the drawings, in which:
Traditional QoS tools, such as differentiated services (DiffServ), can be used to control how network resource sharing is done and can share network links according to the chosen QoS policies. However, traditional QoS frameworks may not fully implement tenant segregation. The goals of traditional QoS frameworks typically include prioritizing traffic and enforcing latency guarantees. However, these goals do not ensure tenant segregation, as a tenant may be aware of other tenants on the network as traffic across the network is prioritized and latency guarantees are enforced. Additionally, traditional QoS tools may operate under a principle of traffic classification, in which the data from each tenant is placed into a limited number of traffic classes as opposed to differentiating network traffic based on each tenant's flow of traffic. Each traffic class can be treated differently according to the specified QoS of that class. The traffic classes may be assigned different rate limits or be prioritized. As used herein, a rate limit refers to the maximum amount of traffic that can be sent using a network. The number of traffic classes may be limited in traditional QoS tools. Further, the limited number of classes may not support a large number of tenants, as the different QoS policies may outnumber the traffic classes within a network.
Examples described herein allocate network bandwidth. Specifically, some examples allocate network bandwidth using distributed rate limiting (DRL). As used herein, bandwidth describes a rate of data transfer, or throughput, of each communication link. Each tenant of a network is allocated a fair share bandwidth of the network based on the QoS expected by the tenant and a DRL assignment of the tenant. As used herein, a fair share refers to the designated quantity of network bandwidth a tenant may access in accordance with a specified QoS, as determined by the capacity of the network, or as specified in a SLA that is designed to exploit the bandwidth of the communication links. As a result of the fair allocation of bandwidth across the communication links of the network, data congestion across the links is reduced. In examples, each tenant has a global rate target. If a tenant has a high rate target relative to the capacity of one link and uses few other links of the network, the tenant may be allocated a large portion of the one link. If the tenant has a small rate target relative to the capacity of one link and uses many other links of the network, the tenant may be allocated a small portion of the one link, relative to the capacity of the link. In this manner, the probability that each tenant is close to its global rate target is maximized. Additionally, the tenants do not exceed their respective global rate target and are limited such that they do not consume all resources of the network. Furthermore, such an allocation of network bandwidth enables each tenant to access the network at the terms agreed upon in the SLA or some other QoS arrangement, effectively segregating the tenants by keeping each tenant within the tenant's specified rate target.
For ease of description, a link is congested when a bandwidth cap of the communication link is met. The bandwidth cap is the specified maximum bandwidth of a network component. The bandwidth cap of a component of the network may be specified by the manufacturer of the component or determined during testing. A link is uncongested when the bandwidth cap has not been met. Accordingly, when the bandwidth cap has not been met, there is additional bandwidth available on the link. it is envisioned that other standards may be used to define congested and uncongested links, and thus the present techniques are not limited to a single definition of congested and uncongested links. For example, a network service provider may set standards regarding when a link is deemed congested or uncongested by using a percent of the link's total capacity as a threshold for congestion.
In examples, the tenant 102A may send traffic across the network 100 by using traffic sources 104A and 104B. Thus, traffic sources 104A and 104B are designated as being allocated to the tenant 102A. Similarly, the tenant 102B may send traffic across the network 100 by using traffic source 104C. Traffic source 104C is shown as being allocated to the tenant 102B. The traffic senders 104A, 104B, and 104C may send traffic to the switch 106A and the switch 106B. The switch 106B may send the traffic to network destinations 108A and 108B. As shown in network 100, the traffic from the tenant 102A is routed to the network destination 108A, while the traffic from the tenant 102B is routed to the network destination 108B. Additionally, a traffic source 104D may send traffic to another network destination 108C through switches 1060 and 1060. In this example, the tenant 102A is using traffic sources 104A, 1048, and 1040, while the tenant 1028 is using the traffic source 104C.
A network controller 110 may be a device that controls the switches 106A, 106B, 106C, and 1060 and determines how traffic is routed through the network. In examples, the network 100 is a data center network, and the traffic from tenants 102A and 102B contains data that is to be processed within the network 100. The tenants 102A and 102E may use the resources connected to the network to process data or perform some networking functions that are traditionally done by network devices. In some examples, the tenants are corporations, businesses, organizations, individuals, or combinations thereof that use resources on the network. Additionally, in some examples, multiple tenants use multiple traffic sources, links, controllers, network destinations, computing nodes, network devices, network programs, other network resources, or combinations thereof, at the same time. The tenants 102A and 102B may request that the data be processed on the network, but the network controller 110 itself controls the processing requested by the tenants. Furthermore, the network controller 110 may track and allocate resources of the network on a per tenant basis. In some examples, the network controller 110 organizes all or a portion of the devices in the network. In other examples, the network is a peer-to-peer network where controls of the network are distributed among multiple devices in the network 100.
In the example of
The network 100 may have devices or mechanisms that prevent the capacity of network destinations 108A, 108B, and 148C from being exceeded by the traffic sources or tenants, such as rate limiter devices. However, the communication links 112 and 114 of the network may also be susceptible to congestion when traffic demands exceed the capacity of the communication links. The communication links shown are illustrative of the types of communication links that may be present in a network. However, the communication links shown are not exhaustive. Furthermore, it is assumed that other communication links may exist within the network, such as communication links between various software modules and hardware devices. The communication links 112 and 114 can become congested as the network allocates bandwidth of the links on a best effort basis. When allocating links on a best effort basis, the network provider makes an attempt to provide each tenant with enough bandwidth to satisfy that tenant's workload. However, an assurance of a particular quality of service (QoS) is not made, nor is any tenant assured a certain priority within the network.
A field 208 representing the rate of traffic at the traffic source 104A indicates that the traffic source 104A sends traffic across link 112 at 500 megabits per second. Similarly, fields 210 and 212 indicate that the traffic sources 104B and 104C each send traffic across link 112 at a rate of 500 megabits per second. Furthermore, field 214 indicates that the traffic source 104D sends traffic across link 114 at a rate of 500 megabits per second. In this example, the link 112 is congested, as the sum of the traffic from the traffic sources 104A, 104B, and 104C exceeds the capacity of the link 112. The link 114 is uncongested, as the traffic from the single traffic source assigned to link 114 does not exceed the capacity of link 114. Further, since tenant 102A has access to more traffic sources when compared to tenant 102B, tenant 102A can implement multiple flows to use more bandwidth than designated by the SLA.
Distributed rate limiting (DRL) may be used to limit network congestion. DRL is a mechanism by which a total rate limit of the network is distributed across multiple traffic sources. The rate limit refers to the amount of traffic that crosses particular points within the network. The global aggregate rate limit of the network is the sum of the rate limit of each traffic source at any point in time. Using DRL, the global aggregate rate limit may be applied to multiple traffic sources by subdividing the global aggregate rate limit and allocating the subdivided global aggregate rate limit piecewise among the traffic sources. In a DRL implementation where all traffic across a communication link is attributed to a single tenant, that tenant may be assigned the entire aggregate rate limit of the communication link, as the tenant is the only traffic sender to which the subdivided rate limit may be allocated. In this scenario, the global aggregate rate limit is allocated to a single tenant without considering the other tenants that may be sharing the traffic source at a future point in time. Accordingly, the single tenant has an unfair allocation across a particular traffic source when other tenants attempt to access the traffic source at a future point in time. Alternatively, DRL may also be implemented such that the capacity of a communication link is not exceeded by the global aggregate rate limit. For example, all tenants sharing a link may place their entire traffic allocation on the link, with the sum of the global aggregate rate limit for those tenants being less than the capacity of the link. This implementation may cause congestion on one link, while under-utilizing the other links within the network. Most often, DRL is implemented so that the global aggregate rate limit is close to the aggregate capacity of the network as a whole. As a result, some of the links of the network may be over-utilized, or congested, due to the instantaneous traffic pattern of the tenants.
To mitigate congestion across links of a network, a weighted fair sharing mechanism may be used to allocate bandwidth across contended links to multiple tenants. The weighted fair sharing mechanism may be implemented, in part, through the use of rate limiters, which are a mechanism that limits the traffic that is sent or received at a particular point within the network. Limiters may be located at each traffic source, and each limiter may operate independently at each sender, without inter-limiter coordination. However, the use of limiters operating independently at each sender may prevent the use of a global aggregate rate limit across multiple traffic sources, as each limiter operates independently. Further, such per link weighted fair sharing also unfairly penalizes tenants that have a higher portion of their traffic on congested links when compared to tenants that have a higher portion of their traffic on uncongested links. The penalty occurs when the tenants that have a higher portion of their traffic on uncongested links use more than their fair share of the network.
To avoid penalizing tenants that have a higher portion of their traffic on congested links, a traffic matrix for each tenant may be used to allocate traffic. The traffic matrix may describe the load of each tenant on each link, and an analysis of the matrix can assure that each tenant gets a fair allocation on each link by rejecting tenants whose traffic matrix is not satisfied by the system. For example, the traffic matrix of a tenant may attempt to consume more network bandwidth than is available in the network. Such a tenant is rejected by the network, as the network is incapable of servicing the traffic matrix. Other tenants may be rejected because their traffic matrix attempts to consume more network bandwidth than is allowed by the QoS. Each tenant pre-defines its traffic matrix, which can be done for a tenant whose traffic load is predictable and static. Network tenants whose traffic is dynamic or unpredictable can either define their traffic matrix for a worst case scenario by requesting resources that will be mostly idle, or they can define their traffic matrix for an average case and be arbitrarily constrained on some links while underutilizing other links when the actual traffic of the tenant does not correspond to its traffic matrix. Such a system does not offer the ability to move allocated resources to optimize for dynamic traffic flows.
In examples, a system may coordinate and enforce aggregate rate limits for multiple tenants across a distributed set of data-center network devices. The system may implement a mechanism that enables to segregate multiple tenants using the network by taking into account tenant negotiated global rate, tenant demands, and uplink capacities of each tenant. In this manner, the traffic of the tenants is allocated to enable rate limited tenants to fairly share contended links while giving tenant performance as close as possible to their assigned rate. Additionally, in examples, the congested and uncongested links may be identified. The DRL assignment for each tenant on each link is determined. The global amount of bandwidth owed to each tenant is calculated by subtracting the total traffic assignments on uncongested links from the bandwidth cap for each tenant. Additionally, the global amount of bandwidth owed may be distributed to the congested links of the tenant.
At block 506, the each tenant is allocated bandwidth on a congested link based on the per-link tenant owed bandwidth. At block 508, it is determined if the sum of the allocated bandwidth for each tenant on the congested link is less than that link's capacity. If the sum of the allocated bandwidth for each tenant on a link is greater than that link's capacity, process flow continues to block 510, If the sum of the allocated bandwidth for each tenant on a link is not greater than that link's capacity, process flow continues to block 512.
At block 510, the allocated bandwidth for each tenant is proportionally scaled down when the sum of the allocated bandwidth for all tenants using the link is greater than the link capacity, In this manner, the capacity of a link is not exceeded and the link is not congested. Each tenant is allocated a share of bandwidth on the link based on the link capacity.
At block 512, it is determined if the allocated bandwidth for a tenant on a congested link is greater than the tenant's demand for bandwidth on that link. If the allocated bandwidth for a tenant on a congested link is greater than the tenant's demand for bandwidth on that link, process flow continues to block 514. If the allocated bandwidth for a tenant on a congested link is not greater than the tenant's demand for bandwidth on that link, process flow continues to block 516.
At block 514, the tenant's unused allocated bandwidth is shared across the other tenants on the same congested link in proportion to each tenant's allocated bandwidth on the congested link, and process flow continues to block 516. Unused allocated bandwidth is the allocated bandwidth minus the demand of the tenant on the congested link.
At block 516, the allocated bandwidth is distributed on each congested link where the tenant has a demand for bandwidth. In this manner, the tenants are segregated by identifying contended links and sharing the links in the presence of multiple network tenants. The fairness occurs in that each tenant is allocated the quantity of bandwidth that each tenant is owed based on each tenant's global usage of the network, and not merely the usage of a link.
The network capacity is shown as the global aggregate rate limit of 2 gigabits per second in row 604 of table 600. Accordingly, each link has a capacity of 1 gigabit per second, as shown in row 606. The DRL assignment for each tenant may be calculated using an estimated traffic demand of the tenant on each link and the bandwidth cap of the tenant. Accordingly, for tenant 102A, the DRL assignment on link 114 may be calculated using the bandwidth cap of 1 gigabit per second. During allocation, the bandwidth cap is shared equally among each link where the tenant has traffic. Since tenant 102A has a bandwidth cap of 1 gigabit per second, shared across two links, the DRL assignment of tenant 102A on link 114 in row 608 is 500 megabits per second. The DRL assignment of tenant 102A on link 112 in row 608 is also 500 megabits per second. The demand of traffic source 104D is greater than the DRL assignment of tenant 102A on link 114. As a result, link 114 is uncongested and shows a final allocation in row 610 of 500 megabits per second to tenant 102A.
For tenant 102B, the entire bandwidth cap of 1 gigabit per second is placed on a single link, specifically link 112. However, the demand of traffic source 104C is less than the bandwidth cap of tenant 102B on link 112. As a result, the DRL assignment of tenant 102B on link 112 in row 608 is limited to the demand of traffic source 104C at 750 megabits per second. The tenant owed bandwidth for tenant 102B on link 112 is 1 gigabit per second. The final allocation of bandwidth is determined by dividing the tenant owed bandwidth by the sum of bandwidth owed to all tenants on the link. In this example, the total owed bandwidth of link 112 is 1500 megabits per second. Accordingly, the final allocation in row 610 of tenant 102B on link 112 is 666 megabits per second. Similarly, the final allocation in row 610 of tenant 102A on link 112 is 333 megabits per second, as the owed bandwidth for tenant 102A on link 112 is 500 megabits per second.
The various software components discussed herein may be stored on the tangible, non-transitory computer-readable medium, as indicated in
While the present techniques may be susceptible to various modifications and alternative forms, the exemplary examples discussed above have been shown only by way of example. It is to be understood that the technique is not intended to be limited to the particular examples disclosed herein. Indeed, the present techniques include all alternatives, modifications, and equivalents falling within the true spirit and scope of the appended claims.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US2012/035866 | 4/30/2012 | WO | 00 | 10/20/2014 |