The computing industry faces increasing challenges in its efforts to improve the speed and efficiency of software-driven computing devices, e.g., due to power limitations and other factors. Software-driven computing devices employ one or more central processing units (CPUs) that process machine-readable instructions in a conventional temporal manner. To address this issue, the computing industry has proposed using hardware acceleration components (such as field-programmable gate arrays (FPGAs)) to supplement the processing performed by software-driven computing devices. However, software-driven computing devices and hardware acceleration components are dissimilar types of devices having fundamentally different architectures, performance characteristics, power requirements, program configuration paradigms, interface features, and so on. It is thus a challenging task to integrate these two types of devices together in a manner that satisfies the various design requirements of a particular data processing environment.
A service mapping component (SMC) is described herein for allocating services to hardware acceleration components in a data processing system to satisfy general demand for the services, individual requests for the services, and/or other factors. The data processing system is characterized by a hardware acceleration plane that is made up of the hardware acceleration components, together with a software plane that is made up of a plurality of software-driven host components. In one mode of operation, the SMC is configured to select, in response to a triggering event, at least one hardware acceleration component in the hardware plane to perform a service, based on at least one mapping consideration and based on availability information that describes a pool of available hardware acceleration components. A configuration component may then configure the selected hardware acceleration component(s) to perform the service, providing that they are not already configured to do so. Each host component in the software plane is configured to access the service provided by one or more of the selected hardware acceleration component(s) via an associated local hardware acceleration component or some other path(s).
Without limitation, the mapping considerations can include any one or more of: service level agreement considerations, load-balancing considerations, bandwidth-related considerations, latency-related considerations, power-related considerations, line-rate considerations (indicating whether the service is a line-rate service), security-related considerations, migration cost considerations, historical demand considerations, monetary cost considerations, host loading considerations, type-of-service considerations (e.g., indicating whether the service is characterized by bursty or steady traffic patterns), thermal-related considerations, and so on.
The above-summarized functionality can be manifested in various types of systems, devices, components, methods, computer readable storage media, data structures, graphical user interface presentations, articles of manufacture, and so on.
This Summary is provided to introduce a selection of concepts in a simplified form; these concepts are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
The same numbers are used throughout the disclosure and figures to reference like components and features. Series 100 numbers refer to features originally found in
This disclosure is organized as follows. Section A describes an illustrative data processing system that includes a hardware acceleration plane and a software plane. Section B describes management functionality that is used to manage the data processing system of Section A. Section C sets forth one implementation of an illustrative hardware acceleration component in the hardware acceleration plane.
As a preliminary matter, some of the figures describe concepts in the context of one or more structural components, variously referred to as functionality, modules, features, elements, etc. The various components shown in the figures can be implemented in any manner by any physical and tangible mechanisms, for instance, by software running on computer equipment, hardware (e.g., chip-implemented logic functionality), etc., and/or any combination thereof. In one case, the illustrated separation of various components in the figures into distinct units may reflect the use of corresponding distinct physical and tangible components in an actual implementation. Alternatively, or in addition, any single component illustrated in the figures may be implemented by plural actual physical components. Alternatively, or in addition, the depiction of any two or more separate components in the figures may reflect different functions performed by a single actual physical component.
Other figures describe the concepts in flowchart form. In this form, certain operations are described as constituting distinct blocks performed in a certain order. Such implementations are illustrative and non-limiting. Certain blocks described herein can be grouped together and performed in a single operation, certain blocks can be broken apart into plural component blocks, and certain blocks can be performed in an order that differs from that which is illustrated herein (including a parallel manner of performing the blocks). The blocks shown in the flowcharts can be implemented in any manner by any physical and tangible mechanisms, for instance, by software running on computer equipment, hardware (e.g., chip-implemented logic functionality), etc., and/or any combination thereof.
As to terminology, the phrase “configured to” encompasses any way that any kind of physical and tangible functionality can be constructed to perform an identified operation. The functionality can be configured to perform an operation using, for instance, software running on computer equipment, hardware (e.g., chip-implemented logic functionality), etc., and/or any combination thereof.
The term “logic” encompasses any physical and tangible functionality for performing a task. For instance, each operation illustrated in the flowcharts corresponds to a logic component for performing that operation. An operation can be performed using, for instance, software running on computer equipment, hardware (e.g., chip-implemented logic functionality), etc., and/or any combination thereof. When implemented by computing equipment, a logic component represents an electrical component that is a physical part of the computing system, however implemented.
Any of the storage resources described herein, or any combination of the storage resources, may be regarded as a computer readable medium. In many cases, a computer readable medium represents some form of physical and tangible entity. The term computer readable medium also encompasses propagated signals, e.g., transmitted or received via physical conduit and/or air or other wireless medium, etc. However, the specific terms “computer readable storage medium” and “computer readable medium device” expressly exclude propagated signals per se, while including all other forms of computer readable media.
The following explanation may identify one or more features as “optional.” This type of statement is not to be interpreted as an exhaustive indication of features that may be considered optional; that is, other features can be considered as optional, although not explicitly identified in the text. Further, any description of a single entity is not intended to preclude the use of plural such entities; similarly, a description of plural entities is not intended to preclude the use of a single entity. Further, while the description may explain certain features as alternative ways of carrying out identified functions or implementing identified mechanisms, the features can also be combined together in any combination. Finally, the terms “exemplary” or “illustrative” refer to one implementation among potentially many implementations.
A. Overview
The term “hardware” acceleration component is also intended to broadly encompass different ways of leveraging a hardware device to perform a function, including, for instance, at least: a) a case in which at least some tasks are implemented in hard ASIC logic or the like; b) a case in which at least some tasks are implemented in soft (configurable) FPGA logic or the like; c) a case in which at least some tasks run as software on FPGA software processor overlays or the like; d) a case in which at least some tasks run on MPPAs of soft processors or the like; e) a case in which at least some tasks run as software on hard ASIC processors or the like, and so on, or any combination thereof. Likewise, the data processing system 102 can accommodate different manifestations of software-driven devices in the software plane 104.
To simplify repeated reference to hardware acceleration components, the following explanation will henceforth refer to these devices as simply “acceleration components.” Further, the following explanation will present a primary example in which the acceleration components correspond to FPGA devices, although, as noted, the data processing system 102 may be constructed using other types of acceleration components. Further, the hardware acceleration plane 106 may be constructed using a heterogeneous collection of acceleration components, including different types of FPGA devices having different respective processing capabilities and architectures, a mixture of FPGA devices and other devices, and so on.
A host component generally performs operations using a temporal execution paradigm, e.g., by using each of its CPU hardware threads to execute machine-readable instructions, one after the after. In contrast, an acceleration component may perform operations using a spatial paradigm, e.g., by using a large number of parallel logic elements to perform computational tasks. Thus, an acceleration component can perform some operations in less time compared to a software-driven host component. In the context of the data processing system 102, the “acceleration” qualifier associated with the term “acceleration component” reflects its potential for accelerating the functions that are performed by the host components.
In one example, the data processing system 102 corresponds to a data center environment that includes a plurality of computer servers. The computer servers correspond to the host components in the software plane 104 shown in
In one implementation, each host component in the data processing system 102 is coupled to at least one acceleration component through a local link. That fundamental unit of processing equipment is referred to herein as a “server unit component” because that equipment may be grouped together and maintained as a single serviceable unit within the data processing system 102 (although not necessarily so). The host component in the server unit component is referred to as the “local” host component to distinguish it from other host components that are associated with other server unit components. Likewise, the acceleration component(s) of the server unit component are referred to as the “local” acceleration component(s) to distinguish them from other acceleration components that are associated with other server unit components.
For example,
The local host component 108 may further indirectly communicate with any other remote acceleration component in the hardware acceleration plane 106. For example, the local host component 108 has access to a remote acceleration component 116 via the local acceleration component 110. More specifically, the local acceleration component 110 communicates with the remote acceleration component 116 via a link 118.
In one implementation, a common network 120 is used to couple host components in the software plane 104 to other host components, and to couple acceleration components in the hardware acceleration plane 106 to other acceleration components. That is, two host components may use the same network 120 to communicate with each other as do two acceleration components. As another feature, the interaction among host components in the software plane 104 is independent of the interaction among acceleration components in the hardware acceleration plane 106. This means, for instance, that two or more acceleration components may communicate with each other in a transparent manner from the perspective of host components in the software plane 104, outside the direction of the host components, and without the host components being “aware” of the particular interactions that are taking place in the hardware acceleration plane 106. A host component may nevertheless initiate interactions that take place in the hardware acceleration plane 106 by issuing a request for a service that is hosted by the hardware acceleration plane 106.
According to one non-limiting implementation, the data processing system 102 uses the Ethernet protocol to transmit IP packets over the common network 120. In one implementation, each local host component in a server unit component is given a single physical IP address. The local acceleration component in the same server unit component may adopt the same IP address. The server unit component can determine whether an incoming packet is destined for the local host component as opposed to the local acceleration component in different ways. For example, packets that are destined for the local acceleration component can be formulated as user datagram protocol (UDP) packets specifying a specific port; host-destined packets, on the other hand, are not formulated in this way. In another case, packets belonging to the acceleration plane 106 can be distinguished from packets belonging to the software plane 104 based on the value of a status flag in each of the packets (e.g., in the header or body of a packet).
In view of the above characteristic, the data processing system 102 may be conceptualized as forming two logical networks that share the same physical communication links. The packets associated with the two logical networks may be distinguished from each other by their respective traffic classes in the manner described above. But in other implementations (e.g., as described below with respect to
Finally, management functionality 122 serves to manage the operations of the data processing system 102. As will be set forth in greater detail in Section B (below), the management functionality 122 can be physically implemented using different control architectures. For example, in one control architecture, the management functionality 122 may include plural local management components that are coupled to one or more global management components.
By way of introduction to Section B, the management functionality 122 can include a number of sub-components that perform different respective logical functions (which can be physically implemented in different ways). A location determination component 124, for instance, identifies the current locations of services within the data processing system 102, based on current allocation information stored in a data store 126. As used herein, a service refers to any function that is performed by the data processing system 102. For example, one service may correspond to an encryption function. Another service may correspond to a document ranking function. Another service may correspond to a data compression function, and so on.
In operation, the location determination component 124 may receive a request for a service. In response, the location determination component 124 returns an address associated with the service, if that address is present in the data store 126. The address may identify a particular acceleration component that hosts the requested service.
A service mapping component (SMC) 128 maps services to particular acceleration components. The SMC 128 may operate in at least two modes depending on the type of triggering event that it receives which invokes it operation. In a first case, the SMC 128 processes requests for services made by instances of tenant functionality. An instance of tenant functionality may correspond to a software program running on a particular local host component, or, more specifically, a program executing on a virtual machine that, in turn, is associated with the particular local host component. That software program may request a service in the course of its execution. The SMC 128 handles the request by determining an appropriate component (or components) in the data processing system 102 to provide the service. Possible components for consideration include: a local acceleration component (associated with the local host component from which the request originated); a remote acceleration component; and/or the local host component itself (whereupon the local host component will implement the service in software). The SMC 128 makes its determinations based on one or more mapping considerations, such as whether the requested service pertains to a line-rate service.
In another manner of operation, the SMC 128 generally operates in a background and global mode, allocating services to acceleration components based on global conditions in the data processing system 102 (rather than, or in addition to, handling individual requests from instances of tenant functionality). For example, the SMC 128 may invoke its allocation function in response to a change in demand that affects one or more services. In this mode, the SMC 128 again makes its determinations based on one or more mapping considerations, such as the historical demand associated with the services, etc.
The SMC 128 may interact with the location determination component 124 in performing its functions. For instance, the SMC 128 may consult the data store 126 when it seeks to determine the address of an already allocated service provided by an acceleration component. The SMC 128 can also update the data store 126 when it maps a service to one or more acceleration components, e.g., by storing the addresses of those acceleration components in relation to the service.
Although not shown in
Note that
In many cases, a requested service is implemented on a single acceleration component (although there may be plural redundant such acceleration components to choose from among). But in the particular example of
In the particular case of
First, note that the operations that take place in the hardware acceleration plane 106 are performed in an independent manner of operations performed in the software plane 104. In other words, the host components in the software plane 104 do not manage the operations in the hardware acceleration plane 106. However, the host components may invoke the operations in the hardware acceleration plane 106 by issuing requests for services that are hosted by the hardware acceleration plane 106.
Second, note that the hardware acceleration plane 106 performs its transactions in a manner that is transparent to a requesting host component. For example, the local host component 204 may be “unaware” of how its request is being processed in the hardware acceleration plane, including the fact that the service corresponds to a multi-component service.
Third, note that, in this implementation, the communication in the software plane 104 (e.g., corresponding to operation (1)) takes place using the same common network 120 as communication in the hardware acceleration plane 106 (e.g., corresponding to operations (3)-(6)). Operations (2) and (7) may take place over a local link, corresponding to the localH-to-localS coupling 114 shown in
The multi-component service shown in
For example,
Moreover, a multi-component service does not necessarily need to employ a single head component, or any head component. For example, a multi-component service can employ a cluster of acceleration components which all perform the same function. The data processing system 102 can be configured to invoke this kind of multi-component service by contacting any arbitrary member in the cluster. That acceleration component may be referred to as a head component because it is the first component to be accessed, but it otherwise has no special status. In yet other cases, a host component may initially distribute plural requests to plural members of a collection of acceleration components.
Finally, note that the local acceleration component 418 is coupled to the TOR switch 410. Hence, in this particular implementation, the local acceleration component 418 represents the sole path through which the host component 412 interacts with other components in the data center 402 (including other host components and other acceleration components). Among other effects, the architecture of
Note that the local host component 412 may communicate with the local acceleration component 418 through the local link 420 or via the NIC 422. Different entities may leverage these two paths in different respective circumstances. For example, assume that a program running on the host component 412 requests a service. In one implementation, assume that the host component 412 provides a local instantiation of the location determination component 124 and the data store 126. Or a global management component may provide the location determination component 124 and its data store 126. In either case, the host component 412 may consult the data store 126 to determine the address of the service. The host component 412 may then access the service via the NIC 422 and the TOR switch 410, using the identified address.
In another implementation, assume that local acceleration component 418 provides a local instantiation of the location determination component 124 and the data store 126. The host component 412 may access the local acceleration component 418 via the local link 420. The local acceleration component 418 can then consult the local data store 126 to determine the address of the service, upon which it accesses the service via the TOR switch 410. Still other ways of accessing the service are possible.
The routing infrastructure shown in
The data center 402 shown in
Generally note that, while
Also note that, in the examples set forth above, a server unit component may refer to a physical grouping of components, e.g., by forming a single serviceable unit within a rack of a data center. In other cases, a server unit component may include one or more host components and one or more acceleration components that are not necessarily housed together in a single physical unit. In that case, a local acceleration component may be considered logically, rather than physically, associated with its respective local host component.
Alternatively, or in addition, a local host component and one or more remote acceleration components can be implemented on a single physical component, such as a single MPSoC-FPGA die. The network switch may also be incorporated into that single component.
In other cases, local hard CPUs, and/or soft CPUs, and/or acceleration logic provided by a single processing component (e.g., as implemented on a single die) may be coupled via diverse networks to other elements on other processing components (e.g., as implemented on other dies, boards, racks, etc.). An individual service may itself utilize one or more recursively local interconnection networks.
Further note that the above description was framed in the context of host components which issue service requests that are satisfied by acceleration components. But alternatively, or in addition, any acceleration component can also make a request for a service which can be satisfied by any other component, e.g., another acceleration component and/or even a host component. The SMC 102 can address such a request in a similar manner to that described above. Indeed, certain features described herein can be implemented on a hardware acceleration plane by itself, without a software plane.
More generally stated, certain features can be implemented by any first component which requests a service, which may be satisfied by the first component, and/or by one or more local components relative to the first component, and/or by one or more remote components relative to the first component. To facilitate explanation, however, the description below will continue to be framed mainly in the context in which the entity making the request corresponds to a local host component.
Finally, other implementations can adopt different strategies for coupling the host components to the hardware components, e.g., other than the localH-to-localS coupling 114 shown in
In block 908, the associated local acceleration component may locally perform the service, assuming that the address that has been identified pertains to functionality that is locally implemented by the local acceleration component. Alternatively, or in addition, in block 910, the local acceleration component routes the request to a remote acceleration component. As noted above, the local acceleration component is configured to perform routing to the remote acceleration component without involvement of the local host component. Further, plural host components communicate in the data processing system 102 with each other over a same physical network as do plural acceleration components.
In conclusion to Section A, the data processing system 102 has a number of useful characteristics. First, the data processing system 102 uses a common network 120 (except for the example of
B. Management Functionality
As described in the introductory Section A, the location determination component 124 identifies the current location of services within the data processing system 102, based on current allocation information stored in the data store 126. In operation, the location determination component 124 receives a request for a service. In response, it returns an address of the service, if present within the data store 126. The address may identify a particular acceleration component that implements the service.
The data store 126 may maintain any type of information which maps services to addresses. In the small excerpt shown in
In some implementations, the data store 126 may optionally also store status information which characterizes each current service-to-component allocation in any manner. Generally, the status information for a service-to-component allocation specifies the way that the allocated service, as implemented on its assigned component (or components), is to be treated within the data processing system 102, such as by specifying its level of persistence, specifying its access rights (e.g., “ownership rights”), etc. In one non-limiting implementation, for instance, a service-to-component allocation can be designated as either reserved or non-reserved. When performing a configuration operation, the SMC 128 can take into account the reserved/non-reserved status information associated with an allocation in determining whether it is appropriate to change that allocation, e.g., to satisfy a current request for a service, a change in demand for one or more services, etc. For example, the data store 126 indicates that the acceleration components having address a1, a6, and a8 are currently configured to perform service w, but that only the assignments to acceleration components a1 and a8 are considered reserved. Thus, the SMC 128 will view the allocation to acceleration component a6 as a more appropriate candidate for reassignment (reconfiguration), compared to the other two acceleration components.
In addition, or alternatively, the data store 126 can provide information which indicates whether a service-to-component allocation is to be shared by all instances of tenant functionality, or dedicated to one or more particular instances of tenant functionality (or some other indicated consumer(s) of the service). In the former (fully shared) case, all instances of tenant functionality vie for the same resources provided by an acceleration component. In the latter (dedicated) case, only those clients that are associated with a service allocation are permitted to use the allocated acceleration component.
The SMC 128 may also interact with a data store 1002 that provides availability information. The availability information identifies a pool of acceleration components that have free capacity to implement one or more services. For example, in one manner of use, the SMC 128 may determine that it is appropriate to assign one or more acceleration components as providers of a function. To do so, the SMC 128 draws on the data store 1002 to find acceleration components that have free capacity to implement the function. The SMC 128 will then assign the function to one or more of these free acceleration components. Doing so will change the availability-related status of the chosen acceleration components.
The SMC 128 also manages and maintains the availability information in the data store 1002. In doing so, the SMC 128 can use different rules to determine whether an acceleration component is available or unavailable. In one approach, the SMC 128 may consider an acceleration component that is currently being used as unavailable, while an acceleration component that is not currently being used as available. In other cases, the acceleration component may have different configurable domains (e.g., tiles), some of which are being currently used and others which are not being currently used. Here, the SMC 128 can specify the availability of an acceleration component by expressing the fraction of its processing resources that are currently not being used. For example,
In other cases, the SMC 128 can take into consideration pending requests for an acceleration component in registering whether it is available or not available. For example, the SMC 128 may indicate that an acceleration component is not available because it is scheduled to deliver a service to one or more instances of tenant functionality, even though it may not be engaged in providing that service at the current time.
In other cases, the SMC 128 can also register the type of each acceleration component that is available. For example, the data processing system 102 may correspond to a heterogeneous environment that supports acceleration components having different physical characteristics. The availability information in this case can indicate not only the identities of processing resources that are available, but also the types of those resources.
In other cases, the SMC 128 can also take into consideration the status of a service-to-component allocation when registering an acceleration component as available or unavailable. For example, assume that a particular acceleration component is currently configured to perform a certain service, and furthermore, assume that the allocation has been designated as reserved rather than non-reserved. The SMC 128 may designate that acceleration component as unavailable (or some fraction thereof as being unavailable) in view of its reserved status alone, irrespective of whether the service is currently being actively used to perform a function at the present time. In practice, the reserved status of an acceleration component therefore serves as a lock which prevents the SMC 128 from reconfiguring the acceleration component, at least in certain circumstances.
Now referring to the core mapping operation of the SMC 128 itself, the SMC 128 allocates or maps services to acceleration components in response to triggering events. More specifically, the SMC 128 operates in different modes depending on the type of triggering event that has been received. In a request-driven mode, the SMC 128 handles requests for services by tenant functionality. Here, each triggering event corresponds to a request by an instance of tenant functionality that resides, at least in part, on a particular local host component. In response to each request by a local host component, the SMC 128 determines an appropriate component to implement the service. For example, the SMC 128 may choose from among: a local acceleration component (associated with the local host component that made the request), a remote acceleration component, or the local host component itself (whereupon the local host component will implement the service in software), or some combination thereof.
In a second background mode, the SMC 128 operates by globally allocating services to acceleration components within the data processing system 102 to meet overall anticipated demand in the data processing system 102 and/or to satisfy other system-wide objectives and other factors (rather than narrowly focusing on individual requests by host components). Here, each triggering event that is received corresponds to some condition in the data processing system 102 as a whole that warrants allocation (or reallocation) of a service, such as a change in demand for the service.
Note, however, that the above-described modes are not mutually exclusive domains of analysis. For example, in the request-driven mode, the SMC 128 may attempt to achieve at least two objectives. As a first primary objective, the SMC 128 will attempt to find an acceleration component (or components) that will satisfy an outstanding request for a service, while also meeting one or more performance goals relevant to the data processing system 102 as a whole. As a second objective, the SMC 128 may optionally also consider the long term implications of its allocation of the service with respect to future uses of that service by other instances of tenant functionality. In other words, the second objective pertains to a background consideration that happens to be triggered by a request by a particular instance of tenant functionality.
For example, consider the following simplified case. An instance of tenant functionality may make a request for a service, where that instance of tenant functionality is associated with a local host component. The SMC 128 may respond to the request by configuring a local acceleration component to perform the service. In making this decision, the SMC 128 may first of all attempt to find an allocation which satisfies the request by the instance of tenant functionality. But the SMC 128 may also make its allocation based on a determination that many other host components have requested the same service, and that these host components are mostly located in the same rack as the instance of tenant functionality which has generated the current request for the service. In other words, this supplemental finding further supports the decision to place the service on an in-rack acceleration component.
In addition, an instance of tenant functionality (or a local host component) may specifically request that it be granted a reserved and dedicated use of a local acceleration component. The status determination logic 1004 can use different environment-specific rules in determining whether to honor this request. For instance, the status determination logic 1004 may decide to honor the request, providing that no other triggering event is received which warrants overriding the request. The status determination logic 1004 may override the request, for instance, when it seeks to fulfill another request that is determined, based on any environment-specific reasons, as having greater urgency than the tenant functionality's request.
In some implementations, note that an instance of tenant functionality (or a local host component or some other consumer of a service) may independently control the use of its local resources. For example, a local host component may pass utilization information to the management functionality 122 which indicates that its local acceleration component is not available or not fully available, irrespective of whether the local acceleration component is actually busy at the moment. In doing so, the local host component may prevent the SMC 128 from “stealing” its local resources. Different implementations can use different environment-specific rules to determine whether an entity is permitted to restrict access to its local resources in the above-described manner, and if so, in what circumstances.
In another example, assume that the SMC 128 determines that there has been a general increase in demand for a particular service. In response, the SMC 128 may find a prescribed number of free acceleration components, corresponding to a “pool” of acceleration components, and then designate that pool of acceleration components as reserved (but fully shared) resources for use in providing the particular service. Later, the SMC 128 may detect a general decrease in demand for the particular service. In response, the SMC 128 can decrease the pool of reserved acceleration components, e.g., by changing the status of one or more acceleration components that were previously registered as “reserved” to “non-reserved.”
Note that the particular dimensions of status described above (reserved vs. non-reserved, dedicated vs. fully shared) are cited by way of illustration, not limitation. Other implementations can adopt any other status-related dimensions, or may accommodate only a single status designation (and therefore omit use of the status determination logic 1004 functionality).
As a second component of analysis, the SMC 128 may use size determination logic 1006 to determine a number of acceleration components that are appropriate to provide a service. The SMC 128 can make such a determination based on a consideration of the processing demands associated with the service, together with the resources that are available to meet those processing demands.
As a third component of analysis, the SMC 128 can use type determination logic 1008 to determine the type(s) of acceleration components that are appropriate to provide a service. For example, consider the case in which the data processing system 102 has a heterogeneous collection of acceleration components having different respective capabilities. The type determination logic 1008 can determine one or more of a particular kind of acceleration components that are appropriate to provide the service.
As a fourth component of analysis, the SMC 128 can use placement determination logic 1010 to determine the specific acceleration component (or components) that are appropriate to address a particular triggering event. This determination, in turn, can have one more aspects. For instance, as part of its analysis, the placement determination logic 1010 can determine whether it is appropriate to configure an acceleration component to perform a service, where that component is not currently configured to perform the service.
The above facets of analysis are cited by way of illustration, not limitation. In other implementations, the SMC 128 can provide additional phases of analyses.
Generally, the SMC 128 performs its various allocation determinations based on one or more mapping considerations. For example, one mapping consideration may pertain to historical demand information provided in a data store 1012. The explanation (below) will provide additional description of different mapping considerations, as they apply to the operation of the placement determination logic 1010.
Note, however, that the SMC 128 need not perform multi-factor analysis in all cases. In some cases, for instance, a host component may make a request for a service that is associated with a single fixed location, e.g., corresponding to the local acceleration component or a remote acceleration component. In those cases, the SMC 128 may simply defer to the location determination component 124 to map the service request to the address of the service, rather than assessing the costs and benefits of executing the service in different ways. In other cases, the data store 126 may associate plural addresses with a single service, each address associated with an acceleration component that can perform the service. The SMC 128 can use any mapping consideration(s) in allocating a request for a service to a particular address, to be described below, such as a load balancing consideration.
As a result of its operation, the SMC 128 can update the data store 126 with information that maps services to addresses at which those services can be found (assuming that this information has been changed by the SMC 128). The SMC 128 can also store status information that pertains to new service-to-component allocations.
To configure one or more acceleration components to perform a function (if not already so configured), the SMC 128 can invoke a configuration component 1014. In one implementation, the configuration component 1014 configures acceleration components by sending a configuration stream to the acceleration components. A configuration stream specifies the logic to be “programmed” into a recipient acceleration component. The configuration component 1014 may use different strategies to configure an acceleration component, several of which are set forth below.
A failure monitoring component 1016 determines whether an acceleration component has failed. The SMC 128 may respond to a failure notification by substituting a spare acceleration component for a failed acceleration component.
B.1. Operation of the SMC in a Request-Driven Mode
In operation (1), the local host component 1102 may send its request for the service to the SMC 128. In operation (2), among other analyses, the SMC 128 may determine at least one appropriate component to implement the service. In this case, assume that the SMC 128 determines that a remote acceleration component 1104 is the most appropriate component to implement the service. The SMC 128 can obtain the address of that acceleration component 1104 from the location determination component 124. In operation (3), the SMC 128 may communicate its answer to the local host component 1102, e.g., in the form of the address associated with the service. In operation (4), the local host component 1102 may invoke the remote acceleration component 1104 via its local acceleration component 1106. Other ways of handling a request by tenant functionality are possible. For example, the local acceleration component 1106 can query the SMC 128, rather than, or in addition to, the local host component 102.
Path 1108 represents an example in which a representative acceleration component 1110 (and/or its associated local host component) communicates utilization information to the SMC 128. The utilization information may identify whether the acceleration component 1110 is available or unavailable for use, in whole or in part. The utilization information may also optionally specify the type of processing resources that the acceleration component 1110 possesses which are available for use. As noted above, the utilization information can also be chosen to purposively prevent the SMC 128 from later utilizing the resources of the acceleration component 1110, e.g., by indicating in whole or in part that the resources are not available.
Although not shown, any acceleration component can also make directed requests for specific resources to the SMC 128. For example, the host component 1102 may specifically ask to use its local acceleration component 1106 as a reserved and dedicated resource. As noted above, the SMC 128 can use different environment-specific rules in determining whether to honor such a request.
Further, although not shown, other components besides the host components can make requests. For example, a hardware acceleration component may run an instance of tenant functionality that issues a request for a service that can be satisfied by itself, another hardware acceleration component (or components), a host component (or components), etc., or any combination thereof
Further assume that a local acceleration component 1208 is coupled to the local host component 1202, e.g., via a PCIe local link or the like. At the current time, the local acceleration component 1208 hosts A1 logic 1210 for performing the acceleration service A1, and A2 logic 1212 for performing the acceleration service A2.
According to one management decision, the SMC 128 assigns T1 to the A1 logic 1210, and assigns T2 to the A2 logic 1212. However, this decision by the SMC 128 is not a fixed rule; the SMC 128 may make its decision based on plural factors, some of which may reflect conflicting considerations. As such, based on other factors (not described at this juncture), the SMC 128 may choose to assign jobs to acceleration logic in a different manner from that illustrated in
In the scenario of
In response to the above scenario, the SMC 128 may choose to assign T1 to the A1 logic 1310 of the acceleration component 1308. The SMC 128 may then assign T2 to the A2 logic 1312 of a remote acceleration component 1314, which is already configured to perform that service. Again, the illustrated assignment is set forth here in the spirit of illustration, not limitation; the SMC 128 may choose a different allocation based on another combination of input considerations. In one implementation, the local host component 1302 and the remote acceleration component 1314 can optionally compress the information that they send to each other, e.g., to reduce consumption of bandwidth.
Note that the host component 1302 accesses the A2 logic 1312 via the local acceleration component 1308. But in another case (not illustrated), the host component 1302 may access the A2 logic 1312 via the local host component (not illustrated) that is associated with the acceleration component 1314.
Generally, the SMC 128 can perform configuration in a full or partial manner to satisfy any request by an instance of tenant functionality. The SMC performs full configuration by reconfiguring all of the application logic provided by an acceleration component. The SMC 128 can perform partial configuration by reconfiguring part (e.g., one or more tiles) of the application logic provided by an acceleration component, leaving other parts (e.g., one or more other tiles) intact and operational during reconfiguration. The same is true with respect to the operation of the SMC 128 in its background mode of operation, described below. Further note that additional factors may play a role in determining whether the A3 logic 1412 is a valid candidate for reconfiguration, such as whether or not the service is considered reserved, whether or not there are pending requests for this service, etc.
Finally, the above examples were described in the context of instances of tenant functionality that run on host components. But as already noted above, the instances of tenant functionality may more generally correspond to service requestors, and those service requestors can run on any component(s), including acceleration components. Thus, for example, a requestor that runs on an acceleration component can generate a request for a service to be executed by one or more other acceleration components and/or by itself and/or by one or more host components. The SMC 102 can handle the requestor's request in any of the ways described above.
B.2. Operation of the SMC in a Background Mode
In the particular example of
The SMC 128 can also operate in the background mode to allocate one or more acceleration components, which implement a particular service, to at least one instance of tenant functionality, without necessarily requiring the tenant functionality to make a request for this particular service each time. For example, assume that an instance of tenant functionality regularly uses a compression function, corresponding to “service z” in
B.3. Physical Implementations of the Management Functionality
The architecture of
Further, the local management component 1804 can send utilization information to a global management component on any basis, such as periodic basis and/or an event-driven basis (e.g., in response to a change in utilization). The global management component can use the utilization information to update its master record of availability information in the data store 1002.
In operation, the low-level management components (2004, 2012, . . . ) handle certain low-level management decisions that directly affect the resources associated with individual server unit components. The mid-level management components (2018, 2020) can make decisions which affect a relevant section of the data processing system 102, such as an individual rack or a group of racks. The top-level management component (2022) can make global decisions which broadly apply to the entire data processing system 102.
B.4. The Configuration Component
Finally,
B.5. Illustrative Operation of the SMC
As also noted above, the SMC 128 may perform different phases of analysis, such as: (1) determining the status associated with a service-to-component allocation (e.g., reserved vs. non-reserved, dedicated vs. fully shared, etc.), which is performed by the status determination logic 1004; (2) determining a number of acceleration components to be used, which is performed by the size determination logic 1006; (3) determining the type(s) of acceleration components to be used, which is performed by the type determination logic 1008; and/or (4) determining individual acceleration components to be used within the data processing system 102, which is performed by the placement determination logic 1010, and so on.
To facilitate explanation, the operation of the SMC 102 will be principally explained with respect to the fourth determination performed by the placement determination logic 1010. To further simplify the explanation, the following explanation will be initially set forth in the context of the assignment of a single service to one or more acceleration components, where plural consumers are not yet contending for the same resources.
Generally, in a request-driven mode, the placement determination logic 1004 of the SMC 128 may satisfy a request by instructing a requesting instance of tenant functionality as to where it can access a requested service. In doing so, the SMC 128 can optionally call on the configuration component 1014 to configure an acceleration component (or components) to perform the requested service, if these components are not already configured to perform the service. Alternatively, or in addition, the SMC 128 can assign a request to an already configured service on an identified acceleration component. Similarly, in the background mode, the placement determination logic 1006 of the SMC 128 can satisfy overall demand for a service in the data processing system 102 by calling on the configuration component 1014 to configure one or more acceleration components to provide the service, and/or draw from one or more already configured acceleration components.
Upon invocation, the SMC 128 can make a decision based on several factors, referred to below as “mapping considerations.” The SMC 128 can obtain input information pertaining to these mapping considerations from various sources, such as host components and acceleration components within the data processing system 102, external entities which provide information regarding performance parameter values and the like (which may be accessed via one or more network connections, such as the Internet), etc.
Some mapping considerations are relatively narrow in focus, e.g., by emphasizing the extent to which an allocation decision satisfies a particular request generated by an instance of tenant functionality that runs on a local host component. Other mapping considerations are more global in focus, e.g., by emphasizing an effect that an allocation decision will have on the data processing system 102 as a whole. Other mapping considerations take into account both particular and global factors. The following explanation identifies a representative but non-exhaustive list of mapping considerations.
a. Location of Consumers
One mapping consideration pertains to the location(s) of the entity(ies) which have requested the service under consideration, or the location(s) of the entity(ies) that will likely consume that service in the future. For example, when performing an allocation in the background mode, the SMC 128 can determine whether a service under consideration has just a few main consumers. If so, then the SMC 128 may attempt to place one or more acceleration components “close” to those consumers. More specifically, in one non-limiting case, the SMC 128 may load the service onto one or more local acceleration components associated with respective host components which regularly request the service. On the other hand, if the service has many random consumers spread over the data processing system 102, then the SMC 128 may consider it less important to place the service in proximity to any one consumer.
b. Current Mapping Considerations
Another mapping consideration pertains to what service or services are currently loaded onto the acceleration components in the hardware acceleration plane 106, e.g., as reflected in the current allocation information provided in the data store 126. For example, assume that the SMC 128 seeks to fulfill a request for a service by an instance of tenant functionality associated with a local host component. The SMC 128 may favor allocating the requested service to the local acceleration component (which is associated with the local host component) when that local acceleration component is already configured to perform that service. Similarly, when operating in the background mode to select a pool of reserved acceleration components for performing a service, the SMC 128 may favor allocating the service to an acceleration component when that component is already configured to perform that service. This factor is also related to a cost-of-migration consideration described below.
c. Image Availability Considerations
Another related consideration pertains whether a configuration image for a requested service even exists. Not all services are good candidates for hardware acceleration, so not all requested software services have counterpart configuration images. The SMC 128 may leverage this consideration in the request-driven mode by immediately instructing a host component to perform a service in software, where that service cannot be implemented in hardware.
d. Acceleration Benefit Considerations
Another mapping consideration pertains to whether a performance boost can be expected by deploying a service on acceleration hardware, as opposed to performing the function in software by a host component. If negligible performance benefit is likely, then the SMC 128 can instruct a local host component to implement a requested service in software. In the background mode, the SMC 128 may decline to create a pool of acceleration components dedicated to a particular service if negligible acceleration benefit can be expected.
e. Current Availability Considerations
Another mapping consideration pertains to the available capacity of each acceleration component under consideration (e.g., as reflected in the data store 1002 that provides availability information), with respect its ability to handle an identified service. As noted above, the availability of an acceleration component can be specified as binary Yes/No information, percentile information, etc. The availability of an acceleration component can also take into account pending requests for the acceleration component, etc., e.g., in which the acceleration component is scheduled to perform identified processing in the future. The SMC 128 can leverage this consideration to determine whether it is feasible to configure a given acceleration component under consideration to perform a service.
f. SLA Considerations
Another mapping consideration pertains to a service level agreement (SLA) associated with the service. For example, an SLA associated with a service may specify one or more parameter values which reflect the requested speed at which the service is to be delivered to end users, such by specifying worst-case latency performance that is to be permitted, and/or other worst-case performance parameter values. The SMC 128 may choose one or more acceleration components to satisfy the SLA requirements of the service, which may entail selecting a certain number of acceleration components, and/or choosing certain types of acceleration components, and/or selecting the locations of those acceleration components, etc.
g. Type-of-Demand Considerations
Another mapping consideration pertains to the nature of traffic patterns associated with a service. For instance, some services are characterized by relatively steady-state traffic flow. Other services exhibit highly “bursty” traffic, meaning that they are subject to large and perhaps unpredictable spikes in traffic. In one non-limiting strategy, the SMC 128 may seek to avoid dedicating a single bursty service to a single acceleration component (or components), as the bursty service may generally fail to efficiently utilize the resources of the dedicated component (e.g., due to underutilization). Instead, the SMC 128 may choose to allocate plural bursty services to a pool of acceleration components. Such an allocation strategy is based on the premise that the intermittent bursts associated with plural services will be uncorrelated, and the average demand associated with several of these bursty services can be reasonably predicted and taken into account, thus permitting a more efficient utilization of the resources of the allocated acceleration components.
h. Historical Demand Considerations
Another mapping consideration pertains to the historical demand associated with a service. In the background mode, the SMC 128 will attempt to allocate a sufficient number of acceleration components to satisfy the expected demand for a service, which may vary throughout the day, throughout the week, etc. The SMC 128 can also take into account the manner in which demand changes for the service, e.g., whether it is typically bursty vs. relatively steady (as described above), unpredictable vs. predictable, trending up vs. trending down, etc.
When handling a specific request, the SMC 128 can take demand information into account in different ways. In one scenario, the SMC 128 may consider historical demand information associated with a particular candidate acceleration component when deciding whether to use that acceleration component to satisfy a current request for a service. For instance, an acceleration component that is soon to be overloaded may not be a good candidate to satisfy the request. The SMC 128 can also leverage such known demand patterns to determine the likely resource requirements associated the current request (where those are not specified in advance), and then use that information as another factor in determining how to most effectively handle the request.
i. Line-Rate Service Considerations
Another mapping consideration pertains to whether or not the service under consideration is a line-rate service. A line-rate service is a service that is performed on information flowing on a link (or through some other point of analysis) at a prescribed rate, preferably without delaying the transmission of that information. The SMC 128 may choose to place line-rate services close to their respective consumers to ensure that the heightened processing demands associated with these services are met. For example, a line-rate service may be rendered inoperable when that service is located remotely from a consumer of the line-service, e.g., due to the latencies involved in contacting the remote acceleration component and/or bandwidth overload caused by interacting with the remote acceleration component, etc.
j. Load Balancing Considerations
Another mapping consideration pertains to load balancing. When handling particular requests for a service, the SMC 128 will seek to allocate the requests to acceleration components in a manner that does not overburden any acceleration component (and/or other processing component) in the data processing system 102. This can be achieved by using any load-balancing strategy to spread the requests out over plural acceleration components that provide the service. Similarly, when performing a more general background allocation, the SMC 128 will seek to distribute a service over acceleration components in such a manner that no one acceleration component (and/or other computing resource associated with the data processing system 102) is overburdened.
k. Bandwidth Considerations
Another mapping consideration pertains to bandwidth in the data processing system 102. When handling particular requests for a service, the SMC 128 will seek to assign the requests to acceleration components in such a manner that no link in the data processing system 102 is overburdened. Similarly, when performing a more general background allocation, the SMC 128 will seek to distribute a service in such a manner that no link in the data processing system 102 is overburdened.
l. Latency Considerations
Another mapping consideration pertains to latency incurred in accessing a service. The SMC 128 will seek to provide a service in such a manner that the latencies involved in accessing the service are within acceptable ranges. As noted above, a line-rate service may be effectively rendered inoperable if the service is located too “far” from the expected consumer(s) of the service. Generally note that, in many cases, the SMC 128 can satisfy several allocation constraints (such as latency, bandwidth, etc.) by placing a service on the same rack as its expected consumers, and preferably on the same server unit component as an expected consumer.
m. CPU Performance Considerations
Another mapping consideration pertains to the load placed on host components in the software plane 104. When processing a particular request, the SMC 128 may avoid performing the requested service in software on the local host component if doing so will overburden the CPUs of that component. Similarly, when operating in the background mode, the SMC 128 may attempt to identify any software-related services that are contributing to overloading of a CPU, and then offload some of that processing to one or more acceleration components.
n. Migration Cost Considerations
Another mapping consideration pertains to costs that will be incurred upon reconfiguring the hardware acceleration plane 106 in a particular manner under consideration. Here, the SMC 128 will generate an assessment of the amount of time and/or other resources that are required to perform the reconfiguration (e.g., based on known and pre-stored configuration data). Based on that knowledge, the SMC 128 will then determine the impact that the reconfiguration process will have on other functions performed by the data processing system 102. For example, the SMC 128 may prohibit a reconfiguration process when that process is projected to interfere with a critical process performed by the data processing system 102.
o. Power and Thermal Considerations
Another mapping consideration pertains to power and/or thermal effects. The SMC 128 may consult a reference table or the like to determine the amount of power that will be consumed, and the amount of heat that will be generated, in running a service on a particular candidate acceleration component or components. The SMC 128 may use this information to choose an allocation option that satisfies appropriate power and/or thermal constraints. The SMC 128 can also consult real-time temperature and power measurements in making its decision. For example, the SMC 128 may seek to distribute a service over plural racks if performing the service on a single rack will exceed thermal limits for that rack.
p. Monetary Considerations
Another mapping consideration pertains to monetary considerations. In some cases, a service confers a known monetary benefit (e.g., as measured by ad revenue, product sales revenue, etc.). Further, a service running on one or more acceleration components may have known costs, such as the cost of the devices themselves (or fractions thereof that are being used to run the service), the cost of supplying power to the components, the cost of utilizing computational resources (e.g., as assessed by a data center administrator), the opportunity cost of forgoing another service or services, and so on. The SMC 128 can compute the monetary benefits and costs for different allocation options, and use this information in determining how and where to allocate the service. In one scenario, the SMC 128 can leverage this consideration to maximize overall profit provided by a data center.
q. Security Considerations
Another mapping consideration pertains to the security implications of allocating a service to one or more proposed acceleration components. For example, security considerations may prohibit two services of a certain type from being placed on the same acceleration component. Alternatively, or in addition, security considerations may prohibit a service from being placed on a remote acceleration component, with respect to its consumer (e.g., with respect to the local host component which consumes the service). The SMC 128 may take these factors into account when determining how to allocate a service in the hardware acceleration plane 106.
r. Co-Location Considerations
Another consideration pertains to the manner in which two or more services are typically hosted or used together on a same computing device or other platform. For example, consider a hypothetical environment in which many users use a document compression service in conjunction with an encryption service, e.g., by first using the document compression service to compress a document, and then using the encryption service to encrypt the compressed document. To the extent that such co-location information is available, the SMC 128 can allocate commonly-grouped services to the same acceleration component, or to the same rack, etc. Doing so may be advantageous, for instance, to facilitate the management of services in the hardware acceleration plane 106. Co-location information can be obtained by examining actual usage patterns within the data processing system 102, and/or by consulting more general statistical information regarding usage habits.
s. Received Request Considerations
Another consideration pertains to whether an entity (such as a local acceleration component, local host component, local management component, an instance of tenant functionality, etc.) has made a request for a specific kind of allocation. For instance, an instance of tenant functionality may ask the SMC 128 to grant it dedicated use of a service that runs on its local acceleration component. The SMC 128 can balance this request against all of the other factors described above.
The above considerations are cited by way of example, not limitation. Other implementations can take into account additional considerations, and/or can omit one or more considerations described above.
Note that the above description sometimes assumed that the SMC 128 uses a single acceleration component to implement a complete instance of a service. In multi-component services, however, a collection of acceleration components implement a single service. That is, each acceleration component in the collection implements a part of the multi-component service. The SMC 128 can apply special considerations when allocating multi-component services to acceleration components.
For example, the SMC 128 may take into account the manner in which a multi-component's acceleration components are distributed in the data processing system 102, as this factor may affect the performance of the multi-component service (and the data processing system 102 as a whole) in terms of latency, bandwidth, load balancing, etc. For instance, the SMC 128 may choose to allocate a collection of acceleration components associated with a multi-component service to a single rack or group of racks to reduce latency and bandwidth bottlenecks. By doing so, for instance, the SMC 128 can reduce the bandwidth in the higher nodes of the switching fabric.
Further note that the above description was framed in the context of the allocation of complete services. But the SMC 128 may also allocate and reallocate fragments of any service of any size to various hardware and/or software processing elements in a dynamic manner, rather than, or in addition to, assigning the complete service to a single processing element.
The SMC 128 can use different algorithms to process all of the mapping considerations described above to arrive at a final conclusion. In one technique, once invoked, the SMC 128 can apply a rules-based process to determine how to allocate a service among a pool of available acceleration components. In one implementation, the rules may be structured as a graph of IF-THEN decisions. Generally, different rules may be appropriate to different data processing systems, based on environment-specific considerations associated with those systems.
To cite one representative case, the SMC 128 can process a request by an instance of tenant functionality that runs on a local host component by first determining whether the requested service is already present on a local acceleration component associated with the local host component. If so, the SMC 128 will determine whether the service is a line-rate service or some other service having relatively high processing demands. If so, the SMC 128 will use the local acceleration component to fulfill the tenant functionality's request, unless there are security constraints which make this allocation inappropriate. On the other hand, if the service is a relatively non-critical task that does not affect key performance metrics of the tenant functionality's operation, the SMC 128 may choose to use a remote acceleration component to fulfill the tenant functionality's request, thereby freeing up the local acceleration component to handle more urgent jobs. The SMC 128 may perform similar multi-factor analysis when allocating services to acceleration components in the background mode.
In other algorithmic approaches, upon encountering a triggering event, the SMC 128 can enumerate the possible allocation options for a service at the present time. Each option reflects the allocation of the service to a specific feasible set of acceleration components within the data processing system 102 (where the set includes one more acceleration components). The SMC 128 can then assign a score to each option which reflects a weighted combination of the above-described considerations. The weights associated with these scores can be empirically generated for a particular processing environment. The SMC 128 can then choose and apply the allocation option having the highest (most favorable) score.
In other approaches, the SMC 128 can employ a model produced by a machine-learning process to make allocation decisions. The model is trained based on a training set that reflects the prior-assessed performance of the management functionality 122. That is, the training set may specify different mapping considerations that have been encountered in the data processing system 102, together with resultant allocation decisions that were considered desirable and undesirable (as assessed by human evaluators and/or other labelling techniques). The model that is learned reflects the relationships between input mapping considerations and desirable (and undesirable) allocation decisions.
In other approaches, the SMC 128 can treat the allocation task as a problem of finding an optimal solution within a search space, subject to specified constraints. In the present case, the constraints correspond to the above-described mapping considerations, or some subset thereof. The SMC 128 can use various techniques for quickly searching the space (such as a best-fit search technique) to find an optimal solution, or at least a satisfactory solution, even though not optimal.
Further note that the processing associated with the SMC 128 can be “layered on top of,” or otherwise integrated with, any existing scheduling, resource allocation, and/or forecasting algorithm(s). For example, assume that a local host component has plural instances of tenant functionality that have issued plural respective service requests. Any conventional resource algorithm can be used to determine the order in which the requests are to be processed. For example, the conventional resource algorithm can process requests based on a first-in-first-out rule, any type of fairness calculus, any priority-based ranking of the requests (e.g., where some instances of tenant functionality may have superior priority over other instances due to the urgency of their tasks, their generally privileged statuses, and/or other factors), and so on. Once the conventional resource algorithm chooses a request to process, the SMC 128 may then apply the above considerations to determine an appropriate resource (or resources) to process the request. The SMC 128 can perform a similar allocation function when more generally considering competing demands among multiples services in its background mode of operation.
In another case, the SMC 128 can integrate forecasting analysis into the above-described logic by projecting when services will be needed (e.g., based on historical demand patterns). The SMC 128 can then automatically and proactively load those services into the acceleration plane 106 at the appropriate times.
In any of the above scenarios, the SMC 128 can also make allocation decisions based on the totality of the requests that are pending (and/or anticipated) at any given time (as opposed to considering each request in isolation). For example, assume that the SMC 128 observes that there are many pending requests for a particular service. In response, the SMC 128 can reserve a pool of acceleration components to handle these requests. In another case, the SMC 128 may take into consideration the respective locations of consumers associated with pending requests in making its allocation decisions, e.g., by favoring the selection of components that are near many of the pending consumers. The SMC 128 can perform similar analysis in the background mode when more generally considering prevailing demands for different services
Finally, the above description was framed in the illustrative context of the placement determination logic 1008, which determines the placement of allocated components within the data processing system 102. Similar analyses to that described above can be applied to other aspects of the operation of the SMC 102. For example, the status determination logic 1004 can conclude that it is appropriate to label a service-to-component allocation as reserved (vs. non-reserved) based on: (a) a determination that there is significant historical demand for the service; and/or (b) a determination that the service is relatively important (e.g., due to monetary considerations and/or other factors); and/or (c) a determination that the consumer(s) themselves are important for any reason (e.g., because they have privileged rights in the data processing system 102 for any environment-specific reason); and/or (d) a determination that the service imposes relatively strict demands (due to SLA considerations, line-rate considerations, etc.), and so on.
The status determination logic 1004 can also determine whether or not the service should be dedicated to one or more particular consumers (as opposed to fully shared) based on the same analysis set forth above, but framed in the context of specific consumers. For example, the status determination logic 1004 may decide to grant a particular consumer dedicated access to a service being run on an acceleration component based on a determination that this particular consumer has frequently requested the service within a short period of time.
Advancing to
In block 2704, the local management component determines whether a request for a service has been received from an instance of tenant functionality running on a local host component. In block 2706, the local management component determines whether it is appropriate to perform the requested service using the local acceleration component. If so, in block 2708, the local management component instructs the local acceleration component to perform the service using the local acceleration component. Alternatively, in block 2710, the local management component contacts the global management component to determine whether it is appropriate for a remote acceleration component to perform the requested service. If so, the global management component returns an address associated with this service. In block 2712, the local management component determines whether an address has been identified. If so, in block 2714, the local management component instructs the local host component to use the address that has been provided to contact the identified remote acceleration component. If no address is provided, then, in block 2716, the local management component (or the global management component) instructs the local host component to perform the service itself in software.
Although not shown in
In block 2804 of
In block 2806, the SMC 102 determines whether it has received a triggering event that generally warrants reassignment of one or more services in the hardware acceleration plane 106. In other words, block 2806 asks whether an event has occurred which triggers the operation of the SMC 128 when operating in its background mode. But as noted above, the request-driven invocation in block 2804 may also entail background analysis, as a component thereof
In block 2808, the SMC 128 determines one or more acceleration assignments in direct or indirect response to whatever triggering event has been received. The SMC 128 may indirectly respond to the triggering event, for instance, by buffering it and acting on it at a later time. In block 2810, the SMC 128 may optionally invoke the configuration component 1014 to configure one or more acceleration components, if, in fact, the allocation entails such configuration. In block 2812, the SMC 128 conveys information and/or instructions to appropriate recipient entities which will have the effect of carrying out the allocation. For example, the SMC 128 can convey an address to a local host component, which allows it to access a remote acceleration component.
In block 3004 of
C. Illustrative Implementation of a Hardware Acceleration Component
From a high-level standpoint, the acceleration component 3102 may be implemented as a hierarchy having different layers of functionality. At a lowest level, the acceleration component 3102 provides an “outer shell” which provides basic interface-related components that generally remain the same across most application scenarios. A core component 3104, which lies inside the outer shell, may include an “inner shell” and application logic 3106. The inner shell corresponds to all the resources in the core component 3104 other than the application logic 3106, and represents a second level of resources that remain the same within a certain set of application scenarios. The application logic 3106 itself represents a highest level of resources which are most readily subject to change. Note however that any component of the acceleration component 3102 can technically be reconfigured.
In operation, the application logic 3106 interacts with the outer shell resources and inner shell resources in a manner analogous to the way a software-implemented application interacts with its underlying operating system resources. From an application development standpoint, the use of common outer shell resources and inner shell resources frees a developer from having to recreate these common components for each application that he or she creates. This strategy also reduces the risk that a developer may alter core inner or outer shell functions in a manner that causes problems within the data processing system 102 as a whole.
Referring first to the outer shell, the acceleration component 3102 includes a bridge 3108 for coupling the acceleration component 3102 to the network interface controller (via a NIC interface 3110) and a local top-of-rack switch (via a TOR interface 3112). The bridge 3108 supports two modes. In a first node, the bridge 3108 provides a data path that allows traffic from the NIC or TOR to flow into the acceleration component 3102, and traffic from the acceleration component 3102 to flow out to the NIC or TOR. The acceleration component 3102 can perform any processing on the traffic that it “intercepts,” such as compression, encryption, etc. In a second mode, the bridge 3108 supports a data path that allows traffic to flow between the NIC and the TOR without being further processed by the acceleration component 3102. Internally, the bridge may be composed of various FIFOs (3114, 3116) which buffer received packets, and various selectors and arbitration logic which route packets to their desired destinations. A bypass control component 3118 controls whether the bridge 3108 operates in the first mode or the second mode.
A memory controller 3120 governs interaction between the acceleration component 3102 and local memory 3122 (such as DRAM memory). The memory controller 3120 may perform error correction as part of its services.
A host interface 3124 provides functionality that enables the acceleration component to interact with a local host component (not shown in
Finally, the shell may include various other features 3126, such as clock signal generators, status LEDs, error correction functionality, and so on.
In one implementation, the inner shell may include a router 3128 for routing messages between various internal components of the acceleration component 3102, and between the acceleration component 3102 and external entities (via a transport component 3130). Each such endpoint is associated with a respective port. For example, the router 3128 is coupled to the memory controller 3120, host interface 1120, application logic 3106, and transport component 3130.
The transport component 3130 formulates packets for transmission to remote entities (such as remote acceleration components), and receives packets from the remote acceleration components (such as remote acceleration components).
A 3-port switch 3132, when activated, takes over the function of the bridge 3108 by routing packets between the NIC and TOR, and between the NIC or TOR and a local port associated with the acceleration component 3102 itself.
Finally, an optional diagnostic recorder 3134 stores transaction information regarding operations performed by the router 3128, transport component 3130, and 3-port switch 3132 in a circular buffer. For example, the transaction information may include data about a packet's origin and destination IP addresses, host-specific data, timestamps, etc. A technician may study a log of the transaction information in an attempt to diagnose causes of failure or sub-optimal performance in the acceleration component 3102.
In some implementations, the data processing system 102 of
C.1. The Local Link
In operations (4) and (5), the application logic 3312 retrieves the data from the input buffer 3310, processes it to generate an output result, and places the output result in an output buffer 3314. In operation (6), the acceleration component 3304 copies the contents of the output buffer 3314 into an output buffer in the host logic's memory. In operation (7), the acceleration component notifies the host logic 3306 that the data is ready for it to retrieve. In operation (8), the host logic thread wakes up and consumes the data in the output buffer 3316. The host logic 3306 may then discard the contents of the output buffer 3316, which allows the acceleration component 3304 to reuse it in the next transaction.
C.2. The Router
In one non-limiting implementation, the router 3128 supports a number of virtual channels (such as eight) for transmitting different classes of traffic over a same physical link. That is, the router 3128 may support multiple traffic classes for those scenarios in which multiple services are implemented by the application logic 3106, and those services need to communicate on separate classes of traffic.
The router 3128 may govern access to the router's resources (e.g., its available buffer space) using a credit-based flow technique. In that technique, the input units (3402-3408) provide upstream entities with credits, which correspond to the exact number of flits available in their buffers. The credits grant the upstream entities the right to transmit their data to the input units (3402-3408). More specifically, in one implementation, the router 3128 supports “elastic” input buffers that can be shared among multiple virtual channels. The output units (3410-3416) are responsible for tracking available credits in their downstream receivers, and provide grants to any input units (3402-3408) that are requesting to send a flit to a given output port.
C.3. The Transport Component
A packet processing component 3504 processes messages arriving from the router 3128 which are destined for a remote endpoint (e.g., another acceleration component). It does so by buffering and packetizing the messages. The packet processing component 3504 also processes packets that are received from some remote endpoint and are destined for the router 3128.
For messages arriving from the router 3128, the packet processing component 3504 matches each message request to a Send Connection Table entry in the Send Connection Table, e.g., using header information and virtual channel (VC) information associated with the message as a lookup item, as provided by router 3128. The packet processing component 3504 uses the information retrieved from the Send Connection Table entry (such as a sequence number, address information, etc.) to construct packets that it sends out to the remote entity.
More specifically, in one non-limiting approach, the packet processing component 3504 encapsulates packets in UDP/IP Ethernet frames, and sends them to a remote acceleration component. In one implementation the packets may include an Ethernet header, followed by an IPv4 header, followed by a UDP header, followed by transport header (specifically associated with the transport component 3130), followed by a payload.
For packets arriving from the network (e.g., as received on a local port of the 3-port switch 3132), the packet processing component 3504 matches each packet to a Receive Connectable Table entry provided in the packet header. If there is a match, the packet processing component retrieves a virtual channel field of the entry, and uses that information to forward the received message to the router 3128 (in accordance with the credit-flow technique used by the router 3128).
A failure handling component 3506 buffers all sent packets until it receives an acknowledgement (ACK) from the receiving node (e.g., the remote acceleration component). If an ACK for a connection does not arrive within a specified time-out period, the failure handling component 3506 can retransmit the packet. The failure handling component 3506 will repeat such retransmission for a prescribed number times (e.g., 128 times). If the packet remains unacknowledged after all such attempts, the failure handling component 3506 can discard it and free its buffer.
C.4. The 3-Port Switch
The 3-port switch 3132 connects to the NIC interface 3110 (corresponding to a host interface), the TOR interface 3112, and a local interface associated with the local acceleration component 3102 itself. The 3-port switch 3132 may be conceptualized as including receiving interfaces (3602, 3604, 3606) for respectively receiving packets from the host component and TOR switch, and for receiving packets at the local acceleration component. The 3-port switch 3132 also includes transmitting interfaces (3608, 3610, 3612) for respectively providing packets to the TOR switch and host component, and receiving packets transmitted by the local acceleration component.
Packet classifiers (3614, 3616) determine the class of packets received from the host component or the TOR switch, e.g., based on status information specified by the packets. In one implementation, each packet is either classified as belonging to a lossless flow (e.g., remote direct memory access (RDMA) traffic) or a lossy flow (e.g., transmission control protocol/Internet Protocol (TCP/IP) traffic). Traffic that belongs to a lossless flow is intolerant to packet loss, while traffic that belongs to a lossy flow can tolerate some packet loss.
Packet buffers (3618, 3620) store the incoming packets in different respective buffers, depending on the class of traffic to which they pertain. If there is no space available in the buffer, the packet will be dropped. (In one implementation, the 3-port switch 3132 does not provide packet buffering for packets provided by the local acceleration component (via the local port) because the application logic 3106 can regulate the flow of packets through the use of “back pressuring.”) Arbitration logic 3622 selects among the available packets and transmits the selected packets.
As described above, traffic that is destined for the local acceleration component is encapsulated in UDP/IP packets on a fixed port number. The 3-port switch 3132 inspects incoming packets (e.g., as received from the TOR) to determine if they are UDP packets on the correct port number. If so, the 3-port switch 3132 outputs the packet on the local RX port interface 3606. In one implementation, all traffic arriving on the local TX port interface 3612 is sent out of the TOR TX port interface 3608, but it could also be sent to the host TX port interface 3610. Further note that
PFC processing logic 3624 allows the 3-port switch 3132 to insert Priority Flow Control frames into either the flow of traffic transmitted to the TOR or host component. That is, for lossless traffic classes, if a packet buffer fills up, the PFC processing logic 3624 sends a PFC message to the link partner, requesting that traffic on that class be paused. If a PFC control frame is received for a lossless traffic class on either the host RX port interface 3602 or the TOR RX port interface 3604, the S-port switch 3132 will cease sending packets on the port that received the control message.
C.5. An Illustrative Host Component
The host component 3702 also includes an input/output module 3710 for receiving various inputs (via input devices 3712), and for providing various outputs (via output devices 3714). One particular output mechanism may include a presentation device 3716 and an associated graphical user interface (GUI) 3718. The host component 3702 can also include one or more network interfaces 3720 for exchanging data with other devices via one or more communication conduits 3722. One or more communication buses 3724 communicatively couple the above-described components together.
The communication conduit(s) 3722 can be implemented in any manner, e.g., by a local area network, a wide area network (e.g., the Internet), point-to-point connections, etc., or any combination thereof. The communication conduit(s) 3722 can include any combination of hardwired links, wireless links, routers, gateway functionality, name servers, etc., governed by any protocol or combination of protocols.
The following summary provides a non-exhaustive list of illustrative aspects of the technology set forth herein.
According to a first aspect, a data processing system is described that includes two or more host components, each of which uses one or more central processing units to execute machine-readable instructions, the two or more host components collectively providing a software plane. The data processing system also includes two or more hardware acceleration components that collectively provide a hardware acceleration plane. The data processing system also includes a location determination component configured to maintain a first data store that provides current allocation information that describes current locations of services, as currently allocated to components within the data processing system. The data processing system also includes a service mapping component configured to: maintain a second data store that provides availability information that describes a pool of available hardware acceleration components; receive a triggering event; in direct or indirect response to the triggering event, determine an assignment of a service to at least one selected hardware acceleration component in the hardware plane, based on at least one mapping consideration and based on the availability information; and update the current allocation information in response to the assignment. The data processing system also includes a configuration component for configuring one or more of the selected hardware acceleration component(s) to perform the service, providing that the selected hardware acceleration component(s) is/are not already configured to perform the service, and providing that the hardware acceleration component(s) is/are identified in the pool of available hardware acceleration components. Each host component in the software plane is configured to access the service provided by the selected hardware acceleration component(s).
According to a second aspect, the triggering event corresponds to a change in demand associated with the service or a particular request for the service.
According to a third aspect, the service mapping component is further configured to: receive an update regarding utilization of a hardware acceleration component within the data processing system; and update the availability information in the second data store in response to receipt of the update regarding utilization.
According to a fourth aspect, the configuration component is configured to configure a selected hardware acceleration component by sending a configuration stream from a global management component to the selected hardware acceleration component.
According a fifth aspect, the configuration component is alternatively configured to configure a selected hardware acceleration component by sending an instruction to a local management component that is associated with the selected hardware acceleration component, whereupon the local management component sends a configuration stream to the selected hardware acceleration component.
According to a sixth aspect, the service mapping component is also configured to determine a status of the assignment, the status specifying one or more conditions which govern treatment of the service that is assigned within the data processing system.
According to a seventh aspect, the service mapping component is configured to determine the assignment by, at least in part: determining a number of acceleration components that make up a group associated with above-referenced selected hardware acceleration component(s); and determining a placement of the service within the data processing system by selecting one or more particular acceleration components that constitute the group.
According to an eighth aspect, one mapping consideration pertains to a location of at least one consumer associated with the service relative to a hardware acceleration component that is assigned to provide the service.
According to a ninth aspect, one mapping consideration pertains to an extent to which a service level agreement associated with the service is satisfied by the assignment.
According to a tenth aspect, one mapping consideration pertains to a load balancing effect that will be caused in the data processing system in response to the assignment, compared to load balancing effects associated with other potential assignments.
According to an eleventh aspect, one mapping consideration pertains to a bandwidth-related effect that will be caused in the data processing system in response to the assignment, compared to bandwidth-related effects associated with other potential assignments.
According to a twelfth aspect, one mapping consideration pertains to a power-related and/or a thermal-related effect that will be caused in the data processing system in response to the assignment, compared to power and/or thermal-related effects associated with other potential assignments.
According to a thirteenth aspect, one mapping consideration pertains to a security-related implication that is relevant to the assignment of the service to above-referenced at least one selected hardware acceleration component, compared to security-related implications associated with other potential assignments.
According to a fourteenth aspect, one mapping consideration pertains to a determination of the whether the service is a line-rate service.
According to a fifteenth aspect, one mapping consideration pertains to a nature of historical demand associated with the service.
According to a sixteenth aspect, one mapping consideration pertains to a monetary cost associated with the assignment, compared monetary costs associated with other potential assignments.
According to a seventeenth aspect, one mapping consideration pertains to a cost to migrate the service from a current allocation to a new allocation associated with the assignment, compared to migration costs associated with other potential assignments.
According to an eighteenth aspect, a method is described for allocating a service within a data processing system. The method includes receiving a triggering event corresponding to a condition in the data processing system that impacts the service, and/or a particular request for the service. The method also includes, directly or indirectly in response to the triggering event, determining an assignment of the service to at least one selected hardware acceleration component in the data processing system, based, at least in part, on availability information that describes a pool of available hardware acceleration components. Each hardware acceleration component is locally coupled to at least one host component, and each host component uses one or more central processing units to execute machine-readable instructions. The method also includes configuring one or more of the selected hardware acceleration component(s) to perform the service, providing that the selected hardware acceleration component(s) is/are not already configured to perform the service, and providing that the selected hardware acceleration component(s) is/are identified in the availability information. The method also includes updating current allocation information in response to the assignment, the current allocation information describing current locations of services, as currently allocated to components within the data processing system.
According to a nineteenth aspect, the method further includes: receiving an update regarding utilization of a hardware acceleration component within the data processing system; and updating the availability information in response to above-referenced receiving of the update.
According to a twentieth aspect, at least one device that implements a service mapping component is described that includes logic configured to receive an update regarding utilization of a hardware acceleration component within a data processing system. Each hardware acceleration component is locally coupled to at least one host component, and each host component uses one or more central processing units to execute machine-readable instructions. The device(s) also includes logic configured to update availability information that describes a pool of available hardware acceleration components in response to receiving the update regarding utilization. The device(s) also includes logic configured to receive a triggering event, and logic configured to determine, in direct or indirect response to the triggering event, an assignment of a service to at least one selected hardware acceleration component in the data processing system, based, at least in part, on the availability information. The device(s) also includes logic configured to configure one or more of the selected hardware acceleration component(s) to perform the service, providing that the selected hardware acceleration component(s) is/are not already configured to perform the service, and providing that the selected hardware acceleration component(s) is/are identified in the pool of available hardware acceleration components.
A twenty-first aspect corresponds to any combination (e.g., any permutation or subset) of the above-referenced first through twentieth aspects.
A twenty-second aspect corresponds to any method counterpart, device counterpart, system counterpart, means counterpart, computer readable storage medium counterpart, data structure counterpart, article of manufacture counterpart, graphical user interface presentation counterpart, etc. associated with the first through twenty-first aspects.
In closing, although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.
This application claims the benefit of U.S. Provisional Application No. 62/149,488 (the '488 Application), filed Apr. 17, 2015. The '488 Application is incorporated by reference herein in its entirety.
Number | Date | Country | |
---|---|---|---|
62149488 | Apr 2015 | US |