A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights whatsoever. The following notice applies to the software and data as described below and in the drawings that form a part of this document: Copyright 2011, BRUTESOFT, INC. All Rights Reserved.
The present application relates generally to the technical field of data processing and more particularly to computer processing and digital data processing systems.
The costs of operating large data centers are tremendous. Power consumption makes up a big part of this, since it affects the amount of electricity used, the amount of cooling needed, the maximum density of computation units, etc. If one may reduce power while keeping performance constant, we believe it may be a key strategic improvement for the future data center.
Despite various economic downturns, desktops and laptops still comprise the bulk of the devices deployed to empower employees in enterprises, with a predicted worldwide forecast of nearly 290 million desktops and laptops by 2xxx. Provisioning, deploying, and updating these devices still remains a large support function of any enterprise, regardless of size. However, as desktop virtualization becomes more prevalent in these enterprises, the problem relating to downloading these virtual desktop images may only escalate.
Deploying virtual devices, regardless of whether these are virtual desktops or virtual servers, may place additional demands at an increasing pace on a company's Internet technology (IT) group. One of the most time-consuming, mundane, and costly functions of the IT department would be to ensure that these virtual devices are correctly and efficiently enabled when required by the end users. Any solution that allows potential customers to reduce and streamline these activities, and at the same time reduce the cost of delivering these virtual devices, may fulfill a need that may escalate with deployment of virtual devices. A more efficient means of delivery of virtual machine images may allow enterprises to use larger images, a greater variety of images, or permit more terminal/system sharing with a possibly smaller network infrastructure.
Ideally, virtual machine image management would be offered in an enterprise as part of a larger software-as-a-service solution that may enable small and medium companies to access world-class VM management solutions based on a subscription per device. This may allow more efficient and less costly services to these companies that often have limited access to qualified resources to manage their non-core activities.
Delivery of thousands of virtual machine images can potentially become a significant problem for enterprises without central control and management which are attempting to move toward a virtual desktop environment. Another important impact of this solution is the ability to reduce the energy consumption required for repeatedly delivering and running virtual machine images to the end users in the enterprise.
Some embodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings in which:
Example methods and systems to manage and distribute virtual machine images over peer-to-peer networks are described. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of example embodiments. It will be evident, however, to one skilled in the art that at least aspects of the inventive subject matter may be practiced without these specific details.
Various example embodiments are discussed herein. For example, various aspects of the inventive subject matter include an application of peer-to-peer (P2P) network protocols to the virtual machine image distribution; dynamic throttling of P2P traffic to reduce bandwidth and power demands; and a mechanism to migrate modified images from one client to another.
This document discusses, with reference to some example embodiments, peer-to-peer network technologies in the virtual machine (VM) image deployment space, and provides some comparisons to traditional point-to-point methods in current use. We want to look not only at applying P2P algorithms to the space, but also discuss their effects on power consumption and on client-to-client transfers. An improvement of efficient VM image deployment is paramount to the success of new desktop initiatives in the developing areas of Virtual Desktop Infrastructure (VDI).
Finding faster ways to deploy large images may change the way virtual machines are managed. For example, if it may be done fast enough, a VM manager may rapidly migrate a VM from one physical machine to another without the need for shared storage to balance machine load, and thus save power and achieve better performance. The VM manager may reuse a VM slot or a physical slice with different images depending on end user roles. For example, a day-shift worker requires one image and a night-shift worker another on the same physical hardware. The VM manager may also allow late binding deployment of images, for example, when the identity of a person logging in is made known, the system can rapidly deploy the appropriate image.
Popular P2P algorithms focus on ‘read-only’ migration of data sets. There is some market advantage for adaptations of the algorithms to handle temporary non-persistent ‘writes’ to the distributed system. This is useful if you simply want to move an image from one place to another.
Underlying P2P systems is a mechanism to distribute file storage over a network of machines. If the machines are relatively near each other, then you can easily saturate your network bandwidth by attempting to obtain all parts of a file at once.
By actively monitoring available network bandwidth, one can throttle the amount of bandwidth used. By applying some level of throttling, combined with the well-established overall speed-up of P2P, one may achieve the same (low) performance as traditional point-to-point protocols, but at reduced power cost. The power savings comes from the shorter distance file blocks has to travel to reach their destination.
Finally, in scenarios where a running virtual machine image is to be migrated to another physical place, one can apply a variation of the algorithm starting with breaking the image into blocks, hashing each block, and adding each hash/block pair to the distributed hashing table (DHT). Next the algorithm proceeds with sending the list of hashes to the target machine. The algorithm concludes with ensuring unique DHT entries are not discarded until the target received its full payload.
For larger images, the likelihood of the entire image to have changed would be low, so the algorithm exploits the caching nature of a DHT and helps the target machine obtain its image faster. If, however, most of the image did change, this worst case may match the point-to-point copy case.
Therefore, applications exist in hardware resource sharing, e.g., using the same VM hardware to support multiple non-concurrent users and in dynamic migration of images for load balancing. Further benefits derived when hardware bandwidth is throttled back are lower power consumption and lower network infrastructure demands.
Example models may take the inputs of Image size in Bytes, Number of clients, and for P2P models; bandwidth throttling (forced utilization as a % of line capacity) and the size of its DHT store (in bytes). For VM image migration, models may take the input of the percentage of data that was modified from the original (as a percentage).
For given inputs, example models may output an estimate of end-to-end migration of the image (e.g., how long it took for the image to be copied from the server to all the clients in seconds), an estimate of the peak power consumed during migration (in Watts), and an estimate of the total energy consumed over the course of the migration (in Joules).
We chose outputs that are material to data center and infrastructure design. Time to completion is a pure performance measurement. Peak power consumed tells us how much infrastructure we need to build for and total energy tells us how big our bills are going to be.
We may instrument each example model to monitor and log its local consumed network bandwidth in fixed time increments. These logs may be analyzed off-line to estimate the peak power and the total energy used.
Peak power would be the log entry with the highest bandwidth entry multiplied by a bandwidth-to-power factor (which is a constant).
Total energy would be the integral of the entire log over the run of the experiment multiplied by the same bandwidth-to-power factor.
A conversion factor between bandwidth (bits/s) and power (W) is a constant for the particular set up that measurements are taken on. If this information is not available (depending on equipment), a one-off experiment to measure power as a function of bandwidth can be performed. Keep in mind that this factor may include the cost of switches in the network path. The SI unit for this factor is Joules/bit.
A comprehensive study of the power and performance space in server-to-client and server-to-server environments is proposed where it is sought to create the six models listed in the table below, take as many measurements as needed over the input space to map out its topology in performance, peak power, and total energy usage, and graph these results. A final step is to make a comparison between the approaches.
Models are developed where fundamentally there may be two code bases, one for traditional server-client models, and one for P2P models. Assumptions are compiled in for the three lesser variations of each.
The P2P models may be built on BruteSoft's existing BruteAPPS product (BruteSoft, Inc., 139 Fremont Avenue, Los Altos, Calif. 94022, United States of America). Instrumentation for measurements may be added. Furthermore, the models may include VM to VM propagation logic.
We may also create an experimentation framework that automates several aspects of the measurement process, such as initialization, running batches of tests, gathering of results, tabulation/storing thereof, clean-up, etc.
In running measurements, the input space to be considered is as follows:
Image size is chosen to be representative of popular VDI environments e.g., Windows XP®, Vista®, and Windows 7®. Based on known P2P scaling data, linear scaling may be reached before we reach 100 clients.
Depending on available time and success of this step, we may also use heterogeneous clients, in another example. We believe that there may be combinations of thin clients (e.g., storageless terminals with just enough RAM to store and run a single image) and traditional clients (with storage). Thin clients can only participate in the P2P process as pure recipients. We believe it is important to include these machines, since they are cheap and may likely become more predominant in the next few years.
Example Customers:
Enterprises that typically deploy more than 1000 desktops or laptops in distributed locations and that are in the process of switching to a virtualized desktop environment may be the prime customers. In addition, any ISP or service provider that needs to provide multiple virtual images, either to their own datacenter or to their client base, may also have a vested interest in this technology. There are currently more than a half a million companies in the USA alone with more than 1000 desktops or laptops deployed in their organizations. These typically encompass various sectors such as retail, financial, educational, government, manufacturing, and health in the Small and Medium Business environments.
We proposed offering the virtual machine image management in the enterprise as part of a larger software-as-a-service solution that may enable small and medium companies to access world-class VM management solutions based on a subscription per device. This may allow more efficient and less costly services to these companies that often have limited access to qualified resources to manage their non-core activities.
We can expect to see a proliferation of virtualized devices in various configurations as such services takes root in educational institutions and retail organizations, among others. Virtualized devices can solve many problems in certain organizations, but they can also consume a large part of the resources of an IT group if they cannot be effectively and efficiently provisioned each time it is required.
In addition, we expect to see distribution of virtual machine images to increase significantly as enterprises moves more towards virtual solutions. This may drive the need for efficient and effective virtual imaging solutions.
Certain embodiments are described herein as including logic or a number of components, modules, or mechanisms. Modules may constitute either software modules (e.g., code embodied on a non-transitory machine-readable medium) or hardware modules. A hardware module is tangible unit capable of performing certain operations and may be configured or arranged in a certain manner. In example embodiments, one or more computer systems (e.g., a standalone, client, or server computer system) or one or more hardware modules of a computer system (e.g., a processor or a group of processors) may be configured by software (e.g., an application or application portion) as a hardware module that operates to perform certain operations as described herein.
In various embodiments, a hardware module may be implemented mechanically or electronically. For example, a hardware module may comprise dedicated circuitry or logic that is permanently configured (e.g., as a special purpose processor, such as a field programmable gate array (FPGA) or an application-specific integrated circuit (ASIC)) to perform certain operations. A hardware module may also comprise programmable logic or circuitry (e.g., as encompassed within a general-purpose processor or other programmable processor) that is temporarily configured by software to perform certain operations. It will be appreciated that the decision to implement a hardware module mechanically, in dedicated and permanently configured circuitry, or in temporarily configured circuitry (e.g., configured by software) may be driven by cost and time considerations.
Accordingly, the term “hardware module” should be understood to encompass a tangible entity, be that an entity that is physically constructed, permanently configured (e.g., hardwired) or temporarily configured (e.g., programmed) to operate in a certain manner and/or to perform certain operations described herein. Considering embodiments in which hardware modules are temporarily configured (e.g., programmed), each of the hardware modules need not be configured or instantiated at any one instance in time. For example, where the hardware modules comprise a general-purpose processor configured using software, the general-purpose processor may be configured as respective different hardware modules at different times. Software may accordingly configure a processor, for example, to constitute a particular hardware module at one instance of time and to constitute a different hardware module at a different instance of time.
Hardware modules can provide information to, and receive information from, other hardware modules. Accordingly, the described hardware modules may be regarded as being communicatively coupled. Where multiple of such hardware modules exist contemporaneously, communications may be achieved through signal transmission (e.g., over appropriate circuits and buses) that connect the hardware modules. In embodiments in which multiple hardware modules are configured or instantiated at different times, communications between such hardware modules may be achieved, for example, through the storage and retrieval of information in memory structures to which the multiple hardware modules have access. For example, one hardware module may perform an operation, and store the output of that operation in a memory device to which it is communicatively coupled. A further hardware module may then, at a later time, access the memory device to retrieve and process the stored output. Hardware modules may also initiate communications with input or output devices, and can operate on a resource (e.g., a collection of information).
The various operations of example methods described herein may be performed, at least partially, by one or more processors that are temporarily configured (e.g., by software) or permanently configured to perform the relevant operations. Whether temporarily or permanently configured, such processors may constitute processor-implemented modules that operate to perform one or more operations or functions. The modules referred to herein may, in some example embodiments, comprise processor-implemented modules.
Similarly, the methods described herein may be at least partially processor-implemented. For example, at least some of the operations of a method may be performed by one or processors or processor-implemented modules. The performance of certain of the operations may be distributed among the one or more processors, not only residing within a single machine, but deployed across a number of machines. In some example embodiments, the processor or processors may be located in a single location (e.g., within a home environment, an office environment or as a server farm), while in other embodiments the processors may be distributed across a number of locations.
The one or more processors may also operate to support performance of the relevant operations in a “cloud computing” environment or as a “software as a service” (SaaS). For example, at least some of the operations may be performed by a group of computers (as examples of machines including processors), these operations being accessible via a network (e.g., the Internet) and via one or more appropriate interfaces (e.g., Application Program Interfaces (APIs)).
Example embodiments may be implemented in digital electronic circuitry, or in computer hardware, firmware, software, or in combinations of them. Example embodiments may be implemented using a computer program product, e.g., a computer program tangibly embodied in an information carrier, e.g., in a non-transitory machine-readable medium for execution by, or to control the operation of, data processing apparatus, e.g., a programmable processor, a computer, or multiple computers.
A computer program can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, subroutine, or other unit suitable for use in a computing environment. A computer program can be deployed to be executed on one computer or on multiple computers at one site or distributed across multiple sites and interconnected by a communication network.
In example embodiments, operations may be performed by one or more programmable processors executing a computer program to perform functions by operating on input data and generating output. Method operations can also be performed by, and apparatus of example embodiments may be implemented as, special purpose logic circuitry, e.g., a field programmable gate array (FPGA) or an application-specific integrated circuit (ASIC).
The computing system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a clientserver relationship to each other. In embodiments deploying a programmable computing system, it will be appreciated that that both hardware and software architectures require consideration. Specifically, it will be appreciated that the choice of whether to implement certain functionality in permanently configured hardware (e.g., an ASIC), in temporarily configured hardware (e.g., a combination of software and a programmable processor), or a combination of permanently and temporarily configured hardware may be a design choice. Below are set out hardware (e.g., machine) and software architectures that may be deployed, in various example embodiments.
The example computer system 400 includes a processor 402 (e.g., a central processing unit (CPU), a graphics processing unit (GPU) or both), a main memory 404 and a static memory 406, which communicate with each other via a bus 408. The computer system 400 may further include a video display unit 410 (e.g., a liquid crystal display (LCD) or a cathode ray tube (CRT)). The computer system 400 also includes an alphanumeric input device 412 (e.g., a keyboard), a user interface (UI) navigation device 414 (e.g., a mouse), a disk drive unit 416, a signal generation device 418 (e.g., a speaker) and a network interface device 420.
The disk drive unit 416 includes a non-transitory machine-readable medium 422 on which is stored one or more sets of instructions and data structures (e.g., software) 424 embodying or utilized by any one or more of the methodologies or functions described herein. The instructions 424 may also reside, completely or at least partially, within the main memory 404 and/or within the processor 402 during execution thereof by the computer system 400, the main memory 404 and the processor 402 also constituting non-transitory machine-readable media.
While the non-transitory machine-readable medium 422 is shown in an example embodiment to be a single medium, the term “non-transitory machine-readable medium” may include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more instructions or data structures. The term “non-transitory machine-readable medium” shall also be taken to include any tangible medium that is capable of storing, encoding or carrying instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of the present invention, or that is capable of storing, encoding or carrying data structures utilized by or associated with such instructions. The term “non-transitory machine-readable medium” shall accordingly be taken to include, but not be limited to, solid-state memories, and optical and magnetic media. Specific examples of non-transitory machine-readable media include nonvolatile memory, including by way of example semiconductor memory devices, e.g., Erasable Programmable Read-Only Memory (EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks.
Although an embodiment has been described with reference to specific example embodiments, it will be evident that various modifications and changes may be made to these embodiments without departing from the broader scope of the invention. Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense. The accompanying drawings that form a part hereof, show by way of illustration, and not of limitation, specific embodiments in which the subject matter may be practiced. The embodiments illustrated are described in sufficient detail to enable those skilled in the art to practice the teachings disclosed herein. Other embodiments may be utilized and derived therefrom, such that structural and logical substitutions and changes may be made without departing from the scope of this disclosure. This Detailed Description, therefore, is not to be taken in a limiting sense, and the scope of various embodiments is defined only by the appended claims, along with the full range of equivalents to which such claims are entitled.
Such embodiments of the inventive subject matter may be referred to herein, individually and/or collectively, by the term “invention” merely for convenience and without intending to voluntarily limit the scope of this application to any single invention or inventive concept if more than one is in fact disclosed. Thus, although specific embodiments have been illustrated and described herein, it should be appreciated that any arrangement calculated to achieve the same purpose may be substituted for the specific embodiments shown. This disclosure is intended to cover any and all adaptations or variations of various embodiments. Combinations of the above embodiments, and other embodiments not specifically described herein, will be apparent to those of skill in the art upon reviewing the above description.
Thus, a method and system to applying cloud computing as a service for enterprise software and data provisioning have been described. Although the present invention has been described with reference to specific example embodiments, it will be evident that various modifications and changes may be made to these embodiments without departing from the broader scope of the invention. Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense.
In addition, in the foregoing Detailed Description, it can be seen that various features are grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed embodiments require more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter lies in less than all features of a single disclosed embodiment. Thus the following claims are hereby incorporated into the Detailed Description, with each claim standing on its own as a separate embodiment.
This application claims the priority benefit of U.S. Provisional Application No. 61/297,520, filed Jan. 22, 2010 and titled “APPLYING PEER-TO-PEER NETWORKING PROTOCOLS TO VIRTUAL MACHINE (VM) IMAGE MANAGEMENT,” which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
61297520 | Jan 2010 | US |