“Hadoop” is an open-source software framework for storage and large scale data processing on clusters of homogenous computers. “MapReduce” is a programming model that may be used in Hadoop clusters for processing large data sets in parallel.
The present application may be more fully appreciated in connection with the following detailed description taken in conjunction with the accompanying drawings, in which like reference characters refer to like parts throughout, and in which:
It is appreciated that certain terms are used throughout the following description and claims to refer to particular system components. As one skilled in the art will appreciate, technology companies may refer to a component by different names. This document does not intend to distinguish between components that differ in name but not function. In the following discussion and in the claims, the terms “including” and “comprising” are used in an open-ended fashion, and thus should be interpreted to mean “including, but not limited to . . . .”Also, the term “couple” or “couples” is intended to mean either an indirect, direct, optical or wireless electrical connection. Thus, if a first device couples to a second device, that connection may be through a direct (wired or wireless) connection or through an indirect connection via other devices and connections.
The following discussion is generally directed to a job scheduler in a system that includes a multicore system. The multicore system described herein may be a heterogeneous multicore system meaning that the system includes at least two different types of cores. The job scheduler described herein takes advantage of the heterogeneous nature of the multicore system to more efficiently schedule and process jobs such as “MapReduce” jobs.
Data centers face diverse computing needs. Some jobs to be executed are time-sensitive meaning their completion deadline is mission critical. An example of a time-sensitive job is a job involving direct user interaction. If a user is directly interacting with software, the user is waiting for a response and the time lag for the software to provide the result should be as short as possible. Other jobs, such as batch processing jobs, may be less time-sensitive in that their completion deadline is not as critical as more time-sensitive jobs.
Systems that include homogenous processor cores (i.e., all processor cores being identical) may not be the most effective at processing largely diverse jobs. A system-on-a-chip (SoC) may include one or more multicore processors. All of the cores may be the highest speed cores available at the time of manufacture. Such cores may be beneficial for processing time-sensitive jobs. However, such high speed cores consume more power than lower speed cores. A given power budget for an SoC, therefore, limits the number of high speed cores that can be included on a given SoC. Consequently, an SoC with high speed cores may only be able to have relatively few of such cores. Conversely, an SoC may include lower speed cores. Such cores consume less power than their higher speed counterparts, thereby permitting an SoC to include a larger number of such lower speed cores for the same power budget. However, lower speed cores may not be satisfactory for processing time-sensitive jobs.
The SoC described herein includes heterogeneous multicore processors. A heterogeneous SoC includes two more different types of processor cores. For example, a heterogeneous SoC may include one or more higher speed cores better suited for processing time-sensitive jobs and one or more lower speed cores that can be used to process less time-sensitive batch jobs. As such, a heterogeneous SoC is more effective at meeting diverse computing needs. Because the lower speed cores are used to process the less time-sensitive batch jobs, the higher speed cores are made available to process the more time-sensitive jobs. A job scheduler is described below that schedules different types of jobs among the different types of processor cores accordingly.
As noted above, in one embodiment, the heterogeneous SoC may include two types of processor cores—one or more higher speed (e.g., higher performance) cores and one or more lower speed (e.g., lower performance) cores. Higher performance cores generally consume more power than lower performance cores, but execute jobs faster than lower performance cores. Other implementations of the SoC may have more than two different types of cores, for example, cores of slow, medium, and high speeds.
To offer diverse computing capabilities, the SoC described herein is heterogeneous meaning, as noted above, that it has a mix of higher and lower performance cores that execute the same instruction set while exhibiting different power and performance characteristics. A higher performance core operates at a higher frequency than a lower performance core, but also consumes more power than its lower performance core counterpart. Because lower performance cores consume less power than higher performance cores, for a given power budget, the SoC can include a larger number of lower power cores. Given that an SoC has a predetermined number of high performance cores and a predetermined number of low performance cores, the disclosed embodiments optimally utilize each group of cores. The principles discussed herein apply to a system that includes cores of different types, whether such cores are on one integrated SoC or provided as separate parts.
A scheduler engine is described herein. The scheduler engine permits the cores of a heterogeneous multi-core system to be virtually grouped based on their performance capabilities, with the higher performance (faster) cores included in one virtual pool and the lower performance (slower) cores included in a different virtual pool. As such, while all of the faster and slower cores may be physically provided on one SoC, cores of a common are grouped together by software. Such groupings are referred to as the “virtual” pools. As such, different types of jobs can be performed by different virtual pools of cores. For example, a first job type such as a completion-time sensitive job (e.g., a job in which a user is directly interacting with a machine) can be assigned to the virtual pool having faster cores, while a second job type such as a large batch job for which rapid completion time is less critical (e.g., a service-level job) can be assigned to the virtual pool having slower cores.
As shown in
The scheduler engine 102 may group some or all the faster cores 108 from the nodes 106 of the cluster 104 to create the virtual fast pool 130. Similarly, the scheduler engine 102 creates a virtual slow pool 132 to include some or all of the slower cores 110 from the nodes 106. In some implementations, the virtual fast pool 130 and virtual slow pool 132 are static, which means that, regardless of varying job queue requirements, the faster cores 108 and the slower cores 110 from each of the nodes 106 in the cluster 104 remain grouped into their respective virtual fast pool 130 and virtual slow pool 132.
While some or all faster cores 108 are included in the virtual fast pool 130, each of such cores may or may not be available—some faster cores may be available to process a job, while other faster cores in the virtual fast pool 130 are preoccupied currently processing a job and thus are unavailable. The same is true for the slower cores in the virtual slow pool 132—some of the slower cores in that pool may be available, while other slower cores are unavailable.
The first job queue 120 may be used to store jobs to be processed by the virtual fast pool 130 of faster cores 108, while the second job queue 122 may be used to store jobs to be processed by the virtual slow pool 132 of slower cores 110. A user may specify a particular job to be included into a designated job queue 120, 122. The scheduler engine 102 assigns one of the virtual pools 130, 132 to process a given job from the various job queues 120, 122. For example, if a user determines that a particular job is completion-time sensitive, the user may cause that job to be included into the first job queue 120. However, if the user considers a job not to be time sensitive (e.g., a batch job), the user may cause the job to be placed into the second job queue 122. The scheduler engine 102 causes the jobs from the first job queue 120 to be processed by the virtual fast pool 130 and jobs from the second job queue 122 to be processed by the virtual slow pool 132. Jobs are thus assigned to the virtual pools 130, 132 based on the job queues from which they originate.
In one example, the cluster 104 may store web server logs that track users' activities on various websites. It may be desirable to know how many times a particular word such as the word “Christmas” has been searched in the various websites during the past week. If the user determines this query is to be a time sensitive job, the user gives the job to the cluster 104 and includes the job in the first job queue 120 associated with the virtual fast pool 130. Upon the scheduler engine 102 recognizing that there is a new job in the first job queue 120, the scheduler engine 102 causes the job to be assigned to the virtual fast pool 130 to process the job.
Referring still to
A map stage (running the map function) is partitioned into map tasks, and a reduce stage (running the reduce function) is partitioned into reduce tasks. Each map task processes a logical split of data that is saved over nodes 106 in cluster 104. Data may be divided into uniformly-sized chunks and a default size of the chunk may be 64 MB. The map task reads the data, applies the user-defined map function on the read data, and buffers the resulting output as intermediate data. The reduce task applies the user-defined reduce function to process the intermediate data to generate output data such as an answer to a problem a user originally tries to solve. The scheduler engine 102 manages the nodes 106 in the cluster 104. Each node 106 may have a fixed number of map slots and reduce slots, which can run map tasks and reduce tasks, respectively. In some implementations, each of the faster cores 108 and slower cores 110 in the nodes 106 performs a map task and/or a reduce task. In one example, four slots may be available to be assigned by the scheduler engine 102 including a fast map slot, a fast reduce slot, a slow map slot and a slow reduce slot. The fast map slot may run the map task using the faster cores 108, and the slow reduce slot may run the reduce task using the slower cores 110.
Each of the nodes 106 in the cluster 104 includes a task tracker 103. The task tracker 103 in a given node is configured to do such operations as: monitor and report the availability of the faster cores and slower cores in the node to process a job, determine whether an available faster core may run a fast map task or a fast reduce task and whether an available slower core may run a slow map task or a slow reduce task, and send information regarding the cores' availability to the scheduler engine 102. Based on the cores' availability information from each of the task trackers 103, the scheduler engine 102 interacts with the first job queue 120 and the second job queue 122 to assign available faster cores 108 from the virtual fast pool 130 to process jobs in the first job queue 120, and to assign available slower cores 110 from the virtual slow pool 132 to process jobs in the second job queue 122.
In some examples, the system may include a virtual shared pool of processor cores as is illustrated in
In
In one implementation, the virtual shared pool 240 is created by the scheduler engine 202 based on an unavailability of a faster core 108 or a slower core 110 in the virtual pools 130, 132 to process jobs from a dedicated job queue 120 and 122, respectively. For example, a job to be processed may be present in the first job queue 120 which is dedicated to be processed by the faster cores 108 in the virtual fast pool 130. However, due to an unavailability of any of the faster cores 108 in the virtual fast pool 130, the scheduler engine 202 may assign one or more of the slower cores 110 from the virtual slow pool 132 by way of the virtual shared pool created by the scheduler engine 102. As such, the faster cores 108 in the virtual shared pool 240 may include at least one of the faster cores 108 from the virtual fast pool 130 and at least one of the slower cores 110 from the virtual slow pool 132. In another example, while there is no job present in the second job queue 122, the scheduler engine 202 may add available slower cores 110 from the virtual slow pool 132 to the virtual shared pool 240 so that the slower cores 110 being added to the virtual shared pool 240 may be able to process a job in a first job queue,
Further, since the job queue requirement and the availabilities of the faster cores 108 and the slower cores 110 may change during runtime, the scheduler engine 202 may change the configuration of the virtual shared pool 240 dynamically. For example, a first job is present in the first job queue 120. Initially, the scheduler engine 202 detects whether an available faster core 108 exists in the virtual fast pool 130. If there is one present, the scheduler engine 202 assigns the job to be processed by the available faster core 108. However, if a faster core 108 is not available in the virtual fast pool 130, the scheduler engine 202 creates the virtual shared pool 240 to enable a slower core 110 from the virtual slow pool 132 to be added to the virtual shared pool 240, resulting in the slower core 110 (now in the virtual shared pool 240) to process the job in the first job queue.
Continuing with the above example, after the first job from the first job queue 120 is completed by the slower core 110 in the virtual shared pool, a second job may be placed in the first job queue 120 and a third job may be placed in the second job queue 122. The scheduler engine 202 may detect that a faster core 108 is now available in the virtual fast pool 130 and, if so, then the scheduler engine 202 may remove the slower core 110 from the virtual shared pool 240 back to the virtual slow pool 132. The second job from the first job queue 120 then may be processed the faster core 108 now available in the virtual fast pool 130. Further, the third job from the second job queue 122 may be processed by a slower core 110 in the virtual slow pool 132 (e.g., the slower core 110 just moved back from the virtual shared pool 240 to the virtual slow pool 132).
By creating the virtual shared pool 240, resources (e.g., faster cores 108 and slower cores 110) in the cluster 104 may be more efficiently utilized. Jobs in the first job queue 120 and the second job queue 122 can be processed by either a faster core 108 or a slower core 110 in the virtual shared pool 240. For example, a job in the first job queue 120 may be processed by an available slower core 110 in the virtual shared pool 240 until a faster core 108 becomes available (e.g., after completing its existing job), and similarly a job in the second job queue 122 may be processed by an available faster core 108 in the virtual shared pool 240 if no slower cores 110 are otherwise available.
As shown in
Referring to
At block 504, the scheduler engine 102, based on a user's decision, determines whether a job to be processed is in the first job queue 120 or in the second job queue 122. In some examples, the first job queue 120 may be a time sensitive job queue (e.g., for interactive jobs) and the second job queue 122 may be a non-time sensitive job queue (e.g., for batch jobs. Further, a user who requests the job may specify (e.g., by way of a flag as noted above) the job queue in which the job is to be placed.
The method 500 continues at block 506 and block 508 with executing the pool assignment module 312 to cause the scheduler engine 102 to choose which virtual pool should be used to process a job. If the scheduler engine 102 determines the presence of a job in the first job queue, at 506, the scheduler engine 102 assigns the faster cores 108 in the virtual fast pool 130 to process the job. Analogously, if the scheduler engine 102 determines the presence of a job in the second job queue, at 508, the scheduler engine 102 uses the slower cores 110 in the virtual slow 132 pool to process the job.
If the job is in the first job queue, the method 600 continues at block 604 to cause the scheduler engine 202 to determine whether a faster core 108 is available in a virtual fast pool 130. if a faster core 108 in the virtual fast pool 130 is available, the method 600 continues at block 608 with processing the job by the faster cores 108 in the virtual fast pool 130. However, if the scheduler engine 202 determines that a faster core 108 in the virtual fast pool 130 is not available, the processor 402 executes the virtual shared pool generation module 414 to create a virtual shared pool 240 (block 605) and to process the job by the virtual shared pool (612).
Returning to block 602, if the job is specified in the second job queue, the method 600 continues at block 606 to cause the scheduler engine 202 to determine whether a slower core 110 is available in the virtual slow pool 132. Similarly, if a slower core 110 in the virtual slow pool 132 is available, the method 600 continues at block 610 with processing the job by a slower core 110 in the virtual slow pool 132. However, if the scheduler engine 202 determines that a slower core 110 in the virtual slow pool 132 is not available, the processor 402 executes the virtual shared pool generation module 414 to create a virtual shared pool 240 (block 607) and to process the job by the virtual shared pool (612).
In some implementations, the virtual shared pool 240 includes all of the available (if any) faster cores 108 and all of the available (if any) slower cores 110 from the virtual fast pool 130 and the virtual slow pool 132.
The above discussion is meant to be illustrative of the principles and various embodiments of the present disclosure. Numerous variations and modifications will become apparent to those skilled in the art once the above disclosure is fully appreciated, It is intended that the following claims be interpreted to embrace all such variations and modifications.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2014/018345 | 2/25/2014 | WO | 00 |