The field of this invention relates to an integrated circuit device, an asymmetric multi-core processing module, and electronic device and a method of managing execution of computer program code therefor.
Integrated circuit devices that are intended for use within, for example, mobile devices such as wireless communication devices are required to meet high performance requirements and strict power consumption constraints. In order to meet the high performance requirements, integrated circuit devices currently implement high-speed processing cores. Conversely, in order to meet the strict power consumption constraints, slower processing cores have to be used. In order to meet both opposites, i.e. the performance requirements and the power consumption constraints, such integrated circuit devices can implement asymmetric multi-core platforms comprising a mix of high performance cores and low power cores.
However, a problem with such asymmetric multi-core platforms is that they are not supported by conventional software.
The present invention provides an integrated circuit device, an asymmetric multi-core processing module, an electronic device, a method of managing execution of computer program code and a non-transitory computer program product as described in the accompanying claims.
Specific embodiments of the invention are set forth in the dependent claims.
These and other aspects of the invention will be apparent from and elucidated with reference to the embodiments described hereinafter.
Further details, aspects and embodiments of the invention will be described, by way of example only, with reference to the drawings. In the drawings, like reference numbers are used to identify like or functionally similar elements. Elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale.
The present invention will now be described with reference to an integrated circuit device comprising at least one asymmetric multi-core processing module for use in a wireless communication unit, and a method of managing execution of computer program code therefor. However, it will be appreciated that the present invention is not limited solely to wireless communication applications, but may be equally applied to any integrated circuit device application that comprises one or more asymmetric multi-core platforms.
Because the illustrated embodiments of the present invention may for the most part, be implemented using electronic components and circuits known to those skilled in the art, details will not be explained in any greater extent than that considered necessary, for the understanding and appreciation of the underlying concepts of the present invention and in order not to obfuscate or distract from the teachings of the present invention.
Referring first to
In the example, the signal processing logic 108 is coupled to a memory element 116 that stores operating regimes, such as decoding/encoding functions and the like and may be realised in a variety of technologies, such as: random access memory (RAM) (volatile), (non-volatile) read only memory (ROM), Flash memory or any combination of these or other memory technologies. A timer 118 may be coupled to the signal processing logic 108 to control the timing of operations within the wireless communication unit 100.
Referring to now to
The asymmetric multi-core processing module 205 still further comprises a core identifier configuration component 230 operably coupled to the processing cores 210, 220 and arranged to provide core identifier values to the processing cores 210, 220 coupled thereto. Such core identifier values may comprise, in some examples, integer values used to enable individual processing cores 210, 220 to be identified, such that software threads or tasks may be managed to execute on processing cores 210, 220 comprising specific core identifier values. The core identifier configuration component 230 is further arranged to enable dynamic configuration of a core identifier of at least one of the processing cores 210, 220. In this manner, the core identifier for such a processing core 210, 220 (e.g. a target core) may be configured such that, say, computer program code executing on another processing core 210, 220 (e.g. an initial core) of the asymmetric multi-core processing module 205 may be transparently switched to executing on the target core 210, 220; whereby the core identifier of the target core 210, 220 may be dynamically configured to comprise a core identifier value of the initial core 210, 220. In this manner, because the core identifier value of the processing core that is executing the computer program code remains constant, even after execution switches between processing cores 210, 220, the computer program code is not required to support switching of processor cores.
By way of example, a hypervisor program (not shown) or other core management software or hardware may be arranged to monitor a load of a processing core executing computer program code, as would be readily understood by a skilled artisan. A hypervisor program typically comprises a part of software that routes particular software threads or tasks to particular processing cores 210, 220 for execution. If the hypervisor program determines that the processing core is, say, overloaded or under-loaded, the hypervisor program may be arranged to initiate a switch of the execution of software code from a first (initial)) processing core of a first type to another processing core of a second type. As part of initiating such a switch, the hypervisor program may be arranged to request, for example via control signals 235, the core identifier configuration component 230 in order to dynamically configure the core identifier of the second (target) processing core to be set to the core identifier value of the first processing core, as described in greater detail with reference to
In particular for the illustrated example, the core identifier configuration component 230 may be arranged to enable a core identifier for, say, a processing core of type ‘A’ 210 to be set to a core identifier value of a processing core of type ‘B’. In this manner, by configuring the core identifier for the processing core of type ‘A’ to have the value of a processing core of type ‘B’, computer program code executing on the processing core of type ‘B’ may be transparently switched to executing on the processing core of type ‘A’. Similarly, the core identifier configuration component 230 may be additionally arranged to enable a core identifier for a processing core of type ‘B’ 210 to be configured to a core identifier value of a processing core of type ‘A’, thereby enabling execution of program code to be transparently switched been processing core types in either direction.
For example, in a case where computer program code is being executed on a first processing core of type ‘A’ 210, which comprises a higher performance processing core, such a hypervisor program may determine that this first processing core 210 is under-loaded. As such, the hypervisor program may initiate a switch of the execution of the mentioned code to a second processing core of type ‘B’ 220, which comprises a lower power processing core. For example, where a software thread/task is configured to run on a higher performance processing core (e.g. core_#0), and where core_#1 is a lower power processing core:
In the above example, the decision to change core identifiers is taken based on system load. However, such a decision may additionally/alternatively be made depending on, for example, system power consumption, temperature, etc. Initiating the switch of the execution of the code may include requesting the core identifier configuration component 230 to dynamically configure the core identifier of the second processing core 220 to be set to the core identifier value of the first processing core 210 prior to, or substantially simultaneously as, switching execution of the computer program code from the first processing core 210 to the second processing core 220. In this manner, since the first, higher performance, processing core 210 was under-loaded when executing the computer program code, execution of the computer program code may be transparently switched to the second, lower power, processing core 210 without significantly impacting on the performance with regard to execution of that computer program code, whilst enabling power consumption to be reduced.
Conversely, in a case where computer program code is being executed on a first, lower power, processing core of type ‘B’ 220, a hypervisor program may determine that the first processing core 220 is overloaded. As such, the hypervisor program may initiate a switch to a second, higher performance, processing core of type ‘A’ 210, including requesting the core identifier configuration component 230 to dynamically configure the core identifier of the second processing core 210 to be set to a core identifier value of the first processing core 220 prior to, or substantially simultaneously as, switching execution of the computer program code from the first processing core 220 to the second processing core 210. In this manner, since the first, lower power, processing core 220 was overloaded when executing the computer program code, execution of the computer program code may be transparently switched to the second, higher power processing core 220, for example in order to improve performance with regard to execution of that computer program code.
In the above examples, only a single identifier is configured. However, it will be apparent that multiple identifiers may be configured simultaneously, either dependent or independent from each other. For example, when the core identifier configuration component 230 configures the core identifier of a target processing core to comprise a core identifier value of an initial processing core, the core identifier configuration component 230 may be also arranged to simultaneously configure the core identifier of the target processing core to comprise a core identifier value of the target processing core. In this manner, the core identifier configuration component 230 may be arranged to enable core identifier values for the initial and target processing cores to be swapped. Accordingly, code execution in the initial processing core can be transferred to the target core and vice versa in a transparent manner in a single configuration.
Referring now to
Additionally, for the illustrated example, the core identifier configuration component 230 further comprises a second core identifier selection component 330, which again in the illustrated example comprises a multiplexer. The second core identifier selector component 330 comprises a first core identifier input 332 arranged to receive the second core identifier value 344, a second core identifier input 334 arranged to receive the first core identifier value 342, a control input 336 arranged to receive a selector signal 314, and an output 338 operably coupled to one of the processing cores, which in the illustrated example comprises a processing core of type ‘B’ 220. The core identifier selector component 330 is arranged to output one of the received core identifier values 342, 344 in accordance with the received selector signal 314.
In the illustrated example, the core identifier configuration component 230 is illustrated as comprising two core identifier selector components 320, 330; one arranged to output a core identifier value to a first processing core of type ‘A’ 210 and one arranged to output a core identifier value to a second processing core of type ‘B’ 220. In this manner, processing cores 210, 220 and their respective core identifier selector components 320, 330 may be ‘paired’ to enable direct swapping of identifier values 342, 344 there between. In addition, the core identifier configuration component 230 may comprise multiple pairs of core identifier selector components 320, 330 arranged to provide core identifier values to corresponding pairs of processing cores 210, 220.
Alternatively, each core identifier selector component 320, 330 may be arranged to configure a core identifier of its respective core independently; accordingly each core identifier selector component 320, 330 may be arranged to receive any suitable number of core identifier values, for example one for each core within the asymmetric multi-core processing module 205, or a subset thereof, and to output one of the received core identifier values to the respective processing core 210, 220.
In the example illustrated in
In some examples, the core identifier control component 310 may be alternatively arranged to configure the core identifier values 342, 344 output by, in the illustrated example, the SCU 340. For example, the SCU 340 may be arranged to output, as the core identifier values 342, 344, values stored within programmable registers 343, 345 respectively. Accordingly, the core identifier control component 310 may be arranged to set new core identifier values 342, 344 for one or more processing cores 210, 220 by directly writing the new values to the appropriate programmable register 343, 345 within the SCU 340, for example via a register write signal illustrated generally at 346. In this manner, the core identifier values 342, 344 output by the SCU 340 may be provided directly to the respective processing cores 210, 220, substantially alleviating the need for the multiplexers 320, 330.
Referring back to
Alternatively, a method of transferring a context such as disclosed in United States patent application number US 2011/022869 A1 may be implemented to transfer content between the processing cores. Content of the processing cores may be alternatively transferred by way direct registers, by bus, scan chains, or retention flip-flop data transfer, etc.
In the illustrated example, the asymmetric multi-core processing module 205 comprises a shared L2 cache memory element 240 of which the content is accessible by both the processing cores of type ‘A’ 210 and by processing cores of type ‘B’ 220. In this manner, L2 content may be preserved during a context transfer from one processing core of the asymmetric multi-core processing module to another processing core thereof. In particular, L2 content corresponding to computer program code being switched from executing on one processing core to another processing core may be preserved, and will be readily accessible from the new processing core. In this manner, the latency of switching execution of computer program code from one processing core to another may be significantly reduced in comparison to, say, solutions in which separate L2 cache memory elements are implemented for the different types of processing cores. In this manner, fast, dynamic switching of computer program code execution between different processing cores may be achieved.
Thus, for some examples of the present invention, the asymmetric multi-core processing module 205 illustrated in
Referring now to
More specifically, in the example illustrated in
For the illustrated examples hereinbefore described, processing cores have been described as comprising either higher performance processing cores or lower power processing cores in order to aid understanding of the present invention. However, it will be appreciated that the present invention is not limited to applications comprising just these two types of processing cores, and the present invention may be implemented within any asymmetric multi-core platforms comprising substantially any configuration of processing core types. For example, the present invention is not limited to being implemented within a multi-core platform comprising substantially equal numbers of higher performance and lower power processing cores, such as a quad-core module comprising two higher performance processing cores and two lower power processing cores. Furthermore, for example, the present invention may be implemented within a multi-core platform comprising different numbers of higher performance and lower power processing cores, such as a quad-core module comprising one higher performance processing core and three lower power processing cores, or a quad-core module comprising one higher performance processing core, two ‘regular’ processing cores, and one lower power processing core.
At least parts of the invention may also be implemented in a computer program for running on a computer system, at least including code portions for performing steps of a method according to the invention when run on a programmable apparatus, such as a computer system or enabling a programmable apparatus to perform functions of a device or system according to the invention.
A computer program is a list of instructions such as a particular application program and/or an operating system. The computer program may for instance include one or more of: a subroutine, a function, a procedure, an object method, an object implementation, an executable application, an applet, a servlet, a source code, an object code, a shared library/dynamic load library and/or other sequence of instructions designed for execution on a computer system.
The computer program may be stored internally on computer readable storage medium or transmitted to the computer system via a computer readable transmission medium. All or some of the computer program may be provided on computer readable media permanently, removably or remotely coupled to an information processing system. The computer readable media may include, for example and without limitation, any number of the following: magnetic storage media including disk and tape storage media; optical storage media such as compact disk media (e.g., CD-ROM, CD-R, etc.) and digital video disk storage media; non-volatile memory storage media including semiconductor-based memory units such as FLASH memory, EEPROM, EPROM, ROM; ferromagnetic digital memories; MRAM; volatile storage media including registers, buffers or caches, main memory, RAM, etc.; and data transmission media including computer networks, point-to-point telecommunication equipment, and carrier wave transmission media, just to name a few.
A computer process typically includes an executing (running) program or portion of a program, current program values and state information, and the resources used by the operating system to manage the execution of the process. An operating system (OS) is the software that manages the sharing of the resources of a computer and provides programmers with an interface used to access those resources. An operating system processes system data and user input, and responds by allocating and managing tasks and internal system resources as a service to users and programs of the system.
The computer system may for instance include at least one processing unit, associated memory and a number of input/output (I/O) devices. When executing the computer program, the computer system processes information according to the computer program and produces resultant output information via I/O devices.
In the foregoing specification, the invention has been described with reference to specific examples of embodiments of the invention. It will, however, be evident that various modifications and changes may be made therein without departing from the broader spirit and scope of the invention as set forth in the appended claims, and that accordingly these are not limited to the examples described.
The connections as discussed herein may be any type of connection suitable to transfer signals from or to the respective nodes, units or devices, for example via intermediate devices. Accordingly, unless implied or stated otherwise, the connections may for example be direct connections or indirect connections. The connections may be illustrated or described in reference to being a single connection, a plurality of connections, unidirectional connections, or bidirectional connections. However, different embodiments may vary the implementation of the connections. For example, separate unidirectional connections may be used rather than bidirectional connections and vice versa. Also, plurality of connections may be replaced with a single connection that transfers multiple signals serially or in a time multiplexed manner. Likewise, single connections carrying multiple signals may be separated out into various different connections carrying subsets of these signals. Therefore, many options exist for transferring signals.
Each signal described herein may be designed as positive or negative logic. In the case of a negative logic signal, the signal is active low where the logically true state corresponds to a logic level zero. In the case of a positive logic signal, the signal is active high where the logically true state corresponds to a logic level one. Note that any of the signals described herein can be designed as either negative or positive logic signals. Therefore, in alternate embodiments, those signals described as positive logic signals may be implemented as negative logic signals, and those signals described as negative logic signals may be implemented as positive logic signals.
Furthermore, the terms ‘assert’ or ‘set’ and ‘negate’ (or ‘de-assert’ or ‘clear’) are used herein when referring to the rendering of a signal, status bit, or similar apparatus into its logically true or logically false state, respectively. If the logically true state is a logic level one, the logically false state is a logic level zero. And if the logically true state is a logic level zero, the logically false state is a logic level one.
Those skilled in the art will recognize that the boundaries between logic blocks are merely illustrative and that alternative embodiments may merge logic blocks or circuit elements or impose an alternate decomposition of functionality upon various logic blocks or circuit elements. Thus, it is to be understood that the architectures depicted herein are merely exemplary, and that in fact many other architectures can be implemented which achieve the same functionality. Any arrangement of components to achieve the same functionality is effectively ‘associated’ such that the desired functionality is achieved. Hence, any two components herein combined to achieve a particular functionality can be seen as ‘associated with’ each other such that the desired functionality is achieved, irrespective of architectures or intermediary components. Likewise, any two components so associated can also be viewed as being ‘operably connected,’ or ‘operably coupled,’ to each other to achieve the desired functionality.
Furthermore, those skilled in the art will recognize that boundaries between the above described operations merely illustrative. The multiple operations may be combined into a single operation, a single operation may be distributed in additional operations and operations may be executed at least partially overlapping in time. Moreover, alternative embodiments may include multiple instances of a particular operation, and the order of operations may be altered in various other embodiments.
Also for example, the examples, or portions thereof, may implemented as soft or code representations of physical circuitry or of logical representations convertible into physical circuitry, such as in a hardware description language of any appropriate type.
Also, the invention is not limited to physical devices or units implemented in non-programmable hardware but can also be applied in programmable devices or units able to perform the desired device functions by operating in accordance with suitable program code, such as mainframes, minicomputers, servers, workstations, personal computers, notepads, personal digital assistants, electronic games, automotive and other embedded systems, cell phones and various other wireless devices, commonly denoted in this application as ‘computer systems’.
However, other modifications, variations and alternatives are also possible. The specifications and drawings are, accordingly, to be regarded in an illustrative rather than in a restrictive sense.
In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word ‘comprising’ does not exclude the presence of other elements or steps then those listed in a claim. Furthermore, the terms ‘a’ or ‘an,’ as used herein, are defined as one or more than one. Also, the use of introductory phrases such as ‘at least one’ and ‘one or more’ in the claims should not be construed to imply that the introduction of another claim element by the indefinite articles ‘a’ or ‘an’ limits any particular claim containing such introduced claim element to inventions containing only one such element, even when the same claim includes the introductory phrases ‘one or more’ or ‘at least one’ and indefinite articles such as ‘a’ or ‘an.’ The same holds true for the use of definite articles. Unless stated otherwise, terms such as ‘first’ and ‘second’ are used to arbitrarily distinguish between the elements such terms describe. Thus, these terms are not necessarily intended to indicate temporal or other prioritization of such elements. The mere fact that certain measures are recited in mutually different claims does not indicate that a combination of these measures cannot be used to advantage.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/IB2011/055326 | 11/28/2011 | WO | 00 | 5/14/2014 |