The present disclosure relates generally to interrupts and exceptions for processors (e.g., systems-on-chips (SoCs)). More particularly, the present disclosure relates to clustering interrupts and exceptions in the processors using timing groups.
Interrupts and exceptions may typically be sent to directly from an interrupt controller to processing cores for servicing. These interrupts and/or exceptions if sent directly to the processing cores when generated by the interrupt controller, the interrupts and/or exceptions may occur aperiodically. While in some processors such timing may be suitable, some processors, such as those used in cellular phone and/or consumer Internet-of-Things (IoT) applications, may place processing cores in a sleep state to improve processing efficiency and/or power utilization. However, interrupts and exceptions continuously waking up the processing cores to service individual interrupts may consume a relatively high amount of power.
This section is intended to introduce the reader to various aspects of art that may be related to various aspects of the present disclosure, which are described and/or claimed below. This discussion is believed to be helpful in providing the reader with background information to facilitate a better understanding of the various aspects of the present disclosure. Accordingly, it may be understood that these statements are to be read in this light, and not as admissions of prior art.
Various aspects of this disclosure may be better understood upon reading the following detailed description and upon reference to the drawings in which:
One or more specific embodiments will be described below. To provide a concise description of these embodiments, not all features of an actual implementation are described in the specification. It should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another. Moreover, it should be appreciated that such a development effort might be complex and time consuming, but would nevertheless be a routine undertaking of design, fabrication, and manufacture for those of ordinary skill having the benefit of this disclosure.
When introducing elements of various embodiments of the present disclosure, the articles “a,” “an,” and “the” are intended to mean that there are one or more of the elements. The terms “comprising,” “including,” and “having” are intended to be inclusive and mean that there may be additional elements other than the listed elements. Additionally, it should be understood that references to “one embodiment” or “an embodiment” of the present disclosure are not intended to be interpreted as excluding the existence of additional embodiments that also incorporate the recited features.
Write mask registers 14 may include m (e.g., 8) write mask registers (k0 through km), each having a number (e.g., 64) of bits. Additionally or alternatively, at least some of the write mask registers 14 may have a different size (e.g., 16 bits). At least some of the vector mask registers 12 (e.g., k0) are prohibited from being used as a write mask. When such vector mask registers are indicated, a hardwired write mask (e.g., 0xFFFF) is selected and, effectively disabling write masking for that instruction.
General-purpose registers 16 may include a number (e.g., 16) of registers having corresponding bit sizes (e.g., 64) that are used along with x86 addressing modes to address memory operands. These registers may be referenced by the names RAX, RBX, RCX, RDX, RBP, RSI, RDI, RSP, and R8 through R15. Parts (e.g., 32 bits of the registers) of at least some of these registers may be used for modes (e.g., 32-bit mode) that is shorter than the complete length of the registers.
Scalar floating-point stack register file (x87 stack) 18 has an MMX packed integer flat register file 20 is aliased. The x87 stack 18 is an eight-element (or other number of elements) stack used to perform scalar floating-point operations on floating point data using the x87 instruction set extension. The floating-point data may have various levels of precision (e.g., 16, 32, 64, 80, or more bits). The MMX packed integer flat register files 20 are used to perform operations on 64-bit packed integer data, as well as to hold operands for some operations performed between the MMX packed integer flat register files 20 and the XMM registers.
Alternative embodiments may use wider or narrower registers. Additionally, alternative embodiments may use more, less, or different register files and registers.
Processor cores may be implemented in different ways, for different purposes, and in different processors. For instance, implementations of such cores may include: 1) a general purpose in-order core suitable for general-purpose computing; 2) a high performance general purpose out-of-order core suitable for general-purpose computing; 3) a special purpose core suitable for primarily for graphics and/or scientific (throughput) computing. Implementations of different processors may include: 1) a CPU including one or more general purpose in-order cores suitable for general-purpose computing and/or one or more general purpose out-of-order cores suitable for general-purpose computing; and 2) a coprocessor including one or more special purpose cores primarily for graphics and/or scientific (throughput). Such different processors lead to different computer system architectures, which may include: 1) the coprocessor on a separate chip from the CPU; 2) the coprocessor on a separate die in the same package as a CPU; 3) the coprocessor on the same die as a CPU (in which case, such a coprocessor is sometimes referred to as special purpose logic, such as integrated graphics and/or scientific (throughput) logic, or as special purpose cores); and 4) a system on a chip that may include on the same die the described CPU (sometimes referred to as the application core(s) or application processor(s)), the above described coprocessor, and additional functionality. Exemplary core architectures are described next, followed by descriptions of exemplary processors and computer architectures.
In
The front-end unit 56 includes a branch prediction unit 62 coupled to an instruction cache unit 64 that is coupled to an instruction translation lookaside buffer (TLB) 66. The TLB 66 is coupled to an instruction fetch unit 68. The instruction fetch unit 68 is coupled to a decode circuitry 70. The decode circuitry 70 (or decoder) may decode instructions and generate as an output one or more micro-operations, micro-code entry points, microinstructions, other instructions, or other control signals, which are decoded from, or which otherwise reflect, or are derived from, the original instructions. The decode circuitry 70 may be implemented using various different mechanisms. Examples of suitable mechanisms include, but are not limited to, look-up tables, hardware implementations, programmable logic arrays (PLAs), microcode read only memories (ROMs), etc. The processor core 54 may include a microcode ROM or other medium that stores microcode for macroinstructions (e.g., in decode circuitry 70 or otherwise within the front-end unit 56). The decode circuitry 70 is coupled to a rename/allocator unit 72 in the execution engine unit 58.
The execution engine unit 58 includes a rename/allocator unit 72 coupled to a retirement unit 74 and a set of one or more scheduler unit(s) 76. The scheduler unit(s) 76 represents any number of different schedulers, including reservations stations, central instruction window, etc. The scheduler unit(s) 76 is coupled to physical register file(s) unit(s) 78. Each of the physical register file(s) unit(s) 78 represents one or more physical register files storing one or more different data types, such as scalar integers, scalar floating points, packed integers, packed floating points, vector integers, vector floating points, statuses (e.g., an instruction pointer that is the address of the next instruction to be executed), etc. In one embodiment, the physical register file(s) unit(s) 78 includes the vector registers 12, the write mask registers 14, and/or the x87 stack 18. These register units may provide architectural vector registers, vector mask registers, and general-purpose registers. The physical register file(s) unit(s) 78 is overlapped by the retirement unit 74 to illustrate various ways in which register renaming and out-of-order execution may be implemented (e.g., using a reorder buffer(s) and a retirement register file(s); using a future file(s), a history buffer(s), and a retirement register file(s); using a register maps and a pool of registers; etc.).
The retirement unit 74 and the physical register file(s) unit(s) 78 are coupled to an execution cluster(s) 80. The execution cluster(s) 80 includes a set of one or more execution units 82 and a set of one or more memory access circuitries 84. The execution units 82 may perform various operations (e.g., shifts, addition, subtraction, multiplication) and on various types of data (e.g., scalar floating point, packed integer, packed floating point, vector integer, vector floating point). While some embodiments may include a number of execution units dedicated to specific functions or sets of functions, other embodiments may include only one execution unit or multiple execution units that all perform multiple different functions. The scheduler unit(s) 76, physical register file(s) unit(s) 78, and execution cluster(s) 80 are shown as being singular or plural because some processor cores 54 create separate pipelines for certain types of data/operations (e.g., a scalar integer pipeline, a scalar floating point/packed integer/packed floating point/vector integer/vector floating point pipeline, and/or a memory access pipeline that each have their own scheduler unit, physical register file(s) unit, and/or execution cluster. In the case of a separate memory access pipeline, a processor core 54 for the separate memory access pipeline is the only the execution cluster 80 that has the memory access circuitry 84). It should also be understood that where separate pipelines are used, one or more of these pipelines may be out-of-order issue/execution and the rest perform in-order execution.
The set of memory access circuitry 84 is coupled to the memory unit 60. The memory unit 60 includes a data TLB unit 86 coupled to a data cache unit 88 coupled to a level 2 (L2) cache unit 90. The memory access circuitry 84 may include a load unit, a store address unit, and a store data unit, each of which is coupled to the data TLB unit 86 in the memory unit 60. The instruction cache unit 64 is further coupled to the level 2 (L2) cache unit 90 in the memory unit 60. The L2 cache unit 90 is coupled to one or more other levels of caches and/or to a main memory.
By way of example, the register renaming, out-of-order issue/execution core architecture may implement the pipeline 30 as follows: 1) the instruction fetch unit 68 performs the fetch and length decoding stages 32 and 34 of the pipeline 30; 2) the decode circuitry 70 performs the decode stage 36 of the pipeline 30; 3) the rename/allocator unit 72 performs the allocation stage 38 and renaming stage 40 of the pipeline; 4) the scheduler unit(s) 76 performs the schedule stage 42 of the pipeline 30; 5) the physical register file(s) unit(s) 78 and the memory unit 60 perform the register read/memory read stage 44 of the pipeline 30; the execution cluster 80 performs the execute stage 46 of the pipeline 30; 6) the memory unit 60 and the physical register file(s) unit(s) 78 perform the write back/memory write stage 48 of the pipeline 30; 7) various units may be involved in the exception handling stage 50 of the pipeline; and/or 8) the retirement unit 74 and the physical register file(s) unit(s) 78 perform the commit stage 52 of the pipeline 30.
The processor core 54 may support one or more instructions sets, such as an x86 instruction set (with or without additional extensions for newer versions); a MIPS instruction set of MIPS Technologies of Sunnyvale, CA; an ARM instruction set (with optional additional extensions such as NEON) of ARM Holdings of Sunnyvale, CA). Additionally or alternatively, the processor core 54 includes logic to support a packed data instruction set extension (e.g., AVX1, AVX2), thereby allowing the operations used by multimedia applications to be performed using packed data.
It should be understood that the core may support multithreading (executing two or more parallel sets of operations or threads), and may do so in a variety of ways including time sliced multithreading, simultaneous multithreading (where a single physical core provides a logical core for each of the threads that physical core is simultaneously multithreading), or a combination thereof, such as a time-sliced fetching and decoding and simultaneous multithreading in INTEL® Hyperthreading technology.
While register renaming is described in the context of out-of-order execution, register renaming may be used in an in-order architecture. While the illustrated embodiment of the processor also includes a separate instruction cache unit 64, a separate data cache unit 88, and a shared L2 cache unit 90, some processors may have a single internal cache for both instructions and data, such as, for example, a Level 1 (L1) internal cache, or multiple levels of the internal cache. In some embodiments, the processor may include a combination of an internal cache and an external cache that is external to the processor core 54 and/or the processor. Alternatively, some processors may use a cache that is external to the processor core 54 and/or the processor.
The local subset of the L2 cache 104 is part of a global L2 cache unit 90 that is divided into separate local subsets, one per processor core. Each processor core 54 has a direct access path to its own local subset of the L2 cache 104. Data read by a processor core 54 is stored in its L2 cache 104 subset and can be accessed quickly, in parallel with other processor cores 54 accessing their own local L2 cache subsets. Data written by a processor core 54 is stored in its own L2 cache 104 subset and is flushed from other subsets, if necessary. The interconnection network 100 ensures coherency for shared data. The interconnection network 100 is bi-directional to allow agents such as processor cores, L2 caches, and other logic blocks to communicate with each other within the chip. Each data-path may have a number (e.g., 1012) of bits in width per direction.
Thus, different implementations of the processor 130 may include: 1) a CPU with the special purpose logic 136 being integrated graphics and/or scientific (throughput) logic (which may include one or more cores), and the cores 54A-N being one or more general purpose cores (e.g., general purpose in-order cores, general purpose out-of-order cores, or a combination thereof); 2) a coprocessor with the cores 54A-N being a relatively large number of special purpose cores intended primarily for graphics and/or scientific (throughput); and 3) a coprocessor with the cores 54A-N being a relatively large number of general purpose in-order cores. Thus, the processor 130 may be a general-purpose processor, coprocessor or special-purpose processor, such as, for example, a network or communication processor, compression engine, graphics processor, GPGPU (general purpose graphics processing unit), a high-throughput many integrated core (MIC) coprocessor (including 30 or more cores), an embedded processor, or the like. The processor 130 may be implemented on one or more chips. The processor 130 may be a part of and/or may be implemented on one or more substrates using any of a number of process technologies, such as, for example, BiCMOS, CMOS, or NMOS.
The memory hierarchy includes one or more levels of cache within the cores, a set or one or more shared cache units 140, and external memory (not shown) coupled to the set of integrated memory controller unit(s) 132. The set of shared cache units 140 may include one or more mid-level caches, such as level 2 (L2), level 3 (L3), level 4 (L4), or other levels of cache, a last level cache (LLC), and/or combinations thereof. While a ring-based interconnect network 100 may interconnect the integrated graphics logic 136 (integrated graphics logic 136 is an example of and is also referred to herein as special purpose logic 136), the set of shared cache units 140, and/or the system agent unit 134/integrated memory controller unit(s) 132 may use any number of known techniques for interconnecting such units. For example, coherency may be maintained between one or more cache units 142A-N and cores 54A-N.
In some embodiments, one or more of the cores 54A-N are capable of multi-threading. The system agent unit 134 includes those components coordinating and operating cores 54A-N. The system agent unit 134 may include, for example, a power control unit (PCU) and a display unit. The PCU may be or may include logic and components used to regulate the power state of the cores 54A-N and the integrated graphics logic 136. The display unit is used to drive one or more externally connected displays.
The cores 54A-N may be homogenous or heterogeneous in terms of architecture instruction set. That is, two or more of the cores 54A-N may be capable of execution of the same instruction set, while others may be capable of executing only a subset of a single instruction set or a different instruction set.
Referring now to
The optional nature of an additional processor 130B is denoted in
The memory 158 may be, for example, dynamic random-access memory (DRAM), phase change memory (PCM), or a combination thereof. For at least one embodiment, the controller hub 152 communicates with the processor(s) 130A, 130B via a multi-drop bus, such as a frontside bus (FSB), point-to-point interface such as QuickPath Interconnect (QPI), or similar connection 162.
In one embodiment, the coprocessor 160 is a special-purpose processor, such as, for example, a high-throughput MIC processor, a network or communication processor, a compression engine, a graphics processor, a GPGPU, an embedded processor, or the like. In an embodiment, the controller hub 152 may include an integrated graphics accelerator.
There can be a variety of differences between the physical resources of the processors 130A, 130B in terms of a spectrum of metrics of merit including architectural, microarchitectural, thermal, power consumption characteristics, and the like.
In some embodiments, the processor 130A executes instructions that control data processing operations of a general type. Embedded within the instructions may be coprocessor instructions. The processor 130A recognizes these coprocessor instructions as being of a type that should be executed by the attached coprocessor 160. Accordingly, the processor 130A issues these coprocessor instructions (or control signals representing coprocessor instructions) on a coprocessor bus or other interconnect, to the coprocessor 160. The coprocessor 160 accepts and executes the received coprocessor instructions.
Referring now to
Processors 172 and 174 are shown including integrated memory controller (IMC) units 178 and 180, respectively. The processor 172 also includes point-to-point (P-P) interfaces 182 and 184 as part of its bus controller units. Similarly, the processor 174 includes P-P interfaces 186 and 188. The processors 172, 174 may exchange information via a point-to-point interface 190 using P-P interfaces 184, 188. As shown in
Processors 172, 174 may each exchange information with a chipset 194 via individual P-P interfaces 196, 198 using point-to-point interfaces 182, 200, 186, 202. Chipset 194 may optionally exchange information with the coprocessor 176 via a high-performance interface 204. In an embodiment, the coprocessor 176 is a special-purpose processor, such as, for example, a high-throughput MIC processor, a network or communication processor, a compression engine, a graphics processor, a GPGPU, an embedded processor, or the like.
A shared cache (not shown) may be included in either processor 172 or 174 or outside of both processors 172 or 174 that is connected with the processors 172, 174 via respective P-P interconnects such that either or both processors' local cache information may be stored in the shared cache if a respective processor is placed into a low power mode.
The chipset 194 may be coupled to a first bus 206 via an interface 208. In an embodiment, the first bus 206 may be a Peripheral Component Interconnect (PCI) bus or a bus such as a PCI Express bus or another third generation I/O interconnect bus, although the scope of the present disclosure is not so limited.
As shown in
Referring now to
Referring now to
Embodiments of the mechanisms disclosed herein may be implemented in hardware, software, firmware, or a combination of such implementation approaches. Embodiments of the disclosure may be implemented as computer programs and/or program code executing on programmable systems including at least one processor, a storage system (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device.
Program code, such as data 224 illustrated in
The program code may be implemented in a high-level procedural or object-oriented programming language to communicate with a processing system. The program code may also be implemented in an assembly language or in a machine language. In fact, the mechanisms described herein are not limited in scope to any particular programming language. In any case, the language may be a compiled language or an interpreted language.
One or more aspects of at least one embodiment may be implemented by representative instructions stored on a machine-readable medium that represents various logic within the processor that, when read by a machine causes the machine to fabricate logic to perform the techniques described herein. Such representations, known as “IP cores,” may be stored on a tangible, machine readable medium and supplied to various customers or manufacturing facilities to load into the fabrication machines that make the logic or processor.
Such machine-readable storage media may include, without limitation, non-transitory, tangible arrangements of articles manufactured or formed by a machine or device, including storage media such as hard disks, any other type of disk including floppy disks, optical disks, compact disk read-only memories (CD-ROMs), compact disk rewritables (CD-RWs), and magneto-optical disks, semiconductor devices such as read-only memories (ROMs), random access memories (RAMs) such as dynamic random access memories (DRAMs), static random access memories (SRAMs), erasable programmable read-only memories (EPROMs), flash memories, electrically erasable programmable read-only memories (EEPROMs), phase change memory (PCM), magnetic cards, optical cards, or any other type of media suitable for storing electronic instructions.
Accordingly, embodiments of the embodiment include non-transitory, tangible machine-readable media containing instructions or containing design data, such as designs in Hardware Description Language (HDL) that may define structures, circuits, apparatuses, processors and/or system features described herein. Such embodiments may also be referred to as program products.
In some cases, an instruction converter may be used to convert an instruction from a source instruction set to a target instruction set. For example, the instruction converter may translate (e.g., using static binary translation, dynamic binary translation including dynamic compilation), morph, emulate, or otherwise convert instructions to one or more other instructions to be processed by the core. The instruction converter may be implemented in software, hardware, firmware, or a combination thereof. The instruction converter may be implemented on processor, off processor, or part on and part off processor.
Similarly,
To avoid such issues, interrupts 297 may be aggregated into clusters where the processor/SoC may place cores 298 in sleep states for longer durations without waking.
Although a processor/SoC utilizing the timing diagram 320 may cluster interrupts 297, the scheme relies upon creation and transmission of a dedicated waken signal that it transmitted to and received by the interrupt controller 296 over some connection using some protocol (e.g., peripheral component interconnect express (PCIE)). Thus, such processors/SoC's may require that the specific messaging mechanism (e.g., PCIe) exist in the design of the processor/SoC putting extraneous requirements on the processor/SoC beyond some planned implementations.
To accommodate the aperiodic nature of the interrupts 297, the interrupts 297 may be organized into membership groups. The membership groups may include an association of interrupts that frequently occur together. The release of an interrupt 297 may be held until a release that is at least partially dependent upon receipt of other interrupts 297 in the membership group.
The interrupt controller 296 receives an interrupt request 300 (block 414). The interrupt controller 296 determines whether the interrupt 297 is associated with a membership group (block 416). If the interrupt 297 is not associated with the membership group, the interrupt controller 296 may transmit the interrupt 297 without further delay (block 418). For example, critical interrupts (e.g., time-critical interrupts) may be prevented from being associated with membership groups to ensure that critical interrupts are transmitted without additional delay due to holding the interrupt.
If the interrupt 297 is associated with a membership group, the interrupt controller 296 determines whether a threshold number of interrupts 297 in the membership group has been exceeded by receipt of the interrupt 297 (block 426). For example, in some embodiments, the threshold may be set such that all held interrupts 297 in the membership group are released only after a certain number (e.g., 2, 3, 4, or all) of interrupts have been requested for the membership group. For example, the number of interrupts 297 for the threshold may be determined using heuristics/empirical testing. If the threshold has not been exceeded, the interrupt controller 296 holds the interrupt 297 (block 428). For instance, the interrupt request 300 may remain asserted until the interrupt 297 is transmitted and/or processed by the respective core 298 after receiving another interrupt request 300 for the membership group.
If the threshold has been surpassed, the interrupt controller 296 transmits all held interrupts 297 in the membership group to one or more respective core(s) 298 (block 430). The interrupt controller 296 waits until the core(s) 298 have serviced any of the transmitted interrupts 297 (block 432). Upon processing of an interrupt, the interrupt controller 296 clears a respective interrupt request 300 of the membership group (block 434). For example, the interrupt controller 296 (and/or the core(s) 298) may transmit a message to the peripheral 299 requesting the interrupt 297 and/or may a clear a latch latching the respective interrupt request 300.
Once all of the interrupt requests 300 of the membership group have been cleared (block 436), the interrupt controller 296 may restart the interrupt clustering for the membership group. For instance, the interrupt controller 296 may generate a detect restart signal clearing an interrupt release control and initiating control logic to cluster a next set of interrupt requests 300 in the membership group (block 438).
At time 468, the interrupt controller 296 receives the first interrupt request 300. At time 470, the interrupt controller 296 receives the second interrupt request 300. At time 472, the interrupt controller 296 receives the ith interrupt request 300. In the illustrated embodiment, all of the illustrated interrupts are in a same membership group. Furthermore, as illustrated, the ith interrupt request 300 causes the threshold for the membership group to be surpassed. Due to this threshold being surpassed at time 474, the release interrupt requests signal is asserted causing the first, second, and ith interrupts 297 (along with any other interrupts 297 in the membership group) to be transmitted from the interrupt controller 296. At time 476, the core(s) 298 completes servicing of the first interrupt 297, and the first interrupt request 300 is cleared. At time 478, the core(s) 298 completes servicing of the second interrupt 297, and the second interrupt request 300 is cleared. At time 480, the core(s) 298 completes servicing of the ith interrupt 297, and the ith interrupt request 300 is cleared. At time 482, the detect restart signal is asserted since all held interrupts 297 of the membership group have been cleared. At time 484, the release interrupt requests signal is deasserted based on the rising and/or falling edge of the detect restart signal. Due to the deassertion of the release interrupt requests signal, future incoming interrupts 297 corresponding to the membership group are held until the threshold has again been surpassed.
Each interrupt request 300 is transmitted to a respective OR gate 508 of the detection circuitry 502. The OR gate 508 may be used to enable/disable the detection in the detection circuitry 502 based on a disable signal 510, individually referenced as disable signals 510A, 510B, and 510C. The disable signals 510 are used to indicate that the respective interrupt 297 is associated with the membership group handled by the clustering circuitry 500. The disable signal 510 may be controlled based on values in the register used to track member interrupts of the membership group. Using the disable signal 510, the OR gates 508A, 508B, and 508C transmit respective outputs effectively causing the respective interrupt requests 300 to make no changes at an AND gate 512 to keep the AND gate 512 from causing interrupts 297 to be held. Thus, the clustering circuitry 500 may be programmable within the interrupt controller 296 using respective disable signals 510 to control which interrupt requests 300 are included in the membership group. Upon each input to the AND gate 512 transitioning high via a respective received interrupt request 300 or an assertion of the respective disable signal 510, the AND gate 512 asserts a detection signal 514 indicating that all of the associated interrupts for the membership group have been requested. In some embodiments, the AND gate 512 may be replaced by a counting mechanism that counts a number of threshold requests until a threshold is reached after which the detection signal 514 is output. The detection signal 514 is transmitted to a pulse generator 516 that outputs the detect restart signal 518 upon receipt of an indication that the respective interrupts 297 have been processed (e.g., upon receipt of a clock signal after the indication). The detection signal 514 is also transmitted to a one-detect circuit 520 that asserts a release interrupt requests signal 522 after a pulse is transmitted out of the AND gate 512 until the detect restart signal 518 resets the one-detect circuit 520.
The holding circuitry 503 for the membership group receives the release interrupt requests signal 522 at respective OR gates 524A, 524B, and 524C for each respective interrupt 297. The OR gates 524 also receive the respective disable signals 510. The respective interrupt 297 is held at a respective AND gate 526 (e.g., AND gate 526A, 526B, or 526C) until an output of the respective OR gate 524 transitions high. In other words, if the respective interrupt 297 is not included in the membership group (e.g., is a critical interrupt) or the detection circuitry 502 has indicated that held interrupts 297 are to be released, the AND gate 526 transmits the respective interrupt request 300 to a respective latch 528 to transmit the respective interrupt to the respective core(s) 298 (e.g., on a next clock cycle).
As may be appreciated in light of the foregoing disclosure, in wireless, mobility, and IoT product spaces, power consumption may be a major attribute that enables product differentiation between manufacturers To this point, the interrupt clustering herein may provide enhanced power efficiency in at least such devices including tablets, handheld, laptops, and wearable devices, where power consumption may be critical. This is true because the interrupt clustering discussed herein reduces non-critical system activity to reduce power consumption. By combining as many interrupt events as possible and servicing those interrupts at the same time, system activity is reduced as opposed to halting current activity, switching to service a single interrupt, and then resuming with the current activity, as illustrated in
Specifically, in wireless and mobile devices, there may typically be a mixture of peripherals 299 which trigger interrupt requests 300 in the SoC. The interrupts 297 may have varying degrees of latency in which each interrupt 297 is to be serviced before information is lost if the interrupt 297 is not serviced in time. For example, in cell phone chip applications, there are low-latency interrupt sources (such as video, graphics, display, and the like) and high-latency interrupt sources (such as voice (microphone), keypad, touchpad, sensors, camera, and the like). Low-latency interrupt sources may be less than ideal candidates for interrupt clustering. However, high-latency interrupt sources may benefit more from interrupt clustering. For example, in voice processing on the cell phone chip, sampling (e.g., 8 kHz pulse-code modulation (PCM)) may be used. The tenths of milliseconds delay before a next voice sample is to be processed means that it may be unnecessary to interrupt the endpoint before the next voice sample arrives.
Additionally, in the cell phone chip applications, some endpoints may consume information from the interrupt sources using processing endpoints to process information, storage endpoints (e.g., flash, SD cards, SDRAM, USB, and the like) to retain the information, and/or radio endpoints (e.g., transmitter and/or receiver) to relay the information. Even though the endpoint may have multiple channels, the endpoint may be optimized for performing one action at a time. For instance, a flash NAND chip may have only one datapath I/O and stores only one channel at a time.
Endpoint activity may be used to cluster interrupt requests. Furthermore, endpoint activity (e.g., heuristics) may be used to control an external event indication to trigger the release of the clustered interrupts 297. As an example, in the case when the core 298 is the endpoint, a measure of the core 298 activity may be represented as a hardware signal. A comparator block compares the activity measurement against a programmed threshold. When the CPU activity drops below the threshold, the comparator generates a release signal to the interrupt controller 296. In such cases, the high-latency interrupts may be serviced later when the activity level of the core 298 drops below the threshold.
Similarly, in the case of storage and radio endpoints, activity to those endpoints may be monitored in a hardware block. High-latency interrupts that also use those endpoints may be clustered as a group and released when activity to those endpoints has dropped below a programmed threshold.
An alternative metric (e.g., heuristics) that may be used to determine the external event trigger release is prioritization of tasks by the endpoint. Priorities may be assigned to tasks (e.g., core processes, storage memory block transfers, radio packets) being performed by the endpoint. Clustered interrupts may be delayed from release until the higher priority task is completed when the clustered interrupts are related to the higher priority task being serviced. For example, a voice packet is to be processed and stored to SDRAM with video information retrieved from the same storage device. The audio interrupt may be clustered with other interrupts that also utilize access to the SDRAM until the higher priority video information retrieval is completed. Other suitable external events may be used to control release of lower priority interrupts.
Since the clustered interrupts do not have infinite latency before being processed, the external event trigger release may have a fail-safe mechanism to prevent loss of information. For a set of interrupts 297 clustered as a group, the minimum latency may be determined by the interrupt that has the lowest latency requirement. Additionally or alternatively, the minimum latency may be derived from empirical testing. The external event trigger release logic would track the minimum latency and release the clustered interrupts 297 if the expected heuristic event has not finished before the minimum latency time.
At time 620, the first interrupt request 300 is received, but the first interrupt 297 is held as part of the external event group since the external event has not occurred (and the minimum latency has not been reached). At time 622, the second interrupt request 300 is received, but the second interrupt 297 is held as part of the external event group since the external event has not occurred (and the minimum latency has not been reached). At time 624, the ith interrupt request 300 is received, but the ith interrupt 297 is held as part of the external event group since the external event has not occurred (and the minimum latency has not been reached). At time 626, an indication of the external event is received. At time 628, the held interrupts 297 are transmitted to the respective core(s) 298. The held interrupts 297 may be released based on a rising edge of the indication, a falling edge of the indication, or after some propagation delay after either edge.
The first interrupt 297 is serviced at time 630, and the first interrupt request 300 is cleared. The second interrupt 297 is serviced at time 632, and the second interrupt request 300 is cleared. The ith interrupt 297 is serviced at time 634, and the ith interrupt request 300 is cleared. Based on all the held interrupts 297 being cleared, the detect restart signal 518 is pulsed at time 636. Any interrupts received after the detect restart signal 518 may be delayed until a next external event or until the minimum latency duration has elapsed.
As illustrated, the clustering circuitry 650 includes detection circuitry 651 and the holding circuitry 503. The detection circuitry 651 is used to control the holding circuitry 503 to ensure that the interrupt requests 300 are held until the external event and/or minimum latency occurs. The clustering circuitry 650 differs from the clustering circuitry 500 in that the detection circuitry 651 includes respective AND gates 652A, 652B, and 652C each configured to receive a respective interrupt request 300. When any AND gate 652 receives a respective assertion of a respective interrupt request 300 and a respective enable signal 654, the AND gate 652 outputs an assertion to a respective OR gate 656A, 656B, or 656C. The assertion propagates through the respective OR gate 656 to release trigger circuitry 658. The propagation of the assertion to the release trigger circuitry 658 indicate that at least one interrupt 297 may be held until an indication of an external event 660 is received. Upon receipt of the indication of the external event 660, the release interrupt requests signal 522 is toggled.
In some embodiments, the release trigger circuitry 658 may include tracking circuitry 662 to determine how long the interrupt has been held. For example, the tracking circuitry 662 may include a counter circuit configured to count a number of cycles (e.g., minimum latency) since the assertion was propagated from the OR gate 656. In some embodiments, the duration determined by the tracking circuitry 662 may be transmitted to a comparator 664 that compares the duration to a threshold duration. The threshold duration may be set for the external event group based on heuristics/empirical testing or other suitable techniques. If the threshold has been reached, the release trigger circuitry 658 may assert the release interrupt requests signal 522 even if the indication of the external event 660 has not been received.
Also, once the detection circuitry 651 has determined that all held interrupts 297 are cleared, a detection signal 666 may be toggled to generate the detect restart signal 518 via the pulse generator 516. For instance, detection signal 666 may transition from high to low after all of the interrupts requests 300 are cleared for the external event group. The assertion of the detect restart signal 518 may cause the release interrupt requests signal 522 to be deasserted and to place the release trigger circuitry 658 back to an initial state prepared to hold interrupt requests 300 until a next pulse is received on the indication of the external event 660. It should be noted that any aspect of the clustering circuitry 650 (e.g., the tracking circuitry 662 and the comparator 664) may be incorporated in the clustering circuitry 500 of
The holding circuitry 503 functions as discussed above in relation to the
In addition to or alternative to membership groups and/or external event groups, interrupts may be organized into timing groups where counters/timers may be started when an interrupt request 300 corresponding to the timing group has been received. Once the timing group counter/timer has begun, any incoming interrupt requests 300 for the timing group may be held until the counter/timer reaches a target value (e.g., 0) where the corresponding interrupts 297 are transmitted to respective endpoints. Once the interrupts 297 are transmitted and/or serviced, the counter/timer may be reset and/or set in an indeterminate state but without running the counter/timer until a next interrupt in the group is received. As may be appreciated, the timing groups may be a group of interrupts 297 that typically occur within a certain time window. The size of the timing window may be set using empirical testing or other suitable mechanisms. The duration of the counter/timer may be set to this time window with an optional additional cushion. Additionally or alternatively, the duration may be tested empirically to check for enhanced efficiency without causing unintended issues with timing. The timing group may be used to release interrupts within a time window of a first received interrupt request 300 for the group without perpetually running a periodic timer that may be used to release held interrupts as the periodic timer may consume power unnecessarily when no interrupts have been received.
However, if the timer/counter is already running, the counter/timer need not be started. While the counter/timer is running (or delayed when unready), the respective interrupt request 300 is held by the interrupt controller 296 (block 710). The interrupt controller 296 determines whether the counter/timer has reached a threshold (block 712). For example, the interrupt controller 296 may utilize a comparator to determine whether the counter/timer has reached a value (e.g., 0 when counting down or n when counting up). If the counter/timer has not reached the threshold, the interrupt controller 296 may determine whether other interrupt requests 300 in the timing group have been received (block 714). If other interrupt requests 300 are received while the timer is running, the interrupt controller 296 holds the other interrupt requests 300 (block 716).
Once the counter/timer has reached the threshold, the interrupt controller 296 transmits the held interrupt requests 300 to the endpoints, such as the core 298 (block 718). The interrupt controller 296 waits until the core(s) 298 have serviced any of the transmitted interrupts 297 for the timing group (block 720). Upon processing of an interrupt 297, the interrupt controller 296 (or other part of the processor/SoC) clears a respective interrupt request 300 of the timing group (block 722). For example, the interrupt controller 296 may transmit a message to the peripheral 299 that requested the interrupt 297 and/or may a clear a latch latching the respective interrupt request 300.
Once all of the interrupt requests 300 of the timing group have been cleared (block 724), the interrupt controller 296 may restart the clustering of the timing group. For instance, the interrupt controller 296 may generate the detect restart signal 518 clearing an interrupt release control and initiating control logic to cluster a next set of interrupt requests 300 in the timing group (block 726). Additionally, in readying for the next set of interrupt requests 300 in the timing group, the interrupt controller 296 may utilize the detect restart signal 518 to reset the counter/timer (block 728).
At time 762, the first interrupt request 300 is received at the interrupt controller 296. The first interrupt request 300 is held, and the interrupt controller 296 toggles the counter enable signal and starts the counter/timer. At time 764, the second interrupt request 300 is received at the interrupt controller. Since the counter/timer is already running, the counter/timer keeps running while the second interrupt request is held until the counter/timer has reached the threshold. At time 766, the third interrupt request 300 is received. Since the counter/timer is already running, the counter/timer keeps running while the second interrupt request is held until the counter/timer has reached the threshold. At time 768, the counter/timer reaches the threshold. After reaching the threshold (e.g., with or without a propagation delay), the first, second, and ith interrupt requests 300 that have been held during the counting are transmitted as respective first, second, and ith interrupts 297. At time 770, the first interrupt 297 has been serviced and the respective interrupt request 300 has been cleared. At time 772, the second interrupt 297 has been serviced and the respective interrupt request 300 has been cleared. At time 774, the ith interrupt 297 has been serviced and the respective interrupt request 300 has been cleared. After all of the interrupts 297 have been serviced, the counter is disabled and reset at time 776. This reset and disable of the counter is based at least in part on a rising edge or a falling edge of a pulse of the detect restart signal 518.
As previously discussed, the processor/SoC may have multiple timing groups.
At time 796, the first interrupt request 300 is received at the interrupt controller 296. Since the first interrupt request 300 is configured as belonging to the core A timing group 778 and the counter/timer of the core A timing group 778 has not already started, the counter/timer of the core A timing group 778 is started. The first interrupt request 300 is also held in the interrupt controller 296 until the counter/timer has crossed a threshold. At time 798, the ith+2 interrupt request 300 is received at the same interrupt controller 296 or another interrupt controller 296 of the processor/SoC. Since the ith+2 interrupt request 300 is configured as belonging to the core B timing group 780 and the counter/timer of the core B timing group 780 has not already started, the counter/timer of the core B timing group 780 is started. The ith+2 interrupt request 300 is also held in the interrupt controller 296 until the counter/timer has crossed a threshold. At time 800, the ith interrupt request 300 and the ith+j interrupt request 300 are received. Since both corresponding counter/timers are already running, the ith interrupt request 300 and the ith+j interrupt request 300 are each held until the already running respective counter/timers reach respective thresholds (e.g., determined using heuristics). At time 802, the counter/timer of the core A timing group 778 has reached the threshold, and the first interrupt 297 and the ith interrupt are released/transmitted from the interrupt controller 296. At time 804, the ith+1 interrupt request 300 is received. Since the corresponding counter/timer is already running, the ith interrupt request 300 is held until the already running counter/timer reaches a threshold. At time 806, the counter/timer of the core B timing group 780 has reached its threshold, and the ith+1, ith+2, and ith+j interrupts 297 are released/transmitted from the interrupt controller 296. The thresholds for the core A timing group 778 and the core B timing group 780 may be the same or different from each other. At time 808, a new cycle of the core A timing group 778 begins with a receipt of the second interrupt request 300. As indicated in the timing diagram 777, each of the timing groups 778 and 780 may operate independently with different counters/timers and interrupt requests 300 operating independently.
As illustrated, the clustering circuitry 820 includes detection circuitry 822 and the holding circuitry 503. The detection circuitry 822 is used to control the holding circuitry 503 to ensure that the interrupt requests 300 are held until the counter/timer meets the corresponding threshold. The clustering circuitry 820 differs from the clustering circuitry 500 in that the detection circuitry 822 includes the respective AND gates 652 each configured to receive a respective interrupt request 300. When any AND gate 652 receives a respective assertion of a respective interrupt request 300 and a respective enable signal 654, the AND gate 652 outputs a pulse to an OR gate 656. The pulse propagates through the OR gate 656 to a latch 824 that latches the assertion as a counter enable 826 upon receipt of a first interrupt request 300 after assertion of the detect restart signal 518. The counter enable 826 causes a timing group counter/timer 828 to begin counting and outputting a timing count 830 to a comparator 832. The comparator 832 compares the timing count 830 to a threshold level (e.g., 0) after which the comparator triggers the release interrupt requests signal 522.
Also, once the detection circuitry 822 has determined that all held interrupts 297 are cleared, the detection signal 666 may be toggled to generate the detect restart signal 518 via the pulse generator 516. For instance, the detection signal 666 may transition from high to low after all of the interrupts requests 300 are cleared for the timing group. The assertion of the detect restart signal 518 may cause the release interrupt requests signal 522 to be deasserted and to reset the timing group counter/timer 828 and/or the comparator 832.
The holding circuitry 503 functions as discussed above in relation to the
First, second, and ith may imply assigned number, priority order, or other nomenclature in various embodiments of the disclosure. For example, the priority order may control which interrupts 297 are serviced first when the clustered interrupts 297 are received together. First, second, and ith indexes as used herein may correspond to any number (e.g., 2, 3, 4, 10s, 100s, and the like) of interrupts, interrupt requests, and accompanying circuitry and signals.
As may be appreciated, the disclosure herein uses particular circuits for discussion with logic high and logic low values. However, some embodiments of the disclosure may include inverse logic with substitute logic elements. For example, AND gates may be used to produce inverted logic from a NAND gates, signals may be inverted using inverting amplifiers, or the like.
While the embodiments set forth in the present disclosure may be susceptible to various modifications and alternative forms, specific embodiments have been shown by way of example in the drawings and have been described in detail herein. However, it should be understood that the disclosure is not intended to be limited to the particular forms disclosed. The disclosure is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the disclosure as defined by the following appended claims. For instance, some embodiments of the processor/SoC disclosed herein may utilize a combination of the grouping mechanisms (e.g., membership groups, external event groups, and/or timing groups) to perform interrupt clustering.
The techniques presented and claimed herein are referenced and applied to material objects and concrete examples of a practical nature that demonstrably improve the present technical field and, as such, are not abstract, intangible or purely theoretical. Further, if any claims appended to the end of this specification contain one or more elements designated as “means for [perform]ing [a function] . . . ” or “step for [perform]ing [a function] . . . ”, it is intended that such elements are to be interpreted under 35 U.S.C. 112(f). However, for any claims containing elements designated in any other manner, it is intended that such elements are not to be interpreted under 35 U.S.C. 112(f).
A first set of example embodiments including:
An system comprising: an interrupt controller comprising: an input terminal configured to receive an interrupt request; an output terminal configured to output an interrupt based on the interrupt request; detection circuitry configured to detect whether a threshold number of interrupt requests have been received by the interrupt controller for a membership group; and holding circuitry configured to hold release of the interrupt until the threshold number of interrupt requests has been received by the interrupt controller.
The system of example embodiment 1, comprising a system on a chip having a peripheral device, wherein the input terminal is configured to receive the interrupt from the peripheral device
The system of example embodiment 2, wherein the output terminal is configured to transmit the interrupt to a processor of the system on a chip.
The system of example embodiment 1, wherein the interrupt controller is configured to receive a disable signal based on a register tracking which interrupts of a plurality of interrupts are associated with the membership group, wherein the disable signal causes the respective interrupt to bypass holding in the holding circuitry regardless of whether the threshold number of interrupt requests has been received.
The system of example embodiment 4, wherein the respective interrupt comprises a time-critical interrupt.
The system of example embodiment 5, wherein membership group interrupts of the plurality of interrupts membership group comprise non-time-critical interrupts.
The system of example embodiment 1, wherein the threshold number corresponds to all of the interrupts associated with the membership group having been requested.
The system of example embodiment 1, wherein the detection circuitry is configured to generate a reset detect signal once all held interrupt requests are serviced.
The system of example embodiment 8, wherein the detection circuitry is configured to reset detection of interrupt clustering in the membership group based at least in part on the reset detect signal.
A method comprising: receiving an interrupt request at an interrupt controller, wherein the interrupt request requests an interrupt to be transmitted to an endpoint; determining that the interrupt is associated with a membership group; holding transmission of the interrupt until a threshold number of interrupt requests have been received by the interrupt controller; determining that the threshold number of interrupt requests have been received by the interrupt controller; and based on the determination that the threshold number of interrupt requests having been received, transmitting all held interrupts for the membership group from the interrupt controller.
The method of example embodiment 10, wherein the interrupt request is generated in a system on a chip that includes the interrupt controller, and the endpoint is within the system on a chip.
The method of example embodiment 10 comprising: receiving an additional interrupt that is not in the membership group; and bypassing holding of the additional interrupt regardless of whether the threshold number of interrupt requests have been received.
The method of example embodiment 10 comprising: servicing the interrupt; and based at least in part on servicing the interrupt request, clearing the interrupt request.
The method of example embodiment 13 comprising: determining that all held interrupts for the membership group have been cleared; and based at least in part on all held interrupts for the membership group having been cleared, generating a detect restart signal.
The method of example embodiment 14 comprising, based on the detect restart signal, resetting detection circuitry configured to perform the operation of determining that the threshold number of interrupt requests have been received by the interrupt controller.
A system on a chip comprising:
one or more peripheral devices configured to generate interrupt requests to transmit interrupts; a programmable interrupt controller configured to:
track which interrupt requests are part of a membership group;
cluster interrupts that are part of the membership group, wherein clustering interrupts comprises holding interrupts until a threshold number of interrupt requests have been received; and
not clustering interrupts that are not part of the membership group; and
one or more processor cores configured to service the interrupts transmitted from the programmable interrupt controller.
The system on a chip of example embodiment 16 comprising a register used to send enable or disable signals to the programmable interrupt controller that uses the enable or disable signals to track which interrupt requests are part of the membership group.
The system on a chip of example embodiment 17, wherein the programmable interrupt controller comprises: two or more OR gates each configured to receive a respective interrupt requests and a respective enable or disable signal; and an AND gate configured to receive outputs of the two or more OR gates, wherein an interrupt release signal configured to release held interrupts is based at least in part on an output of the AND gate.
The system on a chip of example embodiment 18, wherein the programmable interrupt controller comprises a pulse generator configured to utilize the output of the AND gate to generate a detect restart signal indicating that all interrupts have been serviced.
The system of a chip of example embodiment 19, wherein the programmable interrupt controller is configured to reset interrupt clustering based at least in part on the restart detect signal.
A second set of embodiments including:
A system comprising: an interrupt controller comprising: input terminals configured to receive interrupt requests; output terminals configured to transmit interrupts based on the respective interrupt request; detection circuitry configured to detect whether any interrupt requests of an external event group have been received and whether an external event has occurred for the external event group of the interrupts; and holding circuitry configured to hold release of the interrupts of the external event group and to release the held interrupts upon receipt of the external event at the interrupt controller.
The system of example embodiment 1 comprising: additional detection circuitry configured to detect whether any interrupt requests of an additional external event group have been held and whether an additional external event has occurred for the additional external event group of the interrupts; and additional holding circuitry configured to hold release of the interrupts of the additional external event group and to release the additional held interrupts upon receipt of the external event at the interrupt controller.
The system of example embodiment 1, wherein the input terminals are configured to receive the interrupt requests from peripheral devices, and the output terminals are configured to transmit the interrupts to one or more endpoints.
The system of example embodiment 1, wherein the interrupt controller is configured to transmit the interrupts to processor endpoints configured to process information, to storage endpoints configured to store information, or radio endpoints used to transmit or receive information.
The system of example embodiment 1, wherein the interrupt controller is configured to enable interrupt clustering based at least in part on an activity level of an endpoint of the one or more endpoints exceeding a threshold.
The system of example embodiment 5, wherein the external event comprises an indication that the activity level of the endpoint has crossed to below the threshold.
The system of example embodiment 1, wherein the interrupt controller is configured to receive a disable signal based on a register tracking which interrupts of the interrupts are associated with the external event group, wherein the disable signal causes the respective interrupt to bypass holding in the holding circuitry regardless of whether the external event has been received.
The system of example embodiment 7, wherein the respective interrupt comprises a low-latency interrupt.
The system of example embodiment 1 comprising a counter configured to count a minimum latency after which the interrupt controller is configured to transmit the held interrupts regardless of whether the external event has been received.
The system of example embodiment 9 comprising a comparator configured to compare a count from the counter to the minimum latency to determine when to transmit the held interrupts based on the minimum latency.
The system of example embodiment 1, wherein the detection circuitry is configured to generate a reset detect signal once held interrupt requests are serviced.
The system of example embodiment 11, wherein the detection circuitry is configured to reset release trigger circuitry to reset interrupt clustering in the external event group based at least in part on the reset detect signal until a subsequent external event occurs while holding a subsequently received interrupt request for the external event group.
A method comprising: receiving an interrupt request at an interrupt controller, wherein the interrupt request requests an interrupt to be transmitted to an endpoint; determining that the interrupt is associated with an external event group; holding transmission of the interrupt until an external event has been received by the interrupt controller; after holding the transmission of the interrupt, determining that the external event has been received by the interrupt controller; and based on the determination that the external event has been received after holding the transmission of the interrupt, transmitting held interrupts for the external event group from the interrupt controller.
The method of example embodiment 13 comprising: servicing the interrupt; and based at least in part on servicing the interrupt request, clearing the interrupt request.
The method of example embodiment 14 comprising: determining that the held interrupts for the external event group have been cleared; and based at least in part on the held interrupts for the external event group having been cleared, generating a detect restart signal.
The method of example embodiment 15 comprising, based on the detect restart signal, resetting detection circuitry configured to perform the operation of determining that the external event been received by the interrupt controller.
A system comprising: one or more peripheral devices configured to generate interrupt requests to transmit interrupts; a programmable interrupt controller configured to:
track which interrupt requests are part of an external event group;
cluster interrupts that are part of the external event group, wherein clustering interrupts comprises holding interrupts until a threshold number of interrupt requests have been received; and
not clustering interrupts that are not part of the external event group; and
one or more endpoints configured to service the interrupts transmitted from the programmable interrupt controller.
The system of example embodiment 17, wherein the interrupts have a priority scheme with the clustered interrupts having a priority level lower than non-clustered interrupts.
The system of example embodiment 18, wherein at least one interrupt that is not clustered comprises a high priority interrupt, and the external event comprises an indication that one of the non-clustered interrupts has been serviced.
The system of example embodiment 19, wherein a first interrupt of the clustered interrupts is associated with audio information from a storage device, and a second interrupt of the non-clustered interrupts is associated with video information from the storage device, and the first interrupt is held until the second interrupt is serviced.
A third set of embodiments including:
A system comprising: an interrupt controller comprising: input terminals configured to receive a plurality of interrupt requests; an output terminal configured to output a plurality of interrupts based on the interrupt requests; detection circuitry configured to detect that an interrupt request of the plurality of interrupt requests has been received, to start a counter for a timing group based on receiving the interrupt request; and holding circuitry configured to hold release of an interrupt of the plurality of interrupts corresponding to the interrupt request until the counter reaches a threshold value.
The system of example embodiment 1, wherein the holding circuitry is configured to hold a subsequent interrupt of the timing group when a corresponding interrupt request of the plurality of interrupt requests is received after starting the counter but before the counter has reached the threshold value.
The system of example embodiment 1 comprising: additional detection circuitry configured to detect that an additional interrupt request of the plurality of interrupt requests has been received, to start an additional counter for an additional timing group based on receiving the interrupt request; and additional holding circuitry configured to hold release of an additional interrupt of the plurality of interrupts corresponding to the additional interrupt request until the additional counter reaches an additional threshold value.
The system of example embodiment 3, wherein the timing group corresponds to interrupts serviced using a first processor core, and the additional timing group corresponds to interrupts serviced using a second processor core.
The system of example embodiment 3, wherein the threshold value is equal to the additional threshold value.
The system of example embodiment 4, wherein the threshold value is 0.
The system of example embodiment 1, wherein the interrupt controller is configured to receive a disable signal based on a register tracking which interrupts of a plurality of interrupts are associated with the timing group, wherein the disable signal causes corresponding interrupts to bypass holding in the holding circuitry regardless of a status of the counter.
The system of example embodiment 7, wherein the corresponding interrupts comprise time-critical interrupts.
The system of example embodiment 1, wherein the detection circuitry is configured to generate a reset detect signal once held interrupt requests are serviced, and the detection circuitry is configured to reset the counter based at least in part on the reset detect signal.
A method comprising: receiving an interrupt request at an interrupt controller, wherein the interrupt request requests an interrupt to be transmitted to an endpoint; determining that the interrupt is associated with a timing group; starting a counter corresponding to the timing group based at least in part on receiving the interrupt; holding transmission of the interrupt until the counter reaches a threshold number; and based on the counter reaching the threshold number, transmitting the held interrupts for the timing group from the interrupt controller.
The method of example embodiment 10 comprising: receiving a subsequent interrupt that is associated with the timing group while the counter is running but before the counter reaches the threshold number; and holding transmission of the interrupt until a threshold number of interrupt requests have been received by the interrupt controller.
The method of example embodiment 10 comprising: receiving an additional interrupt that is not in the timing group; and bypassing holding of the additional interrupt regardless of a status of the counter.
The method of example embodiment 10 comprising: servicing the interrupt; and based at least in part on servicing the interrupt request, clearing the interrupt request.
The method of example embodiment 13 comprising: determining that held interrupts for the timing group have been cleared; and based at least in part on the held interrupts for the timing group having been cleared, generating a detect restart signal.
The method of example embodiment 14 comprising, based on the detect restart signal, resetting the counter.
The method of example embodiment 10, wherein the interrupt controller is configured to control clustering of interrupts for a plurality of timing groups including the timing group, wherein the plurality of timing groups correspond to different processor cores to service the respective interrupts of the plurality of timing groups.
The method of example embodiment 10, wherein the interrupt controller is configured to control clustering of interrupts for a plurality of timing groups including the timing group, wherein allocation of the interrupts to the plurality of timing groups is user-programmable.
The method of example embodiment 10, wherein interrupts allocated to the timing group are typically performed within the threshold number of cycles of each other.
A system comprising: one or more peripheral devices configured to generate interrupt requests to transmit interrupts; a programmable interrupt controller configured to:
track which interrupt requests are part of a timing group;
cluster interrupts that are part of the timing group, wherein clustering interrupts comprises starting a counter based on receipt of a first interrupt of the timing group and holding interrupts while the counter is running until the counter has reached a threshold number of interrupt requests having been received by the programmable interrupt controller; and
not clustering interrupts that are not part of the timing group; and
one or more processor cores configured to service the interrupts transmitted from the programmable interrupt controller.
The system of example embodiment 19 comprising a register used to send enable or disable signals to the programmable interrupt controller that uses the enable or disable signals to track which interrupt requests are part of the timing group.