Efficiency of a microprocessor increases with the number of instructions executed over multiple execution cycles. Executable program code, however, typically includes branches that can change the sequence of execution of instructions in the executable program code. As a result, in pipelined microprocessors, a branch represents a control hazard because the branch can cause the loss of processor cycles regardless the type of branch (e.g., an unconditional branch or conditional branch).
An approach to avoid losing processor cycles due to a branch is to use branch prediction with a branch target buffer. In that approach, a microprocessor predicts whether that branch is to be taken or not to be taken. Using such a prediction, the microprocessor obtains a speculative branch target address as a next instruction fetch. In cases the branch prediction is a hit, the microprocessor continues operation without loss of processor cycles. In cases the prediction is incorrect, the microprocessor flushes the speculatively fetched instruction using the branch target address and fetches the appropriate instruction using an adjusted address that is correct. Flushing the speculatively fetched instructions causes the loss of processor cycles. The number of lost processor cycles per branch misprediction quantifies a branch misprediction penalty in the pipelined microprocessor.
Although there are various branch prediction mechanisms, branch misprediction may not be avoided. Therefore, improved technologies for the reduction of branch misprediction penalty may be desired.
This disclosure addresses the issue of branch misprediction penalty, providing branch mechanisms that can reduce or avoid branch misprediction penalty.
According to an embodiment, the disclosure provides a multi-thread microprocessor. The multi-thread microprocessor includes comprising first stage circuitry that fetches a pair of consecutive instructions of a program executed in a thread of multiple threads. The multi-thread microprocessor also includes a second stage circuitry that determines, during a clock cycle, that a first instruction in the pair of consecutive instructions is a branch instruction. The first stage circuitry fetches, during a second clock cycle after the clock cycle, a pair of branch target instructions of the program using a branch prediction, where the second clock cycle follows the clock cycle without interruption. The multi-thread microprocessor further includes third stage circuitry that determines that the branch prediction is a misprediction during the second clock cycle. The first stage circuitry sends the second instruction to the second stage circuitry during a third clock cycle after the second clock cycle, wherein the third clock cycle follows the second clock cycle without interruption. In addition, the second stage circuitry decodes the second instruction during the third clock cycle.
According to another embodiment, the disclosure provides another multi-thread microprocessor. The multi-thread microprocessor includes first stage circuitry that determines, during a clock cycle, that a first instruction of a program executed in a thread of multiple threads is a branch instruction. The multi-thread microprocessor also includes second stage circuitry that fetches, during a second clock cycle after the clock cycle, a pair of branch target instructions of the program using a branch prediction, where the second clock cycle follows the clock cycle without interruption. The multi-thread microprocessor further includes third stage circuitry that determines that the branch prediction is a misprediction during the second clock cycle. The first stage circuitry decodes a first instruction of the pair of branch target instructions during a third clock cycle after the second clock cycle, where the third clock cycle follows the second clock cycle without interruption. The second stage circuitry fetches, during a fourth clock cycle after the third clock cycle, a pair of consecutive instructions of the program. The second stage circuitry sends an instruction of the pair of consecutive instructions to the first stage circuitry during a fifth clock cycle after the fourth clock cycle, where the fifth clock cycle follows the fourth clock cycle without interruption. The first stage circuitry decodes the instruction of the pair of consecutive instructions during the fifth clock cycle.
According to yet another embodiment, the disclosure provides a microcontroller unit. The microcontroller unit comprises a multi-thread microprocessor, including first stage circuitry that fetches a pair of consecutive instructions of a program executed in a thread. The microcontroller unit also includes second stage circuitry that determines, during a clock cycle, that a first instruction in the pair of consecutive instructions is a branch instruction. The first stage circuitry fetches, during a second clock cycle after the clock cycle, a pair of branch target instructions of the program using a branch prediction, where the second clock cycle follows the clock cycle without interruption. The microcontroller unit also includes third stage circuitry that determines that the branch prediction is a misprediction during the second clock cycle. The first stage circuitry sends the second instruction to the second stage circuitry during a third clock cycle after the second clock cycle, where the third clock cycle follows the second clock cycle without interruption. The second stage circuitry decodes the second instruction during the third clock cycle.
According to still another embodiment, the disclosure provides a method. The method includes fetching, by first stage circuitry of a multi-thread microprocessor, a pair of consecutive instructions of a program executed in a thread of multiple threads. The method also includes determining, by second stage circuitry of the multi-thread microprocessor, during a clock cycle, that a first instruction in the pair of consecutive instructions is a branch instruction. The method further includes fetching, by the first stage circuitry, during a second clock cycle after the clock cycle, a pair of branch target instructions of the program using a branch prediction, where the second clock cycle follows the clock cycle without interruption. The method also includes determining, by third stage circuitry of the multi-thread microprocessor, during the second clock cycle, that the branch prediction is a misprediction. The method still further includes sending the second instruction from a fetch buffer to the second stage circuitry during a third clock cycle after the second clock cycle, where the third clock cycle follows the second clock cycle without interruption. The method also includes decoding the second instruction by the second stage circuitry during the third clock cycle.
According to a further embodiment, the disclosure provides another method. The method includes determining, by first stage circuitry of a multi-thread microprocessor, during a clock cycle, that a first instruction of a program executed in a thread of multiple threads is a branch instruction. The method also includes fetching, by second stage circuitry of the multi-thread microprocessor, during a second clock cycle after the clock cycle, a pair of branch target instructions of the program using a branch prediction, where the second clock cycle follows the clock cycle without interruption. The method further includes determining, by a third stage circuitry of the multi-thread microprocessor, during the second clock cycle, that the branch prediction is a misprediction. The method also includes decoding, by the first stage circuitry, a first instruction of the pair of branch target instructions during a third clock cycle after the second clock cycle, where the third clock cycle follows the second clock cycle without interruption. The method further includes fetching, by the second stage circuitry, during a fourth clock cycle after the third clock cycle, a pair of consecutive instructions of the program. The method further includes sending an instruction of the pair of consecutive instructions from a fetch buffer to the first stage circuitry during a fifth clock cycle after the fourth clock cycle, where the fifth clock cycle follows the fourth clock cycle without interruption. The method also includes decoding, by the first stage circuitry, the instruction of the pair of consecutive instructions during the fifth clock cycle.
There are many ways to apply the principles of this disclosure in an embodiment. The above elements and associated technical improvements of this disclosure are examples, in a simplified form, of the application of those principles. The above elements and technical improvements and other elements and technical improvements of this disclosure are clear from the following detailed description when considered in connection with the annexed drawings.
Embodiments of this disclosure address the issue of branch misprediction penalty. Branch misprediction penalty in a pipelined microprocessor indicates the number of lost processor cycles per branch misprediction. In a pipelined hardware multi-thread microprocessor, branch misprediction can occur during execution of executable program code in a particular thread of multiple threads. In this disclosure, executable program code can be referred to as a program. Branch misprediction penalty can increase with the number of stages in the pipeline of a microprocessors. Advanced microprocessor having a pipeline with more than five stages can exhibit greater branch misprediction penalty.
Embodiments of this disclosure address such an issue by providing branch mechanisms that can reduce or avoid branch misprediction penalty during execution of a program. To that end, in some embodiments, a pipelined hardware multi-thread microprocessor can fetch a doubleword instruction to obtain a pair of consecutive sequential instructions and can buffer those sequential instructions. Detection of a branch instruction in the pair of consecutive sequential instructions causes the hardware multi-thread microprocessor to fetch another doubleword instruction in order to obtain a pair of branch target instructions. The branch target instructions can be fetched using a branch prediction, and also can be buffered. In turn, a determination that the branch prediction is a misprediction causes the multi-thread microprocessor to pass the non-branch instruction in the pair of consecutive instructions to decode stage circuitry of the hardware multi-thread microprocessor. The multi-thread microprocessor then fetches a doubleword instructions to obtain a subsequent pair of consecutive instructions, continuing with execution of the program thereafter. As a result, branch misprediction penalty can be avoided.
Embodiments of this disclosure provide several technical improvements. For example, by reducing and, in some cases, avoiding branch misprediction penalty, embodiments of this disclosure deliver greater processing efficiency relative to conventional hardware multi-thread microprocessors. Hardware multi-thread microprocessors of this disclosure can have processing efficiency that is improved by 5% relative those conventional microprocessors. These and other technical benefits can be realized through implementation of the embodiments described herein.
It is noted that for the sake of simplicity of explanation, the branch mechanisms of this disclosure are presented in connection with a particular thread T of multiple threads supported by the hardware multi-thread microprocessor 100. For instance, the multiple threads can include threads A and B, and T can be either A or B. Thus, in
With reference to the drawings,
The hardware multi-thread microprocessor 100 includes a five-stage pipeline having an instruction fetch (IF) stage 110, an instruction decode (DEC) stage 120, an execute (EX) stage 130, and a memory access (MEM) stage 140, and a writeback (WB) stage 150. In some embodiments, the MEM stage 140 also can include execution circuitry and, thus, the MEM stage 140 represents a MEM/EX2 stage. Each of those stages is embodied in, or includes, processing circuitry.
The IF stage 110 obtains instructions to execute from a memory device. As such, the hardware multi-thread microprocessor 100 includes an instruction memory device 104 (referred to as instruction memory 104; an I-cache, for example) and multiple program counters (PCs). The multiple PCs include a first PC 102a and a second PC 102b. The IF stage 110 receives an address from a PC, such as PC 102a. The address points to a location in the instruction memory 104 that contains the instruction—a word having bits defining an opcode and operand data that constitute the instruction. In some embodiments, the word can span 32 bits. In other embodiments, the word can span 16 bits.
In some cases, the IF stage 110 can fetch a doubleword instruction defining a pair of consecutive instructions of a program executed in a particular thread of the hardware multi-thread microprocessor 100. As an example, the particular thread can be thread A. The doubleword instruction defines a first instruction and a second instruction of the program. The first and second instructions are consecutive instructions. In some embodiments, a doubleword can span 64 bits, where bits 0 to 31 constitute the low word and bits 32 to 63 constitute the high word. In other embodiments, the doubleword can span 32 bits, where bits 0 to 15 constitute the low word and bits 16 to 31 constitute the high word.
More specifically, the IF stage 110 can receive an address from a PC corresponding to the particular thread, e.g., PC 102a, and can generate a consecutive address by adding one to the received address. The IF stage 110 can then utilize the received address and the consecutive address to obtain a doubleword instruction from the instruction memory 104. The received address and the consecutive address form a doubleword address, wherein a first word of the doubleword address defines an address of the first instruction, and a second word of the doubleword address defines an address of the second instruction. Using the doubleword address, the IF stage 110 receives the doubleword instruction from the instruction memory 104. The low word of the doubleword instruction defines the first instruction and the high word of the doubleword instruction defines the second instruction.
The IF stage 110 also includes a fetch buffer 112 that can store a doubleword instruction that has been fetched before the constituent first instruction and second instruction are passed to the DEC stage 120. The first and second instructions are individually passed to the DEC stage 120, in respective consecutive clock cycles. It is noted that in some embodiments, instead of storing the doubleword instruction that has been fetched, the IF stage 110 can store one of constituent first or second instructions and can pass the other one of the constituent first or second instructions to the DEC stage 120. In other words, the IF stage 110 is not limited to storing an entire pair of instructions prior to passing an instruction in that pair to the DEC stage 120.
DEC stage 120 identifies an instruction type and prepares operand data to execute. In some cases, the DEC stage 120 can determine that an instruction is a branch instruction. The branch instruction can be a conditional instruction or unconditional instruction.
EX stage 130 performs actual data operations based on the operand data received from the DEC stage 120. MEM stage 140 accesses memory if an instruction is of load type or store type. Memory address is typically determined at EX state 130. That memory can be embodied in a particular memory device of multiple memory devices 170. The particular memory device can be external to the hardware multi-thread microprocessor 100, in some cases. The particular memory device can be volatile memory or non-volatile memory, and can include program memory or data memory, or both.
WB stage 150 writes a result operand into a register file 180 and/or a control register within the hardware multi-thread microprocessor 100. The register file 180 can include 16, 32, or 64 registers, for example. Although a single register file 180 is shown, it is noted that the hardware multi-thread microprocessor 100 includes a register file 180 per thread T of the multiple threads supported by the hardware multi-thread microprocessor 100. The control register can pertain to a particular thread executed by the hardware multi-thread microprocessor 100. For instance, the control register can be one of a control register 166a pertaining to a first thread or a control register 166b pertaining to a second thread. The result operand can be embodied in, for example, load data from memory or executed data from the EX stage 130.
Each stage can process data during a clock cycle, which also can be referred to as stage cycle or processor cycle. The clock cycle is determined by a clock frequency f of the hardware multi-thread microprocessor 100. In one example, f can have a magnitude of 100 MHz. After being processed during a clock cycle in one stage, data can be sent from that stage to another stage down the pipeline on a next clock cycle. To that end, the hardware multi-thread microprocessor 100 includes registers functionally coupling those stages. Each one of the registers serves as an input element to the stage that receives the data. In particular, to pass data from a first stage to a second stage, the first stage writes the data to the register coupling the first and second stages during a clock cycle. The second stage then reads the data from that register during a second clock cycle immediately after the clock cycle. The register is embodied in a storage device, such as a latch, a flip flop, or similar device. As is illustrated in
The register 114, register 124, register 134, and register 144 also constitute the five-stage pipeline of the hardware multi-thread microprocessor 100. The five-stage pipeline forms a core of the hardware multi-thread microprocessor 100. Because instructions are processed in sequence, the hardware multi-thread microprocessor 100 can be referred to as an in-order issue, in-order completion pipeline.
In some embodiments, the hardware multi-thread microprocessor 100 supports two threads. In those embodiments, the multi-thread microprocessor 100 can execute two different programs concurrently within a single core by interleaving instructions. Interleaved execution allows parallel execution of two or more programs within a single core. In addition, overall execution speed can be improved because interleaved execution can hide some latency by allowing one thread to run even when the other thread is stalled. Or it could save run time by reducing the overall stall time if both threads stalled.
As is illustrated in
The first control register 166a and second control register 166b can be written or read simultaneously by various stages, including DEC stage 120 for reading registers for multiply operations, EX stage 130 for reading register values for non-multiply operations, and WB stage 150 for writing results back to registers.
The control unit 160 allows thread A and thread B operations to occur simultaneously. This is important because the control unit 160 can receive simultaneously a request to write a particular register from DEC stage 120 and a request to read that particular register from EX stage 130, or there may be a request to write back a value in WB stage 150 while there is a request to read a value in EX stage 130, and data coherency requires that all of these reads and writes be handled concurrently, which requires they all be on the same thread. The control unit 160 in this case provides the data value directly to the reading stage from the writing stage, simultaneously writing the new value into the required register.
An executable program corresponding to a thread A can have an ordered sequence of instructions {ATI1, ATI2, ATI3, ATI4, . . . }. In turn, another executable program corresponding to a thread B can have a sequence of instructions {BTI1, BTI2, BTI3, BTI4, . . . }. The instructions in those programs are executed in interleaving manner, meaning that the hardware multi-thread microprocessor 100 fetches instructions by alternating the executable programs. As is illustrated in
In some cases, a program executed in thread A can include one or several branch instructions. Similarly, another program executed in thread B also can include one or several branch instructions. Branch instructions are decoded at DEC stage 120. Some branch instructions need register value as base. Also, branch conditions can be resolved at the end of the EX stage 130. Upon resolving a branch condition, the EX stage 130 can direct the control unit 160 to flush or accept branch instruction(s). If the flag values are not forwarded to DEC stage 120, then branch result can be found in the EX stage 130, since the earliest conditional flag result can be found at MEM stage 140.
In response to identification of a branch instruction by the DEC stage 120, the control unit 160 can read a branch prediction from a branch predictor component 164 (referred to as branch predictor 164). In some embodiments, the branch predictor component 164 can be flip-flop with appropriate logic. The branch prediction is a speculatively assertion of whether the branch corresponding to the branch instructions is to be taken or not to be taken. The control unit 160 then causes a branch target buffer 162 to pass at least one address of respective branch target instructions to the IF stage 110. For instance, two such addresses can be passed, where each address being passed is represented by an arrow in
The instruction I3 is one clock cycle before an initial instruction in the branch corresponding to the instruction I3. It is noted that, in some cases, the initial instruction could be the only instruction in the branch. The IF stage 110 can fetch a branch target instruction speculatively during a clock cycle n0,+2, while the instruction I3 is in the EX stage 130. That is, the branch target instruction can be fetched using a branch prediction as to whether the branch is taken or not taken. As an illustration, the target branch instruction is represented by instruction I9.
Execution of the instruction I3 provides an actual outcome as to whether the branch is taken or not taken. Thus, in one scenario, the EX stage 130 can determine that the branch prediction utilized to fetch the branch target instruction is a misprediction. In response, the fetched instruction I9 is cancelled due to branch prediction failure in a subsequent clock cycle n0,+3. Cancellation of the instruction is represented by double-strikethrough lines in
In further response to the branch prediction being a misprediction, the IF stage 130 can fetch a sequential instruction during a clock cycle n0,+4. In
Therefore, in the scenario depicted in
Because in dual-thread execution two different programs run on the same pipeline in an interleaving manner, the one clock-cycle penalty is half the penalty that is present in a single-thread microprocessor in case of branch misprediction. In other words, by executing programs in respective threads in the hardware multi-thread microprocessor 100, the branch misprediction penalty is reduced by a single cycle relative to the hardware single-thread microprocessors.
In some embodiments, the hardware m multi-thread microprocessor 100 can mitigate branch penalty by using doubleword instruction fetch and the fetch buffer 112 in the IF stage 110.
For the sake of illustration, in
The particular first and second branch target instructions that are fetched can be dictated by a branch prediction. Such a fetch can thus be referred to as a speculative fetch. As an illustration, in
Accordingly, in response to determining that branch prediction has failed, instruction I4 can be passed to the DEC stage 120 in a following clock cycle n0+3. Because instruction I4 is prefetched and stored in the fetch buffer 112 (
The PC corresponding to the particular thread T can be updated accordingly. The IF stage 110 can then fetch a third doubleword instruction during a next clock cycle n0+4. The third doubleword instruction defines a third instruction and a fourth instruction of the program executed in the particular thread. For the sake of illustration, the third doubleword instruction that is fetched is shown as (I6,I5) in
The instruction I2 is one clock cycle before an initial instruction in the branch corresponding to the instruction I2. In cases when the branch includes two or more instructions, the IF stage 110 fetches a doubleword instruction defining a pair of target branch instructions including a first target branch instruction and a consecutive second target branch instruction. The doubleword instruction is fetched speculatively during a clock cycle n0,+2, while the instruction I2 is in the EX stage 130. That is, the doubleword instruction can be fetched using a branch prediction. As an illustration, the first and second target branch instructions are represented by instruction I9 and instruction I10, respectively. It is noted that, in some cases, the initial instruction could be the only instruction in the branch. In those cases, the second word of the doubleword instruction can contain the target address of the target instruction in the first word of the doubleword. Other data padding also can be used in those cases.
Execution of the instruction I2 provides an actual outcome as to whether the branch is taken or not taken. Thus, in one scenario, the EX stage 130 can determine that the branch prediction utilized to fetch the doubleword instruction is a misprediction. In response, the fetched instructions I9 and I10 are cancelled due to branch prediction failure while the branch instruction I2 is in EX stage 130. Cancellation of the doubleword instruction is represented by double-strikethrough lines in
In further response to the branch prediction being a misprediction, the IF stage 110 can fetch a second doubleword instruction during a clock cycle n0,+4. The second doubleword instruction defines a pair of consecutive instructions of the executable program code. In
During a clock cycle n0,+5, an instruction (e.g., I3) of the pair of consecutive instructions is sent from the fetch buffer 112 (
Therefore, in the scenario depicted in
In view of the various aspects described herein, an example of the methods that can be implemented in accordance with this disclosure can be better appreciated with reference to
At act 710, first stage circuitry of the hardware multi-thread microprocessor fetches a pair of consecutive instructions of a program executed in a thread within multiple threads. The pair of instructions can be fetched using a doubleword address. A first word (e.g., low word) of the doubleword address defines an address of a first instruction in the pair of consecutive instructions. A second word (e.g., high word) of the doubleword address defines an address of a second instruction in the pair of consecutive instructions. The first stage circuitry can be embodied in the IF stage 110 (
At act 720, the first stage circuitry can store the pair of consecutive instructions can be stored in a memory device. The memory device can be included in the first stage circuitry or can be functionally coupled thereto. For instance, the memory device can be embodied in the fetch buffer 112 (
At act 730, the first stage circuitry can pass a first instruction of the pair of consecutive instructions to a second stage circuitry of the hardware multi-thread microprocessor. The first instruction can be, for example, a first-read instruction. More specifically, in an instance in which the pair of instructions is (Ik+1,Ik), the first instruction can be Ik. The second stage circuitry is referred to as decode stage circuitry and can be embodied in the DEC stage 120 (
It is noted that in some embodiments, instead of storing the pair of consecutive instructions in the memory device and then passing those instructions to the second stage circuitry, the first stage circuitry can store a first instruction in the pair of consecutive instructions and can pass a second instruction in the pair of consecutive instructions to the second stage circuitry.
At act 740, the second stage circuitry determines, during a clock cycle, that a first instruction in the pair of consecutive instructions is a branch instruction. To that end, the second stage circuitry can decode the first instruction.
At act 750, the first stage circuitry fetches, during a second clock cycle after the clock cycle, a pair of branch target instructions of the executable program code using a branch prediction. The second clock cycle follows the clock cycle without interruption. The branch prediction can be based on one of various branch predictor components, such as a 1-bit branch predictor or a 2-bit branch predictor. The pair of branch target instructions is fetched using a second doubleword address.
At act 760, third stage circuitry of the multi-thread microprocessor determines, during the second clock cycle, that the branch prediction is a misprediction. To that end, the third stage circuitry can execute the first instruction of the pair of consecutive instructions. The third stage circuitry is referred to as execute stage circuitry and can be embodied in the EX stage 130 (
At act 770, the first stage circuitry sends the second instruction from the memory device (e.g., fetch buffer 112 (
While not illustrated in
At act 810, first stage circuitry of the hardware multi-thread microprocessor determines, during a clock cycle, that a first instruction of a program executed in a thread within multiple threads is a branch instruction. The first instruction can be prefetched an stored in a memory device (e.g., fetch buffer 112 (
At act 820, second stage circuitry of the hardware multi-thread microprocessor fetches, during a second clock cycle after the clock cycle, a pair of branch target instructions of the executable program code using a branch prediction. The second clock cycle follows the clock cycle without interruption. The second stage circuitry is referred to as decode stage circuitry and can be embodied in the DEC stage 120 (
At act 830, the first stage circuitry fetches, during a second clock cycle after the clock cycle, a pair of branch target instructions of the program using a branch prediction. The second clock cycle follows the clock cycle without interruption. The branch prediction can be based on one of various branch predictor components, such as a 1-bit branch predictor or a 2-bit branch predictor. The pair of branch target instructions is fetched using a double-word address.
At act 840, the first stage circuitry decodes a first instruction of the pair of branch target instructions during a third clock cycle after the second clock cycle. The third clock cycle follows the second clock cycle without interruption.
At act 850, the second stage circuitry fetches, during a fourth clock cycle after the third clock cycle, a pair of consecutive instructions of the executable program code. The fourth clock cycle follows the third clock cycle without interruption. The pair of consecutive instructions is fetched using a second doubleword address. As mentioned, the first word (e.g., low word) of the doubleword address defines an address of an instruction in the pair, and a second word (high word) of the doubleword address defines an address of another instruction of the pair.
At act 860, an instruction of the pair of consecutive instructions can be sent from the fetch buffer to the first stage circuitry during a fifth clock cycle after the fourth clock cycle. The fifth clock cycle follows the fourth clock cycle without interruption. At act 870, the first stage circuitry decodes the instruction of the pair of consecutive instructions during the fifth clock cycle.
In addition to the hardware multi-thread microprocessor 100, the MCU 900 includes several memory devices. The memory devices include one or many non-volatile (NV) memory devices 910 (referred to as NV memory 910). The NV memory 910 includes program memory storing program instructions that constitute an executable program. The hardware multi-thread microprocessor 100 can execute the executable program in one or many of multiple threads. Multiple copies of the executable program need not be stored in the program memory in order to execute multiple threads of the executable program. Thus, size requirements of the program memory can be constrained. In some embodiments, the NV memory 910 also includes data memory. The NV memory 910 can include one or more of ROM, EPROM, EEPROM, flash memory, or another type of non-volatile solid-state memory.
The memory devices in the MCU 900 also include and one or many volatile memory devices (referred to as volatile memory 920). The volatile memory 920 includes data memory storing data that is used for or results from execution of program instructions retained in the NV memory 910. The NV memory 910 can include one or more of SRAM, DRAM, or another type of volatile solid-state memory.
The MCU 900 also includes several input/output (I/O) interfaces 930 that, individually or in a particular combination, permit sending data to and/or receiving data from a peripheral device. The I/O interfaces 930 can be addressed individually by the hardware multi-thread microprocessor 100. The I/O interfaces 630 can include serial ports, parallel ports, general-purposed I/O (GPIO) pins, or a combination of those.
The MCU 900 further includes a bus 940 that includes a data bus, an address bus, and a control bus. The bus 940 permits the exchange of data and/or control signals between two or more of the hardware multi-thread microprocessor 100, the NV memory 910, the volatile memory 920, and the I/O interfaces 930.
While the above disclosure has been described with reference to exemplary embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from its scope. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the disclosure without departing from the essential scope thereof. Therefore, it is intended that the present disclosure not be limited to the particular embodiments disclosed, but will include all embodiments falling within the scope thereof.