The technical field of this invention is interprocessor communications.
In a multiprocessor system a large amount of metadata, such as logical to physical address mapping and inter processor communications is shared by multiple processor cores, and may require more registers than what is available in the processors. This may require context saving of the registers, creating problems in real time operations.
A scratchpad register bank is implemented with a remap function that allows rapid saving and loading of the processor's registers into the temporary registers. The remap function allows remapping of the registers during both the store and load operations.
These and other aspects of this invention are illustrated in the drawings, in which:
RISC architectures have instructions (XIN/XOUT) to load/store registers or entire register files into scratch memory shared between cores in single cycle. This operation also insures data consistency of the scratch memory by associating priority of access from multiple cores to this memory. The shift/remap function specifies an offset (currently by register (32-bit), but can be implemented to be byte or bit based depending on the implementation tradeoffs) for the scratchpad register banks. With this enhancement one can do any register array to any register array load/store from scratchpad memory or scratchpad register with one cycle overhead for specifying the offset.
Effectively one can load/store 16 registers (depending on data bus width) in 2+2 cycles compared to 18+17 cycles from internal memory of the core. This is a significant saving in the context of hard real time requirements. Overhead of loading metadata keys for binary search is 2/32 cycles (6%) compared to 56% above. Storing the metadata's key information separately eases binary search operations as RISC cores are register limited. This invention allows full utilization of register scratch banks and helps to save auxiliary metadata in a different bank at some register index—effectively maintaining a linked list in register banks and allowing access for real time processors into any of the registers reserved for processing.
In the real time core, context save/restore overhead can be reduced to 2+2 cycles compared to 35 cycles. This is a significant reduction considering most critical tasks of a real time core needs to be invoked every 500 cycles. Overhead show above is down from 7% to 0.8%.
As shown in
Scratchpad register bank 104 is further connected to register bank 106 in processor 107 through multiplexer/barrel shifter 105. Block 105 is operable to shift the registers with wrap around, both during writes from processor 107 to the scratchpad register bank 104, and reads by processor 107 from scratchpad register bank 104.
Scratchpad register bank 104 may be implemented as multiple banks for added flexibility. One implementation shown in
Multiplexer/barrel shifters 103 and 105 may be implemented as a multiplexer, a barrel shifter or a crossbar switch, or any combination of these.
As an example, we can use register R1.b0 to hold the address remap offset.
Assuming each scratchpad register bank has 30 registers R0:R29
MOV R1.b0, 4//Specify an offset of 4
XOUT 20, R4, 16//Store R4:R7 to R8:R11 in scratchpad register bank 20, XOUT destination: 4+4=8
MOV R1.b0, 9//Specify an offset of 9
XOUT 21, R25, 20//Store R25:R29 to R4:R9 in scratchpad register bank 20. XOUT destination will wrap around: (25+9−30)=4
MOV R1.b0, 10//Specify an offset of 10
XIN 21, R4, 12//Load R14:R16 to R4:R6. XIN source: 4+10=14
This application claims priority under 35 U.S.C. 119(e)(1) to Provisional Application No. 61/709,373 filed 4 Oct. 2012.
Number | Name | Date | Kind |
---|---|---|---|
4800486 | Horst et al. | Jan 1989 | A |
7197627 | Naylor | Mar 2007 | B1 |
Number | Date | Country | |
---|---|---|---|
20140101383 A1 | Apr 2014 | US |
Number | Date | Country | |
---|---|---|---|
61709373 | Oct 2012 | US |