Aspects of the present disclosure relate generally to processor systems, and in particular, to a processor system including table-based memory protection (TMP) for improved performance for accessing shared memory.
A processor may be running separate workloads or worlds that may require substantial isolation between each other for security and/or reliability purposes. That is, a particular world should not have unintentional or purposeful access to code and/or data belonging to another world, unless the code and/or data is explicitly shared among the worlds. A processor may include a physical memory protection (PMP) to check memory access request from different worlds, and to approve or deny the requests to ensure that the worlds substantially operate independent of each other for security and/or reliability purposes.
The following presents a simplified summary of one or more implementations in order to provide a basic understanding of such implementations. This summary is not an extensive overview of all contemplated implementations, and is intended to neither identify key or critical elements of all implementations nor delineate the scope of any or all implementations. Its sole purpose is to present some concepts of one or more implementations in a simplified form as a prelude to the more detailed description that is presented later.
An aspect of the disclosure relates to processor system. The processor system, includes: a first memory including a memory protection table including memory access permission information associated with a set of one or more worlds; and a processor including: an execution core configured to run a first world; and a table-based memory protection (TMP) configured to: receive a first request to access memory content at a first target address from the first world; access the memory access permission information from the memory protection table based on the first target address; and determine whether the first world is allowed to access the memory content at the first target address based on the accessed memory access permission information.
Another aspect of the disclosure relates to a method of providing memory protection. The method includes: receiving a first request to access memory content at a target address from a first world running on a processor execution core; accessing memory access permission information from a memory protection table stored in a first memory based on the target address; and determining whether the first world is allowed to access the memory content at the target address based on the accessed memory access permission information.
To the accomplishment of the foregoing and related ends, the one or more implementations include the features hereinafter fully described and particularly pointed out in the claims. The following description and the annexed drawings set forth in detail certain illustrative aspects of the one or more implementations. These aspects are indicative, however, of but a few of the various ways in which the principles of various implementations may be employed and the description implementations are intended to include all such aspects and their equivalents.
The detailed description set forth below, in connection with the appended drawings, is intended as a description of various configurations and is not intended to represent the only configurations in which the concepts described herein may be practiced. The detailed description includes specific details for the purpose of providing a thorough understanding of the various concepts. However, it will be apparent to those skilled in the art that these concepts may be practiced without these specific details. In some instances, well-known structures and components are shown in block diagram form in order to avoid obscuring such concepts.
The processor 110, which may be implemented as a reduced instruction set computer (RISC) version 5 (or RISC-V) hardware thread (HART) or other type, may include a cache 115, a tightly coupled memory (TCM) 120, and a set of control/status registers (CSRs) 125. In this example, the processor 110 is shown to be running three (3) workloads (separate programs), also known as Worlds 1, 2, and 3. Further, in accordance with this example, World 1 is running one or more threads under control of a real time operating system (RTOS); World 2 may be bare metal (e.g., running no application or operating system, but may have associated firmware); and World 3 may also be running one or more threads under the control of an RTOS.
For security reasons, the Worlds 1, 2, and 3 typically run substantially independently of each other, where code or software running in one world does not accidently or purposely affect code or software running in another world. Also, for reliability purposes, the worlds run substantially independent so that if one world crashes, the crash does not affect the other worlds. The independency of the Worlds 1, 2, and 3 also extend to other components of the processor 110 and the processor system 100. For example, contents or information in the cache 115, TCM 120, and CSRs 125 may pertain to World 1 (W1), World 2 (W2), and World 3 (W3). Similarly, contents, information, and code outside of the processor 110, such as in the interrupt controller 160, memory 170, and peripheral 180, may pertain to the different worlds W1, W2, and W3.
From a privileged access perspective, the various Worlds 0, 1, 2, and 3 may include software components or code operating in various levels of access privileges. An arrow is shown in the diagram pointing in a direction of lower access privileged. In this example, the access privileged levels or modes, from highest to lowest, include a machine mode (M-mode), a supervisory mode (S-mode), and a user mode (U-mode).
In M-mode, the secure monitor of World 0 is able to access any memory/code that the various software components of the S-mode and U-mode of the Worlds 1-3 are able to access. In S-mode, the various components (e.g., threads/RTOS, bare metal, and RTOS of Worlds 1-3, respectively) may access any memory/code that the various software components of the U-mode of the Worlds 1-3 (e.g., thread of World 3), but not all the memory/code accessible by World 0 running in M-mode. Similarly, in U-mode, the various software components of the Worlds 1-3 (e.g., thread of World 3) may access its own privileged memory/code, but not all the memory/code accessible by software components running in M-and S-modes. Thus, the processor 200 implements access privileged from M-mode (e.g., where the software components are accessing machine-specific code in registers) to S-mode (e.g., where the software components are operating systems in nature), to U-mode (e.g., where the software components are user applications).
In this example, the processor 200 runs a particular world at a time, and switches between worlds under the control of the secure monitor of World 0. In M-mode, the processor 200 includes a physical memory protection (PMP), which includes a set of hardware logic comparators that may be programmed with address ranges for allowed memory access permissions for the currently-running or active world. Thus, prior to activating a particular world, the secure monitor of World 0 programs the PMP for the address ranges for allowed memory access permissions for the active world. For instance, if the secure monitor is to make World 1 active, the secure monitor programs the PMP with the address ranges for allowed memory access permissions for World 1, then makes World 1 the active running world on the processor 200. When the active World 1 requests retrieval of memory content, the request first goes to the PMP. If the PMP determines whether the memory address of the requested content is within the allowed range, and if it is, the memory access request is allowed. If the PMP determines that the memory address of the requested content is not within the allowed range, the memory access request may be denied. When the secure monitor is to switch to another world (e.g., World 3), the secure monitor preprograms the PMP with the address ranges for allowed memory access permissions for World 3, and then makes World 3 the active world of the processor 200.
As previously discussed, there are processing entities (e.g., DMA 140, interrupt controller 160, memory 170, and peripheral 180) outside of the processor 200 that operate in accordance with the active world. To inform such processing entities of the active world, the processor 200 further includes a multiplexer (MUX) that is configured to output the identifier of the active world (WID) based on the current privileged mode (e.g., U-mode, S-mode, and M-mode). Thus, the multiplexer informs the processing entities outside of the processor 200 of the active world and the corresponding privileged level for access (e.g., U-mode WID, S-mode WID, or M-mode WID).
The processor 310, in turn, includes a processor execution core 312, a physical memory protection (PMP) 314, a tightly coupled memory (TCM) 316, a cache 318, and an initiator 320. The processor execution core 312 may be configured to run a set of different worlds, such as World 1-3 previously discussed with regard to processor 200. The PMP 314 may be programmed with address ranges for allowed memory access permissions for the current active world running on the processor execution core 312. The active world running on the processor execution core 312 may access at least a portion of the TCM 316 if allowed by the PMP 314. Similarly, the active world running on the processor execution core 312 may access the cache 318 if allowed by the PMP 314. The active world running on the processor execution core 312 may access at least a portion of the memory/peripherals 370 via the initiator 320 if allowed by the PMP 314. Accordingly, the PMP 314 polices the memory access on behalf of the processor 310.
The debug unit 330 includes a debug processor 332 and an initiator 334. The debug processor 332 may need to access the memory/peripherals 370 via the initiator 334 for performing a number of debug operations. The DMA 350 includes a DMA execution core 352 and an initiator 354. The DMA execution core 352 may access the TCM 316 and the cache 318 of the processor 310, and also the memory/peripherals 370 via the initiator 354 for performing DMA operations. However, as discussed above, the PMP 314 provides memory protection on behalf of the processor 310, but not the debug unit 330 and the DMA 350. Thus, there may be a memory access security (or lack of security) issue with regard to the debug unit 330 and the DMA 350.
The processor 410, in turn, includes a processor execution core 412, a physical memory protection (PMP) 414, a tightly coupled memory (TCM) 416, a cache 418, and an initiator 420. The processor execution core 412 may be configured to run a set of different worlds, such as World 1-3 previously discussed with reference to processor 200. The PMP 414 may be programmed with address ranges for allowed memory access permission for the current active world running on the processor execution core 412. The active world running on the processor execution core 412 may access at least a portion of the TCM 416 if allowed by the PMP 414. Similarly, the active world running on the processor execution core 412 may access the cache 418 if allowed by the PMP 414. In this case, the active world running on the processor execution core 412 may access at least a portion of the memory/peripherals 470 via the initiator 420 if allowed by both the PMP 414 and the IOPMP 422.
Thus, because of scalable-related limitations of the PMP 414, the PMP 414 may be programmed with a coarse allowable memory range, and the IOPMP 422 may be programmed with a finer allowable memory range for the processor 410 to access the memory 470. Within the coarse allowable memory range specified in the PMP 414, there may be some forbidden memory subranges that is specified in the IOPMP 422. Thus, even though the PMP 414 allows the memory access by the processor execution core 412 via the coarse allowable memory range, the IOPMP 422 may deny the memory access request via its finer memory range check.
The debug unit 430 includes a debug processor 432 and an initiator 434. The debug processor 432 may need to access at least a portion of the memory/peripherals 470 via the initiator 434 for performing a number of debug operations if allowed by the IOPMP 436. The DMA 450 includes a DMA execution core 452 and an initiator 454. The DMA execution core 452 may access at least a portion of the TCM 416 and the cache 418 of the processor 410 via the initiator 454 for DMA operations if allowed by the IOPMP 456. Additionally, the DMA execution core 452 may access at least a portion of the memory/peripherals 470 via the initiator 454 for performing DMA operations if allowed by the IOPMP 456. Thus, compared to the processor system 300, the processor system 400 provides memory access protection for the debug unit 430 and the DMA 450.
As the memory access request is allowed, the memory content associated with the memory access request at the target address range is written into the cache 418 (block 488). The secure monitor of the processor 410 reprograms the PMP 414 for World 3, makes World 3 the active world running on the processor execution core 412, and then World 3 issues a request to access the same memory previously accessed by World 1 (block 490). As previously discussed, the content of such memory access request now resides in the cache 418; and from a performance perspective, it would be advantageous for World 3 to access the memory content from cache 418 rather than from the memory 470. However, although the PMP 414 reprogrammed for World 3 allows the memory access request, accessing the memory content from the cache 418 by World 3 has not been allowed by the IOPMP 422. Thus, according to the method 480, the memory content in the cache 418 is not available for World 3 (block 492). Accordingly, the memory access request goes out to the IOPMP 422, and if approved, the memory content is recopied into cache 418, which impacts memory access performance (block 494).
In particular, the processor system 500 includes a processor 510, a debug unit 530, a first IOPMP 536, a direct memory access (DMA) 550, a second IOPMP 556, and a memory 570. The processor 510, in turn, includes a processor execution core 512, a table-based memory protection (TMP) 514, a tightly coupled memory (TCM) 516, a cache 518, and an initiator 520. The debug unit 530 includes a debug processor 532 and an initiator 534. The DMA 550 includes a DMA execution core 552 and an initiator 554. The memory 570 includes a memory protection table 572, which, as discussed above, includes memory access permission information associated a set of memory address ranges, respectively.
The processor execution core 512 may be configured to run a set of different worlds, such as World 1-3 previously discussed with processor 200. When an active world (e.g., World 1) issues a memory access request for memory content at a target address, the TMP 514 accesses the corresponding permission information associated with the target address from the memory protection table 572 in memory 570 (if it has not already done so). If the TMP 514 determines that the memory access request by World 1 is allowed based on the permission information, the memory access request is allowed to go to the memory 570, and the corresponding memory content is then written into the cache 518. If the active world running on the processor execution core 512 is switched (e.g., to World 3), and World 3 now issues a memory access request for the same memory content previously accessed by World 1, and the TMP 514 allows the memory access requests based on the permission information it previously accessed, the TMP 514 allows World 3 to access the memory content from cache 518. This is a significant improvement over processor system 400 that required the active World 3 to access the same content in cache 418 from the memory 470.
The debug processor 532 may need to access at least a portion of the memory 570 via the initiator 534 for performing a number of debug operations if allowed by the IOPMP 536. The DMA execution core 552 may access at least a portion of the TCM 516 and the cache 518 of the processor 510 via the initiator 554 for DMA operations if allowed by the IOPMP 556. Similarly, the DMA execution core 552 may access at least a portion of the memory 570 via the initiator 554 for performing DMA operations if allowed by the IOPMP 556.
Each sub-table includes a first (left) column for a set of fixed size memory ranges (e.g., four (4) kilobytes (kB) in size or other size), with the entries in the column corresponding to the start addresses in hexadecimal (e.g., 0000_0000, 0000_1000, etc.) of the fixed size memory ranges, respectively. Each sub-table includes a second (right) column for a set of memory access permissions corresponding to the memory ranges in the same rows, respectively. Each permission entry includes a read (R), write (W), and execute (X). For example, if the read (R) value is a one (1), the memory access permission allows read access to the data in the corresponding memory range; otherwise, if it is zero (0), the read access is denied. Similarly, if the write (W) value is a one (1), the memory access permission allows writing data into memory at the corresponding memory range; otherwise, if it is zero (0), the write operation is denied. In the same token, if the executable (X) value is a one (1), the memory access permission allows code in the memory at the corresponding memory range to be accessed and executed; otherwise, if it is zero (0), the executable operation is denied.
Considering some examples, if the TMP 514 receives a memory access request to read content within the fixed address range starting with 0000_0000 from World 1 running on the processor execution core 512, the TMP 514 may copy the permissions information (R=1, W=0, and X=0) from the sub-table for World 1, and allow the read memory access request as R=1 for World 1. If, on the other hand, the memory access request is to execute code within the fixed address range starting with 0000_0000, the TMP 514 denies the request from World 1 as X=0.
Similarly, if the TMP 514 receives a memory access request to read content within the fixed address range starting with 0000_2000 from World 3 (where N=3) running on the processor execution core 512, the TMP 514 may copy the permissions information (R=1, W=0, and X=1) from the sub-table for World 3, and allow the read memory access request as R=1 for World 3. If, on the other hand, the memory access request is for writing data into memory within the fixed address range starting with 0000_2000, the TMP 514 denies the request from World 3 as W=0.
In particular, the memory protection table 572B includes a first (left) column for a set of fixed size memory ranges (e.g., four (4) kilobytes (kB) in size or other size), with the entries in the column corresponding to the start addresses in hexadecimal (e.g., 0000_0000, 0000_1000, etc.) of the fixed size memory ranges, respectively. The memory protection table 572B further includes a set of columns for a set of memory access permissions for a set of worlds W1 to WN corresponding to the memory ranges in the same rows, respectively. Each permission entry for the set of worlds W1 to WN includes a read (R), write (W), and execute (X), as previously discussed with respect to memory protection table 572A.
Considering some examples, if the TMP 514 receives a memory access request to read content within the fixed address range starting with 0000_0000 from World 1 running on the processor execution core 512, the TMP 514 may copy all of the permissions information for the set of worlds 1-N associated with memory range starting with 0000_0000. Thus, the TMP 514 approves the memory access request from World 1 as the read (R) value is a one (1), and may copy the corresponding memory content into cache 518. If the active world running on the processor execution core 512 is changed to World N, and the TMP 514 receives another memory access request to read content within the fixed address range starting with 0000_0000 from World N, the TMP 514, which may already have the permission information from the previous access in response to the memory access request from World 1, approves the memory access request from World N as the read (R) value is a one (1), and memory content in the cache 518 may be provided to World N without accessing the same content in the memory 570.
The method 580 further includes the TMP 514 allowing the memory access request based on the retrieved permission information (block 586). The memory access request is allowed to pass to the memory 570 (e.g., via the cache 518 if it is a miss), where the corresponding memory content is written into the cache 518 (block 588). The active world running on the processor execution core 512 is now switched to World 3, and World 3 sends a memory access request to the TMP 514 for the same memory content previously accessed by World 1 (block 590). Then, according to the method 580, the TMP 514 (which may locally have the permission information or may access it from the memory protection table 572) allows the access to the memory content stored in the cache 518 by World 3 (block 592).
In particular, the processor system 600 includes a processor 610, a debug unit 630, a first input/output table-based memory protection (IOTMP) 636, a direct memory access (DMA) 650, a first input/output physical memory protection (IOPMP) 656, a bus (interconnect) 660, a second IOTMP 672, a first memory “A” 670, a second memory “B” 680, a second IOPMP 694, a third memory “C” 690, and a peripheral 690.
The processor 610, in turn, includes a processor execution core 612, a table-based memory protection (TMP) 614, a tightly coupled memory (TCM) 616, a cache 618, and an initiator 620. The debug unit 630 includes a debug processor 632 and an initiator 634. The DMA 650 includes a DMA execution core 652 and an initiator 654. The initiator 620 of the processor 610 is coupled to the bus 660. The first IOTMP 636 is coupled between the initiator 634 of the debug unit 630 and the bus 660. The first IOPMP 656 is coupled between the initiator 654 of the DMA 650 and the TCM 616 and cache 618 of the processor 610, as well as the bus 660.
The second IOTMP 672 is coupled between the bus 660 and the first memory “A” 670, and includes a feedback connection to the bus 660 for accessing a memory protection table 682 stored in the second memory “B” 680. As discussed, the second memory “B” 680 is coupled to the bus 660, and includes the memory protection table 682. As previously discussed, the memory protection table 682, which may be implemented per memory protection table 572A, 572B, or other, includes the memory access permission information for the various worlds running on the processor 610 and the other processing entities 630 and 650. The second IOPMP 694 is coupled between the bus 660 and the third memory “C” 690 and the peripheral 692, respectively.
The TMP 614 of the processor 610 operates in a similar manner as TMP 514 of processor 510 previously discussed in detail. That is, when an active world (e.g., World 1) sends a memory access request for memory content at a target range to the TMP 614, the TMP 614 accesses the corresponding permission information from the memory protection table 682 in the second memory “B” 680 via the bus 660 (if it has not already done so). If the TMP 614 determines that the memory access request by World 1 is allowed based on the permission information, the memory access request is allowed to pass to the second memory “B” 680 via the bus 660 (e.g., via the cache 618 is it is a miss), and the corresponding memory content is then written into the cache 618. If the active world running on the processor execution core 612 is switched (e.g., to World 3), and World 3 now sends a memory access request to the TMP 614 for the same memory content previously accessed by World 1, the TMP 614 may allow the memory access requests based on the permission information it previously accessed, and World 3 is able to access the memory content from cache 618.
In processor system 600, the TMP is extended to processing entities, such as the debug unit 630 and the first memory “A” 670 in the form of first and second IOTMPs 636 and 672. For instance, if the debug processor 632 needs to access the first memory “A” 670 or second “B” memory 680 via the initiator 634, the IOTMP 636 accesses the corresponding permission information associated with the memory access request from the memory protection table 682 in the second memory “B” 680 via the bus 660 (if it has not already done so). If the IOTMP 636 determines that the memory access is allowed based on the permission information, the memory access request is allowed to go the first memory “A” 670 or second “B” memory 680 via the bus 660, and the corresponding memory content is provided to the debug processor 632 via the bus 660.
The TMP/IOTMP and PMP/IOPMP in a processor system may coexist. In this regard, if the DMA execution core 652 needs to access the first memory “A” 670 via the initiator 634 and it is allowed by the IOPMP 656, the IOTMP 672 receives the memory access request from the DMA 650 via the bus 660, then accesses the corresponding permission information associated with the memory access request from the memory protection table 682 in the second memory “B” 680 via a request sent by way of the feedback connection to bus 660. If the IOTMP 672 determines that the memory access is allowed based on the permission information, the corresponding memory content in the first memory “A” 670 is provided to the DMA execution core 652 via the bus 660.
In processor system 700, the processor 710 further includes a PMP 713 coupled in parallel with the TMP 714, i.e., both coexists in the processor 710. As an example, the PMP 713 may be implemented to perform memory access permission checks for a certain memory address sub-range (e.g., an address sub-range that points to the TCM 716), and the TMP 714 may be implemented to perform memory access permission checks for another memory address sub-range (e.g., an address sub-range that points to the memory 770). Thus, based on the address of the memory access request, the PMP 713 or the TMP 714 will handle the permission check.
Some of the components described herein, such as one or more of the subsystems, thermal controllers, and communication interfaces, may be implemented using a processor. A processor, as used herein, may be any dedicated circuit, processor-based hardware, a processing core of a system on chip (SOC), etc. Hardware examples of a processor may include microprocessors, microcontrollers, digital signal processors (DSPs), field programmable gate arrays (FPGAs), programmable logic devices (PLDs), state machines, gated logic, discrete hardware circuits, and other suitable hardware configured to perform the various functionality described throughout this disclosure.
The processor may be coupled to memory (e.g., generally a computer-readable media or medium), such as a magnetic storage device (e.g., hard disk, floppy disk, magnetic strip), an optical disk (e.g., a compact disc (CD) or a digital versatile disc (DVD)), a smart card, a flash memory device (e.g., a card, a stick, or a key drive), a random access memory (RAM), a read only memory (ROM), a programmable ROM (PROM), an erasable PROM (EPROM), an electrically erasable PROM (EEPROM), a register, a removable disk, and any other suitable medium for storing software and/or instructions that may be accessed and read by a computer. The memory may store computer-executable code (e.g., software). Software shall be construed broadly to mean instructions, instruction sets, code, code segments, program code, programs, subprograms, software modules, applications, software applications, software packages, routines, subroutines, objects, executables, threads of execution, procedures/processes, functions, etc., whether referred to as software, firmware, middleware, microcode, hardware description language, or otherwise.
The following provides an overview of aspects of the present disclosure:
The previous description of the disclosure is provided to enable any person skilled in the art to make or use the disclosure. Various modifications to the disclosure will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other variations without departing from the spirit or scope of the disclosure. Thus, the disclosure is not intended to be limited to the examples described herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
This application claims the benefit of the filing date of U.S. Provisional Application, 63/484,444, filed on Feb. 10, 2023, which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
63484444 | Feb 2023 | US |