1. Field
The present disclosure relates generally to memory devices.
2. Background
In general, a translation look-aside buffer (TLB) is used to reduce virtual address translation time. A TLB is a table in the processor's memory that contains information about the pages in memory the processor has accessed recently. The table cross-references a program's virtual addresses with the corresponding absolute addresses in physical memory that the program has most recently used. A TLB enables faster computing because it caches the virtual to physical address translations locally. A TLB may be implemented in a number of ways. For example, a TLB may be enabled in a fully associative content addressable memory (CAM) structure. A CAM is a type of storage device which includes comparison logic with each bit of storage. A data value may be broadcast to all words of storage and then compared with the values there. Words matching a data value may be flagged in some way. Subsequent operations can then work on flagged words, e.g. read them out one at a time or write to certain bit positions in all of them. Fully associative structures can therefore store the data in any location within the CAM structure. Comparison logic, however, requires comparison circuitry, which occupies physical space—physical space which, in other structures may be utilized to provide more memory. As such, CAM structures may not be as densely configured as other memory structures. Further, because of the comparison circuitry, CAM structures have relatively high power requirements.
In other examples, a TLB may be enabled in a set associative memory (SAM) structure, such as a random access memory (RAM) structure. SAM structures organize caches so that each block of memory maps to a small number of sets or indexes. Each set may then include a number of ways. A data value may return an index whereupon comparison circuitry determines whether a match exists over the number of ways. As such, only a fraction of comparison circuitry is required to search the structure. Thus, SAM structures provide higher densities of memory per unit area as compared with CAM structures. Further, because of reduced comparison circuitry, SAM structures have lower power requirements as compared with CAM structures.
As may be appreciated, both of the memory structures described above may provide specific advantages in a processing system. In general, however, designers must typically choose between memory structures when developing a system under an existing architecture. For example, the Microprocessor without Interlocked Pipeline Stages (MIPS) architecture, which is well-known in the art, specifies a fully associative TLB based translation mechanism. The mechanism utilizes the EntryHi, EntryLo1, EntryLo0 and Index architectural registers to perform functions such as reading, writing and probing the TLB. These mechanisms and functions assume that the TLB is a fully associative structure (i.e. a CAM structure) that is in compliance with the requirements of the MIPS architecture. Therefore, increasing the size of the TLB necessitates the addition of more fully associative CAM structures. An increase in CAM structures, in turn, requires a commensurate increase in space and power to accommodate the additional CAM structures. Currently, the MIPS architecture cannot utilize a more space and power efficient SAM structure.
It may therefore be desirable to provide a system which includes an extended TLB (eTLB) that utilizes both CAM structures and SAM structures so that the relative advantages of both structures may be realized. The invention is particularly useful in systems that utilize existing registers and mechanism as specified by the MIPS architecture.
U.S. Pat. Nos. 7,797,509 and 8,082,416 disclose extended translation look-aside buffers (eTLB) for converting virtual addresses into physical addresses, in which the eTLB includes a physical memory address storage having a number of physical addresses, a virtual memory address storage configured to store a number of virtual memory addresses corresponding with the physical addresses, the virtual memory address storage including, a set associative memory structure (SAM), and a content addressable memory (CAM) structure; and comparison circuitry for determining whether a requested address is present in the virtual memory address storage, wherein the eTLB is configured to receive an index register for identifying the SAM structure and the CAM structure, and wherein the eTLB is configured to receive an entry register for providing a virtual page number corresponding with the plurality of virtual memory addresses.
With an eTLB structure having both a CAM and a SAM, there is a possibility that instructions will be issued to act upon entries in both (or either) the CAM and/or the SAM. For example, instructions may be issued to write entries to the eTLB, read entries from the eTLB, search for entries within the eTLB and/or flush entries from the eTLB. Despite the differences in the structures of the CAM and SAM mechanisms, it is nevertheless desirable for a system that implements the eTLB to efficiently process these instructions regardless of whether CAM entries or SAM entries are being processed.
The following presents a simplified summary of some embodiments in order to provide a basic understanding of the invention. This summary is not an extensive overview and is not intended to identify key/critical elements or to delineate the scope of the claims. Its sole purpose is to present some embodiments in a simplified form as a prelude to the more detailed description that is presented below.
According to some embodiments, a method is provided for operating an eTLB in which the same instruction is issued to perform the same task for both the CAM and the SAM. For example, the same instruction to perform a TLB flush can be provided to the eTLB that operates upon both the CAM and the SAM, which is handled differently by the underlying implementation system of the eTLB depending upon whether the CAM and/or SAM is being accessed. This approach allows instructions to be issued without the originator of the instruction being required to know the exact structure within the eTLB (either CAM or SAM) that is being accessed to implement the instruction.
As a result, it is possible to use the same instruction for operations on both the CAM and SAM. By way of example, a common usage of a TLB flush occurs when a CPU is performing context switching and needs to flush all of the TLB entries set up by the prior context. In accordance with an embodiment of the present invention, the TLB flush operation will look for any TLB entry with a particular application-specific identifier (ASID) for the prior context, and invalidate those entries.
Further details of aspects, objects, and advantages of various embodiments are described below in the detailed description, drawings, and claims. Both the foregoing general description and the following detailed description are exemplary and explanatory, and are not intended to be limiting as to the scope of the disclosure.
The accompanying drawings, which are incorporated herein and form a part of the specification, illustrate embodiments of the present invention and, together with the description, further serve to explain the principles of the invention and to enable a person skilled in the relevant art to make and use the invention.
The present invention will now be described with reference to the accompanying drawings. In the drawings, generally, like reference numbers indicate identical or functionally similar elements. Additionally, generally, the left-most digit(s) of a reference number identifies the drawing in which the reference number first appears.
The disclosure will provide a description in detail with reference to a few embodiments thereof as illustrated in the accompanying drawings. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the embodiments. It will be apparent, however, to one skilled in the art, that the present approach may be practiced without some or all of these specific details. In other instances, well known process steps and/or structures have not been described in detail in order to not unnecessarily obscure the present embodiments.
Various embodiments are described hereinbelow, including methods and techniques. It should be kept in mind that the approach might also cover articles of manufacture that includes a computer readable medium on which computer-readable instructions for carrying out embodiments of the inventive technique are stored. The computer readable medium may include, for example, semiconductor, magnetic, opto-magnetic, optical, or other forms of computer readable medium for storing computer readable code. Further, the invention may also cover apparatuses for practicing embodiments of the invention. Such apparatus may include circuits, dedicated and/or programmable, to carry out tasks pertaining to embodiments of the invention. Examples of such apparatus include a general-purpose computer and/or a dedicated computing device when appropriately programmed and may include a combination of a computer/computing device and dedicated/programmable circuits adapted for the various tasks pertaining to embodiments.
The present disclosure pertains to extended translation look-aside buffers (eTLB) for converting virtual addresses into physical addresses, in which the eTLB includes a physical memory address storage having a number of physical addresses, a virtual memory address storage configured to store a number of virtual memory addresses corresponding with the physical addresses, the virtual memory address storage including, a set associative memory structure (SAM), and a content addressable memory (CAM) structure; and comparison circuitry for determining whether a requested address is present in the virtual memory address storage, wherein the eTLB is configured to receive an index register for identifying the SAM structure and the CAM structure, and wherein the eTLB is configured to receive an entry register for providing a virtual page number corresponding with the plurality of virtual memory addresses.
According to some embodiments, a method is provided for operating an eTLB in which the same instruction is issued to perform the same task for both the CAM and the SAM. For example, the same write, read, search, or flush instruction can be provided to the eTLB that operates upon both the CAM and the SAM, which is handled differently by the underlying implementation system of the eTLB depending upon whether the CAM and/or SAM is being accessed. This approach allows instructions to be issued without the originator of the instruction being required to know the exact structure within the eTLB (either CAM or SAM) that is being accessed for the instruction.
Thus, CAM structure 130 provides a first memory structure for use with an eTLB in an embodiment of the present invention. As may be appreciated, CAM structures include comparison logic for each bit of storage. Comparison logic in turn requires comparison circuitry, which occupies physical space—physical space which, in other structures may be utilized to provide more memory. As such, CAM structures may not be as densely configured as other memory structures. Further, because of the comparison circuitry, CAM structures have relatively high power requirements. However, one benefit of a CAM structure is that searches conducted over a CAM structure happen simultaneously over all bits. Thus, a complete search of a CAM structure occurs over a single clock cycle. Another benefit of a CAM structure is that an address may reside in any entry in the CAM, therefore the CAM structure may be easily configured with a “wired” space, which is a protected memory space. This wired space can contain any address translation which the operating system wants to retain in the TLB. As may be appreciated, when a CAM structure is searched, a hit may occur. A CAM hit 132 results when a match with a virtual address (VA) 102 from an entry hi register occurs. A hit may result in a data retrieval from a data store. As may be appreciated, a ternary CAM (TCAM) may, in some embodiments, be utilized. A TCAM cell stores an extra state (i.e. a “don't care” state), which necessitates two independent bits of storage. When a “don't care” state is stored in the cell, a match occurs for that bit regardless of the search criteria.
SAM structure 120 provides a second memory structure for use with an eTLB in an embodiment of the present invention. SAM structures organize caches so that each block of memory maps to a small number of cache lines or indexes. Each index may then include a number of ways. Thus, an index register 108 may indicate a location across n-ways. In one embodiment, a 4-way set associative memory structure may be utilized. Comparison logic 122 compares the n-ways identified by index register 108 with tag compare bits from VA 102 and returns a SAM hit 124, if any. A hit may result in data retrieval from a data store. A SAM structure may be densely manufactured because only a fraction of comparison logic is utilized over a CAM structure having the same amount of memory. Additionally, because significantly less comparison logic is utilized, power requirements are also much lower over a similarly sized CAM structure. As may be appreciated, in a single SAM structure page size is not easily variable. In some embodiments, a SAM structure page size is set to 4 KB. Page size for a SAM structure may be programmatically established by setting a PageMask field of an eTLB configuration register to a desired page size in embodiments of the present invention. CAM structures, on the other hand, support variable page sizes and may include a wired or protected space. Thus, in one embodiment, a SAM structure page size of an eTLB is 4 KB and CAM page size is a variable page size greater than 4 KB. In some embodiments, a hash table 104 may be utilized to hash tag index bits from VA 102 for use with a SAM structure. In other embodiments, MUX 106 may be utilized when accessing SAM 120.
If the method determines that an eTLB is enabled at a step 305, the method continues to determine Whether a stricture ID (STRID) is non-zero at a step 306. In this step, the method determines to which memory structure an index or VA is written. If the method determines that the structure ID is zero at a step 306, then the structure corresponding with the index register loaded at a step 302 is a CAM structure. The method then writes the contents of entryhi, entrylo0, and entrylo1 into the CAM entry identified by the index, which is a VA at a step 308, whereupon the method ends. If the method determines that the structure ID is non-zero at a step 306, then the structure corresponding with the index register loaded at a step 302 is a SAM. The method then selects the SAM structure corresponding with the STRID from the loaded index register at a step 310. The method then writes the contents of entryhi, entrylo0, and entrylo1 into the SAM entry identified by the index, and the way ID into the index register at a step 312, whereupon the method ends.
In accordance with embodiments of the present invention, it is not necessary to specify a structure ID (STRID) in an implementation in order to select which SAMs are used for an operation, such as a write operation. By way of non-limiting example, a write operation may solely depend on a page size, with a SAM only holding translations for a single page size, with multiple SAMs used to support different page sizes. In accordance with a further example, a TLB write random operation can be implemented such that software does not need to specify a STRID, but the selection of the SAM is determined by the page size specified by the TLB write random operation. Alternatively, a TLB write index operation (which is a write operation that specifies a specific entry in the eTLB), or a TLB read operation, may specify the structure of the eTLB, including the number of SAMs and the number of “ways” for each SAM (i.e., n-way memory) in order to uniquely identify a specific entry, in accordance with a farther embodiment of the present invention.
In accordance with an embodiment of the present invention, TLB probe, TLB flush, and TLB look-up operations may not know the structure of the eTLB in order for a match to be found, so a STRID and “way” fields (specifying the number of ways for each SAM) are not needed for these instructions. Moreover, in accordance with an additional embodiment of the present invention, the STRID and way information can be incorporated into a single index number. If the eTLB is organized in such a way that TLB entries of all memories (e.g., CAM or SAM) occupy sequential entries, then the value of the index number can identify which memory is being accessed for a TLB read index or TLB write index operation, in accordance with a further embodiment of the present invention.
If the method determines that an eTLB is not enabled at a step 405, the method continues to a step 408 to select a random entry in the CAM structure that is not in a wired space. As may be appreciated, a wired space represents protected memory. That is, memory that cannot be evicted from the TLB. The method then writes the contents of entryhi, entrylo0, and entrylo1 into the random CAM at a step 410, whereupon the method ends. If the method determines that an eTLB is enabled at a step 405, the method continues to determine whether a structure ID (STRID) is non-zero at a step 406. If the method determines that the structure ID is zero at a step 406, then the structure corresponding with the index register loaded at a step 402 is a CAM structure. The method then selects a random entry in the CAM structure that is not in a wired space at a step 408. As may be appreciated, a wired space represents protected memory. That is, memory that cannot be evicted from the eTLB. The method then writes the contents of entryhi, entrylo0, and entrylo1 into the random CAM at a step 410, whereupon the method ends.
If the method determines that the structure ID is non-zero at a step 406, then the structure corresponding with the index register loaded at a step 402 is a SAM structure. The method then selects a SAM structure corresponding with the STRID from the loaded index register at a step 412. The method then selects the index of the SAM based on the VA bits from the entryhi. The method then determines whether a random replacement is desired at a step 416. If the method determines at a step 416 that a random replacement is desired, the method selects a random way at a step 418. If the method determines at a step 416 that a random replacement is not desired, the method selects a way based on LRU or NRU at a step 420. The method then writes the contents of entryhi, entrylo0, and entrylo1 into the set associative entry identified by the index, and the way ID into the index register at a step 422, whereupon the method ends.
Referring to
If the method determines that an eTLB is enabled at a step 503, the method then determines whether a structure ID (STRID) is non-zero at a step 504. If the method determines that the structure ID is zero at a step 504, then the structure corresponding with the index register loaded at a step 502 is a CAM structure. The method continues at a step 506, to read the CAM entry corresponding with the index register loaded at a step 502. The method then updates the entryhi, entrylo0, and entrylo1 with the contents of the data storage at a step 508, whereupon the method ends. If the method determines at a step 504 that the structure ID is non-zero, then the structure corresponding with the index register loaded at a step 502 is a SAM structure. The method then selects at a step 510, the SAM structure corresponding with the STRID in the loaded index register at a step 502. The method continues at a step 512, to read the entry corresponding with the index register loaded at a step 502. Data from the way corresponding with the index register is MUXed at a step 514. MUXing data is well-known in the art and may be utilized without limitation without departing from the present invention. The method then updates the entryhi, entrylo0, and entrylo1 with the contents of the data storage at a step 516, whereupon the method ends.
If the method determines at a step 605 that an eTLB is enabled, the method continues to a step 606, to search the CAM using a VA from the entryhi as a search key. The method substantially simultaneously continues to a step 608 to 1) Use a subset of the VA bits [m:n] to index the SAM, and 2) Read all indexed ways. The upper bits of the VA [MSB; m+1] are then compared with all the indexed ways at a step 610. As may be appreciated, any number of SAM structure banks may be searched without departing from the present invention. Thus, SAM structures 1 to n may be searched at steps 608 to 610. The method continues to a step 612 to determine whether a hit has occurred. If no hit (i.e. no match) occurs at a step 612, the method continues to a step 620 to write to a probe bit to indicate a “miss” whereupon the method ends. If the method determines a hit (or match) has occurred at a step 612, the method continues to a step 614 to write a zero in a probe bit to indicate a “hit.” A probe bit is illustrated in
A thread enable field 724 may be enabled to indicate the threads whose translations will reside in a selected SAM structure. Thus, a configuration may include or exclude a particular thread on a selected SAM structure. In one embodiment, a single thread may be configured to access a single SAM structure. In another embodiment, a single thread may be configured to access multiple SAM structures. In another embodiment, multiple threads may be configured to access a single SAM structure. In another embodiment, multiple threads may be configured to access several SAM structures. An eTLB enable field 730 may be utilized to indicate whether eTLB functionality is enabled for a particular memory structure. As may be appreciated, the representation provided is for illustrative purposes only and should not be construed as limiting. Any available bits in a configuration register may be utilized to enable eTLB features without departing from the present invention.
If the method determines at a step 804 that an eTLB is enabled, the method continues to a step 806, to search the CAM using a VA from execution of the memory operation a search key. The method substantially simultaneously continues to a step 808 to 1) Use a subset of the VA bits [m:n] to index the SAM, and 2) Read all indexed ways. The upper bits of the VA [MSB; m+1] are then compared with all the indexed ways at a step 810. As may be appreciated, any number of SAM structure banks may be searched without departing from the present invention. Thus, SAM structures 1 to n may be searched at steps 808/808′ to 812/812′. The method then continues to a step 814 to determine whether a hit has occurred. If no hit occurs at a step 814, the method ends. If the method determines a hit (or match) has occurred at a step 814, the method continues to a step 816 to read a physical frame number (PFN) corresponding with a “hit” VA along with other information relevant to the “hit.” As may be appreciated, a PFN corresponds with a physical address residing in memory. In this manner, a VA is translated to a PFN.
For example, an instruction may be received to perform an eTLB write index (e.g., as described in
At 902, the instruction is processed to identify the portions of the eTLB content that needs to be operated upon by the instruction. Even though a common instruction is being provided, the actual processing of the instruction will need to understand which underlying type of memory stricture is being accessed to process the instruction.
For example, if an instruction is received to perform an eTLB probe (e.g., as described in
Next, at 906, processing is performed to implement the task described by the instruction. Even though a common instruction was provided without regard for the underlying TLB memory structure, the actual processing of the instruction may differ depending upon whether the instruction is directed to a CAM entry or a SAM entry.
For example, if an instruction is received to perform an eTLB probe (e.g., as described in
At 1002, an instruction is received to perform flush the TLB. The same instruction is provided regardless of whether the flush operation is to be performed against the CAM or the SAM in the eTLB. The originator of the instruction will not need to be aware of the specific type of memory (e.g., CAM or SAM) within the eTLB that is to be accessed to flush the entries. Instead, the originator of the instruction will be able to provide a common instruction that works with any type of underlying memory structure within the eTLB, and will not be required to provide a CAM-specific or SAM-specific instruction to flush the eTLB.
One reason this is advantageous is because the typical set of operations needed to flush a SAM are different from (and much more elaborate than) the operations needed to a flush a CAM. With CAM structures, a complete search (and clear) can be performed simultaneously over all of the entries within the CAM. In contrast, a SAM structure (e.g., as shown in
At 1002, a determination is made whether an eTLB has been enabled. As may be appreciated, methods described herein may be utilized with several system configurations without departing from the present invention. Thus, in one embodiment of the present invention, methods are provided that access a TLB (non-enabled eTLB) utilizing a. CAM structure. If an eTLB is not enabled, the method continues to a step 1006 to search the CAM for the required entries and to then invalidate those CAM entries at 1008.
If it is determined that an eTLB has been enabled, the method continues to 1010 to search the CAM and/or SAM for the entries to be invalidated. The input for a flush instruction typically includes one or more virtual address(es) and ASID value(s). The values are used as the search keys to search the CAM and SAM structures for matching entries. One possible approach to perform this type of search is described above with respect to
At 1012, the identified entries in the CAM and/or SAM are invalidated. In some embodiments, eTLB entries can be invalidated by simply performing an operation to over-write a page's entry with an invalid value.
As discussed above, the search and invalidation of the CAM can actually be performed with a single clock cycle, and therefore does not need to be broken into separate steps as shown. In contrast, the search and invalidation of the SAM structure may correspond to multiple separate operations, depending upon the exact structure of the SAM and the number of comparison logic mechanisms implemented within the SAM.
The operations of flowchart 1000 of
Therefore, what has been described is an improved approach for operating an eTLB in which the same instruction is issued to perform the same task regardless of the exact underlying memory structure within the eTLB being accessed. For flush operations, the same instruction to perform a TLB flush can be provided to the eTLB that operates upon both the CAM and the SAM, which is then handled differently by the underlying implementation system of the eTLB depending upon whether the CAM and/or SAM is being accessed. This approach allows instructions to be issued without the originator of the instruction being required to know the exact structure within the eTLB (either CAM or SAM) that is being accessed to implement the instruction.
While this invention has been described in terms of several preferred embodiments, there are alterations, permutations, and equivalents, which fall within the scope of this invention. It should also be noted that there are many alternative ways of implementing the methods and apparatuses of the present invention. Although various examples are provided herein, it is intended that these examples be illustrative and not limiting with respect to the invention. Further, the Abstract is provided herein for convenience and should not be employed to construe or limit the overall invention, which is expressed in the claims. It is therefore intended that the following appended claims be interpreted as including all such alterations, permutations, and equivalents as fall within the true spirit and scope of the present invention.
The present application is related to U.S. application Ser. No. 13/330,662, which is a continuation of U.S. application Ser. No. 12/859,013 filed on Aug. 18, 2010, which is a continuation of U.S. application Ser. No. 11/652,827 filed on Jan. 11, 2007, all of which are hereby incorporated by reference in their entireties for all purposes.