The present disclosure relates generally to memory utilization, and more specifically, to one or more non-transactional pages for hardware transactional memory (HTM).
Updates to a memory within a transaction performed by a thread might not be visible to other threads by using hardware transactional memory (HTM) until a transaction is committed. The amount of hardware resources to keep track of memory accesses within transactions is limited in HTM. As a result of limited resources, an overflow condition may occur if resource utilization exceeds resource capacity.
Prior solutions are slow with respect to transactions that cause HTM overflow. For example, acceleration by HTM cannot be used after HTM overflow. In terms of development, there are large costs involved in using special machine instructions that do not consume HTM resources for specific memory accesses. Such costs may be indicative of a modification of an instruction set architecture to include new instructions and the cost to develop or modify software tools to use the instructions. In terms of execution, software implementations typically are slow as a result of overhead incurred, such that it is not practical to utilize a software implementation.
According to one or more embodiments of the present disclosure, an apparatus comprises at least one processor, and memory having instructions stored thereon that, when executed by the at least one processor, cause the apparatus to allocate a page to put non-shared data to the page, set a transactional property for the page, the transactional property indicating that data in the page does not need tracking by hardware transactional memory (HTM), in response to detecting an access to the page during a transaction, determine whether the transactional property for the page is set, and in response to determining that the transactional property for the page is set, handle data loaded from the page in a cache as non-transactional data.
According to one or more embodiments of the present disclosure, a non-transitory computer program product comprises a computer readable storage medium having computer readable program code stored thereon that, when executed by a computer, performs a method for using resources in a computer with hardware transactional memory (HTM), the method comprising allocating a page to put non-shared data to the page, setting a transactional property for the page, the transactional property indicating that data in the page does not need tracking by the HTM, in response to detecting an access to the page during a transaction, determining whether the transactional property for the page is set, and in response to determining that the transactional property for the page is set, handling data loaded from the page in a cache as non-transactional data.
According to one or more embodiments of the present disclosure, a system comprises at least one processor configured to execute an application that requests an allocation of a non-transactional page by setting a transactional property that indicates that the page does not need tracking by hardware transactional memory (HTM).
According to one or more embodiments of the present disclosure, a method for using resources in a computer with hardware transactional memory (HTM) is described, the method comprising allocating a page to put non-shared data to the page, setting a transactional property for the page, the transactional property indicating that data in the page does not need tracking by the HTM, in response to detecting an access to the page during a transaction, determining whether the transactional property for the page is set, and in response to determining that the transactional property for the page is set, handling data loaded from the page in a cache as non-transactional data.
Additional features and advantages are realized through the techniques of the present disclosure. Other embodiments and aspects of the disclosure are described in detail herein. For a better understanding of the disclosure with the advantages and the features, refer to the description and to the drawings.
The subject matter which is regarded as the invention is particularly pointed out and distinctly claimed in the claims at the conclusion of the specification. The foregoing and other features and advantages of the disclosure are apparent from the following detailed description taken in conjunction with the accompanying drawings in which:
In accordance with various aspects of the disclosure, a minimization in terms of a number of memory accesses that utilize HTM resources may be obtained. In some embodiments, a parameter or flag may be used to indicate when HTM resources should be used to track a memory access, such as a memory access associated with a transaction.
It is noted that various connections are set forth between elements in the following description and in the drawings (the contents of which are included in this disclosure by way of reference). It is noted that these connections in general and, unless specified otherwise, may be direct or indirect and that this specification is not intended to be limiting in this respect.
Referring to
The threads 104a and 104b may be associated with a resource 108. For example, the resource 108 may include data, which may be organized as one or more blocks, objects, fields, or the like. The threads 104a and 104b may access the resource 108 concurrently (e.g., concurrently in terms of time or space), such that the resource 108 may be, or include, a shared resource. Embodiments of the disclosure may provide for a management of the resource 108. For example, the resource 108 may be managed in accordance with a memory management unit (MMU) as described below.
The core 202 may provide support for so-called “regular” memory accesses, which may occur with respect to a L1 memory 206 associated with the memory 204. The core 202 may provide support for “transactional” accesses, which may occur with respect to a transactional memory 208 associated with the memory 204. In some embodiments, the memory 206 and/or the memory 208 may include fields for a tag and data (e.g., old data and/or new data). In some embodiments, the memory 204, the memory 206, and/or the memory 208 may be associated with a cache. In some embodiments, the size or capacity of the memory 204, the memory 206, and/or the memory 208 may limit the number of memory accesses in HTM.
In some embodiments, in order to minimize the number of HTM accesses, one or more parameters or flags may be added to a memory management unit (MMU). For example, a “tx_disabled” bit may be added to one or more page table entries in the MMU. When memory (e.g., memory 102 of
In some embodiments, a program may specify an attribute when mapping a memory page. When mapping a memory page using, e.g., a function or method such as mmap( ) the function or method may be called with an argument (e.g., “PROT_DISABLE_TX”) that establishes the value or state for the tx_disabled bit.
In a preliminary or initialization event (not shown in
In event 1, a program or an application 304 may call mmap( ) with the PROT_DISABLE_TX argument present or set to request an allocation of a non-transactional page. An example of a non-transactional page may be a stack of a thread that is not shared among speculative threads.
In event 2, the mmap may call a routine (e.g., a kernel service routine) to allocate a page with a PTE whose tx_disabled is set responsive to event 1.
In event 3, the application 304 (optionally as executed by a CPU core 306) may start a transaction. The transaction may be associated with the execution of one or more routines, threads, procedures, functions, etc.
In event 4, the application 304 may access memory a first time. The access may be based on a number of parameters, such as a dirty bit (e.g., an indication of whether a page has been modified), a read/write (R/W) status, etc. An address (addr) or page number associated with the first memory access may serve as an index to the page table 302 to facilitate a comparison or examination of the tx_disabled parameter for that memory access or page.
In event 5, the CPU core 306 may look up the PTE for the memory access of event 4 and determine that the tx_disabled parameter is cleared (e.g., equals zero).
In event 6, the HTM resource 308 may keep track of or log the memory access of event 4 as transactional data, optionally in response to an invocation or command provided by the CPU core 306 or the application 304. Data associated with the memory access of event 4 may be stored in, e.g., a cache 310.
In event 7, the application 304 may access memory a second time. An address or page number associated with the second memory access may serve as an index to the page table 302 to facilitate a comparison or examination of the tx_disabled parameter for that memory access or page.
In event 8, the CPU core 306 may look up the PTE for the memory access of event 7 and determine that the tx_disabled parameter is set (e.g., equals one).
In event 9, the HTM resource 308 might not keep track of or log the memory access of event 7, which may have the effect of logging the access as non-transactional data (e.g., as a “regular” memory access). Data associated with the memory access of event 7 may be stored in the cache 310.
Thus, as described above, the state of the tx_disabled parameter, as potentially set by the PROT_DISABLE_TX argument, determined whether a given memory access was a transactional memory access or a regular memory access. The state or value of the PROT_DISABLE_TX argument may be determined in a number of ways. For example, a tool may guide a programmer based on a run-time instance profile. In some embodiments, an Application Programming Interface (API) may be used to provide the state or value of the PROT_DISABLE_TX argument. In some embodiments, a Java Virtual Machine (JVM) may manage a heap, and based on that management, may know whether to set (or clear) the state or value of the PROT_DISABLE_TX argument. In some embodiments, the PROT_DISABLE_TX argument may be based on an identification of a region of memory.
A number of memory accesses (e.g., memory accesses denoted by 404a and 404b) may be indicative of transactional memory accesses, which may be shared among multiple threads and might be inconsistent if the threads update the shared memory without mutual exclusion control (e.g., lock). Similarly, a number of memory accesses (e.g., memory accesses denoted by 406a and 406b) may be indicative of regular or non-transactional memory accesses, which may be or include non-shared data and may be consistent if the threads update the memory without mutual exclusion control. The HTM might not need to keep track of such non-shared data, as the HTM might never abort transactions. Transactional memory accesses and non-transactional memory accesses may be performed during a transaction.
As reflected via
Comparing
In block 502, a non-transactional page may be allocated. The non-transactional page may be used to put or store non-shared data.
In block 504, a transactional property for the page of block 502 may be set. The transactional property may indicate that data in the page does not need tracking by HTM.
In block 506, in response to detecting an access to the page during a transaction, a determination may be made as to whether the transactional property for the page is set or not.
In block 508, in response to detecting that the transactional property for the page is set, data loaded from the page in a cache may be handled as non-transactional data.
It will be appreciated that the events of the state diagram of
Aspects of the disclosure may be implemented independent of a specific instruction set (e.g., CPU instruction set architecture), operating system, or programming language. Aspects of the disclosure may be implemented in conjunction with non-transactional machine instructions. Aspects of the disclosure may be implemented in connection with thread-level speculation, which may be similar to HTM.
In some embodiments various functions or acts may take place at a given location and/or in connection with the operation of one or more apparatuses or systems. In some embodiments, a portion of a given function or act may be performed at a first device or location, and the remainder of the function or act may be performed at one or more additional devices or locations.
As will be appreciated by one skilled in the art, aspects of this disclosure may be embodied as a system, method or computer program product. Accordingly, aspects of the present disclosure make take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiments combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, aspects of the disclosure may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific example (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming language, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming language, such as the “C” programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
In some embodiments, an apparatus or system may comprise at least one processor, and memory storing instructions that, when executed by the at least one processor, cause the apparatus or system to perform one or more methodological acts as described herein. In some embodiments, the memory may store data, such as one or more data structures, metadata, etc.
Embodiments of the disclosure may be tied to particular machines. For example, in some embodiments one or more devices may allocate or manage resources, such as HTM resources. In some embodiments, the one or more devices may include a computing device, such as a personal computer, a laptop computer, a mobile device (e.g., a smartphones), a server, etc.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, element components, and/or groups thereof.
The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present disclosure has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the disclosure. The embodiments were chosen and described in order to best explain the principles of the disclosure and the practical application, and to enable others of ordinary skill in the art to understand the disclosure for various embodiments with various modifications as are suited to the particular use contemplated.
The diagrams depicted herein are illustrative. There may be many variations to the diagram or the steps (or operations) described therein without departing from the spirit of the disclosure. For instance, the steps may be performed in a differing order or steps may be added, deleted or modified. All of these variations are considered a part of the disclosure.
It will be understood that those skilled in the art, both now and in the future, may make various improvements and enhancements which fall within the scope of the claims which follow.
This application is a continuation of U.S. Non-Provisional application Ser. No. 13/563,967, entitled “NON-TRANSACTIONAL PAGE IN MEMORY”, filed Aug. 1, 2012, which is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
Parent | 13563967 | Aug 2012 | US |
Child | 13568434 | US |