Embodiments described herein facilitate the use of software transactional memory in non-managed language environments and with legacy codes without requiring a software programmer to change the programming paradigm they are currently used to. Embodiments combine the benefits of transactional memory, such as simpler concurrency protocols, with the familiarity of traditional programming languages. Transactional memory has been shown to often provide significant performance advantages over traditional locking protocols, particularly when code complexity forces programmers to use coarse grain locking. Embodiments allow the straightforward conversion of legacy code to an equivalent transactional memory version that realizes any concurrency benefits that may exist.
Embodiments described herein combine the benefits of transactional memory (e.g. deadlock elimination, higher concurrency when compared to coarse grain locking) without the need to introduce new language constructs or complicated library calls into existing program code. Furthermore, when combined with automatic lock detection, embodiments can be used to convert legacy codes into transactional memory equivalents without the need to rewrite them. This conversion can provide performance benefits when the original locking discipline was coarse due to program complexity and can be turned off at no cost if its benefits do not outweigh its costs.
In an embodiment, a software transactional memory (STM) adapter system includes a version of a binary rewriting tool (for example the PIN binary instrumentation tool, available from Intel™ Corporation). The STM adapter system provides a simple-to-use application programming interface (API) for legacy languages (e.g., C and C++) that allows the programmer to simply mark the block of code to be executed atomically; the STM adapter system automatically transforms all the binary code executed within that block (including pre-compiled libraries) to execute atomically (that is, to execute as a transaction).
In an embodiment, the STM adapter system automatically transforms lock-based critical sections in existing binary code to atomic blocks, for example by replacing locks with begin and end markers that mark the beginning and end of transactions. In an embodiment, the markers are interpreted as function calls. This allows adaptation of legacy programs to transactional memory versions, even in cases in which the effort to change the source code would be too large, or where the source code is not accessible
Embodiments can also be used in managed languages that already provide an atomic language construct (e.g., the HPCS languages Fortress, Chapel, and X10, or research languages such as Transactional Java and CILK) but need to call out to native code inside a transaction.
In an embodiment, the benefits of transactional memory are evaluated dynamically, and the appropriate codepath (transactional memory or traditional locking) can be chosen based on runtime statistics.
The chipset 104 may include a memory control hub (not shown) and/or an I/O control hub (not shown). The chipset 104 may be one or more integrated circuit chips that act as a hub or core for data transfer between the CPU 102 and other components of the system 100. Further, the system 100 may include additional components (not shown) such as other processors (e.g., in a multi-processor system), one or more co-processors, as well as other components.
For the purposes of the present description, the term “processor” or “CPU” refers to any machine that is capable of executing a sequence of instructions and should be taken to include, but not be limited to, general purpose microprocessors, special purpose microprocessors, application specific integrated circuits (ASICs), multi-media controllers, digital signal processors, and micro-controllers, etc
The CPU 102, the chipset 104, and the other components, access memory devices 110 via chipset 104. The chipset 104, for example, with the use of a memory control hub, may service memory transactions that target memory devices 110.
Memory devices 110 may include any memory device adapted to store digital information, such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (SDRAM), and/or double data rate (DDR) SDRAM or DRAM, etc. Thus, in one embodiment, memory devices 110 include volatile memory. Further, memory devices 110 can also include non-volatile memory such as read-only memory (ROM).
Moreover, memory devices 110 may further include other storage devices such as hard disk drives, floppy disk drives, optical disk drives, etc., and appropriate interfaces.
Further, system 100 may include suitable interfaces 106 to interface with I/O devices 108 such as disk drives, monitors, keypads, a modem, a printer, or any other type of suitable I/O devices.
System 100 may also include a network interface 114 to interface with a network 116 such as a local area network (LAN), a wide area network (WAN), the Internet, etc.
In an embodiment, system 100 includes multiple cores in CPU 102 for multi-threaded processing, or parallel processing. In an embodiment, memory devices 110 store a software transactional memory (STM) adapter tool 112 and STM adapter tool library 113 as further described below. STM adapter tool 112 adapts all types of software applications to operate in the parallel processing system 100 without the use of locks, regardless of whether the applications were originally written to support parallel processing.
If there is a memory access, it is then determined at 206 whether the memory access is a private memory access. If the memory access is a private memory access, the application code is executed as it is at 212. If the memory access is not a private memory access, STM bookkeeping code is inserted in the application at 208. STM bookkeeping code, in an embodiment, includes the code for saving state and other code for allowing error-free execution with other execution threads in a multi-core system.
It is then determined at 210 whether the transaction was successful and should be committed. If it is determined that the transaction should not be committed, the process returns to 202. In an embodiment, the process returns to the beginning of the same transaction region. If it is determined that the transaction should be committed, it is determined at 214 whether there are any conflicts. If there are no conflicts, the transaction is committed at 216. If there are conflicts, the process returns to 202 at the beginning of the transaction region.
When the beginning of the transaction region is encountered, control is transferred to the STM adapter tool (shown by a left arrow) to the application. At 308, an instruction is fetched from the STM adapter tool, decoded, and instrumented. At the end of the transaction 312, it is determined whether the transaction succeeded or failed. If the transaction failed, control is transferred to the application (310) at the beginning of the failed transaction. State is also restored. If the transaction succeeded, control is transferred to the application after the transaction, and the application executes natively at 314 until the next transaction is encountered.
In an embodiment, the instrumenting of the instruction (at 304) allows collection of performance data during execution of the program. If performance is not improved as desired by the adaptation process, the transaction markers can be removed on a transaction-by-transaction basis.
Aspects of the methods and systems described herein may be implemented as functionality programmed into any of a variety of circuitry, including programmable logic devices (“PLDs”), such as field programmable gate arrays (“FPGAs”), programmable array logic (“PAL”) devices, electrically programmable logic and memory devices and standard cell-based devices, as well as application specific integrated circuits. Implementations may also include microcontrollers with memory (such as EEPROM), embedded microprocessors, firmware, software, etc. Furthermore, aspects may be embodied in microprocessors having software-based circuit emulation, discrete logic (sequential and combinatorial), custom devices, fuzzy (neural) logic, quantum devices, and hybrids of any of the above device types. Of course the underlying device technologies may be provided in a variety of component types, e.g., metal-oxide semiconductor field-effect transistor (“MOSFET”) technologies like complementary metal-oxide semiconductor (“CMOS”), bipolar technologies like emitter-coupled logic (“ECL”), polymer technologies (e.g., silicon-conjugated polymer and metal-conjugated polymer-metal structures), mixed analog and digital, etc.
The term “processor” as generally used herein refers to any logic processing unit, such as one or more central processing units (“CPU”), digital signal processors (“DSP”), application-specific integrated circuits (“ASIC”), etc. While the term “component” is generally used herein, it is understood that “component” includes circuitry, components, modules, and/or any combination of circuitry, components, and/or modules as the terms are known in the art.
The various components and/or functions disclosed herein may be described using any number of combinations of hardware, firmware, and/or as data and/or instructions embodied in various machine-readable or computer-readable media, in terms of their behavioral, register transfer, logic component, and/or other characteristics. Computer-readable media in which such formatted data and/or instructions may be embodied include, but are not limited to, non-volatile storage media in various forms (e.g., optical, magnetic or semiconductor storage media) and carrier waves that may be used to transfer such formatted data and/or instructions through wireless, optical, or wired signaling media or any combination thereof. Examples of transfers of such formatted data and/or instructions by carrier waves include, but are not limited to, transfers (uploads, downloads, e-mail, etc.) over the Internet and/or other computer networks via one or more data transfer protocols.
Unless the context clearly requires otherwise, throughout the description and the claims, the words “comprise,” “comprising,” and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is to say, in a sense of “including, but not limited to.” Words using the singular or plural number also include the plural or singular number respectively. Additionally, the words “herein,” “hereunder,” “above,” “below,” and words of similar import refer to this application as a whole and not to any particular portions of this application. When the word “or” is used in reference to a list of two or more items, that word covers all of the following interpretations of the word: any of the items in the list; all of the items in the list; and any combination of the items in the list.
The above description of illustrated embodiments is not intended to be exhaustive or limited by the disclosure. While specific embodiments of, and examples for, the systems and methods are described herein for illustrative purposes, various equivalent modifications are possible, as those skilled in the relevant art will recognize. The teachings provided herein may be applied to other systems and methods, and not only for the systems and methods described above. The elements and acts of the various embodiments described above may be combined to provide further embodiments. These and other changes may be made to methods and systems in light of the above detailed description.
In general, in the following claims, the terms used should not be construed to be limited to the specific embodiments disclosed in the specification and the claims, but should be construed to include all systems and methods that operate under the claims. Accordingly, the method and systems are not limited by the disclosure, but instead the scope is to be determined entirely by the claims. While certain aspects are presented below in certain claim forms, the inventors contemplate the various aspects in any number of claim forms. Accordingly, the inventors reserve the right to add additional claims after filing the application to pursue such additional claim forms for other aspects as well.