Computer programs may be written to allow different portions (e.g., threads) of the program to be executed concurrently. In order to execute different portions of the program concurrently, the computer system or the program typically includes some mechanism to manage the memory accesses of the different portions to ensure that the parts access common memory locations in the desired order.
Transactional memory systems allow programmers to designate transactions in a program that may be executed as if the transactions are executing in isolation (i.e., independently of other transactions and other sequences of instructions in the program). Transactional memory systems manage the memory accesses of transactions by executing the transactions in such a way that the effects of the transaction may be rolled back or undone if two or more transactions attempt to access the same memory location in a conflicting manner. Transactional memory systems may be implemented using hardware and/or software components.
Many software transactional memory (STM) systems allow programmers to include both transactional and non-transactional code in their programs. In order to be practically efficient and pay-for-play, STM systems may provide weak atomicity where no general guarantee is made for interaction between transactional and non-transactional code. However, some commonly used code idioms, such as privatization, may behave incorrectly in STM systems with weak atomicity if privatization safety is not provided. Privatization safety, however, may introduce at least some cost or overhead that may impact the parallel scalability of STM systems.
This summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
A software transactional memory system is provided that provides privatization safety. The system identifies situations where the completion of a transaction may be expedited because a privatization artifact will not occur. The system determines whether a privatization artifact may occur using a read and write set intersection test, transactional variables, pessimistic locks, or declared privatizing transactions. If a privatization artifact will not occur for a transaction, then the system may allow the transaction to complete prior to one or more earlier transactions.
The accompanying drawings are included to provide a further understanding of embodiments and are incorporated in and constitute a part of this specification. The drawings illustrate embodiments and together with the description serve to explain principles of embodiments. Other embodiments and many of the intended advantages of embodiments will be readily appreciated as they become better understood by reference to the following detailed description. The elements of the drawings are not necessarily to scale relative to each other. Like reference numerals designate corresponding similar parts.
In the following Detailed Description, reference is made to the accompanying drawings, which form a part hereof, and in which is shown by way of illustration specific embodiments in which the invention may be practiced. In this regard, directional terminology, such as “top,” “bottom,” “front,” “back,” “leading,” “trailing,” etc., is used with reference to the orientation of the Figure(s) being described. Because components of embodiments can be positioned in a number of different orientations, the directional terminology is used for purposes of illustration and is in no way limiting. It is to be understood that other embodiments may be utilized and structural or logical changes may be made without departing from the scope of the present invention. The following detailed description, therefore, is not to be taken in a limiting sense, and the scope of the present invention is defined by the appended claims.
It is to be understood that the features of the various exemplary embodiments described herein may be combined with each other, unless specifically noted otherwise.
STM system 10 includes STM code 12, an STM library 14, and a runtime environment 16. STM system 10 is configured to manage the execution of STM transactions 20 that form atomic blocks in STM code 12 to allow transactions 20 to be executed atomically and, if desired, rollback or undo changes made by transactions 20. To do so, STM system 10 tracks memory accesses by transactions 20 to objects 30 using a log 34 for each executing transaction 20.
STM code 12 includes a set of one or more transactions 20. Each transaction 20 includes a sequence of instructions that is designed to execute atomically, i.e., as if the sequence is executing in isolation from other transactional and non-transactional code in STM code 12. Each transaction 20 includes an atomic block designator 22 that indicates that a corresponding portion of STM code 12 is a transaction 20. Each transaction 20 also includes zero or more memory accesses 24 that read from and/or write to one or more objects 30 as indicated by arrows 32. Transactions 20 also include invocations 26 of STM primitives, which may be added by a compiler such as a compiler 92 shown in
STM library 14 includes STM primitives and instructions executable by the computer system in conjunction with runtime environment 16 to implement STM system 10. The STM primitives of STM library 14 that are callable by transactions 20 include management primitives that implement start, commit, abort, and retry functions in STM library 14. A transaction 20 calls the start function to initiate the management of the transaction 20 by STM library 14. A transaction 20 calls the commit function to finalize the results of the transaction 20 in memory system 204, if successful. A transaction 20 calls the abort function to roll back or undo the results of the transaction 20 in memory system 204. A transaction 20 calls the retry function to retry the transaction 20. In other embodiments, some or all of the functions performed by STM library may be included in runtime environment 16 or added to transactions 20 by a compiler such as compiler 92 shown in
The STM primitives of STM library 14 that are callable by transactions 20 also include memory access primitives that manage accesses to objects 30 that are written and/or read by a transaction 20. The memory access primitives access a set of one or more transactional locks 42 for each object 30. In one embodiment, STM system 10 uses the object header of objects 30 to store the corresponding transactional locks 42. Each transactional lock 42 indicates whether a corresponding object 30 or portion of a corresponding object 30 is locked or unlocked for writing and/or reading. When an object 30 is locked for writing, the corresponding transactional lock 42 includes an address or other reference that locates an entry for the object 30 in a write log 34W in one embodiment. When an object 30 is not locked for writing, the corresponding transactional lock 42 includes a version number of the object 30.
For each non-array object 30, the memory access primitives may access a single transactional lock 42 that locks or unlocks the non-array object 30 for writing and/or reading. For each array object 30, the memory access primitives may access a set of one or more transactional lock 42 where each transaction lock 42 in the set locks or unlocks a corresponding portion of the array object 30 for writing and/or reading. Runtime environment 16 creates and manages the transactional lock(s) 42 for each object 30.
The memory access primitives of STM library 14 generate and manage a set of one or more STM logs 34 for each transaction currently being executed. Each set of STM logs 34 includes a write log 34W and a read log 34R in one embodiment. Each write log 34W includes an entry for each object 30 that is written by a transaction 20 where each entry includes an address of a corresponding object 30, the version number from the transactional lock 42 of the corresponding object 30, and an address or other reference that locates a shadow copy of the corresponding object 30. Each read log 34R includes an entry for each object 30 that is read by a transaction 20 where each entry includes a reference that locates the transactional lock 42 of a corresponding object 30.
Runtime environment 16 may be any suitable combination of runtime libraries, a virtual machine (VM), an operating system (OS) functions, such as functions provided by an OS 122 shown in
STM library 14 performs the following algorithm, or variations thereof, to execute each transaction 20. Each time a transaction 20 is started by a thread of execution, STM library 14 creates and initializes variables used to manage the transaction. STM library 14 then allows the transaction 20 to execute and perform any write and/or read memory accesses to objects 30 as follows.
To access an object 30 for writing, the transaction 20 invokes a memory access primitive that opens the object 30 for writing. STM library 14 acquires a transactional lock 42 corresponding to the object 30 for the transaction 20 if the lock is available. If the object 30 is not available (i.e., the object 30 is locked by another transaction 20), then STM library 14 detects a memory access conflict between the current transaction 20 and the other transaction 20 and may rollback and re-execute the current transaction 20. If the object 30 is locked by the current transaction 20, then STM library 14 has already acquired the transactional lock 42 corresponding to the object 30 for the transaction 20. Once a corresponding transaction lock 42 is acquired, STM library 14 causes each write access 32 to be made to either the object 30 itself or a shadow copy of a corresponding object 30 (not shown) and causes an entry corresponding to the write access 32 to be stored in log 34W. For non-array objects 30, the shadow copy, if used, may be stored in log 34W. For array objects 30, a shared shadow copy, if used, may be stored separately from log 34W.
To access an object 30 for reading, the transaction 20 invokes a memory access primitive that opens the object 30 for reading. If the object 30 is not write locked and does not exceed the maximum number of pessimistic readers supported by the pessimistic read lock, STM library 14 causes an entry corresponding to the read access to be stored in read log 34R. If the read access is a pessimistic read access, STM library 14 also acquires a transactional lock 42 for the object 30. If the object 30 is locked by another transaction 20, then STM library 14 detects a memory access conflict between the current transaction 20 and the other transaction 20 and may rollback and re-execute the current transaction 20. If the object 30 is locked by the current transaction 20, then STM library 14 may cause an entry corresponding to the read access to be stored in read log 34R or set a flag corresponding to the object 30 in write log 34W to indicate that the object 30 was also read. STM library 14 causes a read access 32 that occurs before a designated object 30 has been opened from writing by the transaction 20 to be made directly from the corresponding object 30. STM library 14 causes each read access 32 that occurs after a designated object 30 has been opened for writing by a transaction 20 to be made from either the corresponding object 30 directly or the corresponding shadow copy.
Subsequent to performing the memory accesses, STM library 14 allows the transaction 20 to begin commit processing to ensure that the memory accesses by the transaction 20 did not conflict with the memory accesses by any other transaction 20. The commit processing may include validating the read accesses of the transaction 20, updating any objects 30 that were modified by the transaction 20 with the shadow copies used to store the modifications, and/or storing an updated version number in any objects 30 that were modified by the transaction 20. If STM library 14 detects any memory access conflicts between the current transaction 20 and another transaction 20 during the commit processing, STM library 14 may rollback and re-execute the current transaction 20. Subsequent to performing the commit processing, STM library 14 allows the transaction 20 to complete subject to a completion order of transactions 20 as described in additional detail below. After a transaction 20 completes, STM library 14 allows the thread that caused the transaction 20 to be executed to resume and execute additional transactional or non-transactional code.
In one embodiment, STM system 10 provides weak atomicity between transactional code (i.e., transactions 20) and non-transactional code in STM code 12. With weak atomicity, STM system 10 does not provide any general guarantees for interactions between transactional and non-transactional code when STM code 12 is executed. Without any guarantees for interactions between transactional and non-transactional code, an incorrect code behavior that creates a privatization artifact may occur in weak atomicity STM systems that do not provide privatization safety.
STM code 12 may include a programming idiom known as privatization. With privatization, STM code 12 stores a reference to an object 30 in a global data structure (not shown) and maintains a convention that the object 30 is only accessible via the reference stored in the shared data structure in one embodiment. If a thread of STM code 12 removes the object 30 from the shared data structure, the thread has exclusive access to the object 30 and can access and modify it without synchronizing the access or modifications with other threads. In another privatization embodiment, STM code 12 stores a global Boolean flag to indicate whether an object 30 is privatized or not. A thread of STM code 12 privatizes an object 30 by setting the flag and other threads detect that the object 30 is privatized by checking the flag. In further privatization embodiment, STM code 12 stores a global reference to indicate whether an object 30 is privatized or not. A thread of STM code 12 privatizes an object 30 by copying the global reference and setting the global reference to null. Other threads detect that an object 30 is privatized if the global reference of the object 30 is null.
If weakly atomic embodiments of STM system 10 did not provide privatization safety, a privatization artifact may occur as shown in the example of
STM system 10 may provide privatization safety by ensuring that privatizing transactions 20 (e.g., transaction 20(1)) wait (i.e., quiesce) until all concurrently executing transactions 20 in other threads (e.g., transaction 20(2)) that may be damaging transactions have completed before allowing the privatizing transactions 20 to complete. Damaging transactions are those transactions 20 that may perform a damaging write to a privatized object 30 and thereby create a privatization artifact. For example, STM system 10 may serialize the commit processing of transactions 20 using a commit ticket and prevent any given transaction 20 from completing until all transactions 20 that begin commit processing prior to the given transaction 20 have completed in embodiments that use shadows copies for write accesses (i.e., buffered write embodiments). This quiescence implementation, however, may inhibit parallel scalability by causing threads that have completed a transaction 20 to wait for other transactions 20 to complete because of the possibility that the transaction 20 may have privatized an object 30 and one or more of the concurrently executing transactions 20 may be a damaging transaction 20. As another example, STM system 10 may serialize the commit processing of transactions 20 to prevent any given transaction 20 from completing until all transactions 20 that began executing prior to the given transaction 20 have completed in embodiments that perform write accesses directly to objects 30 (i.e., in-place write embodiments).
To enhance parallel scalability, STM system 10 identifies situations where a privatization artifact will not occur between a given transaction 20 that is ready to complete and one or more earlier transactions 20 in a completion order. As used herein, the term earlier transaction in a completion order refers to a transaction 20 that is scheduled by STM system 10 to complete prior to a given transaction 20. Likewise, the term later transaction in a completion order refers to a transaction 20 that is scheduled by STM system 10 to complete subsequent to a given transaction 20. In situations where STM system 10 can ensure that a privatization artifact will not occur for a given transaction 20, STM system 10 allows the given transaction 20 to complete prior to one or more earlier transactions 20 in the completion order.
STM system 10 effectively alters the completion order of transactions 20 in situations where a privatization artifact will not occur between a transaction 20 that is ready to complete and one or more earlier transactions 20. In buffered write embodiments, STM system 10 may determine an initial completion order of transactions 20 based on the order that the transactions 20 began commit processing. In in-place write embodiments, STM system 10 may determine an initial completion order of transactions 20 based on the order that the transactions 20 began executing. In other embodiments, STM system 10 may determine an initial completion order of transactions 20 based on other criteria.
STM system 10 uses the initial completion order as an actual completion order of transactions 20 unless STM system 10 can determine that a privatization artifact will not occur. If STM system 10 makes such a determination between a current transaction 20 and one or more earlier transactions 20, then STM system 10 implements a completion order that differs from the initial completion order by moving the current transaction 20 before the one or more earlier transactions 20 that, in conjunction with the current transaction 20, will not produce a privatization artifact.
If the current transaction 20 is ready to complete, then STM library 14 determines whether all earlier transactions 20 in the completion order have completed as indicated in a block 64. If all earlier transactions 20 in the completion order have completed, then STM library 14 allows the current transaction 20 to complete as indicated in a block 66. Because all earlier transactions 20 have completed in this case, the completion of the current transaction 20 will not cause a privatization artifact as long as no later transaction 20 (i.e., a transaction 20 that is scheduled to complete subsequent to the current transactions 20 in the completion order) that privatizes an object 30 is allowed to complete prior to the current transaction 20.
If one or more earlier transactions 20 in the completion order have not completed, then STM library 14 determines whether a privatization artifact may occur as indicated in a block 68. If STM library 14 ensures that a privatization artifact will not occur between the current transactions 20 and the earlier transaction or transactions 20, then STM library 14 allows the current transaction 20 to complete as indicated in block 66. By doing so, STM library 14 expedites the completion of the current transaction 20 by allowing the current transaction 20 to complete prior to one or more earlier transactions 20 in the completion order.
If STM library 14 cannot conclusively ensure that a privatization artifact will not occur between the current transactions 20 and the earlier transaction or transactions 20, then STM library 14 assumes that a privatization artifact may occur and repeats the functions of blocks 64 and 68 until all earlier transactions 20 have completed as determined in block 64 or until STM library 14 can conclusively determine that a privatization artifact will not occur between the current transaction 20 and the earlier transactions 20 that have not completed as determined in block 68.
STM library 14 determines whether a privatization artifact may occur using one or more of the embodiments described with reference to
In the embodiment of
In example of
Because transaction 20(n+2) is not at the top of queue 70 in the example of
By reordering queue 70 in the embodiment of
The problematic privatization case involves a concurrent modification to a privatized object 30 by a damaging transaction while the object 30 is being accessed by non-transactional code that follows a privatizing transaction 20 in a privatizing thread. Because a TVAR can only be accessed by transactions 20, any privatized object 30 that is accessed by non-transactional code cannot be a TVAR. Thus, no TVAR transaction 20 can be a damaging transaction.
Like the embodiment of
In one embodiment with TVAR transactions 20 shown in
In another embodiment with TVAR transactions 20 shown in
STM library 14 allows a transaction 20 to complete only when the completion number 82 of the transaction 20 is equal to completed counter 81. Any TVAR transactions 20 that begin commit processing after a given non-TVAR transaction 20 but before a next non-TVAR transaction 20 will be assigned the same completion number 82 by STM library 14. As a result, these TVAR transactions 20 can complete in any order when they are ready once completed counter 81 becomes equal to the completion number 82 of the TVAR transactions 20. In addition, the next non-TVAR transaction 20 will be assigned the same completion number 82 as any TVAR transactions 20 that begin commit processing after the previous non-TVAR transaction 20. Thus, the next non-TVAR transaction 20 can complete in any order with any TVAR transactions 20 that begin commit processing after the previous non-TVAR transaction 20.
With the embodiment of
In one embodiment with pessimistic transactions 20 shown in
In another embodiment with pessimistic transactions 20 shown in
STM library 14 allows a transaction 20 to complete only when the completion number 87 of the transaction 20 is equal to completed counter 86. STM library 14 allows any later transaction 20 to complete prior to any earlier pessimistic transaction 20 but prevents all later pessimistic and non-pessimistic transactions 20 from completing prior to any earlier non- pessimistic transaction 20. As a result, STM library 14 may implement a completion order that differs from an initial completion order that may otherwise be determined from the order that transactions 20 began commit processing or from another suitable completion order criteria while ensuring that privatization artifacts will not occur.
The embodiments of
In other embodiments, the techniques of
With the embodiment of
As another example using the embodiment of
In the embodiment of
By reordering queue 88 in the embodiment of
Compiler system 90 represents a compile mode of operation in a computer system, such as computer system 100 shown in
Code 94 includes a set of one or more STM transactions 20. Each STM transaction 20 includes an atomic block designator 22 that indicates to compiler 92 that a corresponding portion of code 94 is an STM transaction 20. Each STM transaction 20 may include zero or more memory accesses 24 that read from and/or write to an object 30. Code 94 may be any suitable source code written in a language such as Java or C# or any suitable bytecode such as Common Intermediate Language (CIL), Microsoft Intermediate Language (MSIL), or Java bytecode.
Compiler 92 accesses or otherwise receives code 94 with transactions 20 that include memory accesses 24. Compiler 92 identifies memory accesses 24 and compiles code 94 into STM code 12 with invocations 26 of STM array object primitives in STM library 14 for each memory access 24. Compiler 92 performs any desired conversion of the set of instructions of code 94 into a set of instructions that are executable by a designated computer system and includes the set of instructions in STM code 12.
Computer system 100 includes one or more processor packages 102, memory system 104, zero or more input/output devices 106, zero or more display devices 108, zero or more peripheral devices 110, and zero or more network devices 112. Processor packages 102, memory system 104, input/output devices 106, display devices 108, peripheral devices 110, and network devices 112 communicate using a set of interconnections 114 that includes any suitable type, number, and configuration of controllers, buses, interfaces, and/or other wired or wireless connections.
Computer system 100 represents any suitable processing device configured for a general purpose or a specific purpose. Examples of computer system 100 include a server, a personal computer, a laptop computer, a tablet computer, a personal digital assistant (PDA), a mobile telephone, and an audio/video device. The components of computer system 100 (i.e., processor packages 102, memory system 104, input/output devices 106, display devices 108, peripheral devices 110, network devices 112, and interconnections 114) may be contained in a common housing (not shown) or in any suitable number of separate housings (not shown).
Processor packages 102 each include one or more execution cores. Each execution core is configured to access and execute instructions stored in memory system 104. The instructions may include a basic input output system (BIOS) or firmware (not shown), OS 122, STM code 12, STM library 14, runtime environment 16, compiler 92, and code 94. Each execution core may execute the instructions in conjunction with or in response to information received from input/output devices 106, display devices 108, peripheral devices 110, and/or network devices 112.
Computer system 100 boots and executes OS 122. OS 122 includes instructions executable by execution cores to manage the components of computer system 100 and provide a set of functions that allow programs to access and use the components. OS 122 executes runtime environment 16 to allow STM code 12 and STM library to be executed. In one embodiment, OS 122 is the Windows operating system. In other embodiments, OS 122 is another operating system suitable for use with computer system 100.
Computer system 100 executes compiler 92 to generate STM code 12 from code 94. Compiler 92 accesses or otherwise receives code 94 and transforms code 94 into STM code 12 for execution by computer system 100. Compiler 92 performs any desired conversion of the set of instructions of code 94 into a set of instructions that are executable by computer system 100 and includes the set of instructions in STM code 12. Compiler 92 also identifies blocks 20 in code 94 from transaction designators 22 and modifies blocks 20 in STM code 12 to include invocations of STM primitives 26.
In one embodiment, compiler 92 includes a just-in-time (JIT) compiler that operates in computer system 100 in conjunction with OS 122, runtime environment 16, and STM library 14. In another embodiment, compiler 92 includes a stand-alone compiler that produces STM code 12 for execution on computer system 100 or another computer system (not shown).
Computer system 100 executes runtime environment 16 and STM library 14 to allow STM code 12, and transactions 20 therein, to be executed in computer system 100 as described above.
Memory system 104 includes any suitable type, number, and configuration of volatile or non-volatile storage devices configured to store instructions and data. The storage devices of memory system 104 represent computer readable storage media that store computer-executable instructions including STM code 12, STM library 14, runtime environment 16, OS 122, compiler 92, and code 94. The instructions are executable by computer system 100 to perform the functions and methods of STM code 12, STM library 14, runtime environment 16, OS 122, compiler 92, and code 94 as described herein. Memory system 104 stores instructions and data received from processor packages 102, input/output devices 106, display devices 108, peripheral devices 110, and network devices 112. Memory system 104 provides stored instructions and data to processor packages 102, input/output devices 106, display devices 108, peripheral devices 110, and network devices 112. Examples of storage devices in memory system 104 include hard disk drives, random access memory (RAM), read only memory (ROM), flash memory drives and cards, and magnetic and optical disks.
Input/output devices 106 include any suitable type, number, and configuration of input/output devices configured to input instructions or data from a user to computer system 100 and output instructions or data from computer system 100 to the user. Examples of input/output devices 106 include a keyboard, a mouse, a touchpad, a touchscreen, buttons, dials, knobs, and switches.
Display devices 108 include any suitable type, number, and configuration of display devices configured to output textual and/or graphical information to a user of computer system 100. Examples of display devices 108 include a monitor, a display screen, and a projector.
Peripheral devices 110 include any suitable type, number, and configuration of peripheral devices configured to operate with one or more other components in computer system 100 to perform general or specific processing functions.
Network devices 112 include any suitable type, number, and configuration of network devices configured to allow computer system 100 to communicate across one or more networks (not shown). Network devices 112 may operate according to any suitable networking protocol and/or configuration to allow information to be transmitted by computer system 100 to a network or received by computer system 100 from a network.
Although specific embodiments have been illustrated and described herein, it will be appreciated by those of ordinary skill in the art that a variety of alternate and/or equivalent implementations may be substituted for the specific embodiments shown and described without departing from the scope of the present invention. This application is intended to cover any adaptations or variations of the specific embodiments discussed herein. Therefore, it is intended that this invention be limited only by the claims and the equivalents thereof.