A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
1. Technical Field
The claimed subject matter relates generally to computer systems and software programs, and more specifically to locks in software loops.
2. Description of the Related Art
In some software programs, it is necessary to acquire a lock on an object in order to perform certain operations that are dependent on the object being in a locked state. Subsequent to performing the operation(s), the lock is released. In some cases, such a lock is found within a software loop, and so lock acquisition on, and release of, an object may occur repeatedly, with each iteration of the loop. While placing the lock in the loop may be convenient and safe, repeated and unnecessary lock acquisition and release may significantly degrade the performance of the software program.
An improved method and system for acquiring and releasing locks within a software program is therefore desirable.
The present invention provides an improved method and system for acquisition and release of locks within a software program. In an exemplary embodiment, a lock within a software loop is transformed by relocating acquisition and release instructions from within the loop to positions outside the loop. In the present discussion, this process is sometimes referred to as a “lock coarsening transformation”. This transformation may significantly decrease unnecessary lock acquisition and release during execution of the software program. In order to avoid contention problems which may arise from acquiring and keeping a lock on an object over a relatively long period of time, a contention test may be inserted into the loop. Such a contention test may temporarily release the lock if another thread in the software program requires access to the locked object.
In an embodiment, in addition to the coarsening transformation, a loop may be transformed into a “strip-mine” configuration. Typically, a strip-mine configuration includes an inner loop and an outer loop, and the inner loop may be executed in “strip-lengths” of “S” iterations. The outer loop may now contain the lock acquisition and release instructions, which may also be executed every S iterations.
In an embodiment, the value of S may be dynamically adjusted based oil the level of contention seen at the contention test.
In an aspect of the invention, there is provided a method of transforming a lock on an object in a loop of a computer program, said lock having a pair of lock and unlock operations applicable on said object at original points within said loop, said method comprising:
In an embodiment, in (ii), said contention test is inserted at the original point of said lock operation.
In an embodiment, said loop has N iterations, and said method further comprises:
In an embodiment, said method further comprises:
In an embodiment, (iv) comprises incrementally shrinking S where said contention test indicates contention.
In an embodiment, (iv) comprises incrementally growing S where said contention test indicates lack of contention.
In an embodiment, (iv) comprises resetting S to a predetermined minimum value where said contention test indicates contention.
In an embodiment, (iv) comprises growing S by a multiplicative value, to a maximum value of N, where said contention test indicates lack of contention.
In an embodiment, in (ii) said contention test is provided in each branch of a loop, such that said contention test is performed regardless of the branch of the loop accessed.
In another aspect of the invention, there is provided a system for transforming a lock on an object in a loop of a computer program, said lock having a pair of lock and unlock operations applicable on said object at original points within said loop, comprising:
In an embodiment, in (b), said contention test is placed at the original point of said lock operation.
In an embodiment, said loop has N iterations, and said system further comprises:
In an embodiment, said system further comprises:
In an embodiment, (d) comprises means for incrementally shrinking S where said contention test indicates contention.
In an embodiment, (d) comprises means for incrementally growing S where said contention test indicates lack of contention.
In an embodiment, (d) comprises means for resetting S to a predetermined minimum value where said contention test indicates contention.
In an embodiment, (d) comprises means growing S by a multiplicative value, to a maximum value of N, where said contention test indicates lack of contention.
In an embodiment, in (b) said contention test is provided in each branch of a loop, such that said contention test is performed regardless of the branch of the loop accessed.
In another aspect of the invention, there is provided a system comprising a processor and computer readable memory, said memory storing code for transforming a lock on an object in a loop of a computer program, said lock having a pair of lock and unlock operations applicable on said object at original points within said loop, said code adapting said system to:
In another aspect of the invention, there is provided a computer readable medium having computer readable program code embedded in the medium for transforming a lock on an object in a loop of a computer program, said lock having a pair of lock and unlock operations applicable on said object at original points within said loop, the computer readable program code including:
In an embodiment, in (b), said code is configured to insert said contention test at the original point of said lock operation.
In an embodiment, said loop has N iterations, and said computer readable program code further comprises:
In an embodiment, said computer readable program code further comprises:
In an embodiment, (d) comprises code for incrementally shrinking S where said contention test indicates contention.
In an embodiment, (d) comprises code for incrementally growing S where said contention test indicates lack of contention.
In an embodiment, (d) comprises code for resetting S to a predetermined minimum value where said contention test indicates contention.
In an embodiment, (d) comprises code for growing S by a multiplicative value, to a maximum value of N, where said contention test indicates lack of contention.
In an embodiment, in (b) said code is configured to provide a contention test in each branch of a loop, such that said contention test is performed regardless of the branch of the loop accessed.
The forgoing and other aspects of the invention will be apparent from the following more particular descriptions of exemplary embodiments of the invention.
In the figures which illustrate exemplary embodiments of the invention:
a is the code of
b shows illustrative alternatives for shrinking and growing the strip-length S in the code of
Referring to
As shown, method 200 begins and first proceeds to decision block 202 at which method 200 determines whether a coarsening transformation of the subject lock is “legal”. In an embodiment, this determination at block 202 may comprise a series of tests performed by examining and analyzing the subject software program code (e.g. software program 103 of
If a lock coarsening transformation is determined to be legal at decision block 202, method 200 then proceeds to block 204 where the lock coarsening transformation is applied to the subject software program code. If not, method 200 ends. An illustrative example of this coarsening transformation is provided further below with reference to
Upon applying the lock coarsening transformation at block 204, a transformed version of the code is obtained. Method 200 then proceeds to block 206, at which one or more contention tests may be inserted into the transformed code. As will be explained, the contention tests ensure that a lock on an object is not held unduly when another thread in the same software program requires the locked object. An illustrative example is provided further below with reference to
Method 200 then proceeds to block 208 at which method 200 determines whether it is possible to further transform the code into a strip-mine configuration. If so, method 200 proceeds to block 210 at which the code is transformed into a strip mine configuration with an inner loop having a strip-length S. If not, method 200 ends. An illustrative example of transformation to a strip-mine configuration is provided further below with reference to
From block 200, method 200 finally proceeds to block 212 at which method 200 may further transform the code by adding the ability to dynamically adjust strip length S. Method 200 then ends. An illustrative example of code to dynamically adjust S is provided further below with reference to
A more specific description of the transformations in method 200 is now provided.
With reference to the determination at decision block 202 as to whether a lock coarsening transformation is legal, there are a variety of restrictions affecting when it is possible and correct to move a pair of lock and unlock operations from inside a loop. For example, a lock or unlock operation cannot be moved above or below certain types of software instructions because the semantics or correctness of the program might be altered.
Note that the lock-coarsening transformation may be affected by the programming language used. In the illustrative transformation examples provided further below, the “C” language is used, as it is widely understood. However, it is necessary to take care in adapting the transformations to other languages, such as Java, where there are memory coherence semantics associated with acquiring and releasing a lock. In such languages as Java, an instruction that locks or unlocks an object cannot simply be moved from one program location to another. Instead, the instruction must be divided into two parts; one which acquires or releases a lock, and another which accomplishes the memory coherence semantics. Only the first part of the instruction can be moved. Thus, in the present description, it will be appreciated that reference to “moving” of a lock or unlock operation applies to moving only the acquire or release aspect of the operation, and not the memory cohesion aspect of the operation.
Generally, before the above described coarsening transformation can be applied, it must first be established that the lock acquisition operation can be moved to before the loop, and that the lock release or unlock operation can be moved to after the loop. Thus, a primary restriction on whether the lock and unlock operations can be moved outside the loop is that the lock and unlock operations must apply to the same object for the entire duration of the loop. Accordingly, if a different object can be locked or unlocked on different iterations of the loop, then the coarsening transformation at block 204 cannot be applied. In this case, method 200 simply ends, as shown in
If the same object is always locked and unlocked, then the software program code may be further examined above and below the original lock and unlock operations, respectively, for instructions that fall into one of the following four cases that prohibit moving the lock and unlock operations out of the loop:
Illustrative examples of the transformations described above are now provided with reference to
First consider the illustrative software program code 300 shown in
For the purposes of the present illustrative example, it is also assumed that the instructions above and below the original lock and unlock operations (lines 305 and 307) in code 300 do not fall into any one of the four previously enumerated cases which may prohibit moving the lock and unlock operations (lines 305 and 307) out of the loop 302-313.
Referring to
It will be apparent from this illustrative example that the “lock (L)” operation is now performed just once, before commencement of the loop 404-413, and the “unlock (L)” operation is performed just once, after completion of the loop 404-413. Thus, the lock has been coarsened in the sense that the number of times it is acquired and released has been substantially reduced.
Now referring to
The actual code sequence generated for the “CONTENDED (L)” operation in code 500 may depend upon the source language and runtime environment. In this illustrative example, note that the contention test is performed at the same point as the original “lock (L)” operation (i.e. line 305 in
To further improve performance, in some cases, the code 500 shown in
As shown in
It will be appreciated that selection of an initial value for “S” may sometimes be difficult. Consequently, rather than assigning a constant value to “S” as shown in code 600, in an alternative embodiment, the value of “S” may be dynamically adjusted. An illustrative example is shown in
In
In this illustrative example, the strip length “S” is stored between invocations of “method_with_loop ( )” (line 701) in the “method_with_loop_S” variable (line 727) so that the code 700A does not have to repeatedly learn an appropriate value for “S”.
The initial constant value stored in “method_with_loop_S” (line 702) depends on the expected likelihood of contention. For example, a reasonable initial choice for “S” might be any of 2-4 iterations, depending on the amount of code inside the inner loop 708-717. This range of “S” may provide a head start on reducing the number of contention checks, but the value of “S” can be quickly reduced, for example to 1, if the contention level is found to be high.
The “SHRINK (S)” and “GROW (S)” operations are expected to be short inlined code sequences that adjust the value of “S” to take into account the degree of contention experienced by the loop. Thus, the value of “S” may be shrunk after each instance of contention which indicates that other threads in the software program require access to the locked object. Correspondingly, the “GROW (S)” operation shown at line 723 is positioned to execute after each “S” iterations of the inner loop 708-717 In the present example, the value of “S” grows, unless there is an instance of contention which causes the value of “S” to shrink. Also, as will be noted, virtually any integer value of “S” may result since the code “mint strip_length min (N−i, S)” at line 707 accommodates any remainder after the total number of iterations “N” is notionally divided by “S”.
Those skilled in the art will appreciate that the above is but one particular example of how the “SHRINK (S)” and “GROW (S)” operations may be performed, and that various other methods may be used.
For the example shown in
In the examples discussed above, there is only one lock within a software loop that is transformed. However, in certain cases, it may be possible to apply similar transformations even if there is more than one lock. Specifically, there are two cases when the transformation can be safely applied even if more than one lock is present inside the loop: 1) when the locks are nested in the original code, and 2) when every possible path taken inside the loop encounters only one lock pair.
In the first case of nested locks, consider the code 800 shown in
It will be appreciated that a similar coarsening transformation may also be applied to the “L2” lock (i.e. “lock (L2)” at line 912 and “unlock (L2)” at line 914) to generate code 1000 of
As will be appreciated, in order to avoid a possible deadlock opportunity, care must be taken to release and then re-acquire the locks “L1” and “L2” in the proper order. This is illustrated in
In the second case, if there are multiple locks present in a loop but only one of the locks is held in each iteration, and there is profile information indicating which of the locked paths is most frequently executed, that one particular lock can be favoured to be placed outside of the loop, with suitable compensation codes inserted in the other paths. That is, the profile information may be used to identify which lock, when moved outside the loop, is more likely to improve the loop's performance.
For example, consider code 1100 shown in
In this case, it will be appreciated that the “CONTENDED (L2)” check in the “then” path is not needed, since the compensation code will release that lock anyway. It will also be noted that the compensation code to release and re-acquire the “L2” lock in the “then” path is placed as late as possible and as early as possible, respectively, along the path where the “L1” lock is locked, so that the hardware can schedule as much of the code inside the less frequently accessed path as possible. In particular, “UnlockedCodeBefore ( )” and “UnlockedCodeAfter ( )” can be executed in parallel with the code outside the “then” path. Since the “lock (L1)” and “unlock (L1)” may act as barriers to scheduling in any case, it will be appreciated that the adjacent guarded code sections “unlock (L2)” and “lock (L1)”, respectively, should not further impede the hardware's ability to schedule the code within the loop.
Alternatively, the transformation could also be done in the absence of such profile information. However, as will be appreciated, picking the wrong lock may have a negative impact on performance due to the extra “lock ( )” and “unlock ( )” operations executed on the most commonly accessed path. Thus, selection based on profile information is preferable.
While exemplary embodiments of the invention have been described, it will be appreciated that various changes and modifications may be made without departing from the scope of the invention.
Therefore, the scope of the invention is defined by the following claims.
Number | Date | Country | Kind |
---|---|---|---|
2442800 | Sep 2003 | CA | national |
The present application is a continuation and claims the benefit of the filing date of an application entitled, “Transforming Locks in Software Loops” Ser. No. 10/845,542, filed May 13, 2004, assigned to the assignee of the present application, and herein incorporated by reference.
Number | Date | Country | |
---|---|---|---|
Parent | 10845542 | May 2004 | US |
Child | 12135311 | US |