The present embodiment uses dual last in, first out (LIFO) memory stacks in implementing a pseudo-FIFO memory configuration. One problem in using LIFO memory configurations is that of starvation, or indefinite queuing of a work item. When using a LIFO memory stack data structure in a consumer-producer application, starvation may occur if the consumer(s) can not keep up with the producer(s). In such a situation, the first work item added to or “pushed” onto the LIFO stack will never be retrieved or “popped”. Previous starvation prevention techniques involved acquiring locks (to implement a true FIFO memory configuration), which reduces performance and/or requires keeping complicated statistics. Starvation is prevented by the present embodiment using a memory stack data structure having a LIFO property that does not normally make such guarantees. That is, the last data or work item to be inserted will also be the first item retrieved. The converse property is the cause for starvation, i.e. the first item inserted will be the last item retrieved. If the stack is never emptied, because the rate of inserts exceeds the rate of removes, the last item on the stack (first item inserted) will never be acted upon or, in other words, starved.
The present embodiment also uses atomic processor instructions in implementing the pseudo-FIFO memory configuration using the dual LIFO memory stacks. Typically, atomic instructions are not used for a structure that operates with FIFO (First In, First Out) behavior. This is because the atomic instructions only provide consistency for a single memory address. If a queue (having the desired first in, first out property) is implemented as a list with head and tail pointers, it cannot use atomic operations to update both head and tail pointers simultaneously. It is necessary to update both pointers when the list is either becoming empty because the last item was removed or was empty when an item was inserted (first item being inserted).
While the atomic processor instructions and predefined code macros used by the pseudo-FIFO memory embodiment are considered well known to those skilled in the pertinent art, they will be nonetheless described herein as they are the building blocks of the pseudo-FIFO memory embodiment.
The atomic instruction, compare-and-swap (also called compare-and-exchange), is a basic synchronization mechanism implemented in hardware in modern versions of microprocessors, like the Itanium® processor family manufactured by Intel Corporation, for example. The premise is that when the processor confronts a compare-and-swap instruction, it reads the value of a memory location, computes a new value for the location based on this value, and then executes the compare-and-swap operation. The instruction includes a value to be compared (the initial value read), a value to be written (the computed value), and a memory location.
During execution of the instruction, if the memory location still equals the compare value, the computed value is written by the processor into the designated memory location atomically (i.e. no other compare-and-swap or memory operation can interfere). On the other hand, if the value in the memory location is not the same as the compare value, the processor does not do a write to the location because the computed value is now invalid. The previous value of the memory location is returned to allow the processor to determine if the operation succeeded. If the return value and the compare value match, the instruction succeeded and the computed value was written.
Herebelow is pseudocode for the compare-and-swap instruction (executed atomically):
The list or stack operations operate by adding an element or removing (retrieving) an element by an atomic write of the head pointer to the list. A “steal” operation which moves an element from one list to another is included. The following pseudocode serves to demonstrate how this can be done. Note that these operations provide the LIFO stack behavior.
Now that these basic building blocks or instructions have been described, an embodiment of a pseudo-FIFO memory stack using two LIFO memory stacks in a producer-consumer environment may now be described. Referring to the block diagram schematic of
The memory subsystem 12 may be comprised of random access memory (RAM), nonvolatile memory, processor cache, flash memory and the like, for example. In the present embodiment, the memory 12 comprises one or more portions 16 which store instructions, including atomic instructions, executable by the processor 10 or the multiple processors. In addition, the memory 12 may include portions 18 and 20 configured as two(2) LIFO memory stacks, LIFO 1 and LIFO 2, respectively. The LIFO memory stacks 18 and 20 are operational together to store work or data items as a pseudo-FIFO memory stack as will be better understood from the following description. The processor 10 may take in work items from a plurality of producers 22 and stores them into the LIFO stacks 18 and 20 using a predetermined method which will be described in greater detail herebelow. The work items, which may be customer orders, for example, are removed or retrieved from the stacks 18 and 20 in a pseudo-FIFO order to be executed by the processor(s) 10 to carry out certain tasks thereof for a plurality of consumers 24 with starvation prevention.
The producers 22 and consumers 24 may be external to the processor 10, embedded in a one or more programs stored in the memory subsystem 12 and operated on by the processor 10, or a combination thereof. If embedded, the producers and consumers may be multiple threads of execution in the same program or multiple processes running on the same processor or network system, for example. A producer/consumer application, in and of itself, is generally well known to all those skilled in the pertinent art, and is used herein, by way of example, to form a working environment for the pseudo-FIFO memory configuration.
A process which is suitable for starvation prevention with multiple producers and multiple consumers will now be described. The basic concept is that each producer of the plurality 22 will produce its work items onto only one stack, like LIFO 1, which may be referred to as the primary or ‘default’ stack. If the LIFO 1 stack is empty when a work item is about to be added to the default stack 18, the processor updates a time value with the current time in a memory location designated as ‘last_empty’ time to be able to detect starvation. The processor will remove items from the ‘default’ stack 18 for consumer tasks until the difference between the current time and time value of the ‘last_empty’ memory location exceeds or crosses a starvation threshold. At this point, one of the consumers via processor intervention will “steal” the entire list of work items currently stored on the default stack 18 and move them to the other LIFO stack 20, LIFO 2, which may be referred to as the ‘backup’ stack. Thereafter, work items will be processed or retrieved for the consumers from the backup stack 20 until it is empty. This movement of the work items from the default stack 18 to the backup stack 20 will ensure against any stored item from starving. When the backup stack 20 becomes void of work items, i.e. in a null condition, consumer processing will begin on the default stack 18 again, along with checking for starvation of the work items thereof.
A flowchart of an exemplary method for implementing the pseudo-FIFO memory operation using the two LIFO memory stacks 18 and 20 is shown in
After execution of block 34 or if the default stack 18 is not in a null state, execution continues at block 36 wherein the processor determines if a starvation condition exists in the default stack 18. The stack memory diagram of
If in block 36 it is determined that the time difference exceeds the predetermined threshold time, then starvation of a work item is determined to exist in the default memory 18 which causes the program flow to be diverted to block 42 wherein the back up flag is set. In the present embodiment, the setting of the back up flag is an indication to the program that the back up memory 20 is in use. Next, in block 44, with the back up flag set, one of the consumers 24 may steal or cause all of the work items of the default memory 18, i.e. items 1 through N, to be moved to the back up LIFO memory 20 as shown by way of example in the stack memory diagram of
While the program is in a state to permit the consumers to use the backup LIFO memory stack 20, it will be monitored by block 48 to detect a depleted or null condition, i.e. no work items stored therein. If the backup LIFO memory stack 20 contains memory items, then program execution will continue at block 30 waiting to receive a new work item from a producer. Upon reception of a new work item, the program will execute blocks 32-38 as described herein above. However, since the back up flag is set, program execution will be diverted from block 38 to block 46 to permit the consumers to continue to use the backup LIFO memory stack 20. When block 48 detects a null state in the LIFO memory stack 20 as shown by way of example in the memory stack diagram of
An example of pseudocode for the pseudo-FIFO memory embodiment in the producer and consumer application using atomic instructions is shown below:
The advantages of embodying starvation prevention into a dual LIFO memory data structure using atomic instructions are primarily performance speed and better parallelism, which translates to more throughput for the data structure. Higher throughput allows work to be performed more efficiently.
In the prior art, a spinlock or other mutex must be used to solve the consistency issue with a two pointer FIFO memory data structure. The lock-free FIFO implementation of Doherty et al. referenced in the Background section of the instant application adds much more complexity than Applicants' solution. If strict ordering must be honored, a true FIFO memory stack implementation must be used. Otherwise, the latency for handling any particular work item may be managed by the starvation detection technique used by the producer to determine when to steal or transfer the list of work items from the default LIFO memory stack to the backup stack. Therefore, the wait times can be as predictable, on average, as the true FIFO solution. The time to actually run the work items and the size of the consumer thread pool produces latency that is similar in both dual LIFO (pseudo-FIFO) and true FIFO solutions.
While the present invention has been described hereinabove in connection with one or more embodiments, it is understood that this presentation is merely by way of example. Accordingly, the above presentation or any its embodiments is in no way intended to limit the invention. Rather, the present invention should be construed in breadth and broad scope in accordance with the recitation of the claims appended hereto.