The present invention relates generally to sharing data structures in a computing environment.
Modern computing systems often make use of multitasking, multiprocessing, or multithreading. For example, a single central processing unit (CPU) may simulate concurrent execution of multiple tasks (sometimes referred to as threads) by switching between the different tasks to provide a fair allocation of CPU time to each task. Multiprocessing systems make use of multiple CPUs and each CPU can execute an assigned task. Both of these techniques may be used in combination in some systems.
In a multitasking system (used herein to refer to multitasking, multiprocessing, and multithreading systems), the multiple tasks may require access to common resources, such as a data structure. For example, tasks running under an operating system may require access to a data structure that has information about the operating system's file system.
Care must be taken to avoid conflicts between multiple tasks accessing the same data structure simultaneously. For example, if one task writes to the data structure while another task is reading from the data structure, then the results can be unpredictable. When multiple tasks attempt to simultaneously modify the data, this may result in the data becoming corrupted. For example, if one task deletes a data structure element that another task holds a pointer to, the pointer becomes invalid (a so-called “dangling pointer”).
To avoid such conflicts, various locks can be implemented to ensure that only a single task accesses the data structure at one time. For example, a mutual exclusion (“mutex”) lock will allow one task to access the data structure at a time. A first task that desires to access the data structure first obtains the mutex lock. When a second task is currently using the data structure, the first task waits until the second task has released the mutex lock. Waiting tasks may either be suspended (allowing other tasks to execute) or spinning (repeatedly checking until the lock becomes available). If the second task takes a long time to release the lock, the first task may be waiting for an extended time period. Various other kinds of locks are also known.
Locking of a data structure by an accessing task can degrade the performance of a computing system when the data structure is shared by many tasks. Because just one task at a time can access the data structure, tasks may spend long time intervals waiting for each other when the data structure is heavily used. Waiting times can be exacerbated when the data structure is large (e.g., a long linked list) and tasks take a long time to traverse the data structure. For example, a low priority task may begin a traverse of the data structure, and then be preempted by a higher priority task. If the lower priority task wishes to traverse the data structure, it waits until the higher priority task is complete. Unfortunately, the low priority task may take a long time to complete, in part because it is preempted by other higher priority tasks. If many tasks wish to traverse a large data structure, locking the data structure during the traverse may thus result in unacceptable performance of the system.
The invention includes a system and method for providing efficient access to a data structure by multiple tasks in a computing environment. The data structure can include an ordered arrangement of elements. One operation of the method can include associating a spinlock with the data structure. Another operation may include defining a plurality of traversal iterators, each having a read pointer pointing to an element within the data structure. The traversal iterators can each be associated with a reader task. A further operation can include traversing the data structure using a reader task. The reader task traverses the data structure by using the read pointer of its associated traversal iterator while holding the spinlock. An additional operation may include deleting a selected element from the data structure using a writer task. The writer task adjusts the read pointer of traversal iterators whose read pointers point to the selected element to be deleted while holding the spinlock.
a-2b are illustrations of deleting a selected element from a data structure in accordance with one embodiment;
a-3b are illustrations of deleting a selected element from a data structure in accordance with another embodiment;
A spinlock 106 is associated with the data structure 102. A spinlock is a mechanism which allows a single task to hold the spinlock at a time. If a task attempts to obtain a spinlock already held by another task, it will execute a wait loop until the spinlock is released. Once a task has obtained the spinlock, it is assured that no other task has the spinlock. Various ways of implementing a spinlock are known in the art. For example, functions to create, hold, and release a spinlock are provided by some operating systems. The spinlock can be implemented by using a flag (or semaphore) that is accessed through atomic functions. Alternatively, the lock can be implemented with a mutex lock where the locked processes are suspended or put to sleep.
The reader tasks 112 access the data structure 102 using traversal iterators 108. Each reader task has an associated traversal iterator. The traversal iterators include a read pointer 110 which points at an element of the data structure. Hence, the reader tasks can traverse the data structure by advancing the read pointer from element to element 104. The reader task will hold the spinlock 106 while updating its read pointer. For example, to traverse the data structure, the reader task may perform the following operations to advance one element in the data structure:
The reader task helps to avoid corruption of the read pointer by holding the spinlock while updating the read pointer. Furthermore, because the reader task releases the spinlock after visiting each element during a long traverse, the reader task limits the amount of blocking of other tasks from accessing the data structure. For example, multiple reader tasks can simultaneously traverse the data structure, each taking turns holding the spinlock while advancing their read pointers and visiting elements of the data structure. Accordingly, efficiency in sharing the data structure can be enhanced relative to an approach where the spinlock is held for the entire time a reader task is traversing the data structure.
Of course, by releasing the spinlock 106 between visiting elements of the data structure 102, a reader task 112 may be interrupted by the writer task 114 wishing to make changes to the data structure. For example, the writer task may delete an element 104 from the data structure, including the element to which a reader task has its traversal iterator read pointer currently pointing. Hence, the writer task will delete a selected element of the data structure by holding the spinlock while adjusting the read pointer of those traversal iterators whose read pointers point to the selected element. For example, the writer task may perform the following operations:
By modifying the read pointers that point to the selected element that is to be deleted, dangling pointers can be avoided. For example,
New elements can also be inserted in the data structure by the writer task. While inserting a new element, the writer task will hold the spinlock. For example, where the data structure is a linked list, holding the spinlock helps to ensure that no reader task (or another writer task) is trying to use the pointers or links while they are being adjusted by the writer task. Various ways of inserting an element into a linked list are known in the art.
In accordance with another embodiment, performance of the system can be further enhanced by including a back pointer in the data structure elements that points back to the traversal iterator (or iterators). For example,
The writer task uses the back pointer of the selected element (element B) being deleted to locate the traversal iterator(s) which point to the selected element. Those read pointers are modified by the writer task. For example, here element B is selected to be deleted, so the writer task follows element B's back pointer to traversal iterator 108a and advances the read pointer to point to the next element, element C. More specifically, the operations performed by the writer task may include the following:
In further detail, to modify the read pointers, the writer task may perform the following operations:
As discussed above, if there is no next element after the selected element being deleted, any read pointers that point to the selected element may be set to null, which will be understood by reader tasks as indicating the traversal is complete.
When back pointers are included, reader tasks may include additional steps to update the back pointer. For example, to traverse the data structure, a reader task may perform the following operations:
Various aspects of the back pointer warrant further detailed discussion. As can be seen from
Inclusion of back pointers can provide enhanced efficiency when deleting an element from the data structure. For example, the writer task can directly look up which traversal iterators are pointing to an element by following the element's back pointer(s) instead of examining all of the traversal iterators. This efficiency benefit increases as more reader tasks and associated traversal iterators are included in the system. Although there is some added overhead associated with maintaining the back pointers, this overhead is generally outweighed by the improved efficiency.
In accordance with another embodiment, performance can be further enhanced by including per element locks.
For example, to access an element of the data structure, the reader task may include the following operations:
The use of per element locks allows the reader task to release the spinlock while visiting the element, allowing other reader tasks to access the spinlock. Hence, even while visiting an element of the data structure, other reader tasks can continue with their traverses (unless, they wish to access the same element, in which case they will wait until the element lock becomes unlocked). Shared access to the data structure is thus accordingly enhanced.
The writer task also obtains the element lock to delete an element. Hence, the reader tasks operations may include:
Various implementation options for the element locks will occur to one of skill in the art. For example, the element lock may also be implemented using a spinlock. Alternately, the element locks may be a flag which is set to lock the element and cleared to unlock the element. A flag may be used for the element lock since it is protected by the spinlock associated with the data structure. Additionally, element locks may optionally be dynamically allocated and associated with the elements as needed, and deleted when no longer needed.
In another example embodiment, the element locks or holding node operation can use a reference counter that allows multiple reader processes to put a hold on the node being visited. The writer processes then wait for the reference count to reach zero before actually deleting the node. Additionally, the deletion operation may even be configured to fail when the reference count is not zero. In this way, the reference counters are another implementation option for the element locks. For example, this embodiment may be used in an operating system's file system.
Another similar way of looking at this counter model is that instead of using a spinlock or mutex, the per-element lock can be a reader/writer lock. Reader processes can obtain the per-element lock in read mode, and the writer process can acquire it in exclusive write mode in order to delete the node. The implication of the reference counter or reader/writer model is that multiple reader processes are allowed to visit each node at the same time but multiple writer processes are not. This is an embodiment that not all implementations will want to use because the basic per-element lock configuration is a better approach for many applications. However, the reference counter model can leverage additional opportunities for parallelism within the nodes (and additional synchronization with locks).
In accordance with another embodiment, both per element locks and back pointers may be used. In such a case, a reader task can traverse the data structure by performing the following operations:
The writer task can delete a selected element by performing the following operations:
The step of modifying the read pointer can include advancing the read pointer to point to a next element, and setting the next element's back pointer to point to the traversal iterator.
Returning to a discussion of the traversal iterators, the traversal iterators may optionally be linked into a traversal iterator list which is associated with the data structure or associated with the spinlock. For example, a reader task desiring access to the data structure can create a new traversal iterator by holding the spinlock while inserting the new traversal iterator into the traversal iterator list. By including the traversal iterators in a list, the writer task can traverse the traversal iterator list, checking each of the read pointers to determine which are pointing to a selected element.
Two exemplary method embodiments will now be described. A first method 500 for providing efficient access to a data structure by multiple tasks is illustrated in
The method may also include defining a plurality of traversal iterators, each traversal iterator associated with a reader task 504. Each traversal iterator includes a read pointer, which points to an element within the data structure. The traversal iterators may optionally be linked to the data structure or the spinlock, for example in a linked list, as discussed previously. Various implementations of an iterator are known in the art and discussed further above.
The method may also include traversing the data structure using a reader task 506. The reader task holds the spinlock while using the read pointer of the reader tasks' associated traversal iterator. As discussed above, the spinlock permits access by a single task at one time, helping to avoid dangling pointers.
Finally, the method may also include deleting a selected element from the data structure using a writer task 508. The writer task holds the spinlock while adjusting the read pointers of traversal iterators whose read pointers point to the selected element being deleted. As discussed above, by adjusting the read pointers of traversal iterators pointing to the selected element to be deleted, the reader tasks will be able to continue with their traverses of the data structure.
A second method 600 for providing efficient access to a data structure by multiple tasks is illustrated in
The method can also include traversing the data structure using a reader task 606, similarly as discussed above. Finally, the method can include deleting a selected element from the data structure using a writer task 608. The writer task holds the spinlock while adjusting the read pointer of traversal iterators pointed to by the back pointer of the selected element being deleted. As discussed above, the efficiency of the data structure access can be enhanced, since the writer task can follow the back pointer to those traversal iterators that can have their read pointers adjusted, rather than searching through all of the traversal iterators to see which are pointing to the selected node to be deleted.
The various embodiments as described above can be implemented in a variety of forms, including for example software. As such, a computer usable medium, including flash memory, magnetic disk, compact disk, digital video disk, etc. may contain computer readable program code to implement the foregoing embodiments. Computer readable code can include executable machine language and/or source code in a high level language such as C or C++.
From the foregoing, it will be appreciated that the various embodiments provide efficient shared access to a data structure. Multiple reader tasks can concurrently traverse a data structure, and hold a spinlock associated with the data structure briefly upon visiting each element of the data structure. Hence, the system efficiency is enhanced since reader tasks do not wait for each other to complete their respective traverses. Additionally, a writer task can delete elements from (or add elements to) the data structure while the reader tasks are traversing the data structure. When deleting an element, the writer task adjusts read pointers pointing to the node being deleted so as to avoid the creation of dangling pointers. Hence, changes to the data structure can be made without disrupting reader tasks traversals. The performance of the system can thus be significantly enhanced under a variety of conditions, including when many reader tasks desire access to the same data structures and when traversing a data structure takes a long period of time.
While the forgoing examples are illustrative of the principles of the present invention in one or more particular applications, it will be apparent to those of ordinary skill in the art that numerous modifications in form, usage and details of implementation can be made without the exercise of inventive faculty, and without departing from the principles and concepts of the invention. Accordingly, it is not intended that the invention be limited, except as by the claims set forth below.