This invention relates generally to computer software and, more particularly, to multi-threaded computing environments.
Traditionally, computer programs operate in single-threaded computing environments. A single-threaded computing environment means that only one task can operate within the computing environment at a time. A single-threaded computing environment constrains both users and computer programs. For example, in a single-threaded computing environment, a user is able to run only one computer program at a time. Similarly, in a single-threaded computing environment, a computer program is able to run only one task at a time.
To overcome the limitations of single-threaded computing environments, multi-threaded computing environments have been developed. In a multi-threaded computing environment, a user typically is able to run more than one computer program at a time. For example, a user can simultaneously run both a word processing program and a spreadsheet program. Similarly, in a multi-threaded computing environment, a computer program is usually able to run multiple threads or tasks concurrently. For example, a spreadsheet program can calculate a complex formula that may take minutes to complete while concurrently permitting a user to still continue editing a spreadsheet.
A problem arises, however, when two threads, of either the same or different computer programs, attempt to access concurrently the same data object or data structure that contains one or more data objects (hereinafter “data structure” will be used to refer to either a data object or a data structure). When exclusive access to the data structure is required by one or both of these threads, such concurrent access of the same data structure may result in corruption of the data structure, ultimately causing the computer hosting the data structure to crash. Therefore, when accessing a data structure, a thread generally is provided a lock associated with the data structure. Utilizing a lock ensures that other threads can only acquire limited rights to the data structure until the thread owning the lock is finished with using the data structure.
Multiple threads may access a data structure to update the data structure with specific modifications. Conventionally, the maintenance of a data structure that can be updated by multiple threads generally employs two approaches: A dedicated processing thread approach and a blocking lock acquisition approach. The dedicated processing thread approach lets a single thread have sole access to the shared data structure. This single thread is also called the master thread. Other threads communicate with the master thread through, for example, message passing, about desired updates to the shared data structure. Because the master thread can only do one thing at a time, concurrent access to the shared data structure is limited; but the integrity of the data structure is maintained. On the other hand, maintaining a dedicated processing thread approach requires additional system resources such as run-time memory and registers. The use of a dedicated processing thread also requires costly context switches. In particular, a computing environment may discourage the existence of threads that are not absolutely necessary. In such a computing environment, the creation and use of an additional thread as a dedicated processing thread to process updates by multiple threads on a shared data structure is considered a poor practice.
The blocking lock acquisition approach utilizes the lock associated with a data structure. A thread wishing to update the data structure can acquire the lock and update the data structure with modifications provided specifically by the thread. Upon completing the updating, the thread releases the lock so another thread can acquire the lock and update the data structure with modifications specifically provided by the another thread. The blocking lock acquisition approach serializes multiple threads' access to a data structure, thus impairing a computing system's scalability and performance. For example, when there are multiple threads wanting to update a data structure, a backlog can be induced. The backlog consists of threads waiting on the lock to be released before they can acquire the lock and modify the data structure. These threads cannot do anything else until they have updated the data structure. Such a backlog thus results in poor system performance.
As a result, the conventional approaches limit the performance, scalability, and resource usage of a computing system. Therefore, there exists a need of an approach that solves the shortcomings and disadvantages of the conventional approaches in updating a data structure that is shared by multiple threads. More specifically, there exists a need of an approach that creates no extra threads dedicated to process updates for a data structure. There also exists a need that allows multiple threads to compete for the lock associated with the data structure, yet induces no backlog of threads wanting to update the data structure.
This invention addresses the above-identified needs by providing an update mechanism that enables any thread attempting to update a data structure to become a temporary master thread. The temporary master thread processes updates for the data structure, wherein the updates are introduced by the temporary master thread itself and/or by other threads. The invention thus allows updates for a data structure to be processed without maintaining a dedicated processing thread, involving costly context switches, or inducing backlog of threads waiting to update the data structure.
In accordance with one aspect of the invention, a thread wanting to update a data structure (hereinafter “update thread”) becomes a temporary master thread for the data structure by acquiring a lock associated with the data structure. The temporary master thread can then process all pending updates for the data structure, wherein the pending updates are introduced by the temporary master thread itself or by other threads. Preferably, threads wanting to update the data structure write pending updates for the data structure to a shared memory. Thus, all pending updates for the data structure are visible to the threads. If one of the threads becomes a temporary master thread, the temporary master thread processes the pending updates for the data structure by reading from the shared memory.
In accordance with another aspect of the invention, the data structure is associated with an Updated flag, whose value indicates whether the data structure has any pending update. Once a thread writes any pending update to the shared memory, the thread also sets the Updated flag. Once the thread successfully acquires the lock associated with the data structure and has become the temporary master thread, it clears the Updated flag and proceeds to process all pending updates for the data structure.
In accordance with another aspect of the invention, once the temporary master thread finishes processing all pending updates for the data structure, the temporary master thread releases the lock and therefore relinquishes its role of being the temporary master thread. Preferably, upon releasing the lock, the thread checks the value of the Updated flag to see if there are any additional pending updates accumulated during the thread's processing of pending updates that previously existed in the shared memory. If there are additional pending updates, the thread may try to acquire the lock again. If the thread successfully acquires the lock again, it becomes the temporary master thread again. If not, this means that another thread wanting to update the data structure has already acquired the lock and becomes the temporary master thread.
The temporary master thread mechanism is used where multiple threads may want to update a data structure in a concurrent (typically interlocked) fashion, where some amount of processing needs to be performed in a serialized fashion, and where it does not matter exactly which thread performs the processing. The invention improves system performance by eliminating the need to maintain a dedicated processing thread and by allowing the updates to be processed without costly context switches. The invention thus improves the performance and scalability of a computing environment.
The invention includes systems, methods, and computers of varying scope. Besides the embodiments, advantages and aspects of the invention described here, the invention also includes other embodiments, advantages and aspects, as will become apparent by reading and studying the drawings and the following description.
The foregoing aspects and many of the attendant advantages of this invention will become more readily appreciated as the same become better understood by reference to the following detailed description, when taken in conjunction with the accompanying drawings, wherein:
In the following detailed description of exemplary embodiments of the invention, reference is made to the accompanying drawings that form a part hereof, and in which is shown by way of illustration specific exemplary embodiments in which the invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention, and it is to be understood that other embodiments may be utilized and that logical, mechanical, electrical and other changes may be made without departing from the spirit or scope of the present invention. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined only by the appended claims.
The detailed description is divided into four sections. In the first section, the hardware and the operating environment in conjunction with which embodiments of the invention may be practiced are described. In the second section, a system of one embodiment of the invention is presented. In the third section, a computerized process, in accordance with an embodiment of the invention, is provided. Finally, in the fourth section, a conclusion of the detailed description is provided.
I. Hardware and Operating Environment
Although not required, the invention will be described in the context of computer-executable instructions, such as program modules, being executed by a personal computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types.
Moreover, those skilled in the art will appreciate that the invention may be practiced with other computer system configurations, including hand-held devices, multiprocessor systems, microprocessor-based or programmable consumer electronics, network PCs, minicomputers, mainframe computers, and the like. As noted above, the invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote memory storage devices. It should be further understood that the present invention may also be applied to much lower-end devices that may not have many of the components described in reference to
With reference to
The personal computer 120 further includes a hard disk drive 127 for reading from and writing to a hard disk 139, a magnetic disk drive 128 for reading from or writing to a removable magnetic disk 129, and an optical disk drive 130 for reading from or writing to a removable optical disk 131, such as a CD-ROM or other optical media. The hard disk drive 127, magnetic disk drive 128, and optical disk drive 130 are connected to the system bus 123 by a hard disk drive interface 132, a magnetic disk drive interface 133, and an optical drive interface 134, respectively. The drives and their associated computer-readable media provide nonvolatile storage of computer-readable instructions, data structures, program modules, and other data for the personal computer 120.
Although the exemplary environment described herein employs a hard disk 139, a removable magnetic disk 129, and a removable optical disk 131, it should be appreciated by those skilled in the art that other types of computer-readable media that can store data that is accessible by a computer, such as magnetic cassettes, flash memory cards, digital video disks, Bernoulli cartridges, random access memories (RAMs), read only memories (ROMs), and the like, may also be used in the exemplary operating environment.
A number of program modules may be stored on the hard disk 139, magnetic disk 129, optical disk 131, ROM 124, or RAM 125, including an operating system 135, one or more application programs 136, other program modules 137, and program data 138.
A user may enter commands and information into the personal computer 120 through input devices, such as a keyboard 140 and pointing device 142. Other input devices (not shown) may include a microphone, joystick, game pad, satellite dish, scanner, or the like. These and other input devices are often connected to the processing unit 121 through a serial port interface 146 that is coupled to the system bus, but may be connected by other interfaces, such as a parallel port, game port, or a universal serial port (USB). A monitor 147 or other type of display device is also connected to the system bus 123 via an interface, such as a video adapter 148. In addition to the monitor, personal computers typically include other peripheral output devices (not shown), such as speakers and printers.
The personal computer 120 may operate in a networked environment using logical connections to one or more remote computers, such as a remote computer 149. The remote computer 149 may be another personal computer, a server, a router, a network PC, a peer device, or other common network node, and typically includes many or all of the elements described above relative to the personal computer 120, although only a memory storage device has been illustrated in
When used in a LAN networking environment, the personal computer 120 is connected to the local network 151 through a network interface or adapter 153. When used in a WAN networking environment, the personal computer 120 typically includes a modem 154 or other means for establishing communications over the wide area network 152, such as the Internet. The modem 154, which may be internal or external, is connected to the system bus 123 via the serial port interface 146. In a networked environment, program modules depicted relative to the personal computer 120, or portions thereof, may be stored in the remote memory storage device. It will be appreciated that the network connections shown are exemplary, and other means of establishing a communications link between the computers may be used.
II. System
In this section of the detailed description, a description of a computerized system according to an embodiment of the invention is provided. The description is provided by reference to
In embodiments of the invention, the multiple theads 202 are the set of threads that request to operate on the data structure 206 in a multi-threaded computing environment. Each of the multiple threads 202, such as representative thread A 212, thread B 214, and thread Z 216, is an executable task that is capable of updating the data structure 206. As those of ordinary skill in the art will appreciate, the multiple threads 202 may not be exclusively associated with the data structure 206. For example, one or more of the multiple threads 202 may also request to update other data structures existing in the multi-threaded computing environment.
The data structure 206 contains data that the multiple threads 202 may wish to modify. One exemplary data structure 206 is a timer queue that contains one or more timers used by a computing system to measure time intervals. The multiple threads 202 may request to set or clear one or more timers in the timer queue, for example.
In embodiments of the invention, each data structure 206 is associated with a lock 208. As known by those of ordinary skill in the art, each lock, such as the lock 208, is a specific type of software object that is specifically utilized to lock a data structure, such as the data structure 206, to a thread, such as the thread A 212. The data structure 206 may be associated with a pointer. When the pointer points to the lock 208, the lock 208 is associated with the data structure 206.
The data structure 206 may also be associated with an Updated flag 210. The Updated flag 210 is used to indicate whether data in the data structure 206 needs to be modified. Any thread among the multiple threads 202 can access the Updated flag 210 and configure its value.
In some embodiments of the invention, the lock 208 may be implemented as a single bit and combined with the Updated flag 210. This implementation allows the lock 208 to be released and the Updated flag 210 to be tested as a single atomic operation.
In embodiments of the invention, the temporary master thread update mechanism 204 enables any one of the multiple threads 202, such as the thread A 212, to become a temporary master thread by acquiring the lock 208. If the thread A 212 succeeds in acquiring the lock 208, the thread A 212 becomes the temporary master thread for the data structure 206. The temporary master thread clears the Updated flag 210, processes pending updates on the data structure 206, and then releases the lock object 208. The temporary master thread then checks the value of the Updated flag 210 to determine whether additional pending updates have been accumulated during the temporary master thread's processing of updates for the data structure 206. If the answer is YES, the temporary master thread will try to re-acquire the lock 208 to process the additional pending updates. If the temporary master thread cannot re-acquire the lock 208, it means that another thread has acquired the lock 208 and thus becomes the new temporary master thread.
The temporary master thread update mechanism 204 is used when multiple threads can update the data structure 206 in a concurrent fashion. The temporary master thread update mechanism 204 can be used when it does not matter exactly which thread processes updates for the data structure 206. Therefore, if the thread A 212 acquires the lock 208, the thread A 212 will process pending updates for the data structure 206, wherein the pending updates can be introduced by the thread A 212 itself, or by other threads among the multiple threads 202.
In an exemplary embodiment of the invention, the system 200 may operate in the following exemplary fashion. Assuming the thread A 212 among the multiple threads 202 desires to update data in the data structure 206. The thread A 212 first writes its desired updates to a shared memory so the updates are visible to other threads among the multiple threads 202. This visibility ensures that other threads may process the updates if the thread A 212 fails to acquire the lock 208 associated with the data structure 206. The thread A 212 then sets the Updated flag 210 to indicate that data in the data structure 206 needs to be updated. The thread A 212 then attempts to become the temporary master thread by trying to acquire the lock 208. If the thread A 212 fails to acquire the lock 208, some other thread has taken the lock 208 and thus has become the temporary master thread, and may process the updates made by the thread A 212. On the other hand, if the thread A 212 succeeds in acquiring the lock 208, the thread A 212 becomes the temporary master thread for the data structure 206. The thread A 212 then clears the Updated flag 210, processes all pending updates for the data structure 206, and releases the lock 208. The thread A 212 may also retest the Updated flag 210 to see if additional updates for the data structure 206 have been provided by other threads just before the lock 208 is released.
III. Process
In this section of the detailed description, a computerized process according to an embodiment of the invention is presented. This description is provided in reference to
Referring now to
Specifically, the process 300 starts by executing a routine 302 in which the update thread enters into a modify state to provide updates for data in the data structure.
Returning to
On the other hand, if the answer to the decision block 314 is YES, meaning that the update thread has acquired the lock associated with the data structure, the update thread thus becomes the temporary master thread for the data structure. The process 300 then enters into a routine 316 where the update thread enters into a process state to process any pending updates in the shared memory that are applicable to the data structure. As noted above, such pending updates could have been provided by the temporary master thread itself, or by any other thread that requests to update the data structure. That is, any thread among the multiple threads 202 illustrated in
Returning to
IV. Conclusion
Although specific embodiments have been illustrated and described herein, it will be appreciated by those of ordinary skill in the art that any arrangement which is calculated to achieve the same purpose may be substituted for the specific embodiments shown. This application is intended to cover any adaptations or variations of the present invention. Therefore, it is manifestly intended that this invention be limited only by the following claims and equivalents thereof.