The present disclosure relates generally to computer systems and information handling systems, and, more particularly, to a system and method for the execution of multithreaded software applications.
As the value and use of information continues to increase, individuals and businesses seek additional ways to process and store information. One option available to these users is an information handling system. An information handling system generally processes, compiles, stores, and/or communicates information or data for business, personal, or other purposes thereby allowing users to take advantage of the value of the information. Because technology and information handling needs and requirements vary between different users or applications, information handling systems may vary with respect to the type of information handled; the methods for handling the information; the methods for processing, storing or communicating the information; the amount of information processed, stored, or communicated; and the speed and efficiency with which the information is processed, stored, or communicated. The variations in information handling systems allow for information handling systems to be general or configured for a specific user or specific use such as financial transaction processing, airline reservations, enterprise data storage, or global communications. In addition, information handling systems may include or comprise a variety of hardware and software components that may be configured to process, store, and communicate information and may include one or more computer systems, data storage systems, and networking systems.
A computer system or information handling system may include multiple processors and multiple front side buses (FSBs). Although each processor of the system will be coupled to one of the multiple front side buses, there could be conflict among the processors of the system for resources that must be shared by the processors of the system. One example of a resource that is shared by the multiple processors is cache resources. If, for example, shared data resides on a cache associated with a first processor and first front side bus, the operation of the system will be degraded by access or invalidate operations that must be performed by processors residing on a different front side bus.
In accordance with the present disclosure, a system and method is disclosed for optimizing the execution of a software application or other code. A computing environment may include a number of processing elements, each of which is characterized by one or more processors coupled to a single front side bus. The software application is subdivided into a number of functionally independent processes. Each process is related to a functional task of the software. Each functional process is then further subdivided on a data parallelism basis into a number of threads that are each optimized to execute on separate blocks of data. The subdivided threads are then assigned for execution to a processing element such that all of the subdivided threads associated with a functional process are assigned to a single processing element, which includes a single front side bus.
The system and method disclosed herein is technically advantageous because it reduces conflict and contention among and between the resources of the computing environment. Because the functionally distinct processes are separated among the processing elements, conflict among the processing element is minimized, as the necessity for a processor of a first processing element to access the resources of a processor of the second processing element is reduced. The system and method disclosed herein is also technically advantageous because the decomposed data threads are distributed among the processors of a single processing element, thereby placing in one processing element all of the software code, and the data required by the software code, that is likely to share the resources that are coupled to a single front side bus. Other technical advantages will be apparent to those of ordinary skill in the art in view of the following specification, claims, and drawings.
A more complete understanding of the present embodiments and advantages thereof may be acquired by referring to the following description taken in conjunction with the accompanying drawings, in which like reference numbers indicate like features, and wherein:
For purposes of this disclosure, an information handling system may include any instrumentality or aggregate of instrumentalities operable to compute, classify, process, transmit, receive, retrieve, originate, switch, store, display, manifest, detect, record, reproduce, handle, or utilize any form of information, intelligence, or data for business, scientific, control, or other purposes. For example, an information handling system may be a personal computer, a network storage device, or any other suitable device and may vary in size, shape, performance, functionality, and price. The information handling system may include random access memory (RAM), one or more processing resources such as a central processing unit (CPU) or hardware or software control logic, ROM, and/or other types of nonvolatile memory. Additional components of the information handling system may include one or more disk drives, one or more network ports for communication with external devices as well as various input and output (I/O) devices, such as a keyboard, a mouse, and a video display. The information handling system may also include one or more buses operable to transmit communications between the various hardware components.
An information handling system or computer system may include multiple processors and multiple front side buses. Software that executes on the processors may execute across multiple processors according to one of two parallelism models. In a data decomposition model, a single function is threaded so that a single function is threaded to execute simultaneously and synchronously on two or more distinct blocks of data. The results of the simultaneous execution are later combined. Data decomposition is also known as data parallelism. The second model is known as functional decomposition and involves the execution of separate functional blocks on non-shared data in an asynchronous fashion. Functional decomposition is established and operates at a higher software level than data decomposition. Functional decomposition is also known as functional parallelism.
Shown in
A parallel application 12 executes in the computing environment 10. In operation, a compiler within the computing environment 10 separates the parallel application into multiple concurrent functional blocks, which are shown in
Following the decomposition of the application into multiple concurrent functional processes, the compiler next performs a data decomposition step to separate each functional process into multiple, parallel threads that each operate on different sets of data. As indicated in
Although
Shown in
Following the steps of
It should be recognized that the term software application is used herein to describe any form of software and should not be limited in its application to software code that executes on an operating system as a standalone application. Although the present disclosure has been described in detail, it should be understood that various changes, substitutions, and alterations can be made hereto without departing from the spirit and the scope of the invention as defined by the appended claims.