The present disclosure relates generally to compilers, and more particularly, to methods and apparatus to preemptively compile an application.
Typically for a processor to run an application, a number of methods (i.e., functions for manipulating data) associated with the application may need to be compiled into instructions that the processor can execute. When the application is started in a current managed runtime environment (MRTE), methods of the application are converted to native code on an as-needed basis. In particular, a method is just-in-time (JIT) compiled when the first attempt to invoke the method is detected. Accordingly, the initial start-up time of an application is a function of compilation of the methods associated with that application.
When a user invokes an application, the processor will attempt to execute the first method of the application (e.g., a “Main( )” method in a C# program or a Java program). Typically, there is no warning prior to the user starting the application. Thus, the processor is forced to JIT compile the “main” method while the user waits because the main method has not been compiled to the native code for the processor to execute. The initial start-up time is further increased because the methods called by the main method are not compiled either. That is, the processor waits until those methods are actually called before compiling those methods. Especially for applications that start and/or stop frequently, the compilation time of an application may negatively affect the performance of the application and/or the platform operating the application.
Although the following discloses example systems including, among other components, software or firmware executed on hardware, it should be noted that such systems are merely illustrative and should not be considered as limiting. For example, it is contemplated that any or all of the disclosed hardware, software, and/or firmware components could be embodied exclusively in hardware, exclusively in software, exclusively in firmware or in some combination of hardware, software, and/or firmware.
The processor system 100 illustrated in
As is conventional, the memory controller 112 performs functions that enable the processor 120 to access and communicate with a main memory 130 including a volatile memory 132 and a non-volatile memory 134 via a bus 140. The volatile memory 132 may be implemented by Synchronous Dynamic Random Access Memory (SDRAM), Dynamic Random Access Memory (DRAM), RAMBUS Dynamic Random Access Memory (RDRAM), and/or any other type of random access memory device. The non-volatile memory 134 may be implemented by flash memory, Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), and/or any other desired type of memory device.
The processor system 100 also includes a conventional interface circuit 150 that is coupled to the bus 140. The interface circuit 150 may be implemented using any type of well known interface standard such as an Ethernet interface, a universal serial bus (USB), a third generation input/output interface (3GIO) interface, and/or any other suitable type of interface.
One or more input devices 160 are connected to the interface circuit 150. The input device(s) 160 permit a user to enter data and commands into the processor 120. For example, the input device(s) 160 may be implemented by a keyboard, a mouse, a touch-sensitive display, a track pad, a track ball, an isopoint, and/or a voice recognition system.
One or more output devices 170 are also connected to the interface circuit 150. For example, the output device(s) 170 may be implemented by display devices (e.g., a light emitting display (LED), a liquid crystal display (LCD), a cathode ray tube (CRT) display, a printer and/or speakers). The interface circuit 150, thus, typically includes, among other things, a graphics driver card.
The processor system 100 also includes one or more mass storage devices 180 configured to store software and data. Examples of such mass storage device(s) 180 include floppy disks and drives, hard disk drives, compact disks and drives, and digital versatile disks (DVD) and drives.
The interface circuit 150 also includes a communication device such as a modem or a network interface card to facilitate exchange of data with external computers via a network. The communication link between the processor system 100 and the network may be any type of network connection such as an Ethernet connection, a digital subscriber line (DSL), a telephone line, a cellular telephone system, a coaxial cable, etc.
Access to the input device(s) 160, the output device(s) 170, the mass storage device(s) 180 and/or the network is typically controlled by the I/O controller 114 in a conventional manner. In particular, the I/O controller 114 performs functions that enable the processor 120 to communicate with the input device(s) 160, the output device(s) 170, the mass storage device(s) 180 and/or the network via the bus 140 and the interface circuit 150.
While the components shown in
In the example of
The method identifier 215 is configured to identify data associated with methods 210 associated with one or more applications to be executed by the processor system 100. As used herein “method” refers to one or more functions, routines, or subroutines for manipulating data. The identified data may be an uncompiled method and/or an identifier associated with the uncompiled method indicating where the uncompiled method is stored. The PCPQ 220 is configured to store data associated with the methods 210. For example, the PCPQ 220 may include methods 210 that are likely to be invoked soon by an application but have not been compiled to native code (i.e., at the time they were inserted into the PCPQ 220) for the processor 120 to execute. In another example, the PCPQ 220 may include identifiers corresponding to the methods 210 so that the methods 210 may be retrieved.
The PCT generator 230 is configured to generate at least one PCT 240. For example, the PCT 240 may be a kernel-level thread such that the operating system (OS) of the processor system 100 may schedule the PCT 240 to run on the processor 120. The processor 120 may be implemented by either multiple physical processors and/or a single processor with multiple cores or multiple hardware threads (e.g., hyperthreading). Each physical processor, core, or hyperthread represents a logical processor. The PCT generator 230 may generate a number of PCTs 240 based on the number of logical processors available in the processor system 100. To maximize the number of methods 210 that are preemptive compiled, the PCT generator 230 may generate a number of PCTs 240 that is equivalent to the number of logical processors available to the processor system 100. For example, the processor system 100 may include one (1) physical processor with three (3) processing cores, which equates to three (3) logical processors. Accordingly, the PCT generator 230 may generate as many as three (3) PCTs 240 to maximize the number of methods 210 for preemptive compilation. To avoid monopolizing all the resources available and affecting other processes in the processor system 100, however, the PCT generator 230 may generate a number of PCT 240 less than the number of logical processors available to the processor system 100. For example, the PCT generator 230 may generate a number of PCT 240 corresponding to a number of idle logical processors in the processor system 100 (i.e., logical processors that are not currently being used and/or will not be used in the near future).
The compiler 250 may be implemented by a just-in-time (JIT) compiler or a dynamic compiler configured to convert byte code (i.e., platform-independent code that may be sent to and/or run by any platform) of a method into native code (i.e., instructions that may be sent and executed by the processor 120). For example, a program may be compiled by a Java compiler into native code used on a Microsoft Windows® platform and/or a Mac OS platform.
To reduce the initial start-up time of an application executing in a managed runtime environment (MTRE), methods 210 called by the main method of the application are stored the PCPQ 220 while the main method is being JIT compiled (i.e., when the first attempt to invoke the main method is detected). The PCPQ 220 stores methods 210 that are likely to be called soon by the application but have not been compiled into native code for the processor 120 to execute. In the example of
The PCT 240 generated by the PCT generator 230 may pull Method A of the PCPQ 300 for the compiler 250 to compile as shown in the third state of the (330) of the PCPQ 300. While Method A is being compiled by the compiler 250, the processor 120 may detect approaching calls to Methods E, F, and G. Accordingly, the processor 120 adds those methods to the beginning of the PCPQ 300 as shown in the fourth state (340) of the PCPQ 300. In particular, Method A invokes Methods E, F, and G shown as E(A), F(A), and G(A), respectively. The compiler 250 then extracts and compiles Method E as shown in the fifth state (350) of the PCPQ 300. If Method E calls additional methods then those additional methods are added to the beginning of the PCPQ 300 in a similar manner.
In the absence of method basic block flow information to indicate when a PCPQ entry (e.g., a method) may be executed, the PCPQ entry may be moved to the end of the PCPQ 300. Further, the PCPQ entry is moved from the beginning of the PCPQ 300 to the end of the PCPQ 300 when a method waiting in the middle of the PCPQ 300 is invoked. For example, Methods F and G may be moved to the end of the PCPQ 300 as shown in the sixth state (360) of the PCPQ 300 so that the compiler 250 may extract and compile Method B as shown in the seventh state (370) of the PCPQ 300. The moved PCPQ entries (i.e., Methods F and G) are retained in the PCPQ 300 because the moved PCPQ entries may be invoked in the future by the method that added them to the PCPQ 300 (i.e., Method A).
During the processing of the PCT 240 shown in
In addition to processing method calls as described above, the preemptive compilation system 200 may also process virtual method calls. A virtual method call accesses the root of an inheritance tree that may be translated at runtime into a call to a specific version of a virtual method in any class of the inheritance tree. In fact, the virtual method may be overloaded in every class of the inheritance tree. When the compiler 250 detects that a method extracted into the PCT 240 is virtually invoked, the compiler 250 adds a virtual call entry to the head of the PCPQ 220. The virtual call entry includes a list of versions of the overloaded virtual method that may be invoked by the virtual method call. The size of the virtual call entry may be minimized by 1) excluding methods that are compiled, 2) excluding abstract methods, and/or 3) excluding methods in classes that currently have no objects in existence. When the compiler 250 processes the PCT 240 with a virtual call entry from the PCPQ 220, the compiler 250 compiles all of the methods listed in the virtual call entry. If an application execution thread invokes one of the virtual methods on the virtual call entry before that method has been compiled, the compiler 250 may remove the entire virtual call entry from the PCPQ 220.
After a method is compiled by the compiler 250, the native code associated with the method is stored in the native code cache 260. Accordingly, the processor 120 may retrieve the native code associated with the method from the native code cache 260 and execute the method without having to JIT compile the method when the method is invoked. Thus, the initial start-up time of the application is reduced.
Machine readable instructions that may be executed by the processor system 100 (e.g., via the processor 120) are illustrated in
In the example of
The processor 120 invokes a method (i.e., Method X) from within the currently executed method (i.e., the first method of the application such as the “Main( )” method) (block 450). The processor 120 determines whether Method X has been compiled and its corresponding native code stored in the native code cache 260 (block 460). If the native code associated with Method X is stored in the native code cache 260, the processor 120 proceeds to execute Method X (block 470). If the native code associated with Method X is not stored in the native code cache 260 (block 460), the processor 120 compiles Method X as described in
Referring to flow chart 500 in
The processor 120 then determines whether the extracted method (i.e., Method X) is currently being compiled or has been compiled (block 540). For example, if Method X is currently being compiled or has been compiled, the processor 120 returns to block 510 to monitor for another PCPQ entry to extract and to preemptive compile into native code. That is, the processor 120 continues to extract methods with the highest priority (i.e., most likely to be executed next by the processor 120) from the PCPQ 220 until a method that has not been compiled is encountered (block 540). If Method X is not currently being compiled or has not been compiled, the processor 120 calls the routine shown in
Referring to flow chart 600 in
Referring back to
To further optimize resources available, the processor system 100 may include a plurality of PCPQs to perform preemptive compilation for additional application threads. Referring to
To perform preemptive compilation of the methods associated with applications (i.e., Methods M, N, and P), the plurality of PCTs 720 may access the plurality of PCPQs 710 in a round-robin manner. That is, the plurality of PCTs 720 may begin by extracting a method to compile starting from the PCPQ #1712, then from PCPQ #2714, and finally from PCPQ #3716. After extracting a method to compile from each of the plurality of PCPQs 710 (i.e., a “round”), the plurality of PCTs 720 return to the first PCPQ (i.e., PCPQ #1712) to extract the next method to compile. Accordingly, the plurality of PCTs 720 extract other methods to compile from PCPQ #2714 and then from PCPQ #3716.
To illustrate this concept, PCPQ #1712 may store methods associated with Method M, generally shown as Method A 732, Method B 734, Method C 736, and Method D 738. PCPQ #2714 may store methods associated with Method N, generally shown as Method K 742, and Method L 744. PCPQ #3716 may store methods associated with Method P, generally shown as Method R 752, Method S 754, and Method T 756. The plurality of PCTs 720 may extract methods associated with Method M, Method N, and Method P. In the example of
In the illustrated example of
The next PCT to complete compiling its corresponding method extracts and compiles the next available method in the plurality of PCPQs 710, which, in the illustrated example, is Method C 734 from PCPQ #1712. For example, if PCT #3726 completes compiling its corresponding method (i.e., Method R 752) before the other PCTs, then PCT #3726 extracts and compiles Method C 734 from PCPQ #1712 (7).
Again, the next PCT to complete compiling its corresponding method extracts and compiles the next available method in the plurality of PCPQs 710. For example, PCT #2724 may finish compiling its corresponding method (i.e., Method K 742) before the other PCTs. Accordingly, PCT #2724 may extract the next available method from the plurality of PCPQs 710. Because, in the illustrated example, all the methods stored in PCPQ #2714 have been extracted by the plurality of PCTs 720, PCT #2724 extracts the next available method from PCPQ #3716, which is Method T 756 (8). Thus, the illustrated processor system 100 reduces the initial start-up time of applications by distributing uncompiled methods 210 of applications in a round-robin manner and preemptively compiling those methods before they are invoked.
The methods and apparatus disclosed herein are particularly well suited for use in an MTRE. However, persons of ordinary skill in the art will appreciate that the teachings of the disclosure may be applied to perform preemptive compilation in other suitable environments.
Although certain example methods, apparatus, and articles of manufacture have been described herein, the scope of coverage of this patent is not limited thereto. On the contrary, this patent covers all methods, apparatus, and articles of manufacture fairly falling within the scope of the appended claims either literally or under the doctrine of equivalents.