Virtual function tables are mechanisms used in programming languages to support run-time method binding, which associates a value (a sequence of bits) with an identifier (often tokens or symbols). A virtual function table for an object may contain the addresses of all dynamically bound methods associated with the object. Method calls may be performed by retrieving the method's address from the object's virtual function table. In some programming languages, a virtual function table may be created when source code is converted into object code.
At times, duplicate virtual function tables may be created. For example, when source code contains more than one class in a hierarchy (e.g., the class “animal” has the subclasses “cat,” “dog,” and “horse”), then the virtual function table that is created for each member of the class in the hierarchy may be identical. When object files (also known as object codes or objects) are linked into an executable binary file, each corresponding virtual function table, even if it is a duplicate of another virtual function table, may be linked into the executable binary file.
One or more embodiments of the present invention relate to a method for duplicate virtual function table removal. The method includes identifying, using a processor of a computer, a first virtual function table formed when a first source code is compiled into a first object code. The method further includes using the processor, identifing a second virtual function table formed when a second source code is compiled into a second object code. The method further includes, independent of linking the first object code to a first executable binary code and the second object code to a second executable binary code, identifying, using the processor, that the first virtual function table and the second virtual function table are identical and, using the processor, deleting the second virtual function table.
One or more embodiments of the present invention relate to a system for duplicate virtual function table removal. The system includes a compiler configured to generate a first virtual function table within a first object code from a first source code and generate a second virtual function table within a second object code from a second source code. The system also includes an optimizer configured to identify that the first virtual function table and the second virtual function table are identical and delete the second virtual function table.
Other aspects of the invention will be apparent from the following description and the appended claims.
Specific embodiments of the invention will now be described in detail with reference to the accompanying figures. Like elements in the various figures are denoted by like reference numerals for consistency.
In the following detailed description of embodiments of the invention, numerous specific details are set forth in order to provide a more thorough understanding of the invention. However, it will be apparent to one of ordinary skill in the art that the invention may be practiced without these specific details. In other instances, well-known features have not been described in detail to avoid unnecessarily complicating the description.
For purposes of clarification, a user may be a computer programmer, a customer, a client, another computer program, or any other entity capable of using and/or creating a computer program associated with the removal of duplicate virtual function tables. Also, as shown in
In general, embodiments of the invention provide a method and system for the removal of duplicate virtual function tables. More specifically, one or more embodiments of the invention provide a method and system for finding duplicate virtual function tables during a linking and removing the duplicate tables.
The programming architecture (145) may be a standalone program, a subprogram, or a program that operates in conjunction with other programs. The system on which the programming architecture (145) operates may be a plug-in of another system, an internet-based system, a network, a system that resides on a standalone desktop computer, or some other suitable system. One of ordinary skill in the art will appreciate that embodiments of the invention are not limited to the configuration shown in
In one or more embodiments of the invention, the programming architecture (145) includes a number of source codes (e.g., source code 1 (120), source code N (122)). Source codes (e.g., source code 1 (120), source code N (122)) may be a collection of statements and/or declarations written in a human-readable and/or object-oriented computer programming language. The source codes (e.g., source code 1 (120), source code N (122)) may be written in a complex programming language, including but not limited to D, Visual Basic®, Delphi®, C++, C#, Java®, and Perl®. (Visual Basic is a registered trademark of Mircosoft Corporation of Redmond, Wash.; Delphi is a registered trademark of Embarcadero Technologies, Inc., of San Francisco, Calif.; Java is a registered trademark of Sun Microsystems of Santa Clara, Calif.; Perl is a registered trademark of The Perl Foundation of Ann Arbor, Mich.) Alternatively, the source codes (e.g., source code 1 (120), source code N (122)) may be written in a lower level programming language, including but not limited to machine code and assembly code. In one or more embodiments of the invention, the source codes (e.g., source code 1 (120), source code N (122)) enable the programmer to communicate with and instruct a computer. Each of the source codes (e.g., source code 1 (120), source code N (122)) may be a file or part of a file.
In one or more embodiments of the invention, the compiler (110) is configured to transform the source codes (e.g., source code 1 (120), source code N (122)) into object codes (e.g., object code 1 (124), object code N (126)). Each of the object codes (e.g., object code 1 (124), object code N (126)) may be written in a lower level programming language relative to the programming language in which its corresponding source code (e.g., source code 1 (120), source code N (122)) is written. Alternatively, each of the object codes (e.g., object code 1 (124), object code N (126)) may be written in a complex programming language relative to the programming language in which its corresponding source code (e.g., source code 1 (120), source code N (122)) is written. In one or more embodiments of the invention, the computer programming language in which the source codes (e.g., source code 1 (120), source code N (122)) are written is different than the computer programming language in which the object codes (e.g., object code 1 (124), object code N (126)) are written. The object codes (e.g., object code 1 (124), object code N (126)) may be in a binary format. Each of the object codes (e.g., object code 1 (124), object code N (126)) may be a file or part of a file.
In one of more embodiments of the invention, the compiler (110) is configured to transform the source codes (e.g., source code 1 (120), source code N (122)) into object codes (e.g., object code 1 (124), object code N (126)) to create an executable computer program. Each of the object codes (e.g., object code 1 (124), object code N (126)) may include a text section and a data section. In one or more embodiments of the invention, the text section of an object code (e.g., object code 1 (124), object code N (126)) contains executable instructions. Those skilled in the art will appreciate that the text section of an object code (e.g., object code 1 (124), object code N (126)) may also be known by other names, including but not limited to a code segment, a text segment, and text. The data section of an object code (e.g., object code 1 (124), object code N (126)) may include one or more global variables, such as virtual function tables (e.g., virtual function table 1 (134), virtual function table N (136)) (described below) that are determined by the user, by default, or by a suitable combination thereof
The compiler (110) may be a program or set of programs. The compiler (110) may be a decompiler if the source codes (e.g., source code 1 (120), source code N (122)) are written in a lower level language and the object codes (e.g., object code 1 (124), object code N (126)) are written in a complex programming language. In one or more embodiments of the invention, each time that the compiler (110) transforms the source codes (e.g., source code 1 (120), source code N (122)) into object codes (e.g., object code 1 (124), object code N (126)), the compiler (110) also creates a corresponding virtual function table (e.g., virtual function table 1 (134), virtual function table N (136)). For a source code (e.g., source code 1 (120), source code N (122)) with multiple objects, the compiler (110) may generate a single virtual function table (e.g., virtual function table 1 (134), virtual function table N (136)) for all objects or one virtual function table (e.g., virtual function table 1 (134), virtual function table N (136)) for each object. In one or more embodiments of the invention, each virtual function table (e.g., virtual function table 1 (134), virtual function table N (136)) is embedded into the data section of the corresponding object code (e.g., object code 1 (124), object code N (126)).
The virtual function tables (e.g., virtual function table 1 (134), virtual function table N (136)) may be used by the compiler (110) to support run-time method binding. In one or more embodiments of the invention, the virtual function tables (e.g., virtual function table 1 (134), virtual function table N (136)) contain each address of a dynamically bound method within the object code (e.g., object code 1 (124), object code N (126)).
In one or more embodiments of the invention, the source codes (e.g., source code 1 (120), source code N (122)), the object codes (e.g., object code 1 (124), object code N (126)), and the virtual function tables (e.g., virtual function table 1 (134), virtual function table N (136)) may be located in the same location. Alternatively, the source codes (e.g., source code 1 (120), source code N (122)), the object codes (e.g., object code 1 (124), object code N (126)), and the virtual function tables (e.g., virtual function table 1 (134), virtual function table N (136)) may be located in different locations. A location may be defined as a computer, cache memory, a file, or some other suitable storage location.
In one or more embodiments of the invention, the linker (114) is configured to combine or link one or more object codes (e.g., object code 1 (124), object code N (126)) into a single executable program code. The linker (114) may be a computer program. The linker (114) may be configured to combine or link object codes (e.g., object code 1 (124), object code N (126)) that were generated by the compiler (110). Those skilled in the art will appreciate that the linker (114) may also be known by other names, including but not limited to a link editor, a loader, and a linkage editor.
In one or more embodiments of the invention, the optimizer (112) is configured to optimize object files (e.g., object code 1 (124), object code N (126)). Specifically, the optimizer (112) may be configured to identify and eliminate duplicate object codes (e.g., object code 1 (124), object code N (126)) before the linker (114) creates the single executable computer program. The optimizer (112) may perform one or more optimizations to identify and eliminate duplicate object codes (e.g., object code 1 (124), object code N (126)), including but not limited to analyzing the naming schemes of the virtual function tables (e.g., virtual function table 1 (134), virtual function table N (136)), analyzing certain characteristics (e.g., size, contents, symbols referenced) of the virtual function tables (e.g., virtual function table 1 (134), virtual function table N (136)), and distinguishing similar patterns in data sections of the virtual function tables (e.g., virtual function table 1 (134), virtual function table N (136)). The optimizer (112) may also read in all object codes (e.g., object code 1 (124), object code N (126)) to identify tile virtual function table (e.g., virtual function table 1 (134), virtual function table N (136)) in each corresponding object code (e.g., object code 1 (124), object code N (126)). In one or more embodiments of the invention, the optimizer (112) may have access to every virtual function table (e.g., virtual function table 1 (134), virtual function table N (136)) associated with a computer program.
In one or more embodiments of the invention, the instructions (116) are configured to coordinate the components of the programming architecture (145) in removing duplicate virtual function tables. Specifically, the instructions (116) may be configured to select the computer programming language for the source codes (e.g., source code 1 (120), source code N (122)) and for the object codes (e.g., object code 1 (124), object code N (126)). The instructions (116) may be configured to interpret and translate each of the source codes (e.g., source code 1 (120), source code N (122)) in the computer programming language of the source codes (e.g., source code 1 (120), source code N (122)) into the computer programming language of the object codes (e.g., object code 1 (124), object code N (126)) to create the object codes (e.g., object code 1 (124), object code N (126)). The instructions (116) may be configured to direct the timing and scope of operation of the compiler (110), the optimizer (112), and the linker (114). For example, the instructions (116) may initiate the linker (114) to instantaneously link the first one hundred object codes (e.g., object code 1 (124), object code N (126)) in the programming architecture (145). In one or more embodiments of the invention, the instructions (116) are set or modified by the user.
Referring to
Object code (or, simply, an object) may be a sequence of computer instructions in a programming format. In a computer language where each object is created from a class, an object is called an instance of that class. Each object has a virtual function table associated with the class, and two objects with the same class have a duplication of at least a portion of the virtual function table for each object. Creating an instance of a class is sometimes referred to as instantiating the class. In other words, the virtual function table that is created for each member of the class in the hierarchy, or the portion(s) of the single virtual function table associated with each member of the class in the hierarchy, is identical. In one or more embodiments of the invention, a computer, as described with respect to
In Step 210, a second virtual function table is identified when a second source code is compiled into a second object code. The second source code may be compiled into the second object code by the compiler. In one or more embodiments of the invention, compiling the second source code into the second object code includes creating and/or identifying the second virtual function table. The second virtual function table may located in a data section of the second object code. In one or more embodiments of the invention, a computer, as described with respect to
In Step 215, the first virtual function table and the second virtual function table are identified as being identical at link time. In one or more embodiments of the invention, only specific portions of a virtual function table are considered when identifying that the first virtual function table and the second virtual function table are identical. Such specific portions of a virtual function table may include, but are not limited to, a naming scheme in a symbol table, characteristics of the virtual function table (e.g., size of the virtual function table, contents of the virtual function table, symbols that the virtual function table reference), and patterns in the rest of the data section. Such specific portions of the virtual function table may be determined by default, by a user, or by a suitable combination thereof. Alternatively, all portions of a virtual function table may be considered when identifying that the first virtual function table and the second virtual function table are identical.
Continuing with Step 215, in one or more embodiments of the invention, the first virtual function table and the second virtual function table are identified as identical at a point in time when the first object code and the second object code are linked into an executable binary file. The first virtual function table and the second virtual function table may be identified as identical at a point in time preceding linking the first object code and the second object code. In one or more embodiments of the invention, the first virtual function table and the second virtual function table, or specific portions thereof, are similar in order to be identified as identical. Specifically, similarity between the first virtual function table and the second virtual function table may be based on user-defined criteria, default settings, or a suitable combination thereof. In one or more embodiments of the invention, a computer, as described with respect to
In Step 220, the second virtual function table is deleted. In one or more embodiments of the invention, a computer, as described with respect to
The following scenario describes a method to remove duplicate virtual function tables in accordance with one or more embodiments described above. In this example, a hierarchy is established where a parent class is Shape, and the classes Square and Circle are subclasses that are derived from parent class Shape.
As mentioned previously, a virtual function table may not need to be an exact match with another virtual function table to be considered a duplicate. For example, if the virtual function table in
The optimizer may operate concurrently with the linking process, as in this example. The optimizer may also operate prior to the formation of the executable binary file. The operation of the optimizer may be triggered by a linking process. The operation of the optimizer may also be triggered by some other event or a passage of time, whether defined by the user or set by default. As with
In one or more embodiments of the invention, implementation of removing duplicate virtual function tables reduces the amount of storage space needed on a storage medium and/or in cache memory. In addition, since the compiler and/or linker may have a limited queue, removing duplicate virtual function tables may create more room in the queue of the compiler and/or linker. Further, removing duplicate virtual function tables prior to execution of the executable binary may reduce the execution time.
Embodiments of the invention may be implemented on virtually any type of computer regardless of the platform being used. For example, as shown in
Further, those skilled in the art will appreciate that one or more elements of the aforementioned computer system (700) may be located at a remote location and connected to the other elements over a network. Further, embodiments of the invention may be implemented on a distributed system having a plurality of nodes, where each portion of the invention (e.g., data compression module, data decompression module) may be located on a different node within the distributed system. In one embodiment of the invention, the node corresponds to a computer system. Alternatively, the node may correspond to a processor with associated physical memory. The node may alternatively correspond to a processor with shared memory and/or resources. Further, software instructions to perform embodiments of the invention may be stored on a computer readable medium such as a compact disc (CD), a diskette, a tape, or any other physical computer readable storage device.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this disclosure, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as disclosed herein. Accordingly, the scope of the invention should be limited only by the attached claims.