This application is the U.S. National Stage of PCT/FR2015/050183, filed Jan. 27, 2015, which in turn claims priority to French Patent Application No. 1450647 filed Jan. 27, 2014, the entire contents of all applications are incorporated herein by reference in their entireties.
The invention relates to a method for disambiguating code and particularly executable code.
Executable code for the purposes of this application refers to any sequence of bytes that can be loaded for execution by an operating system. This relates particular; but not exclusively, to ELF (Executable in Linkable Format) and PE (Portable Executable) formats. In general, it also relates to all so-called executable files and so-called library files that form executable code.
Some applications require that a plurality of libraries be put in parallel. Each library exports functions that can be used by other libraries or functions that reference it. It is then said that these are libraries are dependent on the library. Each exported function has a name that identifies it.
Problems arise in at least the following cases:
When such a problem is detected in the state of the art, the source code will be “refactored” if possible. This means that another development cycle will be performed to solve naming ambiguities. In the example mentioned, the source code of one of the versions of library libC will have to be changed to rename function fc. This requires that:
A new refactorisation is also an expensive operation because it involves another validation of all impacted code, in other words all unit and/or functional tests have to be repeated to check that there is no impact on the behaviour of the application using the modified code.
Therefore this involves a large work load that is difficult to automate.
The invention is intended to overcome all or some of the disadvantages of prior art identified above, and particular to suggest means of adapting versions of a library without needing to access or modify its source code.
The invention uses a configuration file that specifies symbol renaming instructions. An instruction is used to select one or several symbols in a symbols table and to specify a renaming rule for the selected symbols. For example, symbols are selected as a function of the nature (exported or imported) and/or as a function of their name. In the same way, the configuration file contains dependency renaming instructions.
One aspect of the invention to achieve this purpose is related to a method of disambiguating a file of executable code comprising a symbols table, characterised in that it includes the following steps:
Apart from the main characteristics that were mentioned in the above paragraph, the method/device according to the invention can have one or several additional characteristics among the following, considered individually or in any possible technical combination
The invention also relates to a digital storage device storing a program that can be executed by a machine and composed of instructions that execute the method according to a combination of the previously mentioned characteristics.
Other characteristics and advantages of the invention will become clear after reading the description given below with reference to the appended figures that illustrate:
Identical or similar elements are assigned the same reference marks on all figures, to make them easier to understand.
The invention will be better understood after reading the following description and studying the accompanying figures. All the figures are given for guidance and in no way limit the scope of the invention.
The executable file 101 comprises a header zone 101. The header describes the file structure. The header 101 contains several fields including offsets of different sections of the executable file 101. In particular, the first field 102 in the header 101 gives the offset of a symbols table section 103. An offset is conventionally a distance in bytes from the beginning of the file or the end of the header.
A symbol is a reference to a code or data area in an executable file. A symbol is a means of describing an area to be read or written.
The symbols table section 103 is usually structured as a table, each row of the table corresponding to one symbol. A row comprises at least three columns:
The symbol name is used as an entry to the symbols table, in other words as an identifier of the symbol when the executable file 101 is executed or when the executable file 101 is used for link editing in an executable code production process.
The type specifies whether the symbol is imported or exported. The symbol is exported if it can be used by an executable file other than the file that contains the symbols table. The symbol is imported if the corresponding code or data zone has to be fetched from an executable file other the file that contains the symbols table.
The symbol offset column 106 contains an offset if the symbol is exported. If the symbol is imported, the offset column comprises:
In a simplified manner, a dependencies table is an indexed list of executable file names. Knowledge of the dependencies table offset, for example through a header field 101, and knowledge of an index through the symbols table offset column 106, provides access to the name of the executable element.
The method according to the invention is implemented by a processing computer 301 (shown in
When an action is done by a microprocessor or a computer, the action is performed by the microprocessor controlled by instruction codes loaded into a working memory.
A disambiguation configuration file comprises:
A renaming instruction applies either to a symbol or to a dependency. Thus, a renaming instruction has the following structure:
Designation codes are arbitrary values depending on whether it is required to rename a symbol or a dependency, for example:
A mask may for example comply with one syntax among the following syntaxes, for all renaming functions:
This list of example syntaxes is not limiting.
The case of the regular expression is interesting because it can be used to capture parts of a string that can be used to construct a new name for the symbol.
If it is required to use several selection mask syntaxes in the same disambiguation configuration file, then information has to be added to the structure of the renaming instruction, this added information being code that specifies how the mask should be interpreted.
For example, a branch rule may be:
In step 201; the computer 301 loads a disambiguation configuration file. The computer 301 uses this file to know:
The processing computer goes on from step 201 to a step 302 to bad the symbols list. The case in which the symbols table is also a dependencies table is considered, to simplify the description. In the loading step 202, the computer 301 accesses the symbols table of the file 306 to be processed through the header of this file.
Once the symbols table has been loaded, the computer 301 performs a step 203 to analyse each symbol to determine if one of the renaming instructions applies to the symbol. The analysis proceeds as follows:
These pseudo-instructions are executed for each symbol in the symbols table, and if applicable for each dependency in the dependencies table.
The computer 301 goes on from step 203 in which each symbol is analysed, to a step 204 to save the modified file according to the disambiguation configuration loaded in step 201 to load the configuration in a file for which the name was also loaded in step 201.
In one variant of the invention, symbols to be processed are selected depending on whether they are imported or exported. In this case, the structure of a renaming instruction contains a code specifying whether it applies to exported symbols, imported symbols, or to both.
One example considers a branch instruction of the following type:
This means for example that the new symbol name will be the old name+v+first renaming parameter. The first renaming parameter supplied in the disambiguation configuration file may for example by a version number of the file being processed.
In one integrated variant of the invention, only a single renaming instruction is used for all exported symbols. This rule is to rename all exported symbols, taking account of the version of the processed file. This version is obtained as follows depending on the users choice, without being limiting.
In this integrated variant, the file resulting from the processing is a file named according to a version number. For example, this name is Filename.version.extension, in which:
If this name is the same as the name of the file to be processed, then either the file resulting from the processing replaces the original file, or the file resulting from the processing is saved in another location, in other words in another directory.
In the integration version, renaming instructions can be integrated into the executable code corresponding to the invention. In this case they are loaded at the same time as the executable code corresponding to the invention.
Thus, considering a file E that imports;
The following disambiguation operations are performed:
In the illustrative example given above, the fact that the names of LibA and LibB are changed and therefore that E is processed, is optional. It is described to help illustrate the invention.
Thus, as seen from file E, ambiguities related to calls to the “add” function have been removed. Therefore the objectives of the invention have been achieved.
We have described a digital storage device for the storage of a program composed of instructions for performing the method according to the invention that can be executed by a machine as being a hard disk, but obviously it could be any other removable or non-removable support. For example, it could be a USB key, a CD, DVD or a “BlueRay Disc”, This list is not limiting.
Number | Date | Country | Kind |
---|---|---|---|
14 50647 | Jan 2014 | FR | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/FR2015/050183 | 1/27/2015 | WO | 00 |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2015/110771 | 7/30/2015 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
20080134159 | Guo | Jun 2008 | A1 |
20140372994 | Chheda | Dec 2014 | A1 |
Entry |
---|
Donald Hindle, Acquiring disambiguation rules from text, 1989, ACL '89 Proceeding of the 27th annual meeting on Association for Computational Linguistics, pp. 118-125, retrieved online on Nov. 15, 2018. Retrieved from the Internet: <URL: http://delivery.acm.org/10.1145/990000/981638/p118-hindle.pdf?>. (Year: 1989). |
International Search Report as issued in International Patent Application No. PCT/FR2015/050183, dated Mar. 30, 2015. |
“Emile “iMil” Heitor's home,” Blog Archive, Duplication de symbols, la bonne methode, Dec. 2008, XP055150325, Retrieved from the Internet: <http://imil.net/wp/2008/12/06/duplication-de-symboles-la-bonne-methode/> [retrieved on Oct. 31, 2014]. |
“Objcopy—GNU Binary Utilities,” Dec. 2013, XP055150329, Retrieved from the Internet: <http://web.archive.org/web/20131220023136/http://sourceware.org/binutils/docs/binutils /objcopy.html> [retrieved on Oct. 31, 2014]. |
“NM—GNU Binary Utilities,” Aug. 2013, XP055150331, Retrieved from the Internet: <http://web.archive.org/web/20130805024007/http://sourceware.org/binutils/docs/binutils /nm.html> [retrieved on Oct. 31, 2014]. |
Number | Date | Country | |
---|---|---|---|
20160350092 A1 | Dec 2016 | US |