The present specification relates to protection of integrated circuits, and more particularly, to fast FPGA reverse engineering for hardware metering and fingerprinting.
Field programmable gate arrays (FPGAs) are an integral part of many computing and data processing systems. FPGAs may be used for prototyping and for fielding integrated circuit (IC) systems. In order to lower design cost and decrease time to market, FPGA designers often rely on third party intellectual property that can be precompiled. In particular, third party firmware may be used to program an FPGA to operate in a particular manner.
However, by receiving third party firmware in binary form, it may be difficult for a user to detect any malicious functionality that may be hidden inside. For example, potentially hidden Trojan circuitry may leak sensitive information, allow control of the system to be forfeited, or disable the system. Accordingly, it is desirable to be able to quickly and easily reverse engineer firmware to ensure that the firmware does what it is supposed to do, and does not include any additional malicious or unexpected functionality.
In an embodiment, an apparatus may include a processor configured to synthesize a first configuration file associated with a target field-programmable gate array (FPGA), and a second configuration file associated with the target FPGA, wherein first look-up-table (LUT) bits of the first configuration file are the logical inverse of second LUT bits of the second configuration file, and first non-LUT bits of the first configuration file are the same as second non-LUT bits of the second configuration file; and generate a LUT mask indicating which bits of the first configuration file and the second configuration file correspond to the first LUT bits and the second LUT bits by performing a bit-wise exclusive OR operation between the first configuration file and the second configuration file.
In another embodiment, a method may include synthesizing a first configuration file associated with a target field-programmable gate array (FPGA), and a second configuration file associated with the target FPGA, wherein first look-up-table (LUT) bits of the first configuration file are the logical inverse of second LUT bits of the second configuration file, and first non-LUT bits of the first configuration file are the same as second non-LUT bits of the second configuration file; and generating a LUT mask indicating which bits of the first configuration file and the second configuration file correspond to the first LUT bits and the second LUT bits by performing a bit-wise exclusive OR operation between the first configuration file and the second configuration file.
The embodiments set forth in the drawings are illustrative and exemplary in nature and not intended to limit the disclosure. The following detailed description of the illustrative embodiments can be understood when read in conjunction with the following drawings, where like structure is indicated with like reference numerals and in which:
The embodiments disclosed herein are directed to FPGA reverse engineering for hardware metering and fingerprinting. An FPGA may be programmed with firmware comprising a bit file. In embodiments, this file may be referred to as firmware, a bit file, or a configuration file. If a bit file is provided by a third party, it can be difficult to ensure that the bit file will program the FPGA to operate as desired. Even if the programmed FPGA performs its specified functions, it may also perform additional unexpected or malicious functions. Accordingly, it is desirable to be able to reverse engineer an FPGA programming bit file prior to loading it onto an FPGA in order to ensure that it will not perform any unexpected or malicious functions.
Standard reverse engineering of FPGA programming bit files requires knowledge of how the bits in the bitstream map to the configurable logic for specified FPGAs. There are several known methods to obtain this mapping, such as Project X-Ray, which targets 7 Series Xilinx FPGAs. However, the known methods of reverse engineering FPGA bit files are limited either to a specific FPGA or to a specific manufacturer's toolchain. In embodiments disclosed herein, a method is provided for quickly and easily reverse engineering FPGA programming bit files to locate FPGA LUT functionality that is not tied to a specific FPGA or toolchain.
Turning now to the figures,
In the example of
To program an FPGA, a firmware or bit file is loaded onto the FPGA. In particular, the bit file loads 1's or 0's into each memory cell of each LUT of the FPGA. The specific configuration of 1's and 0's loaded into the memory cells of the LUTs defines the operation of the FPGA. Accordingly, embodiments disclosed herein allow for the bit file to be analyzed to determine how it will program each of the memory cells of the LUTs on an FPGA when the FPGA is programmed using the bit file. This may allow for analysis of the functionality of the FPGA after being programmed with the bit file to ensure that no unexpected or malicious operation will occur.
In embodiments disclosed herein, the bits of a bit file used to program an FPGA are mapped to FPGA hardware.
In the example of
In embodiments, a bitstream LUT mask is created for a particular FPGA that identifies LUT bits for an FPGA. For example, a LUT mask may indicate that bits in the positions of bitstreams 202, 204, 206, and 208 are LUT bits. Accordingly, after a LUT mask is created for an FPGA, when a third party configuration file is to be used to program the FPGA, the LUT mask may be applied to the third party configuration file to identify which bits of the configuration file are LUT bits. This may be used to determine the functionality of the FPGA programmed with the configuration file to ensure that no malicious or unexpected functionality will be implemented. These techniques are discussed in further detail below.
Turning now to
As shown in
The processor 302 may be any device capable of executing machine readable and executable instructions. Accordingly, the processor 302 may be a controller, an integrated circuit, a microchip, a computer, or any other computing device. The processor 302 is coupled to a communication path 304 that provides signal interconnectivity between various modules of the computing device 300. Accordingly, the communication path 304 may allow the modules coupled to the communication path 304 to operate in a distributed computing environment. Specifically, each of the modules may operate as a node that may send and/or receive data. As used herein, the term “communicatively coupled” means that coupled components are capable of exchanging data signals with one another such as, for example, electrical signals via conductive medium, electromagnetic signals via air, optical signals via optical waveguides, and the like.
Accordingly, the communication path 304 may be formed from any medium that is capable of transmitting a signal such as, for example, conductive wires, conductive traces, optical waveguides, or the like. In some embodiments, the communication path 304 may facilitate the transmission of wireless signals, such as Wi-Fi, Bluetooth®, Near Field Communication (NFC) and the like. Moreover, the communication path 304 may be formed from a combination of mediums capable of transmitting signals. In one embodiment, the communication path 304 comprises a combination of conductive traces, conductive wires, connectors, and buses that cooperate to permit the transmission of electrical data signals to components such as processors, memories, sensors, input devices, output devices, and communication devices. Accordingly, the communication path 304 may comprise a CAN bus, a VAN bus, and the like. Additionally, it is noted that the term “signal” means a waveform (e.g., electrical, optical, magnetic, mechanical or electromagnetic), such as DC, AC, sinusoidal-wave, triangular-wave, square-wave, vibration, and the like, capable of traveling through a medium.
The computing device 300 includes one or more memory modules 306 coupled to the communication path 304. The one or more memory modules 306 may comprise RAM, ROM, flash memories, hard drives, or any device capable of storing machine readable and executable instructions such that the machine readable and executable instructions can be accessed by the processor 302. The machine readable and executable instructions may comprise logic or algorithm(s) written in any programming language of any generation (e.g., 1GL, 2GL, 3GL, 4GL, or 5GL) such as, for example, machine language that may be directly executed by the processor, or assembly language, object-oriented programming (OOP), scripting languages, microcode, etc., that may be compiled or assembled into machine readable and executable instructions and stored on the one or more memory modules 306. Alternatively, the machine readable and executable instructions may be written in a hardware description language (HDL), such as logic implemented via either a field-programmable gate array (FPGA) configuration or an application-specific integrated circuit (ASIC), or their equivalents. Accordingly, the methods described herein may be implemented in any conventional computer programming language, as pre-programmed hardware elements, or as a combination of hardware and software components.
The computing device 300 comprises a data storage component 308. The data storage component 308 may store data used by various components of the computing device 300. In addition, the data storage component 308 may store FPGA firmware bit files to be analyzed by the computing device 300.
Now referring to
The configuration file synthesis module 400 may be used to synthesize two FPGA configuration files for an FPGA that may be used to generate a LUT mask for an FPGA, as disclosed herein. In particular, the configuration file synthesis module 400 may generate a first configuration file B, and a second configuration file BI, in which all of the LUT bits in B are the inverse of the LUT bits in BI. That is, every LUT bit that is a ‘1’ in configuration B will be a ‘0’ in configuration file BI, and every LUT bit that is a ‘0’ in configuration file B will be a ‘1’ in configuration file BI.
Furthermore, the configuration file synthesis module 400 may generate the two configuration files such that the non-LUT bits are the same in both B and BI. As such, the bits that are different between the two synthesized configuration files B and BI indicate positions of the LUT bits of a configuration file associated with the FPGA. Accordingly, a bit-wise exclusive or (XOR) function may be performed on the bits of the two configuration files B and BI to generate the LUT mask associated with the FPGA, as disclosed herein.
In embodiments, hardware description language (HDL) code is used to generate the two configuration files B and BI, as disclosed herein. A computer-aided design (CAD) program may utilize the HDL code to synthesize the configuration files. In order to generate the two configuration files B and BI without requiring any FPGA specific components or configuration files a set of connected HDL CASE statements are used, as disclosed herein. There is one CASE statement for each LUT in the target FPGA, and by connecting the CASE statements to each other using special patterns, it can be ensured with a high degree of probability that the desired function is mapped to the same LUT location for both B and BI.
In order to generate the two configuration files B and BI such that only the LUT bits are different between the two configuration files, an arbitrary function is consistently placed in a specific LUT location while the two configuration files are generated. By maintaining consistent placement, the same LUT memory cell functions are consistently mapped to the same bits in a configuration file. In particular, input bits to LUTs are mapped to output bits of other LUTs, as disclosed herein. It should be understood that explicit locations are not needed, rather relative locations can be used to build a sea of interconnected LUTs or functions to determine what is mapped onto the FPGA.
In a first example, a row/column approach is used. In this example, we assume that the target FPGA has a 2-dimensional array of LUTs. Most FPGA layouts can be broken into subsections having a consistent number of rows and columns. As such, this assumption holds for most FPGAs. In this example, it is assumed that a target FPGA has R rows and C columns of LUTs. In some examples, the number of rows and columns in the target FPGA may be determined from the data book associated with the target FPGA.
In the example of
In the example of
The input signal connections in
The example input signal connections described above and shown in
In the second example, only knowledge of the total number of LUTs in the target FPGA is required, which is readily obtainable (e.g., from the data book associated with the target FPGA).
In the example of
Once the input connections to the LUTs of the target FPGA are mapped as discussed above as shown in either the example of
In embodiments, the computing device 300 may use a CAD program to compile HDL code to generate configuration files for a target FPGA, as disclosed herein. The HDL code may specify input signal connections, as discussed above. However, many compilers may reorder the LUT input pin assignments to improve or reduce timing delays, even with logic optimization turned off. Such input reordering changes the addresses of the bits used to program the LUTs. As such, if input pin reordering occurs, it may interfere with LUT mask generation. In particular, the LUT bits in the configuration files B and BI, discussed above, may not line up, and the LUT mask may not be able to be generated using these two configuration files. In embodiments, this problem, may be overcome by using Hamming functions of the LUT input address, as discussed in further detail below. This prevents input address bits from being transposed during generation of the configuration files B and BI.
Another potential issue is that some compilers perform logic reduction due to input pin elimination. In particular, even with logic optimization disabled, some compilers still perform optimizations on any LUT equations that can be reduced to few than K inputs. For example, the K=4 input LUT function “0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1” can be reduced to a function of just its least significant address bit. Similar to pin reordering discussed above, this may also disrupt mask creation since the LUT bits of B and BI may not line up. To overcome this issue, the only functions that are used are XOR, XNOR, and Hamming, which require all input address bits as discussed in further detail below.
Another potential issue associated with the row and column and snake patterns discussed above is that external inputs pints are needed to drive the LUT inputs at the start of the patterns. In the snake pattern, the initial LUTs next to each other in the chain share all but one of their inputs. In this instance, some compilers merge those LUTs into a single LUT by decomposing the shared inputs to drive several smaller LUTs whose outputs connected to multiplexers driven by the non-shared inputs to the two separate LUTs. Accordingly, in embodiments, the initial LUTs in the snake pattern are connected such that none of the initial LUTs share multiple inputs to avoid this merging.
Turning now to
The function LUT of
In addition to the LUT function shown in
In embodiments, Hamming functions of an LUT's input address bits are functions that place logic ‘1’ only in LUT memory cells whose input address bits have the same Hamming weight. For example, a K=4 input LUT has address bits, A=0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111, 1000, 1001, 1010, 1011, 1100, 1101, 1110 and 1111. The address with a Hamming weight of 0 is A=0000. The addresses with Hamming weight of 1 are 0001, 0010, 0100, and 1000. The addresses with Hamming weight of 2 are 1100, 0101, 0110, 1001, 1010, and 1100. The addresses with Hamming weight of 3 are 0111, 1011, 1101, and 1110. The only address with Hamming weight of 4 is 111. Given a K input LUT with even K, Hamming weight of K/2 works best.
Two functions that satisfy these hamming weight requirements are XOR and XNOR. Accordingly, in one example, the LUT memory cell values may be assigned by performing an XOR operation on the input address bits. In another example, the LUT memory cell values may be assigned by performing an XNOR operation on the input address bits.
In embodiments, the configuration file synthesis module 400 uses a CAD program to generate the configuration file B using the input signal connections of the row/column pattern shown in
Referring back to
Referring still to
While the LUT mask determined by the LUT mask generation module 402 identifies which configuration file bits are LUT bits, it does not determine which LUT bits correspond to specific LUTs in the target FPGA. Accordingly, this may be determined by the memory cell mapping module 404. In particular, the memory cell mapping module 404 may use a CAD program to generate a new configuration file in which the memory cells of one LUT are defined based on an XOR operation performed on the input address bits and the memory cells of every other LUT are defined based on an XNOR operation performed on the input address bits.
Using XOR and XNOR operations ensures that the CAD program will not perform pin reordering, logic reduction, or LUT merging, as discussed above. In addition, by only defining one LUT using XOR and the other LUTs using XNOR, the LUT mask generated by the LUT mask generation module 402 may be applied to the new configuration file generated by the memory cell mapping module 404 to identify the LUT bits. The identified LUT bits may then be analyzed to identify the bits defined by XOR rather than XNOR, and the location of the identified bits in the configuration file may be identified as the LUT bits associated with the particular LUT that was defined by the XOR operation. This procedure may be repeated for each LUT of the FPGA to identify the bit locations in the configuration file associated with each LUT. In some examples, the memory cell mapping module 404 may generate the new configuration file using XNOR to define the memory cell values of one LUT and using XOR to define the memory cell values of every other LUT.
The procedure above may be performed N times, where N is the number of LUTs in the FPGA, to identify the bit locations in the configuration file of each LUT in the FPGA. However, in another example, a log based binary search algorithm may be used to reduce the number of iterations needed to be performed. In particular, on the first iteration, a single LUT may be identified. Then on the next iteration, two LUTs may be identified. On subsequent iterations, four LUTs, then eight LUTs, and so on may be identified. As such, the number of iterations may be reduced from N to O(log N) to identify the locations of the bit values of every LUT of the FPGA in the configuration file.
Referring back to
Referring still to
Referring still to
One such fingerprint may comprise the initial LUT memory cell values of an FPGA before the FPGA is programmed. And because the LUT mask disclosed herein can identify the locations of a LUT bits, the LUT mask may be used to determine a fingerprint for the FPGA. In embodiments, to determine a fingerprint for an FPGA, the FPGA can be powered on before it is programmed and the programming file can be read out. Because the FPGA has not been programmed, the values of the memory cells can be used as a unique memory PUF fingerprint for the FPGA. Accordingly, the FPGA fingerprint determination module 410 may apply the LUT mask associated with the FPGA to the read out programming file to identify the values of the uncommitted FPGA LUT memory cells at power up. The values of these uncommitted FPGA LUT memory cells may be used as a fingerprint for the FPGA.
The techniques disclosed herein were tested on a variety of Xilinx and Intel/Altera FPGAs. The disclosed techniques were used to generate LUT mask files to identify LUT programming bits in each of the FPGAs that were tested. Once the LUT mask was found for each device, it took O(log N) additional configurations to determine the location of each specific programming bit in the FPGA configuration file. The testing results are summarized in the table shown in
At step 1102, the LUT mask generation module 402 generates a LUT mask for the target FPGA as described above. In particular, the LUT mask generation module 402 performs a bit-wise XOR operation between the two configuration files generated by the configuration file synthesis module 400 to generate the LUT mask. At step 1104, the memory cell mapping module 404 maps the LUT bits identified by the LUT mask to specific LUTs of the target FPGA, as described above.
At step 1106, the configuration file reception module 406 receives a configuration file to be used to program the target FPGA. At step 1108, the LUT mask application module 408 applies the LUT mask to the received configuration file to identify the bits of the configuration file to be loaded onto the memory cells of the LUT of the target FPGA.
It should now be understood that embodiments described herein are directed to fast FPGA reverse engineering for hardware metering and fingerprinting. The techniques disclosed herein allow for the identification of the LUT bits of a configuration file associated with a target FPGA without the need for any special knowledge about the FPGA. As such, the bits to be loaded onto LUTs of an FPGA can be quickly and easily determined. The techniques disclosed herein can also be used to easily identify a fingerprint associated with an FPGA.
This application claims priority to U.S. Provisional Application No. 63/419,098 filed on Oct. 25, 2022 and U.S. Provisional Application No. 63/510,723 filed on Jun. 28, 2023, each of which is incorporated herein by reference in its entirety.
This invention was made with Government support under Contract No. 1916722 awarded by the National Science Foundation. The Government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
20240135077 A1 | Apr 2024 | US |
Number | Date | Country | |
---|---|---|---|
63419098 | Oct 2022 | US | |
63510723 | Jun 2023 | US |