Various embodiments of the present disclosure relate to cloud-based hardware acceleration, and more particularly to secure cloud deployment of field-programmable gate array (FPGA) assets.
Field-programmable gate arrays (FPGAs) may excel in handling large-scale search optimization, acceleration, and signal processing tasks when compared to CPUs in terms of power, latency, and processing speed. Cloud service providers can provide FPGA-based cloud acceleration services, allowing for customized acceleration with minimal power consumption. While providing FPGA computing as a cloud resource may provide several advantages over hardware FPGAs, several security concerns may arise. In particular, entire fabric sharing of a power distribution network allows hardware acceleration via cloud computing to become particularly vulnerable to remote physical side-channel attacks and voltage attacks by malicious actors. Furthermore, hardware assets deployed by tenants to cloud FPGA providers may contain sensitive information and intellectual property belonging to third-party vendors or the tenants themselves. For example, a malicious cloud provider may steal contents of a hardware design deployed on cloud FGPA and launch intellectual property counterfeiting, cloning, and/or piracy. Additionally, a malicious cloud provider may induce bitstream tampering that may result in information leakage and denial-of-service. Thus, there is a need for providing secure cloud FPGA between tenants and cloud FPGA providers.
Various embodiments described herein relate to methods, apparatuses, and systems for protecting hardware intellectual property, particularly of cloud field-programmable gate array (FPGA) deployment.
According to one embodiment, the method comprises providing, by one or more processors, a bootable image (BI) to a tenant computing entity; receiving, by the one or more processors, a certificate chain originating from the tenant computing entity via the BI; validating, by the one or more processors, the certificate chain with a certificate authority; providing, by the one or more processors, a design rule data object to the tenant computing entity via the BI; receiving, by the one or more processors, an secure design rule result data object and an encrypted bitstream; determining, by the one or more processors, a match of the secure design rule result data object with the design rule data object; and programming, by the one or more processors, a cloud programmable logic instance with a decryption of the encrypted bitstream based at least in part on the match.
In some embodiments, the BI comprises an executable and one or more design compiling and design checking tools. In some embodiments, the tenant computing entity comprises a tenant-side root of trust (TSRoT). In some embodiments, the BI comprises a plurality of acceptable hash states that are associated with the TSRoT. In some embodiments, the TSRoT is configured to terminate execution of the BI based at least in part on the TSRoT comprising a hash state that is not in the plurality of acceptable hash states. In some embodiments, the cloud programmable logic instance comprises a cloud field-programmable gate array. In some embodiments, the method further comprises receiving a TSRoT certificate and a BI certificate; and validating the TSRoT certificate and the BI certificate with a certificate authority.
In some embodiments, the method further comprises generating the design rule data object by generating an encrypted design rule (EDR) nonce; and symmetrically encrypting a design rule with the EDR nonce. In some embodiments, the method further comprises generating an EDR key by asymmetrically encrypting the EDR nonce with a BI public key and a TSRoT public key. In some embodiments, the BI is configure to generate the encrypted bitstream by: determining the EDR nonce by asymmetrically decrypting the EDR key with a BI private key and a TSRoT private key; determining a design rule by symmetrically decrypting the design rule data object with the EDR key; extracting one or more design rule components from the design rule; generating a design rule result based at least in part on the one or more design rule components; generating a raw bitstream; generating a hash of the raw bitstream. In some embodiments, the BI is configured to generate the secure design rule result data object by determining an exclusive or (XOR) of a hash of the design rule result with a hash of the raw bitstream; and asymmetrically encrypting the XOR with a cloud programmable logic provider public key. In some embodiments, the BI is configured to generate the encrypted bitstream by applying a first symmetric encryption on the raw bitstream with the hash of the raw bitstream. In some embodiments, the BI is configured to generate the encrypted bitstream by applying a second symmetric encryption on the first symmetrically encrypted raw bitstream with a programmable logic-side root of trust nonce.
In some embodiments, the method further comprises determining the design rule result based at least in part on the secure design rule result data object by asymmetrically decrypting the secure design rule result data object with a cloud programmable logic provider private key. In some embodiments, the method further comprises recovering an estimate hash of the raw bitstream by generating an XOR of the design rule result and a hash of an expected design rule result. In some embodiments, the method further comprises recovering, using the programmable logic-side root of trust nonce, the first symmetrically encrypted raw bitstream by symmetrically decrypting the second symmetrically encrypted raw bitstream with the programmable logic-side root of trust nonce. In some embodiments, the method further comprises recovering, using the programmable logic-side root of trust nonce, the raw bitstream by symmetrically decrypting the recovered first symmetrically encrypted raw bitstream with the estimate hash of the raw bitstream. In some embodiments, determining the match of the secure design rule result data object with the design rule data object comprises comparing a hash of the recovered raw bitstream with the estimate hash of the raw bitstream.
In some embodiments, a system comprises one or more processors and at least one memory storing processor-executable instructions that, when executed by any of the one or more processors, causes the one or more processors to perform operations comprising providing a bootable image (BI) to a tenant computing entity; receiving a certificate chain originating from the tenant computing entity via the BI; validating the certificate chain with a certificate authority; providing a design rule data object to the tenant computing entity via the BI; receiving a secure design rule result data object and an encrypted bitstream; determining a match of the secure design rule result data object with the design rule data object; and programming a cloud programmable logic instance with a decryption of the encrypted bitstream based at least in part on the match.
In some embodiments, one or more non-transitory computer-readable storage media include instructions that, when executed by one or more processors, cause the one or more processors to perform operations comprising providing a bootable image (BI) to a tenant computing entity; receiving a certificate chain originating from the tenant computing entity via the BI; validating the certificate chain with a certificate authority; providing a design rule data object to the tenant computing entity via the BI; receiving a secure design rule result data object and an encrypted bitstream; determining the secure design rule result data object matches with the design rule data object; and programming a cloud programmable logic instance with a decryption of the encrypted bitstream based at least in part on the determination.
Embodiments incorporating teachings of the present disclosure are shown and described with respect to the figures presented herein.
Various embodiments of the present disclosure now will be described more fully hereinafter with reference to the accompanying drawings, in which some, but not all embodiments of the disclosure are shown. Indeed, the disclosure may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will satisfy applicable legal requirements. The term “or” is used herein in both the alternative and conjunctive sense, unless otherwise indicated. The terms “illustrative,” “example,” and “exemplary” are used to be examples with no indication of quality level. Like numbers refer to like elements throughout.
The present disclosure provides systems and methods for safeguarding against different attacks on intellectual property (IP) confidentiality and integrity, particular of hardware designs in cloud field-programmable gate array (FPGA) deployment.
FPGAs are well-known for their ability to accelerate many current computational loads, such as for search requests and deep learning. As a result, cloud users, or tenants may integrate FPGA solutions into their architectures via cloud FPGA providers (CFPs). CFPs may offer FPGA-acceleration through the cloud, e.g., FPGA as a service, allowing users to program instances of cloud FPGA by uploading their designs. However, as FPGA designs create and connect physical hardware, they have unique security challenges associated with them in addition to those of traditional cloud computing.
For example, in a remote side channel analysis attack, a malicious user may upload a design to an FPGA that contains malicious circuitry such as ring oscillators and time/digital converters that may indirectly measure voltage on a circuit board. Such voltage data may be collected by the malicious user who can then analyze the voltage data to recover sensitive information from other circuits on the FPGA, such as keys used in (Rivest-Shamir-Adleman) RSA. Furthermore, deep learning may greatly increase the capabilities of such side channel attacks.
To mitigate risks associated with remote side channel analysis and other threats, CFPs may implement design rules and virus checkers to evaluate hardware designs to ensure no such malicious circuitry are made available to tenants. However, such checks are typically executed on designs created on CFP programs provided to tenants through the cloud. However, this naturally raises concerns of the confidentiality of designs that run on a system controlled by the CFP. Solutions to such concerns typically rely on Trusted Execution Environments (TEEs), such as TrustZone, to ensure confidentiality. However, TEEs often have limits on their throughput, with checks and compilation taking significantly longer on them.
According to various embodiments of the present disclosure, systems and methods for tenant-side checking of designs for cloud FPGA deployment are provided. The present application discloses two separate roots of trust (RoTs) and dually leveraged deployment (DLD) that serve as a set of checks and balances to (i) allow tenants to check their own designs and ensure that bitstreams compiled by tenants based at least in part on the designs are safe to be used in the cloud, without compromising the confidentiality of the design and (ii) enable CFPs to authenticate design rule results of bitstreams checked by the tenants.
The two separate RoTs may comprise (i) a tenant-side ROT (TSRoT) and (ii) a FPGA-side ROT (hereinafter referred to as a DecryptStrapper) associated with a cloud FPGA instance (e.g., a FPGA to program with a design). As such, a CFP may leverage a TSRoT to bootstrap trust onto a tenant's device and a tenant may leverage a DecryptStrapper on a target FPGA owned by the CFP to securely implement a design on the FPGA. The DecryptStrapper may authenticate results of an inspection for potential malicious circuitry performed by the tenant without the need for a dedicated TEE.
Embodiments of the present disclosure may be implemented in various ways, including as computer program products that comprise articles of manufacture. Such computer program products may include one or more software components including, for example, software objects, methods, data structures, and/or the like. A software component may be coded in any of a variety of programming languages. An illustrative programming language may be a lower-level programming language such as an assembly language associated with a particular hardware architecture and/or operating system platform. A software component comprising assembly language instructions may require conversion into executable machine code by an assembler prior to execution by the hardware architecture and/or platform. Another example programming language may be a higher-level programming language that may be portable across multiple architectures. A software component comprising higher-level programming language instructions may require conversion to an intermediate representation by an interpreter or a compiler prior to execution.
Other examples of programming languages include, but are not limited to, a macro language, a shell or command language, a job control language, a script language, a database query or search language, and/or a report writing language. In one or more example embodiments, a software component comprising instructions in one of the foregoing examples of programming languages may be executed directly by an operating system or other software component without having to be first transformed into another form. A software component may be stored as a file or other data storage construct. Software components of a similar type or functionally related may be stored together such as, for example, in a particular directory, folder, or library. Software components may be static (e.g., pre-established, or fixed) or dynamic (e.g., created or modified at the time of execution).
A computer program product may include a non-transitory computer-readable storage medium storing applications, programs, program modules, scripts, source code, program code, object code, byte code, compiled code, interpreted code, machine code, executable instructions, and/or the like (also referred to herein as executable instructions, instructions for execution, computer program products, program code, and/or similar terms used herein interchangeably). Such non-transitory computer-readable storage media include all computer-readable media (including volatile and non-volatile media).
In one embodiment, a non-volatile computer-readable storage medium may include a floppy disk, flexible disk, hard disk, solid-state storage (SSS) (e.g., a solid-state drive (SSD), solid-state card (SSC), solid-state module (SSM)), enterprise flash drive, magnetic tape, or any other non-transitory magnetic medium, and/or the like. A non-volatile computer-readable storage medium may also include a punch card, paper tape, optical mark sheet (or any other physical medium with patterns of holes or other optically recognizable indicia), compact disc read only memory (CD-ROM), compact disc-rewritable (CD-RW), digital versatile disc (DVD), Blu-ray disc (BD), any other non-transitory optical medium, and/or the like. Such a non-volatile computer-readable storage medium may also include read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), flash memory (e.g., Serial, NAND, NOR, and/or the like), multimedia memory cards (MMC), secure digital (SD) memory cards, SmartMedia cards, CompactFlash (CF) cards, Memory Sticks, and/or the like. Further, a non-volatile computer-readable storage medium may also include conductive-bridging random access memory (CBRAM), phase-change random access memory (PRAM), ferroelectric random-access memory (FeRAM), non-volatile random-access memory (NVRAM), magnetoresistive random-access memory (MRAM), resistive random-access memory (RRAM), Silicon-Oxide-Nitride-Oxide-Silicon memory (SONOS), floating junction gate random access memory (FJG RAM), Millipede memory, racetrack memory, and/or the like.
In one embodiment, a volatile computer-readable storage medium may include random access memory (RAM), dynamic random access memory (DRAM), static random access memory (SRAM), fast page mode dynamic random access memory (FPM DRAM), extended data-out dynamic random access memory (EDO DRAM), synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), double data rate type two synchronous dynamic random access memory (DDR2 SDRAM), double data rate type three synchronous dynamic random access memory (DDR3 SDRAM), Rambus dynamic random access memory (RDRAM), Twin Transistor RAM (TTRAM), Thyristor RAM (T-RAM), Zero-capacitor (Z-RAM), Rambus in-line memory module (RIMM), dual in-line memory module (DIMM), single in-line memory module (SIMM), video random access memory (VRAM), cache memory (including various levels), flash memory, register memory, and/or the like. It will be appreciated that where embodiments are described to use a computer-readable storage medium, other types of computer-readable storage media may be substituted for or used in addition to the computer-readable storage media described above.
As should be appreciated, various embodiments of the present disclosure may also be implemented as methods, apparatus, systems, computing devices, computing entities, and/or the like. As such, embodiments of the present disclosure may take the form of a data structure, apparatus, system, computing device, computing entity, and/or the like executing instructions stored on a computer-readable storage medium to perform certain steps or operations. Thus, embodiments of the present disclosure may also take the form of an entirely hardware embodiment, an entirely computer program product embodiment, and/or an embodiment that comprises a combination of computer program products and hardware performing certain steps or operations.
Embodiments of the present disclosure are described with reference to example operations, steps, processes, blocks, and/or the like. Thus, it should be understood that each operation, step, process, block, and/or the like may be implemented in the form of a computer program product, an entirely hardware embodiment, a combination of hardware and computer program products, and/or apparatus, systems, computing devices, computing entities, and/or the like carrying out instructions, operations, steps, and similar words used interchangeably (e.g., the executable instructions, instructions for execution, program code, and/or the like) on a computer-readable storage medium for execution. For example, retrieval, loading, and execution of code may be performed sequentially such that one instruction is retrieved, loaded, and executed at a time. In some example embodiments, retrieval, loading, and/or execution may be performed in parallel such that multiple instructions are retrieved, loaded, and/or executed together. Thus, such embodiments can produce specifically configured machines performing the steps or operations specified in the block diagrams and flowchart illustrations. Accordingly, the block diagrams and flowchart illustrations support various combinations of embodiments for performing the specified instructions, operations, or steps.
In some embodiments, CFP system 101 may communicate with at least one of the tenant computing entities 102 using one or more communication networks. Examples of communication networks include any wired or wireless communication network including, for example, a wired or wireless local area network (LAN), personal area network (PAN), metropolitan area network (MAN), wide area network (WAN), or the like, as well as any hardware, software and/or firmware required to implement it (such as, e.g., network routers, and/or the like).
The CFP system 101 may include a CFP computing entity 106 and a storage subsystem 108. The CFP computing entity 106 may be configured to receive FPGA programming requests from tenant computing entities 102, process the FPGA programming requests to facilitate programming of cloud FPGA instances by the tenant computing entities 102. The CFP computing entity 106 may generate a BI that comprises a minimal executable and one or more design compiling and design checking tools and provide the BI to the requesting one of the tenant computing entities 102. The CFP computing entity 106 may be further configured to (i) receive a certificate chain from the one of the tenant computing entities 102 via the BI, (ii) validate the certificate chain with a CA, (iii) provide design rules to the one of the tenant computing entities 102 via the BI, (iv) receive an secure design rule result and an encrypted bitstream, and (v) program a cloud FPGA instance with a decryption of the encrypted bitstream based at least in part on the secure design rule result matching the design rules.
The storage subsystem 108 may be configured to store input data used by the CFP computing entity 106 to perform checking and securing of hardware designs provided by tenant computing entities 102. The storage subsystem 108 may include one or more storage units, such as multiple distributed storage units that are connected through a computer network. Each storage unit in the storage subsystem 108 may store at least one of one or more data assets and/or one or more data about the computed properties of one or more data assets. Moreover, each storage unit in the storage subsystem 108 may include one or more non-volatile storage or memory media including, but not limited to, hard disks, ROM, PROM, EPROM, EEPROM, flash memory, MMCs, SD memory cards, Memory Sticks, CBRAM, PRAM, FeRAM, NVRAM, MRAM, RRAM, SONOS, FJG RAM, Millipede memory, racetrack memory, and/or the like.
As indicated, in one embodiment, the CFP computing entity 106 may also include one or more network interfaces 220 for communicating with various computing entities, such as tenant computing entities 102.
As shown in
For example, the processing elements 205 may be embodied as one or more complex programmable logic devices (CPLDs), microprocessors, multi-core processors, coprocessing entities, application-specific instruction-set processors (ASIPs), microcontrollers, and/or controllers. Further, the processing elements 205 may be embodied as one or more other processing devices or circuitry. The term circuitry may refer to an entirely hardware embodiment or a combination of hardware and computer program products. Thus, the processing elements 205 may be embodied as integrated circuits, application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), programmable logic arrays (PLAs), hardware accelerators, other circuitry, and/or the like.
As will therefore be understood, the processing elements 205 may be configured for a particular use or configured to execute instructions stored in volatile or non-volatile media or otherwise accessible to the processing elements 205. As such, whether configured by hardware or computer program products, or by a combination thereof, the processing elements 205 may be capable of performing steps or operations according to embodiments of the present disclosure when configured accordingly.
In one embodiment, the CFP computing entity 106 may further include, or be in communication with, non-volatile media (also referred to as non-volatile storage, memory, memory storage, memory circuitry and/or similar terms used herein interchangeably). In one embodiment, the non-volatile storage or memory may include one or more non-volatile storage or memory media 210, including, but not limited to, hard disks, ROM, PROM, EPROM, EEPROM, flash memory, MMCs, SD memory cards, Memory Sticks, CBRAM, PRAM, FeRAM, NVRAM, MRAM, RRAM, SONOS, FJG RAM, Millipede memory, racetrack memory, and/or the like.
As will be recognized, the non-volatile storage or memory media may store databases, database instances, database management systems, data, applications, programs, program modules, scripts, source code, object code, byte code, compiled code, interpreted code, machine code, executable instructions, and/or the like. The term database, database instance, database management system, and/or similar terms used herein interchangeably may refer to a collection of records or data that is stored in a computer-readable storage medium using one or more database models, such as a hierarchical database model, network model, relational model, entity-relationship model, object model, document model, semantic model, graph model, and/or the like.
In one embodiment, the CFP computing entity 106 may further include, or be in communication with, volatile media (also referred to as volatile storage, memory, memory storage, memory circuitry and/or similar terms used herein interchangeably). In one embodiment, the volatile storage or memory may also include one or more volatile storage or memory media 215, including, but not limited to, RAM, DRAM, SRAM, FPM DRAM, EDO DRAM, SDRAM, DDR SDRAM, DDR2 SDRAM, DDR3 SDRAM, RDRAM, TTRAM, T-RAM, Z-RAM, RIMM, DIMM, SIMM, VRAM, cache memory, register memory, and/or the like.
As will be recognized, the volatile storage or memory media may be used to store at least portions of the databases, database instances, database management systems, data, applications, programs, program modules, scripts, source code, object code, byte code, compiled code, interpreted code, machine code, executable instructions, and/or the like being executed by, for example, the processing elements 205. Thus, the databases, database instances, database management systems, data, applications, programs, program modules, scripts, source code, object code, byte code, compiled code, interpreted code, machine code, executable instructions, and/or the like may be used to control certain aspects of the operation of the CFP computing entity 106 with the assistance of the processing elements 205 and operating system.
As indicated, in one embodiment, the CFP computing entity 106 may also include one or more network interfaces 220 for communicating with various computing entities, such as by communicating data, content, information, and/or similar terms used herein interchangeably that can be transmitted, received, operated on, processed, displayed, stored, and/or the like. Such communication may be executed using a wired data transmission protocol, such as fiber distributed data interface (FDDI), digital subscriber line (DSL), Ethernet, asynchronous transfer mode (ATM), frame relay, data over cable service interface specification (DOCSIS), or any other wired transmission protocol. Similarly, the CFP computing entity 106 may be configured to communicate via wireless external communication networks using any of a variety of protocols, such as general packet radio service (GPRS), Universal Mobile Telecommunications System (UMTS), Code Division Multiple Access 2000 (CDMA2000), CDMA2000 1× (1×RTT), Wideband Code Division Multiple Access (WCDMA), Global System for Mobile Communications (GSM), Enhanced Data rates for GSM Evolution (EDGE), Time Division-Synchronous Code Division Multiple Access (TD-SCDMA), Long Term Evolution (LTE), Evolved Universal Terrestrial Radio Access Network (E-UTRAN), Evolution-Data Optimized (EVDO), High Speed Packet Access (HSPA), High-Speed Downlink Packet Access (HSDPA), IEEE 802.11 (Wi-Fi), Wi-Fi Direct, 802.16 (WiMAX), ultra-wideband (UWB), infrared (IR) protocols, near field communication (NFC) protocols, Wibree, Bluetooth protocols, wireless universal serial bus (USB) protocols, and/or any other wireless protocol.
Although not shown, the CFP computing entity 106 may include, or be in communication with, one or more input elements, such as a keyboard input, a mouse input, a touch screen/display input, motion input, movement input, audio input, pointing device input, joystick input, keypad input, and/or the like. The CFP computing entity 106 may also include, or be in communication with, one or more output elements (not shown), such as audio output, video output, screen/display output, motion output, movement output, and/or the like.
The signals provided to and received from the transmitter 304 and the receiver 306, correspondingly, may include signaling information/data in accordance with air interface standards of applicable wireless systems. In this regard, the tenant computing entity 102 may be capable of operating with one or more air interface standards, communication protocols, modulation types, and access types. More particularly, the tenant computing entity 102 may operate in accordance with any of a number of wireless communication standards and protocols, such as those described above. In a particular embodiment, the tenant computing entity 102 may operate in accordance with multiple wireless communication standards and protocols, such as UMTS, CDMA2000, 1×RTT, WCDMA, GSM, EDGE, TD-SCDMA, LTE, E-UTRAN, EVDO, HSPA, HSDPA, Wi-Fi, Wi-Fi Direct, WiMAX, UWB, IR, NFC, Bluetooth, USB, and/or the like. Similarly, the tenant computing entity 102 may operate in accordance with multiple wired communication standards and protocols, such as those described above via a network interface 320.
Via these communication standards and protocols, the tenant computing entity 102 can communicate with various other entities using concepts such as Unstructured Supplementary Service Data (USSD), Short Message Service (SMS), Multimedia Messaging Service (MMS), Dual-Tone Multi-Frequency Signaling (DTMF), and/or Subscriber Identity Module Dialer (SIM dialer). The tenant computing entity 102 can also download changes, add-ons, and updates, for instance, to its firmware, software (e.g., including executable instructions, applications, program modules), and operating system.
According to one embodiment, the tenant computing entity 102 may include location determining aspects, devices, modules, functionalities, and/or similar words used herein interchangeably. For example, the tenant computing entity 102 may include outdoor positioning aspects, such as a location module adapted to acquire, for example, latitude, longitude, altitude, geocode, course, direction, heading, speed, universal time (UTC), date, and/or various other information/data. In one embodiment, the location module can acquire data, sometimes known as ephemeris data, by identifying the number of satellites in view and the relative positions of those satellites (e.g., using global positioning systems (GPS)). The satellites may be a variety of different satellites, including Low Earth Orbit (LEO) satellite systems, Department of Defense (DOD) satellite systems, the European Union Galileo positioning systems, the Chinese Compass navigation systems, Indian Regional Navigational satellite systems, and/or the like. This data can be collected using a variety of coordinate systems, such as the DecimalDegrees (DD); Degrees, Minutes, Seconds (DMS); Universal Transverse Mercator (UTM); Universal Polar Stereographic (UPS) coordinate systems; and/or the like. Alternatively, the location information/data can be determined by triangulating the client computing entity's 102 position in connection with a variety of other systems, including cellular towers, Wi-Fi access points, and/or the like. Similarly, the tenant computing entity 102 may include indoor positioning aspects, such as a location module adapted to acquire, for example, latitude, longitude, altitude, geocode, course, direction, heading, speed, time, date, and/or various other information/data. Some of the indoor systems may use various position or location technologies including RFID tags, indoor beacons or transmitters, Wi-Fi access points, cellular towers, nearby computing devices (e.g., smartphones, laptops) and/or the like. For instance, such technologies may include the iBeacons, Gimbal proximity beacons, Bluetooth Low Energy (BLE) transmitters, NFC transmitters, and/or the like. These indoor positioning aspects can be used in a variety of settings to determine the location of someone or something to within inches or centimeters.
The tenant computing entity 102 may also comprise a user interface (that can include a display 316 coupled to a processing element 308) and/or a user input interface (coupled to a processing element 308). For example, the user interface may be a user application, browser, user interface, and/or similar words used herein interchangeably executing on and/or accessible via the tenant computing entity 102 to interact with, access, and/or cause display or retrieval of information/data from the CFP computing entity 106, as described herein. The user input interface can comprise any of a number of devices or interfaces allowing the tenant computing entity 102 to receive data, such as a keypad 318 (hard or soft), a touch display, voice/speech or motion interfaces, or other input device. In embodiments including a keypad 318, the keypad 318 can include (or cause display of) the conventional numeric (0-9) and related keys (#, *), and other keys used for operating the tenant computing entity 102 and may include a full set of alphabetic keys or set of keys that may be activated to provide a full set of alphanumeric keys. In addition to providing input, the user input interface can be used, for example, to activate or deactivate certain functions, such as screen savers and/or sleep modes.
The tenant computing entity 102 can also include volatile storage or memory 322 and/or non-volatile storage or memory 324, which can be embedded and/or may be removable. For example, the non-volatile memory may be ROM, PROM, EPROM, EEPROM, flash memory, MMCs, SD memory cards, Memory Sticks, CBRAM, PRAM, FeRAM, NVRAM, MRAM, RRAM, SONOS, FJG RAM, Millipede memory, racetrack memory, and/or the like. The volatile memory may be RAM, DRAM, SRAM, FPM DRAM, EDO DRAM, SDRAM, DDR SDRAM, DDR2 SDRAM, DDR3 SDRAM, RDRAM, TTRAM, T-RAM, Z-RAM, RIMM, DIMM, SIMM, VRAM, cache memory, register memory, and/or the like. The volatile and non-volatile storage or memory can store databases, database instances, database management systems, data, applications, programs, program modules, scripts, source code, object code, byte code, compiled code, interpreted code, machine code, executable instructions, and/or the like to implement the functions of the tenant computing entity 102. As indicated, this may include a user application that is resident on the tenant computing entity 102 or accessible through a browser or other user interface for communicating with the CFP computing entity 106 and/or various other computing entities.
In another embodiment, the tenant computing entity 102 may include one or more components or functionality that are the same or similar to those of the CFP computing entity 106, as described in greater detail above. As will be recognized, these architectures and descriptions are provided for exemplary purposes only and are not limiting to the various embodiments.
In various embodiments, the tenant computing entity 102 may be embodied as an artificial intelligence (AI) computing entity. Accordingly, the tenant computing entity 102 may be configured to provide and/or receive information/data from a user via an input/output mechanism, such as a display, a camera, a speaker, a voice-activated input, and/or the like. In certain embodiments, an AI computing entity may comprise one or more predefined and executable program algorithms stored within an onboard memory storage module, and/or accessible over a network. In various embodiments, the AI computing entity may be configured to retrieve and/or execute one or more of the predefined program algorithms upon the occurrence of a predefined trigger event.
Various embodiments of the present disclosure describe steps, operations, processes, methods, functions, and/or the like for securing hardware designs deployed in cloud FPGA environments. For example, CFPs may comprise cloud computing providers that offer FPGA computing resources as a cloud service to tenant computing entities. As such, tenants may conveniently apply FPGA-based hardware acceleration for specific applications by leveraging FPGA as a cloud service. Tenants may program cloud FPGA instances to execute their specific applications through the cloud by deploying hardware designs to the cloud FPGA instances.
Despite the many benefits provided by deploying FPGA computing through the cloud, threats to both tenants and CFPs may exist. For example, malicious circuitry/programming may be inserted by tenants into cloud FPGA instances allocated to the tenants, either knowingly or unknowingly, to carry out remote side channel or denial-of-service attacks. That is, tenants may introduce malicious circuitry in their designs that causes physical damage or breaches to the integrity or confidentiality of other designs that may also be deployed on a same cloud FPGA instance. To prevent this, CFPs may comprise design rule checks and virus scanners that may examine designs to ensure that malicious circuits are not included. However, if given foreknowledge of such checks that designs will be subjected to, malicious actors may be able to operate around those restrictions.
Furthermore, in order to run design rule checks, CFPs will traditionally require access to raw designs, which may raise concerns of confidentiality from the perspective of tenants. For example, CFPs may steal tenant designs to reuse or resell them, e.g., IP Piracy. Preventing designs from being seen by CFPs may go beyond confidentiality concerns. In particular, CFPs may add malicious circuitry to designs that affect performance during critical junctions. As such, tenants may desire that their designs are placed on a FPGA unmodified and unseen. Balancing design confidentiality with the CFP's need to detect malicious circuitry poses security and trust issues.
According to various embodiments of the present disclosure, systems and methods for establishing mutual trust between CFPs and tenants during cloud FPGA deployment are provided. The disclosed systems and methods may ensure that designs uploaded by tenants are safe to be deployed to cloud FPGA instances without compromising the confidentiality of the design. In some embodiments, CFPs may require tenants to inspect designs (e.g., for deployment to cloud FPGA instances) based at least in part on design rules which may detect malicious circuitry in a design along with other violations. To enable CFPs to authenticate design rule results provided by tenants, the CFPs may leverage TSRoTs to bootstrap trust onto the tenants' devices. At the same time, tenants may leverage DecryptStrappers associated with cloud FPGA instances that are allocated to the tenants by the CFPs to securely implement designs on the cloud FPGA instances. As such, tenants may be prevented from uploading falsified design rule results while avoiding exposing an unencrypted design to the CFP.
According to various embodiments of the present disclosure, a secured design rule result (SDRR) is generated by a tenant computing entity combining a hash of a compiled bitstream of a design with a hash of a design rule result or virus scan result. In some embodiments, programming a cloud FPGA instance comprises the tenant computing entity transmitting the SDRR along with an encrypted version of the bitstream that is encrypted by the tenant computing entity to a CFP computing entity. In some embodiments, the CFP computing entity recovers the hash chain of the bitstream and forwards the encrypted bitstream to a DecryptStrapper on the cloud FPGA instance. The DecryptStrapper checks the recovered hash chain against a recovered bitstream (from the encrypted bitstream) to confirm they match before programming the cloud FPGA instance with the bitstream.
In some embodiments, TSRoTs and DecryptStrappers comprise RoTs implemented on tenant computing entities (e.g., tenant computing entities 102) and cloud FPGA instances, respectively. In some embodiments, a RoT comprises a hardware, firmware, or software component that is configured to establish trust of all other components within a system. A RoT may also provide various security functions, such as secure key storage, attestation, secure communication, and cryptographic operations.
In some embodiments, a hash chain describes a cryptographic state of data that is generated based at least in part on a sequence of hash values. Each hash value may be computed by combining a previous hash value with a new hash value. Hash chains may be used in securing system operations. For example, a hash state may be compared to a list of known values stored in a secure location in a system. If the hash state matches a known value, the system may continue. If the hash state does not match a known value, it is assumed that the process has been compromised, and the system may be halted.
One of the one or more tenant computing entities 102 may request one or more cloud FPGA instances from CFP system 101. CFP system 101 may be configured to allocate a given one of the one or more cloud FPGA instances to the one or more tenant computing entities 102. One or more tenant computing entities 102 may be used by tenants to place a design on given one of the one or more cloud FPGA instances. The given one of the one or more cloud FPGA instances may comprise programmable logic 444 and a DecryptStrapper 460 that is configured to authenticate design rule results from one of tenant computing entities 102 allocated with the given cloud FPGA instance and allow for programming of programmable logic 444 (e.g., cloud FPGA instance) with a bitstream based at least in part on the design rule results. The CFP system 101 may also generate and provide a BI 450 to the tenant computing entity 102. The BI 450 may be used by the tenant computing entity 102 to compile bitstreams and check designs.
The tenant computing entity 102 comprises a TSRoT 404. The BI 450 may comprise an acceptable hash states (AHS) list encrypted for the TSRoT 404. In some embodiments, the TSRoT 404 is configured to shut down or terminate execution of the BI 450 should a hash state of the TSRoT 404 exit the list of allowed hash state values. In some embodiments, the BI 450 may require authentication of the TSRoT 404 in order to execute one or more functions. The BI 450 may generate a new private key at runtime, which is then registered with the TSRoT 404 to form a certificate chain. The signed certificate may then be provided to the CFP system 101, which uses it to create EDR 406 after verifying the creator is the BI 450.
In some embodiments, the BI 450 generates a BI public key (BIPub), a BI private key (BIPriv), and a BI certificate (BICert), and sends the BIPub, BIPriv, and BICert to TSRoT 404 to sign, thereby generating a TSRoT certificate (TSRoTCert) and a BI certificate (BICert), which may be provided by the BI 450 to the CFP system 101. In some embodiments, the CFP system 101 is configured to receive the TSRoT certificate (TSRoTCert), the BI certificate (BICert), an encrypted design rule request (EDRREQ), and capabilities of the TSRoT 404. The CFP system 101 may validate the TSRoT Cert and the BICert with a certification authority (CA) 402 to mitigate “man-in-the-middle” attacks between the CFP system 101 and TSRoT 404, particularly of design rules. In some embodiments, the CFP system 101 is configured to maintain a list of design rules that are used to determine design compliance and detect malicious circuitry. For example, should the tenant computing entity 102 recover the plaintext of design rules, design rule results may be extracted, and passing design rule results may be falsified for arbitrary bitstreams. As such, this may be prevented by CFP system 101 confirming TSRoTCert and BICert with the CA 402.
In some embodiments, the CFP system 101 may provide a TSRoT public key (TSRoTPub) to the DecryptStrapper 460. In some embodiments, the DecryptStrapper 460 generates a DecryptStrapper nonce (NDS). In some embodiments, an encrypted DecryptStrapper nonce NDS_Full is generated based at least in part on the NDS. In some embodiments, NDS_Full comprises a vector including an NDS_Safe and an NDS_Auth. NDS_Safe may be generated based at least in part on an asymmetric encryption of a DecryptStrapper nonce (NDS) with the TSRoTPub. NDS_Auth may be generated based at least in part on an asymmetric encryption of NDS_Safe with a DecryptStrapper private key (DSPriv).
In some embodiments, the CFP system 101 generates an encrypted design rules (EDR) 406, which the BI 450 may decrypt to recover design rules. In some embodiments, the EDR 406 is generated by (i) generating a EDR nonce (NEDR) and (ii) and symmetrically encrypting one or more design rules with the NEDR. Along with the EDR 406, the CFP system 101 may also return an EDR key (EDRKey) and NDS_Full to the BI 450. In some embodiments, the EDRKey is generated by asymmetrically encrypting the NEDR with the BIPub and further asymmetrically encrypting with the TSRoTPub.
The EDR 406, the EDRKey, and the encrypted DecryptStrapper nonce (NDS_Full) may be provided to the BI 450 to facilitate generating a raw bitstream B0 414. In some embodiments, raw bitstream B0 414 may is generated by the BI 450 based at least in part on a bitstream compilation 410 of a design 408. The BI 450 may also perform design rules check 412 on the design 408. A bitstream compilation 410 may be performed based at least in part on a design rule (DR), a register transfer language (RTL) of the design 408, and the EDRKey. In some embodiments, bitstream compilation 410 comprises instructions for (i) determining the NEDR by asymmetrically decrypting the EDRKey with the BIPriv and a TSRoT private key (TSRoTPriv), (ii) determining DR by symmetrically decrypting the EDR 406 with the EDRKey, (iii) extracting DRRTL, DRSyn, DRImpl, and DRBit from the DR, (iv) generating design rule result (DDR) 416 based at least in part on the DRRTL, the DRSyn, the DRImpl, and the DRBit, (v) generating the raw bitstream B0 414, (vi) generating a hash H0 418 of the raw bitstream B0 414, (vii) generating a secure design rule result (SDRR) 426, and (viii) generating a first encrypted bitstream B1 422.
DDR416 may comprise a vector including DRR1, DRR2, DRR3, and DRR4. In some embodiments, generating DDR416 comprises (i) generating DRR1 by performing design rules check 412 on the RTL based at least in part on the DRRTL, (ii) generating a synthesis based at least in part on the RTL, (iii) generating DRR2 by performing design rules check 412 on the synthesis of the RTL based at least in part on the DRsyn, (iv) generating an implementation based at least in part on the synthesis, (v) generating DRR3 by performing design rules check 412 on the implementation of the synthesis based at least in part on the DRImpl, (vi) generating raw bitstream B0 414 based at least in part on the implementation, and (vii) generating DRR4 by performing design rules check 412 on the raw bitstream B0 414 based at least in part on the DRBit.
In some embodiments, generating the SDRR 426 comprises (a) determining an exclusive or (XOR) 424 of a hash HDRR 420 of DRR 416 with hash H0 418 of the raw bitstream B0 414 and (b) asymmetrically encrypting the XOR 424 with a CFP public key (CFPPubKey) 428. In some embodiments, generating the first encrypted bitstream B1 422 comprises symmetrically encrypting the raw bitstream B0 414 with the hash H0 418.
In some embodiments, the tenant computing entity 102 verifies DecryptStrapper certificate (DSCert) with the CA 402 to prevent against man-in-the-middle attacks of encrypted bitstreams. In some embodiments, the tenant computing entity 102 recovers {circumflex over (N)}DS. by asymmetrically decrypting NDS_Safe with TSRoTPriv. In some embodiments, the tenant computing entity 102 confirms that the asymmetric decryption of NDS_Auth with DecryptStrapper public key (DSpub) and TSRoTPriv matches the recovered {circumflex over (N)}DS.
The tenant computing entity 102 further generates an encrypted bitstream B2 430 based at least in part on the first encrypted bitstream B1 422. In some embodiments, generating encrypted bitstream B2 430 comprises symmetrically encrypting the first encrypted bitstream B1 422 with the recovered {circumflex over (N)}DS. The SDRR 426 and encrypted bitstream B2 430 are then sent to the CFP system 101 to initiate deployment.
CFP system 101 may determine DRR 416 from SDRR 426 by asymmetrically decrypting SDRR 426 with CFP private key (CFPPriv) 432. The CFP system 101 may then recover an estimate hash of B0, Ĥ0, by generating an XOR 434 of the DRR 416 (from SDRR 426) and a hash of an expected design rule result (HDRR) 436. The encrypted bitstream B2 430 and Ĥ0 may be provided to DecryptStrapper 460 by the CFP system 101. In some embodiments, the DecryptStrapper 460 recovers encrypted bitstream {circumflex over (B)}1 438 by symmetrically decrypting B2 with NDS. In some embodiments, the DecryptStrapper 460 may then recover bitstream {circumflex over (B)}0 440 by symmetrically decrypting recovered encrypted bitstream {circumflex over (B)}1 438 with Ĥ0.
DecryptStrapper 460 generates a comparison 442 between a hash of recovered bitstream {circumflex over (B)}0 440 and Ĥ0. Should there be any malicious circuitry in the RTL code provided to the BI 450 by the tenant computing entity 102, the DRR 416 will differ from the HDRR 436. As a result, the Ĥ0 of the DRR 416 will be different from the hash of recovered bitstream {circumflex over (B)}0 440. Thus, when the SDRR 426 is used to obtain Ĥ0, the resulting value will be incorrect and the hash of the recovered bitstream {circumflex over (B)}0 440 will not match Ĥ0. On the other hand, if all or a pre-requisite of rules in the design rules were followed or violated as expected by the CFP, the hash of recovered bitstream {circumflex over (B)}0 440 will equal to Ĥ0. In the event that the hash of recovered bitstream {circumflex over (B)}0 440 does not match Ĥ0, the DecryptStrapper 460 may refuse to program the programmable logic 444 and alert the CFP system 101. If the comparison 442 of a hash of recovered bitstream {circumflex over (B)}0 440 is equal to Ĥ0, the DecryptStrapper 460 programs programmable logic 444 with the recovered {circumflex over (B)}0 440.
In some embodiments, the process 500 comprises, at operation 502, providing a BI to a tenant computing entity. For example, the CFP system 101 may provide a BI to a tenant computing entity 102. In some embodiments, the BI comprises an executable and one or more design compiling and design checking tools. In some embodiments, the tenant computing entity comprises a TSRoT. In some embodiments, the BI comprises a plurality of acceptable hash states that are associated with the TSRoT. In some embodiments, the TSRoT is configured to terminate execution of the BI based at least in part on the TSRoT comprising a hash state that is not in the plurality of acceptable hash states.
In some embodiments, the process 500 comprises, at operation 504, receiving a certificate chain from the tenant computing entity via the BI. For example, the CFP system 101 may receive a certificate chain from the tenant computing entity 102 via the BI. In some embodiments, a TSRoT certificate and a BI certificate are received.
In some embodiments, the process 500 comprises, at operation 506, validating the certificate chain with a CA. For example, the CFP system 101 may validate the certificate chain with a CA. In some embodiments, the TSRoT certificate and the BI certificate are validated with a CA. For example, the CFP system 101 may validate the TSRoT certificate and the BI certificate with a CA to mitigate “man-in-the-middle” attacks between the CFP system 101 and the TSRoT, particularly of design rules.
In some embodiments, the process 500 comprises, at operation 508, providing a design rule data object to the tenant computing entity via the BI. For example, the CFP system 101 may provide a design rule data object to the tenant computing entity via the BI. In some embodiments, the design rule data object is generated by (i) generating an EDR nonce and (ii) symmetrically encrypting a design rule with the EDR nonce. In some embodiments, an EDR key is generated by asymmetrically encrypting the EDR nonce with a BI public key and a TSRoT public key.
In some embodiments, the process 500 comprises, at operation 510, receiving a secure design rule result data object and an encrypted bitstream. For example, the CFP system 101 may receive a secure design rule result data object and an encrypted bitstream.
In some embodiments, the BI is configured to generate the secure design rule result data object by (i) determining an exclusive or (XOR) of a hash of a design rule result with a hash of the raw bitstream and (ii) asymmetrically encrypting the XOR with a cloud programmable logic provider public key. In some embodiments, the BI is configured to generate the encrypted bitstream by applying a first symmetric encryption on the raw bitstream with the hash of the raw bitstream. In some embodiments, the BI is configured to generate the encrypted bitstream by applying a second symmetric encryption on the first symmetrically encrypted raw bitstream with a programmable logic-side root of trust nonce.
In some embodiments, the design rule result is determined based at least in part on the secure design rule result data object by asymmetrically decrypting the secure design rule result data object with a cloud programmable logic provider private key. In some embodiments, an estimate hash of the raw bitstream is recovered by generating an XOR of the design rule result and a hash of an expected design rule result. In some embodiments, using the programmable logic-side root of trust nonce, the first symmetrically encrypted raw bitstream is recovered by symmetrically decrypting the second symmetrically encrypted raw bitstream with the programmable logic-side root of trust nonce. In some embodiments, using the programmable logic-side root of trust nonce, the raw bitstream is recovered by symmetrically decrypting the recovered first symmetrically encrypted raw bitstream with the estimate hash of the raw bitstream.
In some embodiments, the process 600 comprises, at operation 602, determining a EDR nonce by asymmetrically decrypting a EDR key with a BI private key and a TSRoT private key. For example, the CFP system 101 may determine a EDR nonce by asymmetrically decrypting a EDR key with a BI private key and a TSRoT private key.
In some embodiments, the process 600 comprises, at operation 604, determining a design rule by symmetrically decrypting a design rule data object with the EDR key. For example, the CFP system 101 may determine a design rule by symmetrically decrypting a design rule data object with the EDR key.
In some embodiments, the process 600 comprises, at operation 606, extracting one or more design rule components from the design rule. For example, the CFP system 101 may extract one or more design rule components from the design rule.
In some embodiments, the process 600 comprises, at operation 608, generating a design rule result based at least in part on the one or more design rule components. For example, the CFP system 101 may generate a design rule result based at least in part on the one or more design rule components.
In some embodiments, the process 600 comprises, at operation 610, generating a raw bitstream. For example, the CFP system 101 may generate a raw bitstream.
In some embodiments, the process 600 comprises, at operation 612, generating a hash of the raw bitstream. For example, the CFP system 101 may generate a hash of the raw bitstream.
Returning to
In some embodiments, the process 500 comprises, at operation 514, programming a cloud programmable logic instance with a decryption of the encrypted bitstream based at least in part on the match. For example, the CFP system 101 may program a cloud programmable logic instance with a decryption of the encrypted bitstream based at least in part on the match. In some embodiments, the cloud programmable logic instance comprises a cloud field-programmable gate array.
Exemplary algorithms for performing operations of the disclosed embodiments may be further understood with reference to
It should be understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application.
Many modifications and other embodiments of the present disclosure set forth herein will come to mind to one skilled in the art to which the present disclosures pertain having the benefit of the teachings presented in the foregoing descriptions and the associated drawings. Therefore, it is to be understood that the present disclosure is not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claim concepts. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.
This application claims the priority of U.S. Provisional Application No. 63/624,558, entitled “SYSTEM AND METHOD FOR SECURE CLOUD FPGA DEPLOYMENT USING A CERTIFICATE AUTHORITY AND ROOTS OF TRUST,” filed on Jan. 24, 2024, the disclosure of which is hereby incorporated by reference in its entirety.
This invention was made with government support under 2007320 awarded by The National Science Foundation and 1801599 awarded by The National Science Foundation. The government has certain rights in the invention.
| Number | Date | Country | |
|---|---|---|---|
| 63624558 | Jan 2024 | US |