This application claims priority to PRC Patent Application No. 202210926062.X filed Aug. 3, 2022, which is incorporated herein by reference for all purposes.
The present application relates to the technology in the field of information security, and particularly to a processor for use in performing function operation on ciphertext data and plaintext data.
With the continuous development of new types of internet networks, data is growing explosively, and huge amount of data is often stored in cloud servers in the mode of entrusted computing services. Some data stored in the cloud often contain private information, or the data security mechanism in the cloud is not perfect, and some data information may be leaked easily. Thus, privacy data should be encrypted for protection purpose; however, once the data is encrypted, the original data structure of the original data is destroyed, and therefore, it is no longer feasible to process the information. For this reason, there is a need for a cryptographic technique that can encrypt the data and ensure that the encrypted data can be processed. The fully homomorphic encryption algorithm not only protects the privacy of the original data, but also supports arbitrary homomorphic addition and homomorphic multiplication of ciphertext data, providing a general security solution for cloud computing and big data environments.
However, homomorphic encryption requires complex operations, and the operation process requires a lot of data exchange with the cache and/or memory. Therefore, how to achieve these requirements with a low cost has become one of the most important issues to be addressed in the related field.
One embodiment of the present disclosure is directed to a processor, characterized in that the processor includes a processor core and a memory. The processor core includes: a homomorphic encryption instruction execution module, configured to perform a homomorphic encryption operation, wherein the homomorphic encryption instruction execution module includes a plurality of instruction set architecture extension components, and the plurality of instruction set architecture extension components are respectively configured to perform a sub-operation related to the homomorphic encryption; and a general-purpose instruction execution module, configured to perform non-homomorphic encryption operation. The memory is vertically stacked with the processor core and is used as the cache or scratchpad of the processor core.
Another embodiment of the present disclosure is directed to a multi-core processor, which includes a plurality of fore-going processors.
The processor and processor core of the present disclosure can be used in performing homomorphic encryption operation and non-homomorphic encryption operation. Since the memory is arranged outside of the processor core, and the processor core and the memory are vertically stacked in a three-dimensional space, and the memory is used as the cache or scratchpad memory of the processor core, it can be arranged to have a larger storage, and the bandwidth between the processor core and the memory is also greatly increased.
Aspects of the present disclosure are best understood from the following detailed description when read with the accompanying figures. It should be noted that, in accordance with the standard practice in the field, various structures are not drawn to scale. In fact, the dimensions of the various structures may be arbitrarily increased or reduced for the clarity of discussion.
For example,
For example,
The following disclosure provides many different embodiments, or examples, for implementing different features of the provided subject matter. Specific examples of elements and arrangements are described below to simplify the present disclosure. These are, of course, merely examples and are not intended to be limiting. For example, the formation of a first feature over or on a second feature in the description that follows may include embodiments in which the first and second features are formed in direct contact, and may also include embodiments in which additional features may be formed between the first and second features, such that the first and second features may not be in direct contact. In addition, the present disclosure may repeat reference numerals and/or letters in various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed.
Moreover, spatially relative terms, such as “beneath,” “below,” “lower,” “above,” “upper”, “on” and the like, may be used herein for ease of description to describe one element or feature's relationship to another element(s) or feature(s) as illustrated in the figures. These spatially relative terms are intended to encompass different orientations of the device in use or operation in addition to the orientation depicted in the drawings. The apparatus may be otherwise oriented (rotated 90 degrees or at other orientations) and the spatially relative descriptors used herein may likewise be interpreted accordingly.
As used herein, the terms such as “first”, “second” and “third” describe various elements, components, regions, layers and/or sections, these elements, components, regions, layers and/or sections should not be limited by these terms. These terms may be only used to distinguish one element, component, region, layer or section from another. For example, the terms such as “first”, “second” and “third” when used herein do not imply a sequence or order unless clearly indicated by the context.
As used herein, the singular forms “a,” “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. The term “connect,” and its derivatives, may be used herein to describe the structural relationship between components. The term “connected to” may be used to describe two or more components in direct physical or electrical contact with each other. The term “connected to” may also be used to indicate that two or more components are in direct or indirect (with intervening components therebetween) physical or electrical contact with each other, and/or that the two or more components collaborate or interact with each other.
Generally, if a general processor is used to process instructions related to homomorphic cryptographic operations, the operations become extremely complex and lengthy. Therefore, many accelerator solutions for homomorphic cryptography have been proposed in the related art. However, over-optimization of the hardware tends to lose the flexibility of the processor and makes the processor incompatible with most of the specification requirements. Therefore, the present disclosure provides a solution that not only takes into account the performance of the processor in performing both homomorphic and non-homomorphic cryptographic operations, but also easily keeps the processor compatible with various usage scenarios with different specifications when performing homomorphic cryptographic operations; the details are discussed below.
Since homomorphic cryptographic operations require a larger amount of data accessing than non-homomorphic cryptographic operations, in order to avoid the storage space and bandwidth from becoming performance bottlenecks, the present disclosure arranges the memory outside the processor core to increase the storage space and stacks the memory on top of the processor core in the form of a three-dimensional integrated circuit to obtain a high bandwidth.
In
The three-dimensional integrated circuit package 10 can be further coupled to a substrate 22, a solder ball 24 and a heat sink cover 26. The substrate 22 can be a semiconductor substrate (e.g., a silicon substrate), an intermediate layer or a printed circuit board, etc.; discrete passive devices such as resistors, capacitors, transformers, etc. (not shown) may also be coupled to the substrate 22. The solder ball 24 is attached to the substrate 22, wherein the processor 20 and the solder balls 24 are located on opposite sides of the substrate 22. The heat sink cover 26 is mounted on the substrate 22 and wraps around the processor 20. The heat sink cover 26 may be formed using a metal, metal alloy, etc., such as a metal selected from the group consisting of aluminum, copper, nickel, cobalt, etc.; the heat sink cover 26 may also be formed from a composite material selected from the group consisting of silicon carbide, aluminum nitride, graphite, etc. In some embodiments, an adhesive 28 may be provided on top of the processor 20 for adhering the heat sink cover 26 to the processor 20 to improve the stability of the three-dimensional integrated circuit package 10. In some embodiments, the adhesive 28 may have a good thermal conductivity so as to accelerate the dissipation of heat energy generated during operation of the processor 20. In some embodiments, the memory 14 may be arranged below the processor core 12 such that the memory 14 is located between the processor core 12 and the substrate 22.
The processor 20 can be applied in a server in a cloud environment (hereinafter, a cloud server) for processing data in different formats. More specifically, the processor core 12 in the processor 20 can perform functional computation on ciphertext data and plaintext data, according to on a user's request, where the ciphertext data has a first format and the plaintext data has a second format; the memory 14 is used as a cache and/or scratchpad memory of the processor core 12 for storing intermediate or final computation results obtained during the functional computation. In certain embodiments, the user may encrypt the data to be computed by homomorphic encryption algorithms to obtain ciphertext data and send the instructions (containing the function to be computed), ciphertext data and plaintext data to the cloud server; after the processor 20 located in the cloud server computes the ciphertext data and the plaintext data separately, it then returns the computation results to the user. In some embodiments, the user can upload and store (for a long term) the plaintext data and the ciphertext data obtained by homomorphic encryption algorithm in the cloud server; the processor 20 can compute the plaintext data and the ciphertext data stored in the cloud server according to the instructions sent by the user, and then send the computation results to the user or store them in the cloud server.
An instruction receiving module 340 of the processor core 12 is coupled to the homomorphic encryption perform module 310 and the general-purpose instruction execution module 320, and is used to receive instructions and correspondingly control the homomorphic encryption perform module 310 and the general-purpose instruction execution module 320 to perform corresponding operations according to the type of the received instruction. Generally, the instructions received by the processor 20 include homomorphic encryption instructions related to ciphertext data process and non-homomorphic encryption instructions related to plaintext data process. When the instruction receiving module 340 receives the homomorphic encryption instruction, it will assign the homomorphic encryption instruction to the homomorphic encryption perform module 310; when the instruction receiving module 340 receives the non-homomorphic encryption instruction, it will assign the non-homomorphic encryption instruction to the general-purpose instruction execution module 320.
The homomorphic encryption instruction execution module 310 can include a plurality of instruction set architecture extension components 312, each instruction set architecture extension component 312 is configured to perform sub-operations related to homomorphic encryption. In certain embodiments, the sub-operation performed by the instruction set architecture extension components 312 can include performing number theoretic transform (NTT) operation, KeySwitch operation, modulus operation or data manipulation operation etc. on ciphertext data; in other words, each instruction set architecture extension component 312 only has the capability to perform a specific sub-operation. In the present embodiment, before the instruction receiving module 340 transfers the homomorphic encryption instruction to the homomorphic encryption instruction execution module 310, it will first break down the homomorphic encryption instruction into a plurality of sub-operations, and then assigns a plurality of sub-operations to a least a portion of the instruction set architecture extension components 312 of the homomorphic encryption perform module 310 according to the property of a plurality of sub-operations and the purpose and number of a plurality of instruction set architecture extension components 312. In the present embodiment, the type and complexity of the computation functions to be processed and the desired speed and hardware cost can be used to determine which functional instruction set architecture extensions 312 are to be included and how many instruction set architecture extensions 312 are to be configured for each function. The number (3) of instruction set architecture extension components 312 shown in
The processor core 12 can further include a storage manager 330, coupled between the homomorphic encryption instruction execution module 310 and the memory 14 of
In certain embodiments, a plurality of the foregoing processors 20 may be arranged and coupled in a two-dimensional mesh network to form a multi-core processor, such as a thousand-core processor. A plurality of processors 20 in the multi-core processor may be configured to perform different functional computing, and the plurality of processors 20 are connected in series with each other to perform parallel computing. In some embodiments, a plurality of processor cores 12 of a plurality of processors 20 may be located on a bare chip at the same time; a plurality of memories 14 of a plurality of processors 20 may be located on another bare chip at the same time.
A plurality of processors 20 of the multi-core processor can be configured to perform different functional computations, a plurality of processors 20 are serially connected with each other to perform parallel computations. In certain embodiments, a plurality of processor cores 12 of a plurality of processors 20 may be located on a bare chip at the same time; a plurality of memories 14 of a plurality of processors 20 may be located on another bare chip at the same time.
Specifically, in the processor core 12A, the instruction receiving module 340 is responsible for receiving instruction, identifying whether the received instruction is a homomorphic encryption instruction or a non-homomorphic encryption instruction, and assigning the homomorphic encryption instruction to the micro-operator 350 and assigning the non-homomorphic encryption instruction to the general-purpose instruction execution module 320. The micro-operator 350 will assign a plurality of sub-operation of the homomorphic encryption instruction to a specific or non-specific instruction set architecture extension components 312 according to the capability (e.g., performing one or more of the number theoretic transform operation, KeySwitch operation, modulus operation or data manipulation operation) and workload of the instruction set architecture extension components 312.
As mentioned above, the type and complexity of the computation functions to be processed and the desired speed and hardware cost can be used to determine which functional instruction set architecture extensions 312 are to be included and how many instruction set architecture extensions 312 are to be configured for each function. That is, the setting of the plurality of instruction set architecture extension components 312 in the homomorphic encryption instruction execution module 310 often needs to be adjusted according to the application of the product in which the processor 20 is located. Therefore, in some embodiments, a reconfigurable architecture can be implemented for the homomorphic encryption instruction execution module 310 to save the time and money required to redevelop the chip.
For example,
The processor 20 and/or multi-core processor proposed in the present application is capable of handling the complex operations required for homomorphic encryption in an efficient and cost-effective manner through the three-dimensional structure of the processor core 12 and memory 14, together with a flexible design in the homomorphic encryption instruction execution module 310.
The foregoing outlines features of several embodiments of the present application so that those skilled in the art may better understand the aspects of the present disclosure. Those skilled in the art should appreciate that they may readily use the present disclosure as a basis for designing or modifying other processes and structures for carrying out the same purposes and/or achieving the same advantages of the embodiments introduced herein. Those skilled in the art should also realize that such equivalent constructions do not depart from the spirit and scope of the present disclosure, and that they may make various changes, substitutions, and alterations herein without departing from the spirit and scope of the present disclosure.
Number | Date | Country | Kind |
---|---|---|---|
202210926062.X | Aug 2022 | CN | national |