Embodiments relate to systems and methods for enterprise-wide tokenization of Payment Card Industry data.
A system may use public cloud technology to deliver core critical business services to customers and stakeholders. In some cases, the system may need to distribute sensitive data across internal and/or external applications. Distributing sensitive data on the public cloud, however, increases the risk of inappropriate access/theft to that sensitive data.
Systems and methods for tokenizing data in a public cloud are disclosed. According to an embodiment, a method may include: (1) receiving, at a tokenization service in a public cloud and from a client application, source data associated with one of a plurality of namespaces; (2) generating, by the tokenization service, a token for the source data according to a token format rule, wherein the token format rule specifies one or more digit in the token format that are reserved; (3) encrypting, by the tokenization service, the source data; (4) associating, by the tokenization service, the token with the encrypted source data; (5) persisting, by the tokenization service, the association between the token and the encrypted source data in a token table in the public cloud; and (6) providing, by the tokenization service, the token to the client application.
In one embodiment, the method may also include: computing, by the tokenization service, a hash of the source data; and persisting, by the tokenization service, the hash with the token in a source hash table in the public cloud.
In one embodiment, the source data may include debit card information or credit card information.
In one embodiment, the credit card information or the debit card information may include a primary account number.
In one embodiment, the method may also include validating, by the tokenization service, an entitlement or permission of the client application to tokenize the source data.
In one embodiment, a first digit of the token is reserved and prevent the token from being identified as a payment token.
In another embodiment, a first digit and a second digit of the token are reserved and identify the namespace for the source data.
In one embodiment, the association is partitioned in the token table using an application identifier and/or the namespace, and hash and the token are partitioned in the source hash table using the application identifier and/or the namespace.
In one embodiment, the method may also include persisting, by the tokenization service, the token format for the namespace, a prefix for the token, and an encryption master key identifier for the namespace in a metadata table.
In one embodiment, the step of encrypting, by the tokenization service, the source data may include: obtaining, by the tokenization service, encryption keys from an encryption keys service; and encrypting, by tokenization service, the source data using the encryption keys.
According to another embodiment, a non-transitory computer readable storage medium, may include instructions stored thereon, which when read and executed by one or more computer processors, cause the one or more computer processors to perform steps comprising: receiving, from a client application, source data associated with one of a plurality of namespaces; generating a token for the source data according to a token format rule, wherein the token format rule specifies one or more digit in the token format that are reserved; encrypting the source data; associating the token with the encrypted source data; persisting the association between the token and the encrypted source data in a token table in a public cloud; and providing the token to the client application.
In one embodiment, the non-transitory computer readable storage may also include instructions stored thereon, which when read and executed by one or more computer processors, cause the one or more computer processors to perform steps comprising: computing a hash of the source data; and persisting the hash with the token in a source hash table in the public cloud.
In one embodiment, the source data may include debit card information or credit card information.
In one embodiment, the credit card information or the debit card information may include a primary account number.
In one embodiment, the non-transitory computer readable storage may also include instructions stored thereon, which when read and executed by one or more computer processors, cause the one or more computer processors to perform steps comprising validating an entitlement or permission of the client application to tokenize the source data.
In one embodiment, a first digit of the token is reserved and prevent the token from being identified as a payment token.
In another embodiment, a first digit and a second digit of the token are reserved and identify the namespace for the source data.
In one embodiment, the association is partitioned in the token table using an application identifier and/or the namespace, and hash and the token are partitioned in the source hash table using the application identifier and/or the namespace.
In one embodiment, the non-transitory computer readable storage may also include instructions stored thereon, which when read and executed by one or more computer processors, cause the one or more computer processors to perform steps comprising persisting the token format for the namespace, a prefix for the token, and an encryption master key identifier for the namespace in a metadata table.
In one embodiment, the instructions that cause the one or more computer processors to encrypting the source data may include instructions, which when read and executed by one or more computer processors, cause the one or more computer processors to perform steps comprising: obtaining encryption keys from an encryption keys service; and encrypting the source data using the encryption keys.
For a more complete understanding of the present invention, the objects and advantages thereof, reference is now made to the following descriptions taken in connection with the accompanying drawings in which:
Embodiments relate to systems and methods for tokenization in the public cloud.
Embodiments may tokenize sensitive data (e.g., source data) before moving it onto the public cloud. For example, tokens may be used in place of the sensitive data, thereby providing useful information to external parties without exposing sensitive data. In embodiments, the tokens cannot be calculated back to their original values, thereby allowing for better protection of sensitive data.
Embodiments may generate tokens that may be idempotent within a logical boundary (e.g., a logical container to segregate/bucketize different types of data) that represents a type of data. This logical boundary may be referred to as a “namespace.” Each namespace may be protected using its own tokenization and de-tokenization entitlement, where permissions to a namespace may be managed by data owners. The use of namespaces provides for multitenancy (e.g., two tenants may use the same tokenization service but their data is isolated using the namespace construct).
Examples of namespaces may include credit card namespaces and debit card namespaces.
In embodiments, all data stored against a specific namespace is tokenized using the token format associated with that namespace; thus, all tokens in that namespace conform to that single format.
In embodiments, repeated invocations of tokenization with a specific source value against a specific namespace will result in the same token being returned.
In embodiments, the risk of data breaches or theft may be significantly reduced by using a token to protect sensitive data, such as debit or credit card information (e.g., primary account numbers (PANs)).
Each namespace may be associated with a different encryption key.
In embodiments, a token may have the same number of digits as a debit or credit card (e.g., 16 digits), but may not resemble or match any existing PAN across payment networks. Thus, the first two digits of a token may be reserved for use. The first digit will not use any number associated with a payment network (e.g., Visa, Mastercard, American Express, Discover). Examples of digits that are not used may include 0, 3, 4, 5, and 6. Thus, digits 1, 2, 7, 8, and 9 may be used by the token. Because these digits are not used by the payment network, and are thus not primary account numbers, there is no conflict between the token and any primary account number.
In addition, because of this formatting, the token will not be flagged or identified as a PAN by internal systems.
In embodiments, a combination of the first digit and the second digit may be used to uniquely identify the namespace under which the token was generated and/or issued. For example, digits 7 and 0 together may identify a credit card namespace, while digits 8 and 0 may identify a debit card namespace.
In embodiments, the first and second digit may also be used to identify a cobrand credit-card namespace (e.g., 7 and 1 may identify a first cobrand partner, 7 and 2 may identify a second cobrand partner, etc.). The first and second digit may also identify payment aggregators (e.g., 9 and 1 may identify a first payment aggregator, 9 and 2 may identify a second payment aggregator, etc.).
It should be noted that this combination of digits is exemplary only and other combinations may be used as is necessary and/or desired.
Referring to
Client application may interface with cloud provider 120, which may provide a public cloud.
A plurality of client electronic devices 110 and/or client applications 115 may be provided. For example, one client application may provide source data, such as debit or credit card information, for tokenizing, and another client application may use the token received from tokenization service 134 as a surrogate for the source data. Another client application may retrieve the source data that is associated with a token from tokenization service 134 as needed.
Cloud provider 120 may provide compute layer 130 and secure storage 140. Compute layer 130 may provide services, such as encryption key service 132 and tokenization service 134. Secure storage 140 may include source hash table 142, token table 144, and metadata table 146.
Tokenization service 134 may receive the source data from the client application. It may generate a token for the source data in each namespace, and may encrypt the data using keys obtained from encryption key service 132. Tokenization service 134 may tokenize the source data according to a tokenization format that may be stored or defined in token format rules 136. For example, token format rules 136 may specify the format that generated tokens must adhere to. For example, token format rules 136 may reserve the first and/or second digit of the token, and may identify digits that may be used and/or digits that may not be used. For example, the first digit may not be 0, 3, 4, 5, and 6. A combination of the first digit and the second digit may identify the namespace for the token generation, such as 70 for a credit card namespace, and 80 for a debit card namespace.
Tokenization service 134 may encrypt the source data according to an encryption scheme for the namespace.
Tokenization service 134 may also generate a hash (e.g., SHA-512) of the source data.
Tokenization service 134 may persist the hash of the source data in source hash table 142 of secure storage 140 with the token. In one embodiment, source hash table 142 may support look-up operations, such by using a hash of the source data.
In one embodiment, the hash and the token may be partitioned in source hash table 142 based on the application id and/or the namespace.
In one embodiment, the application id may identify client application 115 and/or a system of record (not shown) that owns the source data.
Tokenization service 134 may persist the encrypted source data, the hash, and the token in token table 144 of secure storage 140 with the token.
In one embodiment, the encrypted source data, the hash, and the token may be partitioned in token table 144 based on the application id and/or the namespace.
Metadata table 146 may store the format (e.g., token type) for each namespace, the token prefix, and/or an encryption master key identifier for each namespace. For incoming requests (e.g., a single item or multiple items), metadata table 146 may be used to perform authorization checks for each item in the request. For example, for each item in the request, embodiments may check that the record exists in metadata table 146 for the requested combination of the application identifier and namespace, and that a corresponding entitlement for this combination is provided in the authentication token received with the request.
One or more third parties 150 may receive and use the token instead of the actual data. In one embodiment, third parties 150 may provide the token to tokenization service 134 to retrieve the data associated with the token.
Referring to
In step 205, a computer program, such as a client application executed by a client electronic device, may provide source data to tokenize into the specified namespace to a tokenization provider. The source data may include, for example, a debit card number, a credit card number, etc.
In one embodiment, the source data may be received via a REST API interface as part of the request body. The body may include the namespace under which the requested source data needs to be tokenized.
In step 210, a tokenization service in the cloud may receive the source data for a namespace, and in step 215, the tokenization service may generate a token for the source data. In one embodiment, the tokenization service may tokenize the source data according to token format rules that may specify the format that generated tokens must adhere to. For example, the token format rules may reserve the first and/or second digit of the token, and may identify digits that may be used and/or digits that may not be used.
An example of a token format rule is that the first digit may not be 0, 3, 4, 5, and 6. A combination of the first digit and the second digit may identify the namespace for the token generation, such as 70 for a credit card namespace, and 80 for a debit card namespace.
The token may have no relation to the original source data and cannot be calculated back to the source data.
In step 220, the tokenization service may generate a hash of the source data. For example, the tokenization service may generate a SHA-512 hash of each source data value.
In step 225, an encryption key service in the compute layer may manage data encryption keys for encrypting the source data. In one embodiment, the source data may be encrypted with, for example, application level encryption. In another embodiment, each namespace may have its own encryption scheme.
In step 230, the tokenization service may use keys from encryption key service to encrypt the source data.
In step 235, the tokenization service may persist the hash and the token in, for example, a source hash table. The source hash table may be in secure cloud storage.
In one embodiment, an application identifier and an identifier of the system of record that owns the source data may be persisted with the token in the source hash table.
In one embodiment, the source hash table may be partitioned by namespace.
In step 240, the tokenization service may persist the hash, the encrypted source data, and the token, for example, a token table. The token table may be in secure cloud storage.
In one embodiment, an application identifier and an identifier of the system of record that owns the source data may be persisted with the token in the token table.
In one embodiment, the tokenization service may persist the format (e.g., token type) for each namespace, the token prefix, and/or an encryption master key identifier for each namespace in a metadata table.
In step 245, the tokenization computer program may return the token to the client application. The token may then be used as a surrogate for the actual value (e.g., the PAN for the debit or credit card).
Referring to
In step 305, the client application may provide a token to a tokenization service in a compute layer of public cloud. In one embodiment, the client application may also provide the application identifier and the namespace.
In step 310, the tokenization service may construct a query using the token, the application identifier, and the namespace.
In step 315, the tokenization service may query the token table for the source data.
In one embodiment, the tokenization service may validate that the client requesting the decrypted data is entitled or permissioned to access the source data.
In one embodiment, the tokenization service may perform an authorization check for each item in the client request. For example, the tokenization service may check that a record exists in the metadata table for the requested combination of the application identifier and namespace, and that a corresponding entitlement for this combination is provided in the authentication token received with the request.
In step 320, the tokenization service may receive the encrypted source data as a result of the query.
In step 325, the tokenization service may decrypt the encrypted source data. The tokenization service may apply the same scheme that was used to encrypt the data to decrypt the source data.
In step 330, the tokenization service may return the source data to the client application.
Referring to
In step 405, the client application may provide the source data to a tokenization service in a compute layer of public cloud. In one embodiment, the client application may also provide an application identifier and the namespace.
In step 410, the tokenization service may compute a hash of the source data. In one embodiment, the tokenization service may hash the source data using the process for the namespace.
In one embodiment, the tokenization service may hash an application identifier if provided.
In step 415, the tokenization service may query the source hash table for the computed source hash.
In one embodiment, the tokenization service may validate that the client is entitled or permissioned to access the source data.
In one embodiment, the tokenization service may perform an authorization check for each item in the client request. For example, the tokenization service may check that a record exists in the metadata table for the requested combination of the application identifier and namespace, and that a corresponding entitlement for this combination is provided in the authentication token received with the request.
If, in step 420, the query returns no results, in step 425, the tokenization service may output a “token not found” response to the client application.
If, in step 420, the query returns a token, in step 430, the tokenization service may return the token to the client application. The token may then be used as a proxy for the source data.
The co-relation indicator may be any value that is mapped to the token.
In embodiments, there may be a partial impact to the downstream systems of SOR (System of Record) in the way that they will not be able to de-tokenize the tokens that have not yet been updated by the tokenization service to reflect the mapping of the token to the source value.
Prefetching may include both prefetching tokens and mapping the tokens. In prefetching, a client electronic device may make a prefetch call (e.g., an API call) for one or more tokens for a certain token type to a tokenization service. The call returns a list of tokens and each token's co-relation identifier. At this time, the tokens are not associated with a source value. For example, if de-tokenization request were sent for one of the tokens, the tokenization service will return a “Token not mapped” error.
The client electronic device may save the list of tokens and corresponding co-relation identifiers for later use.
When a client has a source value, the client may map or associate the source value with one of the unassigned tokens and may share the mapping with a downstream system. The client may then call the tokenization service with a map request to update the mapping between the token and the source value. The request may include the source value and the co-relation identifier, and the tokenization service may retrieve the token associated with the co-relation identifier and map the source value to the token.
Mapping may be done in a single synchronous call; prefetching, which involves the token creation may be performed in a separate call at different times.
In step 505, a client application executed by a client electronic device may request one or more tokens from tokenization service for future use. The tokens may be of one or more types. In one embodiment, the client application may execute an API call to request the tokens.
In step 510, the tokenization service may generate the requested tokens and co-relation identifiers for each of the tokens, and may store the tokens and the co-relation indicators in a token vault.
In step 515, the tokenization service may return the token and co-relation indicators to the client application.
In step 520, the client application may identify a source value for a token and may map the source value to one of the unassigned tokens.
In step 525, the client application may send a mapping request including token, mapped source value, and corresponding co-relation identifier to the tokenization service. The mapping request may be sent in an API call.
In step 530, the tokenization service may identify to token corresponding to the co-relation identifier and may map the source value to the token.
In step 535, the tokenization service may return a successful status to the client application for each successful token mapping.
The disclosure of U.S. patent application Ser. No. 18/527,074, filed Dec. 1, 2023, is hereby incorporated, by reference, in its entirety.
Hereinafter, general aspects of implementation of the systems and methods of embodiments will be described.
Embodiments of the system or portions of the system may be in the form of a “processing machine,” such as a general-purpose computer, for example. As used herein, the term “processing machine” is to be understood to include at least one processor that uses at least one memory. The at least one memory stores a set of instructions. The instructions may be either permanently or temporarily stored in the memory or memories of the processing machine. The processor executes the instructions that are stored in the memory or memories in order to process data. The set of instructions may include various instructions that perform a particular task or tasks, such as those tasks described above. Such a set of instructions for performing a particular task may be characterized as a program, software program, or simply software.
In one embodiment, the processing machine may be a specialized processor.
In one embodiment, the processing machine may be a cloud-based processing machine, a physical processing machine, or combinations thereof.
As noted above, the processing machine executes the instructions that are stored in the memory or memories to process data. This processing of data may be in response to commands by a user or users of the processing machine, in response to previous processing, in response to a request by another processing machine and/or any other input, for example.
As noted above, the processing machine used to implement embodiments may be a general-purpose computer. However, the processing machine described above may also utilize any of a wide variety of other technologies including a special purpose computer, a computer system including, for example, a microcomputer, mini-computer or mainframe, a programmed microprocessor, a micro-controller, a peripheral integrated circuit element, a CSIC (Customer Specific Integrated Circuit) or ASIC (Application Specific Integrated Circuit) or other integrated circuit, a logic circuit, a digital signal processor, a programmable logic device such as a FPGA (Field-Programmable Gate Array), PLD (Programmable Logic Device), PLA (Programmable Logic Array), or PAL (Programmable Array Logic), or any other device or arrangement of devices that is capable of implementing the steps of the processes disclosed herein.
The processing machine used to implement embodiments may utilize a suitable operating system.
It is appreciated that in order to practice the method of the embodiments as described above, it is not necessary that the processors and/or the memories of the processing machine be physically located in the same geographical place. That is, each of the processors and the memories used by the processing machine may be located in geographically distinct locations and connected so as to communicate in any suitable manner. Additionally, it is appreciated that each of the processor and/or the memory may be composed of different physical pieces of equipment. Accordingly, it is not necessary that the processor be one single piece of equipment in one location and that the memory be another single piece of equipment in another location. That is, it is contemplated that the processor may be two pieces of equipment in two different physical locations. The two distinct pieces of equipment may be connected in any suitable manner. Additionally, the memory may include two or more portions of memory in two or more physical locations.
To explain further, processing, as described above, is performed by various components and various memories. However, it is appreciated that the processing performed by two distinct components as described above, in accordance with a further embodiment, may be performed by a single component. Further, the processing performed by one distinct component as described above may be performed by two distinct components.
In a similar manner, the memory storage performed by two distinct memory portions as described above, in accordance with a further embodiment, may be performed by a single memory portion. Further, the memory storage performed by one distinct memory portion as described above may be performed by two memory portions.
Further, various technologies may be used to provide communication between the various processors and/or memories, as well as to allow the processors and/or the memories to communicate with any other entity; i.e., so as to obtain further instructions or to access and use remote memory stores, for example. Such technologies used to provide such communication might include a network, the Internet, Intranet, Extranet, a LAN, an Ethernet, wireless communication via cell tower or satellite, or any client server system that provides communication, for example. Such communications technologies may use any suitable protocol such as TCP/IP, UDP, or OSI, for example.
As described above, a set of instructions may be used in the processing of embodiments. The set of instructions may be in the form of a program or software. The software may be in the form of system software or application software, for example. The software might also be in the form of a collection of separate programs, a program module within a larger program, or a portion of a program module, for example. The software used might also include modular programming in the form of object-oriented programming. The software tells the processing machine what to do with the data being processed.
Further, it is appreciated that the instructions or set of instructions used in the implementation and operation of embodiments may be in a suitable form such that the processing machine may read the instructions. For example, the instructions that form a program may be in the form of a suitable programming language, which is converted to machine language or object code to allow the processor or processors to read the instructions. That is, written lines of programming code or source code, in a particular programming language, are converted to machine language using a compiler, assembler or interpreter. The machine language is binary coded machine instructions that are specific to a particular type of processing machine, i.e., to a particular type of computer, for example. The computer understands the machine language.
Any suitable programming language may be used in accordance with the various embodiments. Also, the instructions and/or data used in the practice of embodiments may utilize any compression or encryption technique or algorithm, as may be desired. An encryption module might be used to encrypt data. Further, files or other data may be decrypted using a suitable decryption module, for example.
As described above, the embodiments may illustratively be embodied in the form of a processing machine, including a computer or computer system, for example, that includes at least one memory. It is to be appreciated that the set of instructions, i.e., the software for example, that enables the computer operating system to perform the operations described above may be contained on any of a wide variety of media or medium, as desired. Further, the data that is processed by the set of instructions might also be contained on any of a wide variety of media or medium. That is, the particular medium, i.e., the memory in the processing machine, utilized to hold the set of instructions and/or the data used in embodiments may take on any of a variety of physical forms or transmissions, for example. Illustratively, the medium may be in the form of a compact disc, a DVD, an integrated circuit, a hard disk, a floppy disk, an optical disc, a magnetic tape, a RAM, a ROM, a PROM, an EPROM, a wire, a cable, a fiber, a communications channel, a satellite transmission, a memory card, a SIM card, or other remote transmission, as well as any other medium or source of data that may be read by the processors.
Further, the memory or memories used in the processing machine that implements embodiments may be in any of a wide variety of forms to allow the memory to hold instructions, data, or other information, as is desired. Thus, the memory might be in the form of a database to hold data. The database might use any desired arrangement of files such as a flat file arrangement or a relational database arrangement, for example.
In the systems and methods, a variety of “user interfaces” may be utilized to allow a user to interface with the processing machine or machines that are used to implement embodiments. As used herein, a user interface includes any hardware, software, or combination of hardware and software used by the processing machine that allows a user to interact with the processing machine. A user interface may be in the form of a dialogue screen for example. A user interface may also include any of a mouse, touch screen, keyboard, keypad, voice reader, voice recognizer, dialogue screen, menu box, list, checkbox, toggle switch, a pushbutton or any other device that allows a user to receive information regarding the operation of the processing machine as it processes a set of instructions and/or provides the processing machine with information. Accordingly, the user interface is any device that provides communication between a user and a processing machine. The information provided by the user to the processing machine through the user interface may be in the form of a command, a selection of data, or some other input, for example.
As discussed above, a user interface is utilized by the processing machine that performs a set of instructions such that the processing machine processes data for a user. The user interface is typically used by the processing machine for interacting with a user either to convey information or receive information from the user. However, it should be appreciated that in accordance with some embodiments of the system and method, it is not necessary that a human user actually interact with a user interface used by the processing machine. Rather, it is also contemplated that the user interface might interact, i.e., convey and receive information, with another processing machine, rather than a human user. Accordingly, the other processing machine might be characterized as a user. Further, it is contemplated that a user interface utilized in the system and method may interact partially with another processing machine or processing machines, while also interacting partially with a human user.
It will be readily understood by those persons skilled in the art that embodiments are susceptible to broad utility and application. Many embodiments and adaptations of the present invention other than those herein described, as well as many variations, modifications and equivalent arrangements, will be apparent from or reasonably suggested by the foregoing description thereof, without departing from the substance or scope.
Accordingly, while the embodiments of the present invention have been described here in detail in relation to its exemplary embodiments, it is to be understood that this disclosure is only illustrative and exemplary of the present invention and is made to provide an enabling disclosure of the invention. Accordingly, the foregoing disclosure is not intended to be construed or to limit the present invention or otherwise to exclude any other such embodiments, adaptations, variations, modifications or equivalent arrangements.
This application is a Continuation-In-Part of U.S. patent application Ser. No. 18/527,074 filed Dec. 1, 2023, the disclosure of which is hereby incorporated, by reference, in its entirety.
Number | Date | Country | |
---|---|---|---|
Parent | 18527074 | Dec 2023 | US |
Child | 18412307 | US |