Simplified Deletion of Personal Private Data in Cloud Backup Storage for GDPR Compliance

Information

  • Patent Application
  • 20220012360
  • Publication Number
    20220012360
  • Date Filed
    September 27, 2021
    3 years ago
  • Date Published
    January 13, 2022
    2 years ago
Abstract
In a public cloud that stores data in a database system for a plurality of entities as primary data and as one or more secondary backup copies of the primary data, the data being stored in predefined data fields of data records, personal private data of each entity is stored encrypted using an encryption/decryption key that is unique to each different entity. The encryption/decryption keys are stored in the cloud in a key store of a key management system. To delete the personal private data of a particular entity, as to comply with the right to be forgotten pursuant to GDPR regulations, or otherwise, the encryption/decryption key for that particular entity is deleted from the key store to render permanently inaccessible all copies of that entity's personal private data.
Description
BACKGROUND

This invention relates generally to data management in cloud-based environments, and more particularly to methods and systems for simplified deletion of selected data in backup storage copies.


There are situations where it is desirable to selectively delete or otherwise render inaccessible certain data contained in fields of stored data records. For instance, in the United States HIPPA (Health Insurance Portability and Accountability Act) regulations require that a person's health-related data be kept confidential and not disclosed except to authorized entities; and the European Union has mandated compliance with strict rules on personal data privacy pursuant to the General Data Protection Regulation (GDPR) legislation. GDPR which is broadly applicable to any organization, vendor, or service provider, among other data holders of private personal data of the customers and/or users of the holding entity, requires that such private personal data be maintained confidential and not disclosed to unauthorized recipients. Additionally, a significant provision of GDPR afforded to persons is the “right to be forgotten”. This requires holders of a person's private personal data such as names, identification numbers, financial and social security information, credit card data, etc., to erase all or particular parts of such data from data records upon request of the person so that the data are inaccessible. This applies not only to production copies of the data, but also to data residing in all backup copies as well. The penalties on holders of private personal data for failure to comply may be severe.


This requirement to delete a user's personal data, and similar other requirements to make inaccessible certain types of data, poses a complex challenge to organizations which hold both primary and secondary copies of relevant data. Personal data are typically stored in particular predetermined fields of a user's record in a database. Organizations can implement with reasonable effort personal data erasure in a production database by accessing a user's record and either deleting or altering them by overwriting the private data fields with arbitrary or random data. However, it is a bigger challenge to erase or delete these fields in all other copies of the database that are stored as backups on the same or on another storage system, and on disaster recovery copies in another location. There is no easy way to access and delete or alter specific records in database secondary copies which does not require accessing all such copies. To delete these other copies of data, the database copies must be presented by the backup/data recovery system and either attached to a database host for deletion of the relevant data fields, or a special tool must be used for changing these data fields without a database host. Where the copies are at a remote site, or reside on tape with no immediate physical access and must be delivered to a facility where they can be mounted and processed, it is an even more complex and resource-intensive endeavor, especially where the remote copies are stored in a cloud. Such approaches are far too complex and challenging to be acceptable to most organizations.


It is desirable to provide systems and methods which address these and other problems associated with the selective deletion of all primary and secondary copies of certain selected types of data stored in databases, and which afford simple and efficient approaches for the deletion of all copies of selected data. It is to these ends that the present invention is directed.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 is a diagrammatic block diagram illustrating a cloud environment comprising an embodiment of a system in accordance with the invention;



FIG. 2 is a workflow diagram of an embodiment of a process in accordance with the invention for accessing data in the system of FIG. 1; and



FIG. 3 is a workflow diagram of an embodiment of a process in accordance with the invention for deleting selected data stored in the system of FIG. 1.





DESCRIPTION OF PREFERRED EMBODIMENTS

The invention is particularly applicable to application and database systems running in a public cloud and providing as services, such as FaaS (“function-as-a-service”) cloud computing services, and will be described in that context. As will become apparent, however, this illustrative of only one utility of the invention, and the invention may be beneficially employed in other types of systems and contexts.


As will be described, in one aspect the invention affords an approach to deleting all copies of particular selected data, such as personal data of a user, to comply, for instance, with the right to be forgotten. In accordance with an aspect of the invention, access to relevant sensitive data, such as personal data, is restricted to those with proper access permission, and the right to be forgotten is implemented by rendering the sensitive data permanently inaccessible instead of deleting or erasing it. Rendering data permanently inaccessible is equivalent to deletion of the data, and the term “deletion” will be used herein to mean “permanently inaccessible”. This is accomplished, in accordance with a preferred embodiment, by maintaining relevant sensitive data encrypted at rest to restrict unauthorized access to the data and deleting the decryption key when it is desired to render the data permanently inaccessible, as will be described. Additionally, in another aspect, as will also be described, the invention preferably uses the gateway API service of the cloud as a mediator between a cloud database and a cloud application to ensure that read/write access to the fields of data records that store the selected user data are encrypted and decrypted as needed for access.


As noted, the invention is especially applicable to public cloud systems that provide database storage services, such as AWS (Amazon Web Service) Aurora or others. In the description that follows, for convenience the invention will be described in reference to the AWS public cloud using its services and terminology. However, other public clouds such as Microsoft Azure, Google Cloud Platform and others offer similar services and may be used as well.



FIG. 1 is a diagrammatic block diagram of an AWS public cloud system in which the invention may be deployed. Referring to FIG. 1, a public cloud 10 is typically accessed by an organization's front end processing system, shown at 12 in FIG. 1 for context. Public clouds 10, such as AWS, afford a services platform that offers computing power as virtual processors and memory embodying executable instructions for controlling the operations of the processors, database storage, content delivery and other functionalities that may be defined and configured by users, as desired, without the necessity of providing hardware and software. Referring to FIG. 1, cloud 10 may comprise an EC2 virtual server compute instance comprising a processor and associated memory running an application 14 that may be a customer resource management (CRM) application that is selected and configured by the customer/user organization or another system for a desired operation. The cloud may further comprise a database storage system 16 that includes the AWS relational database RDS service; a gateway API instance 18 that mediates between the application and the database system; an AWS Lambda computing platform 20 that runs user-selected code, such as a user-selected encryption/decryption cryptographic algorithm, in response to events and automatically manages computing resources required by that code; and a KMS (key management system) 22 that manages encryption and decryption keys for the cryptographic algorithm. The cloud 10 may also comprise a personal fields database 24 storing information that identifies the data fields of data records as defined by the user organization, as will be described.


The CRM application 14 may be any cloud-based application or system selected by the user organization that processes, manages and stores data as data records in database 16. The data may comprise a separate data record for each of a plurality of different entities of the organization, and each record may comprise a plurality of fields containing different types of data about an entity of the user and to which different access permissions may be applicable. Some of the data types may include sensitive private or confidential personal data about the entities; other fields may contain more general non-private information about entities such as entity identifying information employed by the user to manage an entity. Personal private or confidential information may comprise, for example, client or patient financial or health data to which access must be restricted. The organization may define those fields in the data records which store sensitive private or confidential data and should have restricted access, and those fields that contain personal data about an entity that is non-private and need not be protected. These data fields of data records may be identified along with the sensitivity type of the data each field stores in the personal fields database 24. A user-selected cryptographic function for encryption and decryption performed by the Lambda service 20 may access this personal fields database to determine which data fields are for private data and require encryption and decryption. The cryptographic function is preferably a symmetric encryption/decryption cryptographic algorithm, such as AES-256, for example. The KMS 22 may, in one embodiment, store a unique symmetric encryption/decryption key for each user entity of the organization for use by the encryption/decryption algorithm.


During operation, the organization's application 14 running on a compute instance EC2 accesses the database through API calls via the gateway API 18. The gateway API is preferably configured to serve as a mediator, as will be described in connection with FIG. 2, that activates the cryptographic function provided by the Lambda service 20 to provide encryption and decryption of those personal private data fields of data records as defined in the personal fields database 24. This enables the data record of a user to comprise encrypted fields containing private personal data of an entity to be accessed for reading and writing using the cryptographic function, and to comprise unencrypted fields containing non-private/non-confidential data that may be accessed directly. More importantly, as will also be described, encrypted fields also facilitate easy identification and deletion or rendering inaccessible those data fields that contain sensitive information or private personal data to comply with the right to be forgotten or other access restrictions.



FIG. 2 is a functional workflow illustrating a process in accordance with an embodiment of the invention for accessing data in the database 16 of the cloud system of FIG. 1. Database system 16 may comprise a primary database for storing primary data, and one or more other databases for storing secondary or backup copies of the primary data. As indicated above, each entity of the organization may have a data record, comprising a plurality of data fields, stored as primary and secondary copies of data in the database. The data fields of a record may contain a mix of data types having different levels of sensitivity and different requirements for protection. In a preferred embodiment as described herein, there may be two types of data and two levels of protection, i.e., sensitive private personal data that must be maintained confidential and other non-sensitive data for which there is no requirement of confidentiality. At step 30 in FIG. 2, the organization may define the format of the particular data fields of data records, including the type of data that is contained in each field and the type or level of sensitivity and the required protection of such data. In an embodiment, all user data records may have the same format so that the corresponding data fields of different user's data records contain the same type of data. The data fields, format and data type of each field of each user's data record may be identified in the personal fields database 24. At 32, the organization may additionally assign to each user an identifier (ID) which is included in an appropriate field of the user's record, and assign and store a unique key to each user in the KMS 22 for use by the cryptographic function for encryption and decryption of that user's records.


As described above and more fully below, the gateway API is preferably configured by the user to act as a mediator between the application and the database. The API may continually and transparently monitor and intercept requests from the application 14 or front end 12 for access to specific user data in the database. As appropriate, the API may call the cryptographic function 20 and retrieve the user's key from the KMS to service the request. This avoids the necessity of tailoring each application that may be running on an EC2 instance in the cloud from being modified to call the cryptographic function, so that the application may request access to user data without regard to whether it is encrypted.


Upon receiving a request at 34 from the application 14 for writing a record, the gateway API 18 may activate the cryptographic function running on the Lambda compute service 20. At 36, the cryptographic function identifies the user and the protected data fields of a data record using information in the personal fields database, and encrypts the data being written to the protected data fields with the user's personally assigned unique key. At 38, the cryptographic function rebuilds the user's data record with the required data fields encrypted, and at 40 writes the rebuilt data record to the database.


Reading a record involves a substantially similar process to writing. Upon receiving a request as from the application 14 for access to read a user's record, the gateway API 18 calls the cryptographic function which identifies the personal private data fields, identifies the user from the ID field of the data record; retrieves the appropriate decryption key from the KMS based upon the user's ID; decrypts the encrypted personal private data fields; rebuilds the record; and returns the record with decrypted fields to the application.


Referring to FIG. 3, at 44, when an entity wishes to be forgotten pursuant to GDPR, for instance, or when sensitive data of the entity otherwise needs to be deleted, an administrator or other authorized entity of the organization may, at 46, access the KMS, as from the front end, and simply delete the key that is associated with the entity. With the key deleted, the encrypted data fields remain in the database (in all primary and secondary copies of the data), but both the encrypted primary and secondary copies of the entity's private data are rendered permanently inaccessible, which is equivalent to the data having been deleted. Thereafter, any request for data of the entity will return that entity's data record with the personal private data fields encrypted.


As can be seen, the invention offers a simple and efficient method and system for quickly and seamlessly deleting multiple primary and secondary copies of selected data without the necessity of locating and mounting the multiple copies of the data on a database host or using some other method to delete the data. As such, it affords an easy and efficient way of implementing the GDPR right to be forgotten, as well as for managing data stored in a public cloud to which access may not be possible.


While the foregoing has been with reference to particular embodiments of the invention, it will be appreciated that the principles of the invention are also applicable to other embodiments and uses. For instance, while an embodiment of the invention has been described above for handling only two types of data—sensitive protected personal data and unprotected data, other embodiments of the invention are applicable to handling multiple different types of data having multiple different protection requirements and access restrictions. By defining the data fields of an entity's data records to store different types of data to which different protections are applicable and different entities are authorized access, and by assigning a plurality of different keys to the different data fields, upon the gateway API receiving a request for access to protected fields, the gateway API may access the personal fields database to verify access authorization and retrieve appropriate keys associated with the requested data fields to service the access request. For example, different groups or entities within an organization may have different access permissions. The different keys may be used to control access and afford specific protections to the data.


It will also be appreciated that changes may be made to the embodiments described herein without departing from the principles of the invention, the scope of which is defined by the appended claims.

Claims
  • 1. A method of managing data stored in a cloud database system, the data comprising a primary copy of the data and one or more secondary copies of the data, said data comprising a separate data record for each of a plurality of different entities of a user, each said data record having a plurality of different data fields storing different data, the method comprising: defining a sensitivity type for the data stored in the data fields of said data records in said cloud database system, the sensitivity type comprising, for each entity, private data having restricted access and non-private data having unrestricted access, said private data being stored in the cloud database system in encrypted form using a different encryption key for each said entity, and said non-private data being stored in unencrypted form;storing information in the cloud identifying for each said entity the data fields of a data record of said entity that contain private data in encrypted form, and storing in a key store in the cloud a decryption key for each different entity, wherein there are different classes of private data stored in said data fields of said data records, each different class of private data of an entity having a corresponding decryption key stored in said key store; anddeleting a selected class of the private data of a selected entity by deleting the corresponding decryption key in said key store in said cloud to render only said selected class of private data inaccessible.
  • 2. The method of claim 1, wherein all copies of said selected class of private data of the selected entity are encrypted using the same encryption key, and said deleting said decryption key renders inaccessible all of said copies of said selected class of private data of the selected entity.
  • 3. The method of claim 1, wherein said cloud comprises a compute instance executing an encryption algorithm for encrypting and decrypting private data, and said deleting said private data comprises deleting a decryption key in said key store for said encryption algorithm.
  • 4. The method of claim 3, wherein said compute instance further executes a user-defined cloud application, and said cloud further comprises a gateway API, the gateway API mediating requests from said cloud application for access to private entity data in said database system by invoking said encryption algorithm to encrypt and decrypt said requests.
  • 5. The method of claim 4, wherein said compute instance in the cloud further executes user-selected code that responds to events and automatically manages cloud resources as required by said events.
  • 6. The method of claim 1, wherein said private data is encrypted using a symmetric cryptographic algorithm that uses the same key for encryption and decryption.
  • 7. A method of managing data stored in a cloud database system, the data comprising a primary copy of the data and one or more secondary backup copies of the primary data, said data comprising a data record for each of a plurality of different entities, each said data record having a plurality of different data fields, the method comprising: defining a sensitivity type for the data stored in the data fields of said data records in said cloud database system, the sensitivity type comprising, for each entity, private data having restricted access, said private data being stored in the cloud database system in encrypted form using a different encryption key for each entity, wherein there are different classes of private data stored in said data fields of said data records, each different class of private data of an entity having a corresponding unique encryption/decryption key stored in a key store in said cloud;receiving a request to delete a particular class of private data of a selected entity that is stored in said cloud database system;accessing from a fields database in said cloud information identifying for the selected entity the data fields of a data record of said selected entity that store said particular private data; anddeleting from said cloud key store a decryption key that corresponds to said particular class of private data and that is necessary for decrypting the particular class of private data of the selected entity, to render inaccessible all copies of said class of private data of the selected entity stored in said database system.
  • 8. The method of claim 7, wherein said deleting comprises accessing the key store and deleting the corresponding decryption key of said particular class of private data of the selected entity from said key store.
  • 9. The method of claim 7 further comprising servicing a request for a class of personal private data of an entity that has been deleted by returning the data record of that entity with the personal private data fields encrypted.
  • 10. The method of claim 7, wherein said private data is encrypted using a symmetric cryptographic algorithm that uses the same key for encryption and decryption.
  • 11. Non-transitory storage medium embodying executable instructions for controlling a processor to perform a method of managing data stored in a cloud database system, the data being stored in multiple different fields of separate data records for each of a plurality of different entities of a user, said data being of different sensitivity types comprising different classes of private data having restricted access and non-private data having unrestricted access, the method comprising: storing said different classes of private data of an entity in different fields of said data records in encrypted form using different encryption keys for each said different class of private data, and storing information identifying for each entity the fields of each data record of said each entity that contain private data in encrypted form and the class of said private data;further storing in a key store in the cloud corresponding unique decryption keys for each different entity and for each different class of private data; anddeleting a selected class of private data of a selected entity by deleting a corresponding decryption key in said key store to render said selected class of private data of the selected entity inaccessible.
  • 12. The non-transitory storage medium of claim 11, wherein all copies of said selected class of private data of the selected entity are encrypted using the same encryption key, and said deleting said corresponding decryption key renders inaccessible all copies of said selected class of private data of the selected entity.
  • 13. The non-transitory storage medium of claim 11, wherein said cloud comprises a compute instance executing a encryption/decryption algorithm for encrypting and decrypting said selected private data of said selected entity, and a cloud gateway API that receives a request to delete said selected private data and causes said compute instance to delete said corresponding decryption key in said key store.
  • 14. The non-transitory storage medium of claim 13, wherein said compute instance in the cloud further executes an application that responds to events and manages cloud resources in response to said events.
CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No. 16/431,563, filed Jun. 4, 2019, the disclosure of which is incorporated by reference herein.

Continuations (1)
Number Date Country
Parent 16431563 Jun 2019 US
Child 17486542 US