This invention relates to real-time publish-subscribe communication and protocols.
Unlike point-to-point request/reply systems, where data is exchanged between pairs of endpoints, in Publish-Subscribe systems the Publisher entity may have to send data to many subscribing entities (Subscribers), which can range from a handful to hundreds, thousands, or more.
Many of these systems may be used for critical applications that require security. Security requires an authentication phase where the Publisher can securely identify Subscribers and determine they have the necessary permissions to receive the information they send. Likewise, the Subscribers need to authenticate the Publishers to ensure they are entitled to produce the information they send.
Beyond authentication, Publishers and Subscribers need to securely establish (exchange or derive) Session Keys that can be used to cryptographically protect (via encryption and/or message authentication) the actual data exchanged. The process of securely establishing Session Keys with multiple Subscribers can be quite expensive in terms of CPU and bandwidth as it would normally require sending a new secure message to each individual Subscriber.
The present invention addresses the needs in the art.
In one embodiment, the invention is a method for performing secure and scalable distribution of symmetric keys from a publisher to one or more subscribers in publish-subscribe system. The method includes having a plurality of applications, each application having a plurality of participants, each participant containing a plurality of publishers and subscribers. The method further includes having a cryptographic symmetric key for each publisher to encode data samples sent by the publisher to one or more of the subscribers, where the cryptographic symmetric key is derived from a key material and a key revision, where the key material is a piece of cryptographic information unique per publisher and where the key revision is a piece of cryptographic information unique per participant; where a participant can generate a plurality of key revisions. The unique key material for the publisher is distributed by the participant containing the publisher to the other participants. One of the key revisions is distributed by the participant containing the publisher to the other participants. A new cryptographic symmetric key for the publisher is derived from the distributed unique key material for the publisher and one of the distributed key revisions for the participant containing the publisher.
In another embodiment, the invention is a method for performing secure and scalable distribution of cached data samples from a publisher to one or more subscribers in a publish-subscribe system. The method includes having a plurality of applications, each application having a plurality of participants, each participant containing a plurality of publishers and subscribers. The method further includes having a plurality of cryptographic symmetric keys for each publisher to encode data samples sent by the publisher to one or more of the subscribers. The method further includes having a cache of samples in the publisher, where each sample is encoded with one of the plurality of cryptographic symmetric keys. The publisher stores a finite history of the most recent cryptographic symmetric keys, where a new cryptographic symmetric key removes the oldest cryptographic symmetric key from the finite history, where samples in the cache of samples encoded using an oldest cryptographic symmetric key are re-encoded using the latest cryptographic symmetric key in the cryptographic symmetric key history. The publisher sending a window of the most recent cryptographic symmetric keys in the cryptographic symmetric key history to one or more of the subscribers. The publisher sending a sample from the cache of samples to one or more the subscribers, where the publisher re-encodes a sample with the latest cryptographic symmetric key in the cryptographic symmetric key history if the cryptographic symmetric key used to encode the sample key is outside the window sent to one or more subscribers.
In yet another embodiment, the method is a method for performing secure and scalable distribution of cryptographic symmetric keys and cached data samples encoded using the cryptographic symmetric keys from a publisher to one or more subscribers in a publish-subscribe system. This method is a combination of the above described methods, where a cryptographic symmetric key is derived from a key material and a key revision.
Unlike point-to-point request/reply systems, where data is exchanged between pairs of endpoints, in Publish-Subscribe systems the Publisher entity may have to send data to many subscribing entities (Subscribers), which can range from a handful to hundreds, thousands, or more.
Many of these systems may be used for critical applications that require security. Security requires an authentication phase where the Publisher can securely identify Subscribers and determine they have the necessary permissions to receive the information they send. Likewise, the Subscribers need to authenticate the Publishers to ensure they are entitled to produce the information they send.
Beyond authentication, Publishers and Subscribers need to securely establish (exchange or derive) Session Keys that can be used to cryptographically protect (via encryption and/or message authentication) the actual data exchanged. In this context, we define Key Material (KM) as a piece of cryptographic information from which an entity (Publisher or Subscriber) can derive a Session Key. We use the term (secure) encoding to refer to the process of cryptographically protecting data (converting plain data to encrypted data and/or adding a message authentication code). Likewise, we use the term (secure) decoding to refer to the process of validating the message authentication code and/or extracting the plain data from the cryptographically protected data.
For scalability in Publish-Subscribe systems, it is desirable for a Publisher to share the same cryptographic KM with multiple Subscribers. That way the data does not need to be encrypted (or protected by a message-authentication tag) multiple times and there is no need to keep track of many separate Session Keys for a single Publisher.
However, a Publisher that shares KM with multiple Subscribers may need to change the Session Keys at certain times. For example, if a Session Key has been used to encode too many messages, if the Publisher needs to revoke access permissions for one or more existing Subscribers, or if the criteria to determine who has access to the information has changed.
The process of securely establishing Session Keys with multiple Subscribers can be quite expensive in terms of CPU and bandwidth as it would normally require sending a new secure message to each individual Subscriber.
This invention provides an efficient and scalable solution for cryptographic (session) key regeneration and distribution to many Subscribers to achieve better-than-linear scaling with the number of Subscribers, providing support for scalable dynamic Publishers' and Subscribers' renewal, revocation, and expiration.
In addition, Publish-Subscribe systems are often able to “cache” previously-published data and send it to Subscribers that join the system after the data was published. This cached data is usually stored in encoded form (this is, securely encoded using the Publisher Session Key) so subsequent re-sending of previously-published data does not require spending resources in encoding the same data again. The problem with this approach is that after Session Key change, the cached data needs to be encoded again with the new Session Key.
An approach for cached data is for the Publisher to just encode all of the cached data again whenever its Session Key changes. The problem with this approach is that cached data could have thousands (or even hundreds of thousands) of individual messages, which makes the process of encoding all the data again and therefore Session Key generation very expensive.
This invention provides a scalable solution for management of the securely-protected cached messages that avoids encoding them each time the Session Key changes.
To better illustrate the concepts in this invention, the rest of this document uses Data Distribution Service (DDS) system as an example of a Publish-Subscribe System to which this invention can apply.
Setting the Stage of the Invention
To fully support Dynamic Certificate Renewal, Revocation, and Expiration on a DDS Security system we need the following main elements:
We now briefly introduce these elements.
Session Key Regeneration and Redistribution
The way DDS Security's built-in plugins enforce access control is through the Cryptographic plugin. The Cryptographic plugin controls who has access to the system by selectively sharing the appropriate Key Material. Specifically, the way the Cryptographic plugin prevents an unauthorized Participant from accessing a DDS system is by not sharing with that Participant the sender's (Participant, DataWriter, or DataReader) information needed to derive the Session Keys used for protecting the RTPS messages, submessages, and user data.
Consequently, to effectively allow for kicking out from a DDS Security system a Participant whose certificate has expired or has been revoked (we will refer to this Participant as a Revoked Participant or non-trusted Participant), we need two things:
Other peer-to-peer Publish-Systems typically use similar mechanisms to share key material from the sender to all the receivers, whether that key material is specific to a single sender or groups of senders.
Existing mechanisms for (Session) Key distribution are inefficient because they require exchanging all of the new DataWriter Key Material: this introduces a high cost both in terms of network overhead (traffic exchanged) and CPU processing (associated with the reliable delivery of the DataWriter Key Material).
Other peer-to-peer Publish-Systems will encounter similar scalability issues whenever a sender needs to regenerate the key material that was previously shared with multiple receivers.
Participant Identity Certificate Revocation and Expiration
Typical access control mechanisms rely on first authenticating the identity of the actor that wants access to a resource, and then checking that the authenticated actor has the necessary permissions.
Publish-Subscribe systems and specifically DDS security operate the same way. The authentication and access control checks are typically performed at “discovery” or “connection” time and based on those the Key Material is exchanged with the Participants that pass those checks.
However, the access controls cannot stop after the initial access grant: In general, the fact that an actor or Participant has permissions at a point in time to do something does not grant those permissions indefinitely. There are multiple reasons for that.
Because of this, it becomes necessary to be able to “rescind” or “revoke” the access of Participants that had previously been granted access and therefore already have the Key Material previously sent to them.
The (Session) Key Regeneration and Redistribution mechanism we presented in Section Session Key regeneration and redistribution provides us with the tools needed to securely remove a Participant from the system. With the mechanism to remove a Participant in place, with this in place it becomes possible to enforce Identity Certificate validity at ma y points in time, for example:
Participant Identity Certificates Renewal
DDS Security does not provide mechanisms to propagate changes in the Identity certificate to other Participants. This lack of mutability for the Identity Certificate forces users to perform full Participant destruction and creation to renew the Participant certificate. Of course, this is not acceptable for systems requiring high availability, as destroying and creating a Participant will result in communication loss and triggering full discovery.
Secure Historical DataWriter Samples Re-Encoding
DDS supports delivering historical samples to late joiners. DDS (through the DDS Security specification) also supports protecting the sample's content.
An efficient way (and also the one that RTI follows) of implementing sample content protection in combination with historical sample delivery is to store the samples encoded in the DataWriter sample queue, so there is no need to encode them again upon resending. While storing encoding samples works great when Session Keys remain unchanged during the whole DataWriter lifecycle, it becomes a problem when the Session Key needs to change (and therefore the samples need to be reencoded, which has a significant impact on the CPU usage).
This invention relates to a method for scalable key regeneration and redistribution for publish-subscribe systems, including those based on the data distribution service standard (DDS) and those using the Real-Time Publish-Subscribe (DDSI-RTPS) wire protocol standard.
Original Contributions
This invention is about the following main original contributions:
Efficient Key Regeneration Mechanism for Publish-Subscribe Systems
To enforce fine-grained access control a publish-subscribe system typically needs to create and maintain different Key Material for each separately-protected Endpoints (e.g. each DataWriter or DataReader) that way sharing the KeyMaterial used for that Endpoint does not “leak” information that can be used to decode data from other DataWriters or DataReaders.
Generating Key Material can be an expensive operation in terms of CPU as it typically requires the creation of cryptographically-secure random numbers and the use of Key-Derivation Functions. If a Participant needs to re-generate the Key Material for all the Endpoints it contains the burden of generating that Key Material that can be significant.
We created an efficient key regeneration mechanism that allows generating many different Key Material that can be used for different Endpoints within the same Participant (e.g. creating new, unique, Key Material for every DataReader and
DataWriter in the Participant) using an effort that is significantly less than linear with the number of Endpoints contained by the Participant.
The mechanism has sharing a Participant-level secret random NONCE that can be used in combination with the original DataWriter Key Material to derive a new set of Session Keys.
This part of the invention is further described in the following sections:
Scalable Key Redistribution Mechanism for Publish-Subscribe Systems
We created a scalable key redistribution mechanism that allows for the trusted Participants in the system to receive the needed new (re-generated) Session Keys in a way that significantly reduces the network traffic.
Since the re-generated Session Key is derived from the original Key Material and a Participant-level random NONCE and the trusted Participants had already received the original Key Material it is sufficient to send them the new Participant-level random NONCE and they use it to derive the new Session Keys themselves.
In this sense, the number of messages to be delivered from one Participant to the rest of the trusted Participants goes from (RemoteParticipants×LocalDataWriters) to (RemoteParticipants).
This part of the invention is further described in the following sections:
Seamless No-Communication-Loss Key Transition
We created a strategy to achieve a seamless, without loss in communication, key transition. In this sense, we leverage any underlying reliability features available from the Publish-SUbscribe infrastructure.
In the case of DDS/RTPS, we leverage DDS reliability features to achieve transitioning from a set of Session Keys to a new set without breaking the communication between the two involved Participants.
In particular, after we send new Key Revision Tokens to all of the trusted remote Participants, we take advantage of the RTPS reliability protocol to detect when all of these remote Participants have received the Key Revision Tokens we sent, and only then do we start using the new Session Keys derived from the new Key Revision information. We combine this with the definition of a timeout to avoid holding the transition for too long in case one of the remote Participants becomes unresponsive.
This part of the invention is further described in the following sections:
Efficient Historical Samples Management Mechanism for Secure
Data Writers
We created a very efficient management mechanism for historical samples, which is based on the following concepts:
Architectural Design
Requirements
Key Regeneration and Redistribution: Requirements
After generating new Session Keys, the local Participant needs to deliver them to the remote Participants he trusts, so the local Participant can keep communicating with them. The local Participant shall not distribute the new Session Keys to non-trusted Participants.
To be scalable, the granularity of the Session Key regeneration shall be at the remote Participant level: Connext Secure will not support regenerating the Session Keys for individual DataWriters or DataReaders.
The transition to the new Session Keys should happen seamlessly: communication should not break, liveliness should not be lost, and therefore Participants should not need to initiate a new discovery process.
Non-volatile data-protected DataWriters store historical data encoded in their DataWriter queues. We need to make sure to provide a mechanism for supporting the delivery of this historical data to late joiners after a rekeying event has happened.
There will be no changes concerning who can receive historical data: trusted late joiners will be able to receive any historical data that was produced at any point in the past.
The solution should work and remain secure on long-running systems.
Persistence Service needs to support the mutability of the Session Keys of the Persistence Service DataWriter.
The solution should still interoperate with older Connext versions when the key regeneration feature is disabled.
Participant Revocation and Expiration: Requirements
To avoid making the plugins even more complex, keep as much state as possible within the core libraries.
PLUGINS will support a new API to receive updated PropertyQos configuration. As part of this project, only CRL property, Identity Certificates, and Identity CAs will be supported (see RE-R3. Support mutable CRL property in the PLUGINS,
This API will be exposed as a new Domain Participant API for the main Connext DDS APIs.
PLUGINS will support either passing a new CRL in data format or file format. If using file format, users can provide either a path to a different file, or provide a path to an already loaded file that has been updated.
Upon passing an updated CRL to the plugins, the plugins will store the updated state, but they will not take any action yet: core will be driving the revocation process.
PLUGINS will support new APIs (validate_local_identity_status, validate_remote_identity_status) to validate the status of an identity (represented by an Identity Handle associated with a Participant) against the currently valid CRL state, expiration dates, and Identity/OCSP CAs' own status (Identity and OCSP CAs could also expire).
By calling these APIs, the core will be able to determine if any authenticated Participant's certificate is revoked or expired.
PLUGINS will support new APIs (validate_local_permissions_status, validate_remote_pemissions_status) to validate the status of a Permissions Document (represented by a Permissions Handle associated with a Participant) against the currently valid expiration dates. We will check both the Permissions CA and the Permissions Document for expiration.
By calling these APIs, the core will be able to determine if any authenticated Participant's permissions are expired.
Remote Participant certificate (or permissions) revocation will be treated the same as expiration: remote Participant will be removed from the local Participant, but still will be able to start a new authentication (which will fail unless revocation has been lifted by the applicable CA or a new valid non-revoked certificate is presented).
CORE will periodically check for authenticated Participants' identity & permissions status (see RE-R4. Support API in the PLUGINS to validate the identity status of a known Participant and RE-R5. Support API in the PLUGINS to validate the permissions status of a known Participant). If any (one or multiple) remote authenticated Participant certificate (or permissions) is no longer valid, the core will:
Note that if any authenticated remote Participant's permissions are not valid yet (for example, because of the not before date), the remote Participant will be completely removed from the local Participant. Important: removed, not ignored, we may need to review the current logic.
If the local Participant certificate (or permissions) is not valid (either because it is expired or because it is not yet valid) upon creation, Participant creation will fail.
CORE will keep track of the last N Identity Certificates associated with Participants that left the system since the last key regeneration event. These Identities are considered when calling validate_remote_identity_status and validate_remote_permissions_status. This list is purged when there is a key regeneration event.
When N is reached, trigger a key regeneration event and remove those N certificates. N is configurable with a default of 50. These Identity Certificates will also be part of the checks done as part of RE-R6. Support Dynamic Participant Certificate Expiration.
This will ensure that we will renew keys if at any point in the past we shared keys with a Participant that holds a currently invalid certificate, even if that Participant is not matched anymore with the local Participant.
Note that no special action is required for a recreated local Participant: if the local application is restarted, then its Key Materials are also fresh, and therefore the original list of Participants that the Session Keys have been shared with is no longer relevant.
Note that ignoring a Participant (which is a public API) by itself will not trigger key regeneration. If a user wants to securely stop communication with previously trusted Participants, the user will need to call ignore_participant( ) for all of those Participants and, once all of the ignore participant calls have been completed, then force_key_regeneration( ).
Upon configuration change or applicable API call, CORE should trigger a status check for the local and remote Participant statuses, so if an identity/permission is no longer trusted, the kicking out of the associated Participant is not delayed.
Participant Identity Certificates Renewal: Requirements
To avoid making the plugins even more complex, keep as much state as possible within the core libraries.
Upon passing an updated local Identity Certificate to the plugins, the plugins will store the updated state for the certificate, but they will not take any action yet: core will be driving the renewal process.
CORE will trigger the update for the local Participant's Identity certificate in the PLUGINS by using the PLUGINS API validate_local_identity_status introduced in RE-R4. Support APIs in the PLUGINS to validate the identity status of a known Participant. This API will return a specific status notifying CORE about identity being valid & recently updated.
The new certificate must have the same subject name (as this is tied to the Participant GUID) and public key as the previous identity certificate.
This will be done by propagating an AuthenticatedPeerCredentialToken to all currently trusted remote Participants through the SecureVolatileChannel built-in channel.
Upon passing an updated remote Identity Certificate to the plugins, the plugins will store the updated state, but they will not take any action yet: core will be driving the renewal process.
CORE will drive this process through a new PLUGINS API, set_remote_credential_token.
The new certificate must have the same subject name (as this is tied to the Participant GUID) and public key as the previous identity certificate.
The rest of the process will be handled by RE-R6. Support Dynamic Remote Participant Certificate Expiration or Revocation.
Upon passing an updated local Identity CA certificate to the plugins, the plugins will store the updated state, but they will not take any action yet: core will be driving the renewal process.
This includes the identity CA and the OC SP CA (the CA used to verify the signature of OC SP responses).
CORE will trigger the update for the local Participant's Identity certificate against the updated CA in the PLUGINS by using the PLUGINS API validate_local_identity_status introduced in RE-R4. Support APIs in the PLUGINS to validate the identity status of a known Participant. Note that if only the CA has changed (but not the Identity) validate_local_identity_status will just return valid/not valid (it will not trigger Identity Cert propagation).
The new CA certificate must have the same public key as the previous CA certificate.
CORE will drive the new Permissions Document update to all of the currently trusted remote Participants without triggering new authentication processes or losing liveliness.
This will be done by propagating an AuthenticatedPeerCredentialToken to all currently trusted remote Participants through the SecureVolatileChannel built-in channel.
Upon passing an updated remote Permissions Document to the plugins, the plugins will store the updated state, but they will not take any action yet: core will be driving the renewal process.
CORE will drive this process through a new PLUGINS API, set_remote_credential_token.
The rest of the process will be handled by RE-R6. Support Dynamic Remote Participant Certificate Expiration or Revocation.
Design Decisions
Key Regeneration and Redistribution: Design Decisions
To meet the requirements, we defined in section Key regeneration and redistribution: Requirements, we made the following design decisions:
Participant Revocation and Expiration: Design Decisions
Participant Identity Certificates Renewal: Design Decisions
General Flow
Key Regeneration and Redistribution: General Flow
Supporting Basic Case of Key Regeneration and Distribution
Connext DDS Secure sender Session Key redistribution will have the following steps:
Supporting Data Protection for Historical Data
One of the main challenges introduced by key revisions is how to handle the samples encoded in the DataWriter queue. While RTPS and submessage protection kinds are computed “on the fly” with the latest revision, samples on the DataWriter queue are encoded whenever the sample was added to the queue, and they remain encoded forever. After adding key revisions, this is now a problem because we need to either:
We want to be efficient both bandwidth-wise and CPU-wise. To achieve this, we came up with the following strategy:
Key Revisions Lifecycle
DDS Entities apply RTPS and submessage protection upon generating/sending RTPS messages. As a consequence of this, DDS Entities will always use the latest key revision available when encoding for these protection kinds. Data protection works differently: DataWriters exercise data protection upon adding samples to the DataWriter Queue.
Just reencoding the full DataWriter history to use the latest key revision each time a key revision is generated would scale poorly for non-volatile DataWriters. To address this issue, we define two concepts:
Non-Configurable KRW
To make system configuration easier, we only allow for two possible values for the KRW:
Moving the KRW
Upon new key revision generation, the plugins will not remove the oldest member of the key revision window yet. The Participant will propagate the new key revision to the remote Participant so they can update their windows. Once all of the trusted remote Participants have acknowledged the reception of the new key revision, CORE will mark the new key revision as active and then the oldest member of the key revision window will be removed in the plugins. If, while waiting for acknowledgments, the Participant attempts to generate a new key revision, CORE will post an event to do this generation later. As long as the latest acknowledged revision is not the latest revision that was generated, this event will be postponed.
When a Participant discovers a new remote Participant, it will obtain the key revisions belonging to the current local KRW from the plugins as crypto tokens, and then share those crypto tokens with the discovered Participant.
Purging Old Key Revisions
As mentioned earlier, the KRW is PLUGINS concept that represents the set of Key Revisions for a local Participant that are available to remote Participant's DataReaders so they can decode historical data-protected samples. However, this KRW does not limit the number of key revisions the local Participant needs to keep around.
Since a DataWriter needs the key revision a given sample was encoded with to be able to re-encode that sample, and since DataWriters will re-encode samples lazily (only upon repairing a sample that has a key revision outside of the KRW), we need some sort of resource limit to avoid the list of old key revisions to grow unbounded. This is the Key Revision Max. History Depth (KRMHD) (default value: 0; range: 0 or 7-59652323), and it is managed at Participant level. This parameter will be immutable. Note: if KRW could get any value, we would need a KRMHD with a minimum of 2 it is because we need to keep re-encoding samples with the oldest revision before introducing the new revision. Since KRW can be of 1 or 7, we need a minimum of 7 for the KRMHD.
When the number of Key Revisions a Participant has created and not destroyed reaches the KRMHD, the Participant will purge the oldest key revision, so it can make room for a new one. To achieve this, it will check for each of its DataWriters, what is the oldest key revision the DataWriter is using in his DataWriter Queue. Each DataWriter whose oldest key revision matches the key revision to be removed will reencode (with the latest active key revision) all of the samples encoded with the oldest key revision.
Note that since we generally re-encode lazily, we cannot make assumptions about key revisions in use by a DataWriter based on SN order. We will need to check the key revision_id for every sample we need to evaluate.
Interaction with Compression
Because we compress, then encrypt, we do not need to recompress when we to reencode.
Implications on Integrity/Confidentiality
Keeping more than one (the latest) key revision active has implications on integrity and confidentiality:
Participant Revocation and Expiration
Participant Identity Certificates Renewal
New Types and SPIs
Key Regeneration and Redistribution: New Types and SPIs
RTI Security IDL
RTI API Detailed Description
In this section, we will follow DDS Security notation. In Implementation Detailed Design we will detail the exact mapping for the Connext DDS Secure implementation of these APIs (e.g., instead of OctetSeq type we use DDSBuffer type to pass sequences of bytes).
This function is called when we need to regenerate new keys. If necessary (due to KRMHD limit being reached), this function will remove the oldest key revision from the local_participant_crypto's list of key revisions in order to make room for the new key revision. Before calling this function, you must call re_encode_serialized_payload on all of the samples encoded with the oldest key revision.
Parameter key_revision_id: This output parameter identifies a key revision.
Returns true on success and false on failure.
This function is called on the plugins after create_local_key_revision is called, and only once the key revision info has been delivered to all of the relevant remote Participants. This function is responsible for notifying the senders that they should start using the new derived key for that CryptoHandle.
Parameter revision_id: This parameter identifies the revision to be activated. It may not be the latest revision if there has been another key change while waiting for a previous revision to be delivered.
Returns true on success and false on failure.
This function is called on the plugins after create_local_key_revision is called. This function is responsible for generating the message contents for key revisions.
Parameter latest_key_revision_tokens: This output parameter contains the contents of a message that should be sent to existing remote Participants after a new key revision is created. It should contain one token for the latest key revision that was just created. Existing remote Participants only need to learn about the latest revision, since it already knows about the previous revisions.
Parameter all_key_revision_tokens: This output parameter contains the contents of a message that should be sent to newly-discovered remote Participants. It should contain many tokens, one for each key revision in the KRW. Newly-discovered remote Participants need to learn about all available revisions.
Parameter max_all_key_revision_tokens: This parameter contains the maximum number of elements that all_key_revision_tokens should contain. If the local Participant currently has no data-protected DataWriters that are reliable or non-volatile, then this parameter shall be 2. Otherwise, it shall be 7.
Parameter local_participant_crypto: The local Participant CryptoHandle, which internally contains the list of key revisions.
Returns true on success and false on failure.
This function is called on the plugins after the key revision tokens created by create_local_key_revision tokens are sent.
Parameter key_revision_tokens: The key revision tokens created by create_local_key_revision tokens.
Returns true on success and false on failure.
This function is called on the plugins after the output of create_local_key_revision_tokens is received. This function is responsible for processing the message contents for key revisions.
Parameter local_participant_crypto: Unused, but set_remote_participant_crypto_tokens also has it.
Parameter remote_participant_crypto: This parameter will be updated with a new key revision.
Parameter remote_key_revision_tokens: This parameter contains the message contents. It contains one token per revision_id within the begin-end range.
Returns true on success and false on failure.
This function is called on the plugins after checking that the key revision version stored with the serialized sample's crypto header belongs to a revision that went out of the KRW. It is also called on the plugins after the KRMHD limit has been reached, and samples encoded with the oldest key revision need to be re-encoded with a new key revision. The goal is to re-encode the encoded_serialized_payload using the latest active key revision.
Parameter encoded_serialized_payload: The caller passes in the serialized payload encoded with an old key. The plugins will repopulate this buffer with the serialized payload encoded with the latest active key revision. The plugins will use their own scratch buffer where the plugins can put the decoded serialized payload (since the plugins need to decode and then re-encode the serialized payload).
Parameter crypto_handle: The DataWriter's crypto handle that was used to encode the payload. This CryptoHandle also contains the key revision that will be used to provide the new encoding.
Returns true on success and false on failure.
This function is called on the plugins when restoring a sample from durable DataWriter history. It is called under the following conditions:
The first two conditions are necessary because the CryptoHeader has a different format depending on whether or not key revisions are enabled (see New CryptoTransformIdentifier_v2 structure).
Parameter encoded_serialized_payload: same as re_encode_serialized_payload
Parameter key_revisions_previously_enabled: true if key revisions were previously enabled. This information should be retrievable from the durable DataWriter history. See Restore=0.
Parameter historical_key_revision_tokens: the key revision tokens retrieved from the durable DataWriter history. This will be used to decode the sample.
Parameter crypto_handle: The DataWriter's crypto handle, which contains the key revision that will be used to provide the new encoding.
New KeyRevision Tokens ParticipantGenericMessage class
If GenericMessageClassId is
GMCLASSID_SECURITY_KEY_REVISION_TOKENS, the message_data attribute shall contain a KeyRevisionTokenSeq having N elements.
This message is intended to send key_revisions from one DomainParticipant to another.
The destination_participant_guid shall be set to the GUID t of the destination DomainParticipant.
The destination_endpoint_guid shall be set to GUID UNKNOWN. This indicates that there is no specific endpoint targeted by this message: It is intended for the whole DomainParticipant.
The source_endpoint_guid shall be set to GUID UNKNOWN.
The message_class_id shall be set to “dds.sec.key_revision_tokens”
The message_data shall have one element per key revision. For each element:
revision_secret_seed is a random array of 32 bytes (256 bits, matching AES256 key length). revision is a counter that increments by one every time the KeyRevisionInfo is changed for a given Participant. Using the KeyRevisionInfo received from a remote Participant, a Participant can compute new key material for every single original key material he has previously received (i.e., any previously received remote DataWriters key material, remote DataReaders key material, and Participant key material).
Key Material Derivation
The new key material is calculated as follows:
These calculations map to RFC5869 (HMAC-based Extract-and-Expand Key Derivation Function (HKDF)) sections 2.2 and 2.3 as follows:
T=T(1)|T(2)|T(3)| . . . |T(N)
To derive the new_master_salt we apply the algorithm once (to obtain TSALT (1)):
So we have:
To derive the new_master_sender_key we apply the algorithm once (to obtain TKEY(1))
So we have:
Notes
New CryptoTransformIdentifier_v2 structure
If a Participant enables the key regeneration feature, then it will serialize CryptoTransformIdentifier_v2 in all of its crypto headers. Otherwise, it will serialize CryptoTransformIdentifier in all of its crypto headers.
Revocation and Expiration: New Types and APIs
Flow Description
Key Regeneration and Redistribution: Examples
Generating and Distributing Key Revisions
Entities
Flow
Purging Key Revisions Upon Reaching Key Revision Max. History Depth
When a Participant reaches KRMHD limits (this is, the maximum number of locally created key revisions), it needs to purge the oldest key_revision to make room for the new key_revision.
If the Participant contains data-protected DataWriters with samples in their queues, it will need to re-encode any sample that was encoded using the oldest key_revision. This is required because the key_revision is needed to re-encode the sample. Consequently, the Participant needs to make sure there are no encoded samples relying on the key_revision that is going to be destroyed.
Entities
Flow
Implementation Detailed Design
Lazily Reencoding a Historical Sample Because its Old Key Revision is Outside the Key Revision Window
Purging Key Revisions Upon Reaching Key Revision Max. History Depth
Motivation
If we don't cache key revisions at all, then we would have to iterate through the entire DataWriter history to check if samples need to be reencoded. This could be slow for ODBC. If the DataWriter history contains 1 sample with key revision 0 and 10000 samples with key revision 1, and we're purging key revision 0, then that's 10000 unnecessary iterations and fetches.
If we maintain an inline list of {key revision ID, sample} and cache a REDAInlineListNode
in the metadata of every sample, then we would introduce 3 pointers of memory for every single sample, historical or not. In a real scenario, many of the live data samples may never get resent as repairs or historical data. We should not punish a large number of live data samples just to make reencoding a small number of historical data samples faster when it comes time to purge a key revision, which is not a common event.
A hybrid approach would he to have an inline list where each node has 1) revision ID, 2) lowest possible SN that was encoded with that revision ID, 3) highest possible SN that was encoded with that revision ID. This approach would consume less memory than the second approach and he faster than the first approach in most cases.
Hybrid Approach
Note that under this approach, having one low SN sample and one high SN sample that have not been sent in a while (e.g., because of content filtering) will force us to iterate through all of the samples between the two (as opposed to just 2 samples if using an ordered list of samples based on when reencoding happened). To make this efficient, we introduce the use of REDASequenceNumberIntervalList.
REDASEQUENCENUMBERINTERVALLIST REDASequenceNumberIntervalList is a data structure representing a list of sequence number intervals. A sequence number interval is a set of consecutive sequence numbers that are grouped together based on a certain state (userData). Two consecutive intervals can be merged if there is no gap in sequence number between them and they share the same userData. The userData has an expiration time that indicates when it is not valid anymore. The userData expiration allows merging sequence number intervals with different userData that otherwise could never be merged.
For example, in
The REDASequenceNumberIntervalList also allows changing the userData and expiration time for an existing sequence number interval. Changing the userData may also lead to the merging of consecutive sequence number intervals if they shared the same userData after the change.
The sequence number intervals in the REDASequenceNumberIntervalList are ordered based on two different criteria (see
The ordering per expiration time allows for fast lookup and invalidation of all of the intervals already expired.
Following there is a description of how the REDASequenceNumberIntervalList is used to facilitate fast re-encoding of samples with an old revisionId using a new revisionId.
To do that, this invention uses the revisionId as both userData and expiration time. Because the expiration time is the revisionId, finding all the samples with the old revisionId should have an algorithmic complexity O(1) which will speed up the re-encoding.
Reencoding Instances
Problem: we need to store encoded instances in durable DataWriter history. We now need to solve the problem of reencoding the instances.
Key Idea:
Workflow of Reencoding Instances
The PRESWriterHistoryDriver keeps a REDASequenceNumber nextInstanceSn. It starts off at 1. Whenever we initialize a new instance, we set the instance's SN to the DataWriter's _nextInstanceSn, and we increment _nextInstanceSn. Whenever we serialize a key in a dispose message, we use the instance's SN to populate the REDASequenceNumberIntervalList for samples.
Storing Key Revisions in Persistent Storage
Motivation: Although key revision information is common across DWs within the same Participant, there are two problems with storing key revision information in the Participant:
For these reasons, we will duplicate the key revision information across DWs. This should be fine because the information is not kept in memory.
For more information about how key_revision_tokens are used and their role please refer to Key regeneration and redistribution: new types and SPIs.
RESTORE=0 (CREATING DW FROM SCRATCH)
Restore=1 (Creating DW with State Restored from a Previous DW)
Interaction with Batching
Today Connext Secure encodes each individual sample of a batch. Reencoding would be simplified if 1) we encode the entire batch, and 2) we flush as soon as we activate a new key revision (so that all samples in a batch have the same revision).
Encoding the entire batch also helps to support batching+compression+payload protection.
Public Interface Design
For functionality that requires user interaction, this section explains how the user will be able to use the functionality.
Configuration
This section describes the public configuration. For example, if a feature requires a new QoS, the QoS will be documented here. The design rationale for choosing a specific way to configure the functionality will be part of this section as well.
DDS.PARTICIPANT.TRUST_PLUGINS.MAX_KEY_REDISTRIBUTION_DELAY.SEC
This integer property is configurable in the core library. Per KR-R2-a, a new key revision won't take effect until one of these conditions is true:
If this timeout occurs, the remote Participants that have not yet acknowledged the new key revision will be completely removed. To be consistent with dds.participant.trust_plugins.authentication_timeout.sec, the default value is 60. The range is 1−RTI_INT32_MAX, or −1 for unlimited.
DDS.PARTICIPANT.TRUST_PLUGINS.KEY_REVISION_WINDOW_SIZE
This integer property is configurable in the core library. It controls the number of active key revisions that may be used for sending repair payloads. If the value is 0, then key redistribution is disabled.
DDS.PARTICIPANT.TRUST_PLUGINS.KEY_REVISION_MAX_HISTORY_DEPTH
This integer property is configurable in the core library. It controls the number of key revisions that are used to encode samples in the DataWriters' queues.
API Design
This section describes and documents the public APIs for the new functionality. The design rationale for the new API will be part of this section as well.
This section must include the design for the different languages that will be supported such as: C, Traditional C++, Modern C++, Java, .NET, Ada, Python, Lua, Javascript, etc.
See priority document for references on:
This application claims priority from U.S. Provisional Patent Application 63/390,475 filed Jul. 19, 2022, which is incorporated herein by reference.
| Number | Date | Country | |
|---|---|---|---|
| 63390475 | Jul 2022 | US |