The present invention relates generally to data protection mechanisms and, more particularly, to protecting personal information management data (PIM) in untrusted domains such as email systems on the Internet.
Cloud computing encompasses putting an enterprise's business operations, word documents, sales information, and personal information management solutions such as e-mail, calendar and contact information on Internet servers hosted by third parties or Internet Service providers. The benefit of cloud computing allows an enterprise or company to effectively outsource its information technology needs, email servers, and other personal information management systems to Service Providers that specialize in providing large scale network servers and hosted solutions.
For example, Microsoft provides hosted e-mail solutions for enterprises, which makes it unnecessary for companies to provide in house solutions and maintain and service their own enterprise servers within their local area network. With the cost of electronic data storage decreasing and the speed, redundancy and the efficiency of network servers and reliability and ubiquity of Internet connections increasing, small businesses and even some large scale enterprises are eventually migrating towards cloud computing systems for email and other personal information management solutions.
However, whenever an enterprise places its data outside its own company walls and entrusts the data to the third party service providers, there is always a risk that the privacy and confidentiality of the data could be comprised. One of the factors that limits adoption of cloud computing for email and other communication technologies is a distrust of the cloud solution providers or a reluctance to put company sensitive data outside of corporate control.
Cloud solution providers typically use basic username and password authentication. However, this mechanism is not considered sufficiently strong for many corporate security policies. Some cloud solution providers are believed or known to mine the data stored in their data storage systems for marketing information. This invention aims to reduce or solve these concerns.
The present invention overcomes the above-described problems with enterprise cloud computing solutions by providing a system and method for ciphering e-mail and other personal information management information. The present invention accomplishes this by use of security transformation systems and methods described below.
In one preferred embodiment according to the present invention, a security transformation system and method is disclosed which includes an e-mail client, a cipher proxy, a dictionary database and an Internet e-mail system, for example, such as an Internet Service provider's e-mail system. According to this embodiment, a message is generated from either the user's client computer or a third party, which is received at the user's Internet email system. The message is then transformed using a cipher mechanism to encrypt the essential fields of the email using a cipher dictionary. When the message is accessed, it is decrypted using a reverse cipher security transformation method, and the original message is restored.
In another embodiment according to the present invention, the cipher dictionary and e-mail fields are encrypted using well known encryption methods including symmetric encryption, asymmetric encryption, and Public Key Infrastructure.
In yet another preferred embodiment according to the present invention, a process for coding messages occurs as follows: a message is ciphered from terms in a dictionary; if new terms are encountered, a new set of mappings is created in the dictionary database and the terms are replaced with the ciphered terms; a subset of the dictionary is created for terms of the message; the subset dictionary is encrypted using an encryption algorithm; the encrypted subset dictionary is attached to the message in an extended attributes field; the coded message is transmitted to an Internet e-mail system; and the message is then decrypted and run through the reverse security transformation process.
Other and further features and advantages will be apparent from the following detailed description of preferred embodiments of the present invention when read in conjunction with the accompanying drawings. It should be understood that the embodiments described are provided for illustrative and exemplary purposes only, and that variations to, and combinations of, the several elements and features thereof are contemplated as being within the scope of the invention.
In the drawings, which illustrate what is currently considered to be the best mode for carrying out the invention:
Preferred embodiments of the invention describe a system and method for providing email continuity that protects email content when the data is transmitted over and stored in the Internet. Email content may include, but is not limited to, email messages, calendar items, meeting requests, meeting acceptance/rejection notices, contacts, tasks, notes and journal items. Preferred embodiments of the invention protect searchable email content that is stored by performing a term substitution cipher replacing each term or word with a substitute term or word. This cipher is used to protect data in untrusted domains at an Internet e-mail system, such as an Internet service provider's email system.
Preferred embodiments of the invention are intended to work with all types of e-mail systems and protocols, including for example, POP, IMAP, Microsoft Exchange, IBM Lotus Notes, and well known e-mail protocols such as SMTP, MIME, POP and IMAP, as well as Microsoft's MAPI and IBM Lotus' VIM.
An email message typically includes a number of standard headers defined by the Simple Mail Transport Protocol (SMTP) that are used in routing and delivering mail. An embodiment of this invention replaces the terms in fields not necessary for further transporting email with cipher terms. The replacement algorithm allows the message to retain its original formatting, but all the natural language words will be replaced with ciphered terms.
In this embodiment, a term substitution cipher is a mechanism that replaces each term in a message with, for example, a randomly chosen term. In accordance with this embodiment, the mapping between those terms is stored in a local dictionary. For example the “the sky is blue” might be mapped to “z12 z18 z9 z35”. The dictionary would hold the mappings between the natural language terms and the cipher terms. The algorithm for performing this mapping is that each time a new natural language term is encountered a randomly selected cipher term is chosen and added to the dictionary. These terms are sequential integers based on a key to avoid dictionary problems. Encoding or decoding a message is done by a look up of each word or each cipher term and determining its corresponding entry in the dictionary.
Alternatively, optionally or additionally, in order to thwart word-frequency-analysis attacks, words can have multiple entries in the dictionary. Thus, “the”, which occurs frequently, might be coded as “z12”, or as “z96”, or as “z13”, etc, and the algorithm can randomly choose which coding will be used at any given point. Conversely, however, when this approach is used, searches that operate in cipher-space will have to be expanded. Thus, in the simple case, a search for “the” can be coded as a search for “z12”; in the optional case, a search for “the” must be coded as a search for any of “z12”, “z96”, or “z13”.
Using this embodiment, the data stored at the Internet e-mail system will retain all of its formatting but all the natural language words will be replaced with ciphered terms. This term substitution algorithm can be applied selectively to various fields in the email message such as, but not limited to:
According to preferred embodiments, a determination of which fields are coded is table-driven. Thus, a transformation can be applied to any component of the email based on specific needs. The substitution cipher is applied to fields that need to be searched based upon terms in the semi-trusted email store. Fields needed by the email store to manage items (i.e. message-id) are not modified. Other fields and MIME attachment objects (including html bodies and text bodies) can be transformed by the substitution cipher if term-based searching is desired, or can be encrypted using AES or other encryption methods that are known by one of ordinary skill in the art. Accordingly, a security transformation on an e-mail message field encompasses the term substitution cipher, or encryption, or any other means known to one of ordinary skill to reversibly obscure the contents of such a field from view by an observer or attacker
Any given deployment of an embodiment of the invention can be tuned to apply the term substitution, encrypt, or even remove extended attributes. Additional attributes have also been observed in certain emails such as:
Typical behavior for email software is to store unaltered any attribute that is syntactically correct but unrecognized. According to certain embodiments of the invention, a deployment would monitor the attributes that are transformed and which type of transformation is applied based upon the email software used. Default behavior for unknown types would include encryption or cipher-substitution based upon customer preference. Thus, a variety of deployment models are possible and included in this invention. Preferred deployment models include but are not limited to: a proxy mechanism that intercepts and applies the ciphering transformation to messages in transit from one email store to one at a service provider. The proxy could run on a client computer, as a web service in an enterprise, a service in the Internet, a plug-in for client mail software, an Internet browser plug in, a software module in a client-based email continuity solution, a software module in an email archiving solution, or other possible network locations.
Optionally and/or in addition to having a terms dictionary that is used at the point of cipher substitution for all messages in a folder or mailbox, a subset of the dictionary terms used in the email message is stored in an encrypted form with the message. This dictionary subset is encrypted using a symmetric key. Symmetric-key algorithms are a class of algorithms for cryptography that use trivially related, often identical, cryptographic keys for both decryption and encryption. An example of this is the Advanced Encryption Standard, AES. The symmetric key is then also encrypted and stored with the mail message. The encryption of the symmetric key is done using Public Key Infrastructure (PKI) technology. A Public Key Infrastructure (PKI) is an arrangement that binds public keys with respective user identities by means of a certificate authority (CA). The user identity must be unique for each CA. The binding is established through the registration and issuance process. The PKI role that assures this binding is called the Registration Authority (RA). For each user, the user identity, the public key, their binding, validity conditions and other attributes are made unforgeable in public key certificates issued by the CA. The symmetric key is encrypted using a set of public keys that would include at a minimum the user and a corporate (or “auditor”) key. Thus, for each encryption of a symmetric key under a public key, a separate encrypted key would be stored. Any encryption and decryption mechanism known to one of ordinary skill in the art is contemplated for use in this invention.
According to preferred embodiments of the invention, a process for the coding of a message may include the following steps:
When the corporation needs access to ciphered mail in a user's mailbox, that message would be retrieved, the symmetric key would be unlocked using the corporate private key, the dictionary subset for that message would be unencrypted using the symmetric key and the message run through the reverse term substitution cipher process.
It should be noted that there are several alternative embodiments to the process suggested above and certain steps of the process may be altered, omitted performed non-sequentially, or any permutation that yields the successful ciphering and storage and transmission of the message in accordance with the true spirit of the invention as contemplated by one of ordinary skill in the art. Further, system components may be distributed across software, hardware may be co-located on the same platform, may be performed on the same client or server, or may be hosted on the Internet or located within the same enterprise, whenever such distribution or location of components accords with the scope and spirit of the invention. Moreover, the use of the client may be interchanged with the use of a server or vice versa where such substitution is a trivial and insubstantial modification of the design architecture. Further preferred embodiments of e-mail clients may include personal computers, smartphones, cell phones, PDAs, laptops or other portable communication devices.
Benefits of preferred embodiments of the invention include the fact that data stored in the semi-trusted Internet store cannot be easily searched or data-mined. The control fields in email are not touched so email service is unaffected. This mechanism works with any language that is encoded using encoding systems like UNICODE. Further, one of ordinary skill will readily see how to apply this to alternate text formats such as HTML and XML.
Another significant feature and benefit of the system and method is that data stored at an Internet Service Provider would not be able to be mined by the Service Provider. This protects the user and the user's company from having their email read by the third party Service Provider's software and its employees. This protects critical confidential information from being used to garner knowledge about a company's business without significant and illegal effort.
Described embodiments of the invention also protect the data if a mailbox password is cracked. The ciphered email would be useless to anyone who obtained access to the account. Unlike other encryption techniques which only encrypt the body of the message so that email can still be forwarded or replied to and routed, this cipher method allows all sensitive information to be protected while retaining the ability to manage email in the semi-trusted store since the control fields in the email are not touched allowing services to continue to operate. The operational characteristics of the email service are maintained for the user since email is sent from the client in its original form and transformed back to the original form when retrieved from the service for forwarding or reply actions.
While certain embodiments of the present invention have been described, these embodiments are not intended to limit the scope of the present invention. Various modifications of the above described embodiments can be made by those skilled in the art in view of the technology disclosed and the knowledge available to one of ordinary skill in the art. These modifications and alternative embodiments are within the scope and true spirit of the present invention. The scope of the invention is, therefore, indicated by the appended claims rather than the foregoing description. All changes which come within the meaning and range of equivalency of the claims are to be embraced within their scope.
This application claims benefit from provisional application Ser. No. 61/098,679, filed Sep. 19, 2008, entitled System and Method for Cipher E-Mail Protection.
Number | Date | Country | |
---|---|---|---|
61098679 | Sep 2008 | US |