The present invention relates to a method for storing information in DNA The method of invention comprises storing information in DNA. The present invention addresses storage for all kind of digital information whether it is a text file, an image file or an audio file. Large sequences are divided into multiple segments.
DNA is the best molecular electronic device ever produced on the earth because DNA can store, process and provide information for growth and maintenance of living system. AU living species are as a result of single cell produced during reproduction. In most of the cases this single cell does not have most of the materials required for fabricating a living system but contains all the information and processing capability to fabricate living spaces by taking materials from environment, for example, fabrication of baby from Zygote which contains rearranged DNA sequences of parents. DNA is ready to use nanowire of 2 nm and can be synthesized in any sequence of four bases i.e. ATGC. DNA of every living organism (micro/macro) consist of large number of DNA segments where each segment represents a processor to execute a particular biological process for growth and maintaining life. Other important characteristics of DNA which makes it material of choice for future molecular devices are: DNA the building block of life, can store information for billion of years. The tremendous information storage capacity of DNA can be imagined from the fact that 1 gram of DNA contains as much information as 1 trillion CD's1 four bases (A,T,G,C) instead of 0 and 1, extremely energy efficient (1019 operations per joule), synthesis of any imaginable sequence is possible and semiconductor are approaching limit.
Clelland et al, 1999[2], and Bancroft, et al. 2001[3] [U.S. Pat. No. 6,312,911], have developed the DNA based steganographic technique for sending the secret messages. Although their prime objective was steganography (the art of information hiding), they used. DNA as storage an transmission device for secret message. They encrypted the plaintext message into the DNA sequences and retrieved the message using the encryption/decryption key. They used three DNA bases for representing a single alphanumeric character, as DNA has 4 bases (A, T, C, G) so a maximum of 64 (4×4×4) ASCII character can be formed using this scheme. Whereas, a total of 256 extended ASCII characters are required to represent complete set of digital information. Hence, Clelland's scheme cannot be used to address complete set of digital information and has limited scope.
The main object of the present invention is to develop a comprehensive DNA based information storage technique.
Another object of the present invention is to encrypt complete extended ASCII character set in terms of minimum number of DNA bases.
Another object of the present invention is to develop software to encrypt/decrypt data in terms DNA bases.
Yet another object of the present invention is to design suitable primers to be flanked at both ends of the encrypted and synthesized information.
The present invention provides a method for storing information in DNA The method of invention comprises storing information in DNA. The present invention addresses storage for all kind of digital information whether it is a text file, an image file or an audio file. Large sequences are divided into multiple segments
a, Information storage in DNA. Structure of prototypical single segment information storage in DNA strand.
b. Information storage in DNA. Structure of prototypical multi segment information storage in DNA strand.
The present invention provides a method for storing information in DNA. The method of invention comprises storing information in DNA. The present invention addresses storage for all kind of digital information whether it is a text file, an image file or an audio file. Large sequences are divided into multiple segments.
The method enables the storage of information in DNA. In another embodiment a software based on the above method enables all 256 Extended ASCII characters to be defined in terms of DNA sequences. The basic concept used is to take minimum number of bases to define each Extended ASCII character. With simple permutation we have 4 sequences combinations with one base Le. A, T, G, C. Similarly, with 2 bases we have 4×4=16 different sequences, with three bases we get 4×4×4=64 distinct sequences and flour bases give 4×4×4×4=56 distinct sequences. Therefore, with a set of 4 bases, complete extended ASCII set has been encoded. Software named as “DNASTORE” has been developed in Visual Basic 6.0 for encryption and decryption of digital information in terms of DNA bases. Using DNASTORE complete extended ASCII character set can be encoded 256 different ways.
In yet another embodiment in our scheme, plain text/image or any digital information is encrypted in terms of DNA sequences using encryption key (software). If the information overflows the limits i.e. it cannot be synthesized in a single piece then it is encrypted and fragmented in a number of segments. Synthesis of encrypted sequence(s) is carried out using DNA synthesizer.
In yet another embodiment a fixed number of different DNA primers sequence have been designed and assigned a number, which resembles the segment position it represents e.g. segment 1, segment 2 . . . segment n. These are called as header primers. Two tail primers have also been designed one resembles continuation and other resembles termination segment.
In yet another embodiment the DNA segment(s) is/are flanked by known PCR primers [as described earlier] at both the ends i.e. header primers are attached at the beginning of segment and tail primers are attached at the end of the segment. If there is only one segment, at the beginning it is, flanked by header primer number 1 and at the end it is flanked by termination tail primer. However, if there are more than one segments, each segment would be attached with header primers numbered as 1, 2, 3 . . . n respectively, at the end these would be attached with a continuation tail primer except for last segment which would be attached with a termination tail primer.
The SM DNA is then mixed with the enormous complex denatured DNA strands of genomic DNA of human or other organism. As the human genome contains about 3×109 nucleotide pairs, fragmented & denatured human DNA provides a very complex background for storing the encrypted DNA. The DNA can be stored and transported on paper, cloths, buttons etc.
In still another embodiment only a recipient knowing the sequences of both the primers [starting and tail] would be able to extract the message, using PCR to isolate & amplify the encrypted DNA strand. Isolated and amplified DNA can then be sequenced using automated DNA sequencer. The DNA sequence obtained can then be converted into digital message using encryption/decryption key (software key).
In yet another embodiment the key is helpful in the secret & secure transfer of information particularly for spying and military purposes. It may also be helpful in anti-theft, anti-counterfeiting product authentication, copyright infringements etc.
Encryption and decryption of a textual message “CSHU” in terms of DNA bases may be defined as
Isolation decryption of above encrypted DNA sequence
Some examples of DNA encryption for textual data
A JPEG image encrypted in term of DNA bases
In example 2, a JPEG image if Indian Flag having file size of 1981 Bytes have been encrypted in terms of DNA bases. A total of 7924 DNA bases (4-base/Byte) are required to encrypt the complete image. Since the sequence is large, fragmenting the sequence into smaller segments is required.
Number | Date | Country | |
---|---|---|---|
60459140 | Mar 2003 | US |