In one embodiment, a data storage system is provided. The data storage system includes a freeze-tolerant data storage medium. The freeze-tolerant data storage medium has a grid with a plurality of grid elements. Each grid element of the plurality of grid elements includes a storage container. The data storage system also includes a molecule generator that is configured to generate a plurality of different plant-based molecules representing data. The data storage system further includes a molecule depositor, which is coupled to the molecule generator. The molecule depositor is configured to deposit different combinations of the plurality of different plant-based molecules into different storage containers of the plurality of storage containers.
In another embodiment, a method is provided. The method includes providing a substrate, and coating at least one surface of the substrate with an antifreeze layer. The method also includes providing a plurality of storage containers on the substrate. The plurality of storage containers is configured to store different combinations of a plurality of different plant-based molecules representing data.
In yet another embodiment, a data storage medium is provided. The data storage medium includes a substrate, and an antifreeze layer coated on at least one surface of the substrate. The data storage medium also includes a plurality of storage containers on the substrate. The plurality of storage containers is configured to store different combinations of a plurality of different plant-based molecules representing data.
Other features and benefits that characterize embodiments of the disclosure will be apparent upon reading the following detailed description and review of the associated drawings.
Embodiments of the disclosure provide synthetic data storage systems that are based in part on physical and chemical properties of Arecaceae (a family of perennial flowering plants).
In some embodiments, molecules that are similar to molecules found in Arecaceae are generated and utilized to represent data, and media having antifreeze properties of Arecaceae are developed and utilized to store the generated molecules that represent the data. Prior to providing additional detail regarding different embodiments of the disclosure, an example general embodiment is described below in connection with
It should be noted that like reference numerals are used in different figures for same or similar elements. It should also be understood that the terminology used herein is for the purpose of describing embodiments, and the terminology is not intended to be limiting. Unless indicated otherwise, ordinal numbers (e.g., first, second, third, etc.) are used to distinguish or identify different elements or steps in a group of elements or steps, and do not supply a serial or numerical limitation on the elements or steps of the embodiments thereof. For example, “first,” “second,” and “third” elements or steps need not necessarily appear in that order, and the embodiments thereof need not necessarily be limited to three elements or steps. It should also be understood that, unless indicated otherwise, any labels such as “left,” “right,” “front,” “back,” “top,” “bottom,” “forward,” “reverse,” “clockwise,” “counter clockwise,” “up,” “down,” or other similar terms such as “upper,” “lower,” “aft,” “fore,” “vertical,” “horizontal,” “proximal,” “distal,” “intermediate” and the like are used for convenience and are not intended to imply, for example, any particular fixed location, orientation, or direction. Instead, such labels are used to reflect, for example, relative location, orientation, or directions. It should also be understood that the singular forms of “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
It will be understood that, when an element is referred to as being “connected,” “coupled,” or “attached” to another element, it can be directly connected, coupled or attached to the other element, or it can be indirectly connected, coupled, or attached to the other element where intervening or intermediate elements may be present. In contrast, if an element is referred to as being “directly connected,” “directly coupled” or “directly attached” to another element, there are no intervening elements present. Drawings illustrating direct connections, couplings or attachments between elements also include embodiments, in which the elements are indirectly connected, coupled or attached to each other.
In data storage system 100, molecule generator 104 and molecule depositor 106 together act as a writer to store information as molecule combinations on data storage medium 102. Reader 108 identifies the molecule combinations on the data storage medium 102, and thereby retrieves the stored information from the data storage medium 102. Controller 110 may include one or more processors and one or more memories that store program code having instructions that are executable by the processor(s). Controller 110, which is coupled to molecule generator 104, molecule deposition 106 and reader 108, may control operations of those components.
In embodiments of the disclosure, the molecules generated by molecule generator 104 and deposited on data storage medium 102 by molecule depositor 106 are molecules that are formed using amino acids found in Arecaceae, and the data storage medium 102 has antifreeze or freeze-tolerant properties of (or similar to) Arecaceae. Details regarding antifreeze properties of Arecaceae are provided further below.
In data storage system 100, freeze-tolerant data storage medium 102 has a grid 111 including multiple grid elements 112. In some embodiments, each of the grid elements 112 includes a storage container 114 that is configured to store molecules deposited by depositor 106. In one embodiment, each storage container 114 may include an adhesive that holds the molecules in place when deposited into the container 114. In other embodiments, no containers 114 are used, and the molecules may be directly deposited on the grid elements 112. In such embodiments, the adhesive that is configured to hold the molecules in place when deposited may be applied directly on the grid elements 112. Details regarding how the molecules are generated are provided further below.
Synthetic data storage system 100 may include a library of characters 115 that maps characteristics of different molecules and/or molecule combinations to different combinations of bits. Thus, when binary data arrives to the synthetic data storage system 100 from a host 116 for storage, molecule types corresponding to the binary data are first identified from the library 115 by controller 110. Then, based on the molecule types identified by controller 110, molecule generator 104 generates the molecules by one or more chemical reactions and provides them to molecule depositor 106. Molecule depositor 106 deposits the molecules in storage containers 114. To carry out a read operation for the stored data (for example, in response to a read command received from host 116), controller 110 directs reader 108 to the container(s) 114 that include molecules that represent the requested information. The reader 108 identifies the molecules, and returns the identified information to the controller 110. The controller 110 converts the identified information into binary data, and returns the data to the host 116.
In the embodiment shown in
In one embodiment, the data storage medium 102 includes a glass plate, with one or more surfaces of the glass plate coated with an antifreeze layer (e.g., an antifreeze protein (AFP)) similar to chemical properties of Arecaceae. Physical and chemical properties of Arecaceae are listed below in tables 1 and 2, respectively.
1-1.2
Plants such as Arecaceae have their own deoxyribonucleic acid (DNA) and their properties such as density and durability are some of the premises that were considered for the disclosure. Regarding density, research papers indicate that one gram of DNA can potentially store 215 petabytes of data. An average hard disk drive in a laptop can house just one millionth of that amount. Thus, as indicated above, embodiments of the disclosure encode data at a molecule level (using synthesized plant-based molecules) and store the molecules representing the large amounts of data in a medium that extends the data shelf life. Regarding durability, the natural physical and chemical properties of palm leaves, if preserved under the right conditions, can increase the shelf life of the medium for 600 years or more.
Palm leaves includes AFPs, also known as thermal hysteresis proteins, which are compound proteins (e.g., a protein complex combining amino acids with other substances, usually sugar), which have the following properties:
Antifreeze prevents growth of crystals when a data storage medium (e.g., 102 of
Unlike polymers of nucleotides where alphabets can be formed from 4 nucleotide bases A, T, G and C, polymers of amino acids (known as proteins) have 22 flavors. Each synthesized small plant-based molecule is created using amino acids. Amino acids utilized to generate the plant-based molecules in some embodiments are listed below in table 3.
Each amino acid has a spectral color, and when a molecule is created with amino acid combinations, that molecule is associated with a color. In some embodiments, each molecule is created using a multi-component reaction (MCR) process, and each molecule represents an alphabet. Accordingly, a word library may be created from molecules.
Some reasons for choosing an MCR process for generating molecules are as follows:
1) MCRs are convergent reactions in which three or more components react to form a product.
2) MCRs involve a single operational step, and may be carried out in a single pot/container.
3) An MCR reduces waste generation due to its convergent nature and its single operational step.
4) An MCR does not generate a by-product, and is considered to be an eco-friendly reaction system.
5) MCRs are substantially atomically balanced (e.g., a number of atoms of any given element does not change in any reaction, which maximizes the incorporation of all materials used in the process).
6) Due to weak negative reactions, MCRs are usually safe processes.
7) MCRs save time, and are energy efficient.
MCR methodology is considered as a sorted tool to create combinational libraries. Since MCR can generate diverse libraries, it can be considered as a tool to formulate unique molecules. As indicated above, in embodiments of the disclosure, MCR reactions are used to combine amino acids to make unique molecules. Each unique molecule represents a unique character and a group of molecules forms a word. Every different generated character can be associated with a unique color. For example, 26 alphabets=26 unique molecules, and each unique molecule is associated with a unique color. Thus, a library of words may be created based on the generated molecules. Examples for creating data from amino acids are provided below.
An amino acid has both a basic amine group and an acidic carboxylic acid group. The 20 amino acids in Table 3 are structured as polar and non-polar amino acids. Distinct letters may be formed from polar or non-polar amino acids.
Consider the word “The,” which has the following binary equivalents of ASCII from an ASCII-binary character table:
h is 01101000
e is 01100101
As can be seen in
To form a molecule (T) from R+N+D, 2 R, 2N and 1D with an additional standard component are utilized. A simple demonstration is as follows:
Convert R, N, D into binary equivalents
R is 01010010
N is 01001110
D is 01000100
As noted above, T is 01010100
R*R=01101001000100
(R*R)/N=01010110
01010110+N=010101100
010101100−D=01101000
01101000−00010100=01010100 (which is T)
Data stored in the form of molecules on a data storage medium such as 102 of
The above examples describe storage of single words as molecules, and the retrieval of stored single words by identifying the molecules. In general, data storage systems in accordance with embodiments of the disclosure may be employed for storing any amounts of data. For example, if there are in total 500-1000 grid elements in the data storage medium, by using 22 amino acids in different combinations, in total 500-1000 molecules can be created. In general, molecule generation may be carried out once and may be dependent on a number of grid elements. As indicated in an example provided above, a mixture of three amino acids (R+N+D) may store one byte of information which represents one molecule. A larger combination of amino acids can stretch up to larger data sets (e.g., 26). It should be noted that the grid elements on a data storage medium are labelled/numbered to enable tracking of molecules for storage and retrieval of information.
The illustrations of the embodiments described herein are intended to provide a general understanding of the structure of the various embodiments. The illustrations are not intended to serve as a complete description of all of the elements and features of apparatus and systems that utilize the structures or methods described herein. Many other embodiments may be apparent to those of skill in the art upon reviewing the disclosure. Other embodiments may be utilized and derived from the disclosure, such that structural and logical substitutions and changes may be made without departing from the scope of the disclosure. Additionally, the illustrations are merely representational and may not be drawn to scale. Certain proportions within the illustrations may be exaggerated, while other proportions may be reduced. Accordingly, the disclosure and the figures are to be regarded as illustrative rather than restrictive.
One or more embodiments of the disclosure may be referred to herein, individually and/or collectively, by the term “invention” merely for convenience and without intending to limit the scope of this application to any particular invention or inventive concept. Moreover, although specific embodiments have been illustrated and described herein, it should be appreciated that any subsequent arrangement designed to achieve the same or similar purpose may be substituted for the specific embodiments shown. This disclosure is intended to cover any and all subsequent adaptations or variations of various embodiments. Combinations of the above embodiments, and other embodiments not specifically described herein, will be apparent to those of skill in the art upon reviewing the description.
The Abstract of the Disclosure is provided to comply with 37 C.F.R. § 1.72(b) and is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims. In addition, in the foregoing Detailed Description, various features may be grouped together or described in a single embodiment for the purpose of streamlining the disclosure. This disclosure is not to be interpreted as reflecting an intention that the claimed embodiments employ more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive subject matter may be directed to less than all of the features of any of the disclosed embodiments.
The above-disclosed subject matter is to be considered illustrative, and not restrictive, and the appended claims are intended to cover all such modifications, enhancements, and other embodiments, which fall within the true spirit and scope of the present disclosure. Thus, to the maximum extent allowed by law, the scope of the present disclosure is to be determined by the broadest permissible interpretation of the following claims and their equivalents, and shall not be restricted or limited by the foregoing detailed description.
This application claims priority to U.S. Provisional Application No. 63/213,946, filed on Jun. 23, 2021, the content of which is hereby incorporated in its entirety.
Number | Date | Country | |
---|---|---|---|
63213946 | Jun 2021 | US |