Filed herewith and expressly incorporated herein by reference in its entirety is a Sequence Listing submitted electronically as an ASCII text file via EFS-WEB. The ASCII copy, created on ______, is named______ and is ______ bytes in size.
This disclosure is in the field of sensors, and also in the field of nano-electronics. More specifically, it is in the field of molecular electronic sensors. It is also in the field of genetic analysis, and more specifically, in the field of measuring samples for their content of specific DNA molecules of interest. It is also in the field of infectious disease monitoring, and in particular the detection, monitoring or diagnosis of viral infectious diseases. In particular, this includes monitoring of viral disease such as COVID-19.
In the field of genetic analysis, it is important to be able to determine if a given sample of biological material contains a target DNA or RNA segment of interest. Another import example is species identification, where a sequence characteristic of species, is searched for within the sample. This is, for example, important for the environmental monitoring of, and diagnosis of, infectious disease. For example, a DNA segment that identifies a pathogen, such as a parasite, bacteria or virus, can be looked for within the sample taken from the environment, or from an animal or person that may be infected. This is especially important for the environmental surveillance and epidemiology of viral diseases with the potential for large scale, rapidly progressing infection or pandemics, such as COVID-19. It is also important in genetic analysis to look for known genetic variants that may occur, relative to a give segment of DNA. This type of measurement is known as genotyping in the context of looking for known variants in humans and animals. In the context of pathogens, this often takes the form of identification of strains, which are defined by DNA or RNA variants relative to a reference genome sequence, or by the sequence differences between two genomes.
It is also important to be able to determine the concentration level of a DNA or RNA segment of interest in a sample. One such example is in gene expression analysis, where the activity level or expression level of genes, represented in the form of messenger RNA, can be assessed in a sample. This is important, for example, in studying gene function, or in characterizing the pathology of cancers for research, diagnosis and treatment. Another such example is in Non-Invasive Pregnancy Testing (NIPT), which requires measurement of levels of non-maternal cell-free DNA fragments in blood samples. Another similar example is Liquid Biopsy, for early detection or recurrence monitoring of cancer, which may look to detect of the levels of known mutant sequences in blood samples. Another example is Comparative Genomic Hybridization (CGH), where the relative concentration of segment of genomic DNA in a sample is used to detect genomic duplication or deletion events, both in diagnosing germline disease such as Down Syndrome (Trisomy 21), or in characterizing genomic alterations in cancers as a component of Precision Medicine for Oncology. Another example arises in the field called metagenomics, where the goal is to characterize complex populations of diverse organisms present in an environmental sample, such as a soil or water sample, by extracting and quantifying the abundance of different forms of genomic DNA present in the organisms in the sample. Of particular interest for health and disease is the special case of assessing microbiomes, such as gut microbiome, or oral microbiome, for the populations of bacteria present. For the purpose of quantifying such complex populations, one common approach is to use PCR to target a common “barcode of life” DNA segment that is present in all the organisms of interest, and has enough diversity to distinguish species and strains of interest, an in this approach, the focus becomes identifying and measuring the relative concentrations of these fragments.
These general classes of genetic analysis problems-measuring the presence of or concentration of DNA segments of interest—have been addressed using well known modern molecular biology tools such as PCR, DNA Microarrays, and DNA sequencing. Older techniques predating these, such as Southern Blots (for measuring DNA targets) and Northern Blots (for measuring RNA targets) have also been applied to these problems. All such assays, including Southern and Northern Blots, employ the process of DNA “hybridization” as a fundamental part of the detection scheme.
As shown in
Similar to what is shown in
Molecular electronics is a general field of technology in which single molecules are placed as a component in an electrical circuit, to perform some useful electrical functions such as transducing some chemical or molecular event into an electrical signal. This general concept is illustrated in
The inventions described and claimed herein have many attributes and embodiments including, but not limited to, those set forth or described or referenced in this Brief Summary. The inventions described and claimed herein are not limited to, or by, the features or embodiments identified in this Summary, which is included for purposes of illustration only and not restriction.
It is the object of this invention to disclose and provide a molecular electronics sensor that utilizes DNA hybridization as its primary sensing mechanism, in order to obtain benefits of testing speed, simplicity, robustness, and broad applicability for genetic analysis that hybridization-based detection can provide.
A molecular electronics sensor for genetic analysis is also disclosed and provided, including methods of use for this sensor. These embodiments have the benefits of providing for faster testing, lower cost testing, lower cost test apparatus, and testing that is simpler to perform, and that also enables highly distributed deployment or point-of-use deployment of such testing systems, including mobile use and home use of such testing systems.
An all-electronic, single molecule detector of DNA or RNA segments of interest and the concentration of such segments is also disclosed and provided. Certain embodiments include methods for these sensors to be deployed in a semiconductor chip format, and more desirably in a CMOS chip device format.
Another object of the invention is to provide methods to perform highly multiplexed measurements on such chip devices, in order to provide the benefits of low cost, rapid and portable testing, and the benefits that such chip-based devices, systems and kits can be manufactured at extremely high volume and low cost by leveraging the existing manufacturing base of the semiconductor industry. It is also the object of the invention to provide methods by which these disclosed devices and systems can be used to address genetic analysis problems of importance specifically in the areas of the diagnosis and treatment of disease.
In another aspect, DNA tag arrays and DNA tag reporter assays are disclosed and provided using the hybridization sensors and sensor array chips disclosed herein as a universal and preferred means of performing a broad range of multiplex biomarker assays. These tag arrays provide the benefits of having a common detection platform based on the well-established method of hybridization detection, for many diverse assays, both DNA, RNA or nucleic-acid based, as well as protein detection assays or other biochemical analyte assays. These provide the benefit of massively multiplex detection capability, allowing scalable and high levels of multiplexing of diverse analyte assays. These also provide the benefit of highly optimized, robust and uniform performance of the reporter detection. These tag arrays further provide the benefit of separation of the primary detection assay, which can be done under preferred or standard solution conditions phase to generate the reporter tags, from the reporter readout part of the assay, that is performed on the molecular electronics hybridization sensor array chip. The reporter tags further provide the advantage that they can be readily amplified, by standard PCR processes for copying DNA, to improved detection sensitivity, whereas the primary target of detection may otherwise not be readily amplifiable, such as for DNA targets that contain epigenetic marks, such as methylation, or detection targets that are not DNA, such as protein targets or other molecular targets.
It is also the object of the invention to disclose and provide methods that extend these benefits to the problems of genetic analysis that occur in the field of infectious disease, for detection of the pathogens or pathogenic strains that cause such diseases, including pathogens in the form of parasites, fungi, bacteria and viruses. In particular, this is a benefit for viral disease, such as influenza, colds/respiratory viruses, including rhinoviruses and adenoviruses, AIDS virus/HIV, Ebola, Dengue, other hemorrhagic fever viruses, Hanta, Zika and West Nile Virus, SARS, MERS, and novel viruses with pandemic potential, such as COVID-19, for which the benefits of low cost, broadly deployed, rapid testing have great value in preventing or controlling the potentially rapid spread of these disease which can have massive public health and economic impact.
It is also the object of the invention to extend these benefits to the domain of infectious diseases testing in the domain of sexually transmitted diseases (STDs), which are predominantly caused by pathogenic parasites, fungi, bacteria and viruses, and where it is a particular benefit to have detection systems that are well suited to widespread deployment for use in community clinics or in the privacy of the home, by virtue being low cost, rapid, simple to use, smart and connected electronic testing devices.
The features and advantages of the embodiments of the present disclosure will become more apparent from the detailed description set forth below when taken in conjunction with the drawings. The drawings and the associated descriptions are provided to illustrate embodiments of the disclosure and not to limit the scope of what is claimed.
As used herein, the term “DNA” refers generally to not only to the formal meaning of deoxyribonucleic acid, but also in contexts where it would makes sense, this term also encompasses the well-known nucleic acid analogs of DNA that are used throughout molecular biology and biotechnology, such as RNA, or RNA or DNA that comprises modifications such as bases with chemical modifications, such as addition of conjugation groups at the 5′ or 3′ termini or on internal bases, or which includes nucleic acids analogues, such as PNA or LNA. DNA may generally refer to double stranded or single stranded forms as well in contexts where this makes sense, and unless specifically designated. In particular, when referring to hybridization and the probes and targets for this as DNA, they are interpreted in this broader sense of any of these analogs which undergo hybridization to form a bound duplex.
As used herein, the term “hybridization” or “DNA hybridization” refers to the process by which a single stranded segment of DNA in solution pairs with its reverse complement sequence to form a duplex molecule via Watson-Crick base pairing, and forming a double helical segment. It is understood here this includes the cases of DNA-RNA pairs forming, RNA-RNA pairs forming, and that such DNA could also include modified bases or nucleic acid analogs such as PNA or LNA. It is understood this pairing can occur between single strands of different length, such pairing occurring between the complementary segments of these longer sequences.
As used herein, the terms “complement”, “match”, “exact match” and “reverse complement” of a given segment of single stranded DNA or RNA all refer another single strand of DNA or RNA that will hybridize properly with this strand to form a duplex with Watson-Crick base pairings (and base pairing U-A for RNA-DNA or RNA-RNA pairings, as RNA has Uracil (U) instead of Tymine (T)) for the segment of interest.
As used herein, the term “hybridization probe” refers to a specific segment of DNA (or RNA) that is to be used to bind a complementary strand of interest. Such a strand of interest may exist within a sample or complex pool of known or unknown DNA or RNA fragments, or a diverse set of oligos presented in a solution environment that allows for the hybridization reaction. This term also may refer to the segment that will be anchored in place for exposure to the test sample solution. In context, the hybridization probe may refer to the single molecule of interest, or to a quantity of such molecules that all have the same sequence. A hybridization probe in many instances may be a short segment of DNA, in the range of 10-100 bases, but in general can be a DNA strand of any length. As used herein, the hybridization probe may generally refer to a DNA segment for which only a portion of it is used to hybridize to a target of interest, and other portions of it may serve difference purposes, such as spacers, segments comprising conjugation sites, segments intended to hybridize to other distinct targets, segments intended to bind DNA primers, or sites for binding of decoding probes use to produce location maps for the sensor on a chip, including segments that are sites for hybridization to targets that are decoding probes that are DNA hybridization oligos, including such oligos used for combinatorial decoding, which oligos which may be otherwise labelled or unlabeled with additional signaling groups to aid in decoding.
As used herein, the term “primer” refers to a single stranded DNA oligo that has a hybridization binding site on a single stranded DNA template molecule of interest, and the term “primer binding” refers to hybridization of the oligo to its target site. This term arises from the well-known process of priming a single strand for synthesis of the complementary strand by a polymerase enzyme, however in the present context, primers and primer binding are merely an alternate way to refer to the process of an oligo DNA, that binds to its complementary site via hybridization, in a context where the primer is typically a relative short segment, 6-60 bases, and more commonly 12-40 bases, or 16-25 bases in length.
As used herein, the term “primer extension reaction” or “primer extension” refers to the reaction in which a polymerase enzyme binds to the free 3′ end of a primer that is hybridized onto a complementary template strand to form a duplex, and, provided with suitable dNTP substrates, then synthesizes a complementary strand of the template, extending the phosphate backbone of this strand from this initial 3′ end of the primer. Such a primer extension reaction may extend just a single base, or more commonly, it may extend multiple bases along the template. Such an extension process may go to the end of the template, or terminate before the end of the template is reached, depending on reaction conditions and properties of the enzyme.
As used herein, the term “decoding probe” generally refers to any molecule whose binding and subsequent detection is used for a process of determining the location map of where hybridization probes for different targets are located on a sensor pixel array. In this context, it is assumed there are a multiplicity of different types of DNA hybridization probes, having different target DNA as defined by the probe sequences, and that molecules of these types have been randomly assembled into a sensor pixel array, or otherwise placed in such a way that their location in the pixel array is unknown. It this context, each hybridization probe is assumed to have physically linked or connected to it, one or more binding sites that would bind to one or more of the decoding probe molecules. The series of decoded probes are applied to such an array in series or in pooled form, allowed to bind to their specific targets on the hybridization probes, and the bound state is read out using the detectable signal generated by the binding probes. Such binding probes in preferred embodiments are single stranded DNA oligo hybridization probes, with hybridization targets on or linker to the DNA hybridization probes on the array. In preferred embodiments, the detectable signal is the electrical hybridization signal measurable by the sensor. In other preferred embodiments, dye labels on such probes could be read out with an optical microscope imaging system. Other embodiments could use binding probes that are not based on DNA hybridization, such as using aptamers or antibodies or libraries of small molecules.
As used herein, the term “combinatorial decoding” generally refers to any process of decoding the location map of hybridization probes on an array, where a series of outcomes of multiple decoding probe binding reactions is used to generate a unique identifier “barcode” for the array probes, that determines the hybridization probe identity.
As used herein, the term “hybridization target” means DNA or RNA molecules which contain the complement of a DNA hybridization probe. Such a target could be the exact complementary strand, but in most cases, it will be a longer strand that contains a segment exactly complementary to the probe. In the context of discussing a target with mismatches, it refers to molecules which match to the probe except at one or more bases, as indicated. In the context of hybridization, “perfect match” means a sequence that correctly hybridizes to the probe, with no mispaired bases, while a “mismatch” refers to a sequence that may bind to the probe, but has one or more mispaired bases, i.e. bases not engaged in the standard Watson-Crick bond found in natural double helix DNA-DNA or double helix DNA-RNA pairings. Such incomplete pairing will have reduced stability compared to the perfect match binding, which can be generally be used to discriminate perfect matches from mismatched forms—also known as cross-hybridization—in assay methodologies.
As used herein, the “hybridization assay” means any assay or test that comprises the process of hybridization.
As used herein, the term “sample” or “biosample” refers to any material that is intended for testing. Such material could be in solid or liquid form, and may also generally be in some form of container, such as a tube, and/or reside a carrier medium such as a swab or filter paper. Such material could comprise tissue, cells, bodily fluids, excrement, food products, portions of plants, of materials collected by a swab, air filter or water filter. Such material may also be maintained with some form of preservative or stabilizing agents. These terms may refer to the material in the state as initially collected, or materials that have undergone process steps, such as to extract or amplify DNA or RNA, prior to being in a form suitable to introduce to the sensor device.
As used herein, a “DNA tag reporter assay” or “tag reporter assay” “tag assay”, refers to an assay in which DNA tags are produced by the assay reaction, and where the detection results of the assay are encoded by the presence of the corresponding tags, or by the abundance of such tags produced.
As used herein, a “DNA tag” refers to a single stranded DNA oligo that may be used as a reporter tag in a tag reporter assay. It is understood that in contexts where this makes sense, that the term “DNA tag” may further refer more generally to a larger DNA segment that contains the tag segment, the double-stranded DNA duplex form where the tag constitutes one strand, or the reverse complement of the tag which would hybridize to the tag to yield the duplex form. As used herein, the term “tag complement” refers to the reverse complement of the tag in question, or, in contexts where it makes sense, a larger strand comprising the reverse complement.
As used herein, the term “tag probe” refers to the hybridization probe that has as its specific target the tag in question. Thus, such a probe consists of or comprises the reverse complement of the tag. The term “tag sensor” refers to the hybridization sensor that comprises the tag probe.
As used herein, a “DNA tag array”, or “tag array”, refers to a molecular electronics hybridization sensor array in which the probes on the array correspond to the single-stranded DNA tag complements, or single stranded DNA tags, corresponding to a given set of DNA tags.
As used herein, a “DNA tag set” or “tag set”, refers to a specific set of DNA tags, and in preferred embodiments, such a set that has been designed and otherwise selected to perform well for readout with hybridization tag arrays, under a common hybridization reaction condition.
As used herein, the term “bipartite tag” refers to tags whose sequences are formed by joining together A and B sequences, possibly with joining sequence inserted, and wherein in actual assays, the corresponding physical tags are generated by physical joining of such partial tag sequences, by the methods as in
As used herein, the term “PCR” refers broadly to any methods that use polymerase or reverse-transcriptase reactions to produce multiple copies of sequences from source DNA or RNA. In this context, the term “copies” may in general refer to single stranded reverse complements of segments of the source molecule, or single stranded exact copies of segments of the source molecule, or double stranded forms where one strand is identical to a segment of the source molecule. The term “copies” also may refer to the product of methods where an RNA template is converted to DNA molecules of the corresponding sequence, or a DNA template is converted to RNA molecules of the corresponding sequence. Such “PCR” methods in this context may include methods with linear amplification or exponential amplification, relative to time or cycle numbers. Such methods include those that use specific primers, or degenerate primers. Such methods also include isothermal reactions that occur in continuous time, or reactions that rely on thermal or chemical cycling. The “PCR” process may produce copies of specific target segments of the source DNA or RNA, as defined by specific primers, or may produce copies from many sites or random sites, as may result from degenerate primers. In particular, “PCR” in this connect may refer to isotheral amplification methods that can be used to rapidly produce large amounts of DNA copy fragments from a source genome, of RNA or DNA, using one of the many well-known methods, such as Rolling Circle or RCA, Genomify, or LAMP, and with such a method incorporating a reverse-transcriptase in the case of RNA starting material.
As used herein, “amplification” of DNA or RNA in a sample material refers to the use of PCR methods such as above, to make copies of the source DNA or RNA.
As used herein, “pathogen” refers to any disease-causing agent that has a genome, such as parasites, fungi, viruses, or bacteria, or other single or multicellular organisms that cause disease.
As used herein, “strain” refers the genetic variants within a species, i.e. members of the same species that have genomes that difference in sequence.
As used herein, “molecular electronics” refers to devices in which a single molecular or single molecular complex is integrated into an electronic circuit.
As used herein, a “molecular electronics sensor” is a device that transduces molecular interactions with target molecules in solution into electronic signals, using a single molecule or molecular complex integrated into an electrical circuit as the primary transduction mechanism.
As used herein, a “molecular complex” refers to small number of molecules that are held together by chemical conjugation, bioconjugation, or covalent or non-covalent bonds, such that the assembly is expected to retain this configuration or affiliation during the process of assembling it onto nanoelectrodes, and during use of the resulting sensor in assays. Such small number of molecules may be just two, such as a DNA oligo probe bound to a bridge, but in other contexts may be in the range of 2-10, 10-100, or 100-1000.
As used herein, “nanoelectrodes” are conducting elements that define a nanometer scale gap, and have dimensions of nanometer scale height and width, and substantially longer length, which provide an electrical conducting connection into a circuit.
As used herein, “bridge” or “bridge molecule” refers to any type of molecular wire or conducting molecule than may be using to make a conducting connection across the gap between nanoelectrodes. Such molecules include biopolymers, double stranded DNA, peptide or protein alpha helices, graphene nanoribbons, pilin filaments or bacterial nanowires, other multichain proteins or conjugates of multiple single-chain proteins, antibodies, Carbon nanotubes, or conducting polymers such as PDOT. Such molecules may include attachment groups that provide for specific attachment to, and/or self-assembly to, the nanoelectrode contacts.
As used herein, “semiconductor chip” refers to an integrated circuit chip comprising semiconductor materials such as Silicon or Gallium, and fabricated with techniques from the semiconductor industry.
As used herein, “CMOS chip” refers to an integrated circuit chip, fabricated using CMOS process techniques from the semiconductor industry. CMOS is an acronym for Complementary Metal-Oxide Semiconductor, and refers to a specific manufacturing process for making integrated circuit chips of the type most produced for processors, DRAM memory, and digital imager devices. As used herein, “CMOS chip” also refers to a device fabricated at the foundries that make such chips in industry, but which may also be postprocessed for purposes of the present disclosure if, using processes to adding or exposing accessible nanoelectrodes, an suitably protecting such nanoelectrodes, for use in the molecule electronics sensors.
As used herein, the term “chip” used in isolation refers to a “semiconductor chip” or “CMOS chip”.
As used herein, the term “pixel” refers to a sensor and measurement circuit that is repeated throughout a regular rectangular array of such identical circuits on a chip. A pixel may in context refer to just the measurement circuit, which here is a form of current meter measuring circuit, or may also include the sensor transducer element or elements affiliated with the circuit, which here are the molecular electronic component, i.e. molecule attached to nanoelectrodes. For definiteness, the term “measurement pixel” as used herein refers to the measurement circuitry of the pixel, and the term “sensor pixel” refers to the pixel circuit affiliated with a given sensor element. The origins of this term come from image sensors, where such pixels contained light sensing elements and measurement circuitry, which captured an element of a picture, but in the present context, as used herein the term pixel is unrelated to light sensing or imaging, and the pixels disclosed herein are sensing chemical interactions, not light.
As used herein, the term “sensor” refers to the complex consisting of the nanoelectrodes, bridge and hybridization probe, which is the primary transducer of interactions of the hybridization probes to electrical signals. In contexts where it makes sense, sensor could also refer to this plus the supporting current measurement circuitry, such as the including the pixel circuits. “Sensor pixel” refers to the pixel circuitry that provides measurements to a particular sensor.
As used herein, the term “signal group” or “signal enhancing group” refers to a chemical group that could be added to an oligo, and such that the presence of this group complexed into the probe-bridge complex, versus dissociation from this complex, produces a detectable signal. In particular, such a group may be displaced from the critical position by target probe binding, or may be brought into proximity as a label on the target strand.
As used herein, the term “secondary structure” refers to the physical conformation that a DNA strand takes in response to bonds it forms with itself or other molecules. In particular, this includes the structures that form from hybridization between portions of a DNA molecule, or between two DNA molecules. This also includes structure that may result from the DNA strand interacting with the bridge. Secondary structure can be induced by hybridization binding, and other forms of binding.
In various aspects of this disclosure, a DNA hybridization probe, which is a short piece of single stranded DNA, is attached by various means of conjugation to a bridge molecule that itself spans between two nano-electrodes and is suitably attached to each on either end, by some means of conjugation or binding. This configuration is further established within an aqueous solution. One preferred embodiment of this configuration is illustrated in
It is an object of the invention that this disclosed composition can function as a molecular electronics sensor for hybridization events. When such a molecular electronic sensor is exposed to a solution that contains single stranded DNA material, if the hybridization probe encounters and binds by hybridization to its intended exact match target, a distinguishable signal is produced in the measured current. This is illustrated in
In some preferred embodiments, the nature of the detectible signal produced by the presence of the target is a series of spikes in the current, that correspond to target DNA binding to the probe, and then coming off of the probe. This is in general an expected behavior, as the hybridization binding is reversible, and the rate of binding “on” is influenced by concentration of the target DNA in solution, as well as composition of the buffer, pH, temperature, etc., as is the rate of coming “off” for a bound target also dependent on temperature, and such properties of the solution.
In some preferred embodiments, the properties of the observed spikes, such as the length of time from exposure to first observation an exact-match step up, the time between pulses, or the ratio of time on to time off, or other properties of the on/off rates, are relatable to the concentration of the target, and therefore provide a measure of concentration of the target of interest. Thus, by analyzing the data to extract and compute such measures, this provides a molecular electronic single molecule hybridization sensor that can detect concentration of a target DNA of interest in a sample, including samples with a background that may be a complex pool of off-target fragments.
In some preferred embodiments, a perfect match hybridization between the probe and target DNA will produce a detectible signal, and a single based mismatch in an off-target DNA relative to the probe DNA in the sensor will produce a distinguishably altered signal, and further levels of mismatches will produce even more distinguishably diminished signals, or little or no detectible signals. In this way, the sensor signal can be used to distinguish targets that are a perfect match to the hybridization probe DNA from other fragments that have even a single base mismatch. This provides sufficient sensitivity and selectivity to perform the genetic analysis applications of genotyping and strain determination, which often require the ability to discriminate DNA targets that differ by as little as a single base mismatch, or otherwise may differ by just a few bases of mismatch, or by insertions or deletions of one or a few bases.
In particular, the identification of Single Nucleotide Polymorphism (SNP) genotypes becomes possible, as these require single base discrimination among different DNA segments. For this, we disclose a method for SNP genotyping, where in the organism or biosample of interest, two or more sequence variants may be present, differing between any two by as little as one base substitution, or one base insertion or deletion, and specific hybridization probes are made for each such sequence, and put into molecular electronic sensors of the type disclosed. From a primary bio-sample of interest, DNA or RNA is suitably purified, using any of various means well known to those skilled in molecular biology, such as using an extraction column or phenol-chloroform extraction, and the purified sample is applied to these sensors, either separately, in different reaction volumes, or within one reaction volume applied to a device containing all such sensors. Such a device could be a CMOS chip with all such sensors present on the chip. By monitoring the signal results from each sensor, it can then be determined which if any of the variant targets are present in the sample, and this information can be used to determine a genotype for subsequent interpretation, or to identify the presence of one or more specific strains of a pathogen. In preferred embodiments of this method, this could be determining the strain of a parasite, fungi, bacteria or virus.
Referring again to the general hybridization sensor disclosed, and as illustrated in
In preferred embodiments, the DNA hybridization probe may be between 4 and 200 bases in length, preferably between 10 and 100 bases in length, and preferably 12 to 60 bases in length. In applications requiring single base discrimination, such as SNP genotyping or pathogen strain determination, the probe length is preferably 10 to 35 bases in length, or more preferably 15 to 25 bases in length. The probe may also comprise other nucleic acids or nucleic acid analogs, such as RNA, PNA or LNA, which may provide stronger binding or great specificity of binding. Such probes can have reduced length. The probe DNA may also comprise a fluorescent group, such as a FAM dye molecule on the 5′ end, and groups can be used for quality control or characterization in the synthesis and purification of the probe-bridge conjugates, or the characterization of the assembled sensors, such as in optically assessing whether a nanoelectrode has received a probe-bridge complex.
In preferred embodiments, the molecular bridge illustrated in
In preferred embodiments, the application could be gene expression analysis of any cellular samples, and in general could be any application where methods such as DNA microarrays have been used for gene expression. In preferred embodiments, this would include gene expression applied to tumor tissue as may be used in cancer diagnostics.
In preferred embodiments, the application could be SNP genotyping in human, animal, or other cellular samples, and in general any application where methods such as DNA microarrays have been used for genotyping. In preferred embodiments, this would be applied to SNP genotyping in humans.
In preferred embodiments, the application is massively multiplexed hybridization probe detection and/or concentration measurement of targets in a complex pool. The level of multiplexing could be up to 100 different probes, 1000 probes, 10,000 probes, 100,000 probes, 1 million probes, 10 million probes, 100 million probes, or 1 billion probes. The provides an alternative to DNA microarrays for such high levels of multiplex detection, with the advantages of all-electronic, chip-based system, single molecule sensitivity, speed, low-cost consumables and instrument, and compact mobile, portable, or point-of-use instruments.
In preferred embodiments, the application could be species identification, in particular determining what species a given tissue sample is taken from, or in the identification of which pathogens, such as bacteria or viruses are present in a given sample. In preferred embodiments, this could be testing of environmental samples for the presence of a given virus, such as COVID-19. In a preferred embodiment, the sample could be a tissue, or fluid, from any of the common vectors for viral transmission, such as bats, birds, rodents or mosquitoes. In preferred embodiment the sample could be material filtered from air or water, or material swabbed from a surface. In preferred embodiments the sample could be a biosample taken from a human or animal subject, such as saliva, mucous, buccal swab, blood, sweat, urine, stool, or exhaled air.
In preferred embodiments, the application could be strain identification, in particular determining what strain of a pathogen, such as a bacteria or virus, are present in a given sample. In preferred embodiments, this could be testing of environmental samples for the presence of a given strain of a virus, such as COVID-19. The samples for this, could be the same as for the previous species identification application.
In preferred embodiments, these molecular electronics hybridization sensors are deployed on integrated circuit semiconductor chip devices, where such chips include the circuitry to supply voltages to the sensors, measure currents in the sensors, and transfer such data off-chip, and to control such operations. In preferred embodiments, and as illustrated in
These circuit blocks indicated in the schematic, as well as the blocks of the pixel, or the pixel circuit itself, can be fulfilled by many possible detailed circuit designs and IC layouts, well-known to those skilled in the art of VLSI Integrated Circuit Design, Digital Circuit Design, Mixed-Signal Circuit Design, and Analog Circuit Design. The architectures and schematics shown in
In preferred embodiments, the molecular electronic sensor is deployed onto a CMOS chip device, which is a specific form of semiconductor chip and chip manufacturing process. The advantage of using CMOS chips is the very large manufacturing base for such chips, and related supply chains, as well as the aggressive scaling roadmap for such devices. The majority of chips presently made are of the CMOS type, including the common processors, memory, and digital imaging chips used in commercial products. Another advantage is that aggressive scaling has led to shrink the features on such chips down to the near the 1 nm scale, so that such processes are in principle capable of producing nano-electrodes needed for the present disclosed sensor, thereby enhancing the manufacturability of the devices disclosed herein.
In preferred embodiments, such a chip operates synchronously, by each pixel acquiring a single current measurement value, and then the array of such values are transferred off chip as a “frame” of digital data, in a row-by-row fashion as indicated in
In such sensor array chips, in some preferred embodiments, as illustrated in
In preferred embodiments, the chip pixel array architecture is such that nearby pixels in the array share a common staging area, where the many nanoelectrode pairs for these neighbors are all located, and suitably electrically routed back to these adjacent pixel circuits. This applies to both cases where each pixel has one sensor affiliated with it, or cases of M multiple sensors per pixel. The use of such staging areas further improves the efficiency of circuit layout, and allows the staging area to have a larger opening, and better accessibility for nanofabrication or molecular assembly purposes or to facilitate wetting by the solution.
For arrays of sensor on chip, in one preferred embodiment multiple probes with the same target are represented on all or part of the array. In preferred embodiments, these multiple measures of the same target can be aggregated or averaged together to produce a more robust detection of presence/absence of the target, or to provide more sensitive detection or lower detection limits, or to provide a measure of concentration. For example, in preferred embodiments, if N sensors have the same hybridization probe and target, and a fraction f of these register a detection event, within a measurement time T of exposure to the sample, then detection becomes more robust or sensitive or accurate if a minimum threshold, fimin, is required for detection, f>fmin. Or, in other preferred embodiments, the ratio of f/T, which is the rate of detection, provides a measure of the concentration of the target in the sample. In other preferred embodiments, more detailed analysis of the f(t) curve acquired during the time interval [0,T] could provide various robust fits to the slope of this curve, or this curve could be fit to characteristic profiles or measured calibration curves produced by known reference concentrations, to provide a measure of concentration from these multiple probe measurements on the chip array. Such aggregation measures also typically provide a related estimate of confidence or measurement uncertainty, such as a suitable mean and standard deviation. If such individual probes are otherwise directly providing concentration measures, for each probe, these can also be averaged together, by various well-known means of averaging measurements, to produce a more accurate estimate of concentration, as well as error bars or confidence intervals on the measurement, based on the spreads or standard deviations observing in the set of individual measurements. This provides benefits of greater accuracy and measurement confidence for the concentration estimate for the target of interest.
For arrays of sensor on chip, in another preferred embodiment different hybridization probes with different targets are represented on some or all of the array. In preferred embodiments, these multiple measures of different targets provide the ability for multiplex or in-parallel measurements of the set of targets of interest. This has the advantage of lower cost of testing the targets, and faster testing of the targets, or simpler testing of the targets, or the use of less sample material or less reagents, to test for all the targets, versus separately testing for such targets on separate devices. This is generally referred to as multiplex testing or parallel testing, and is widely appreciated as a potential benefit to testing systems.
In preferred embodiments of sensor array devices, both forms of multiple probes will be present, i.e. for the give set of hybridization probes with the respective targets of interest, each specific type of hybridization probe will be represented by multiple, replicated sensors on the array, proving the benefits of redundant, replicate measurement above, and the multiplex probes for the multiple targets will be represented on the array, to also proved the benefits of multiplexing. The resulting compound benefit is multiplex testing, with confident measurements for each target that have the benefits of statistical replication for accuracy and confidence interval estimation. For this purpose, it a benefit in preferred embodiments to allow for very large chip-based large arrays of probes, in preferred embodiments up to 100 probes, up to 1000 probes, up to up to 10,000 probes, up to 100,000 probes, up to 1 million probes, up to 10 million probes, up to 100 million probes, or up to 1 billion probes, or up to 10 billion probes.
Multiplex Probe Maps and Decoding Methods In such preferred embodiments of sensor arrays in which multiple distinct hybridization probe sensors are deployed on one chip for the purpose of multiplex testing, there is a map provided that specifies what probe type is present at each different pixel location or sensor location (if multiple sensors are affiliate with each pixel). This allows the measured sensor data readout from the pixel array to be related to which probe target was being assessed at each sensor. Such a map may be produced by various techniques, based on how the probe molecules are prepared and applied to assemble into the array. This map is referred to as the probe map for the sensor array or pixel array.
In one preferred embodiment for establishing this map, spatially controlled exposure of the pixels to the different solutions containing the different probe types for assembly (or, instead of just the probe molecule, in preferred embodiments it is the probe-bridge molecular complex pre-formed which is assembled into the sensors) during sensor assembly, so that the probe map is known from which solutions were applied to which pixels. In preferred embodiments, such spatial control can be achieved by mechanically applying solution only to certain regions of the chip, instead of applying solution to the entire chip pixel array. In other preferred embodiments, this can be achieved by applying a probe assembly solution to the entire chip array, but using a voltage driven assembly processes such that only electrically activated pixels will assembly probe molecules into the sensor nanoelectrodes. In preferred embodiments, this could be done by applying a voltage to electrodes that either attracts or repels the probes or probe-bridge complexes, such as using a positive voltage to attract the negative chard on DNA in solution, or a negative voltage to repel such DNA. This relies on the well-known process of electrophoresis. In other preferred embodiments, an AC voltage may be used to selectively attract or repel the probe or probe-bridge molecules, using the well-known process of dielectrophoretic forcing. In particular, in one preferred embodiment, the solution contains the particular probe type or probe-bridge type for a particular target is applied to the solution, for a short period of time, and in a low concentration, such that diffusive transport is unlikely to deliver these molecules to bind to nanoelectrodes on the chip array. However, for the desired nanoelectrode locations of the molecules, an AC voltage of proper frequency and amplitude to create a dielectrophoretic force that will drive these molecules to concentrate near the electrode gaps of the intended sites, allowing them to selectively bind the intended probes or probe-bridge complexes. The solution is then flushed away, and the next probe may be introduced, similarly target sites for it. This may be done for individual probes, or pools of distinct probe types, in which case their locations are restricted to a much smaller set of possible sites, but probe type from the pool is still randomly distributed across electrodes within those site sets, and further location information would be required to complete the map to the individual sensor level. In preferred embodiments, the low concentrations of the probes used may be in the range of 1 pM (pico-Molar) to 100 nM (nano-molar), the exposure time used may be in the range of 0.1 s to 100s, and amplitude of voltage used maybe be in the range of 0.1V to 10 V, and the frequency of AC modulation used may be in the range of 1 kHz to 100 MHz.
In other preferred embodiments, the probe map may be constructed by a process of decoding hybridization probe locations using the result of special binding reactions, with special detectable probes (not necessarily DNA hybridization probes) that are designed to be able to locate or localize each specific hybridization probes type on the array. In this approach, each distinct hybridization probe type is provided with one or more binding sites, directly coupled to or integrated into the probe molecule (or, in preferred embodiments, to the probe-bridge molecular complex, pre-formed), and where such sites are capable of producing an observable binding signature in response to binding with the corresponding decoding probe molecules, and with such a signature being localizable to the resolution of a specific probe site (nanoelectrode pair) or pixel, as required for the complete hybridization probe map. One or a series of these decoding signatures can thereby be affiliated to a specific probe site on the array, and these may be used to decode which particular type of hybridization probe is present at the site. In this method, the hybridization probes (or probe-bridge complexes) are applied as a pool to the pixel array, allowing them to randomly assemble into the nanoelectrodes on the pixels, and after they are so assembled, in a series of subsequent reactions, known decoding probe molecules are applied to the array and observing for the production of their unique binding signatures, localized to sensor sites. For each site, there is at the end of this procedure a resulting series signals localized to that sight, that are sufficient in combination to determine confidently which hybridization probe must be present at the site.
Such observation of a decoding signature in preferred embodiments may be done by monitoring electrical signals from the pixels and respective sensor sites, that are produced by the binding of the decoding probe. In preferred embodiments, these decoding probes are themselves DNA hybridization probes, whose targets are affiliated with the hybridization probes, and their hybridization events also can produce detectable signals in the sensor. The decoding scheme may require one or multiple such decoding hybridization targets affiliated with the hybridization probe. There are many ways such targets can be affiliated with a hybridization probe,
In other preferred embodiments, these signatures of the decoding probes may be optical signatures from dye molecules or fluorescent groups, such as Quantum Dots, on the decoding probes, and such signatures may be acquired by microscopic imaging of the chip, under white light or fluorescent light, and widefield imaging or confocal scanning conditions. This is sufficient to localize the optical signals to within approximately 1 micron, or one wavelength of the emitted light, of the location of the decoding probe itself, therefor providing spatial resolution in the range of approximately 0.5 microns to 1 micron. This is sufficient to localize the probes in preferred embodiments. If well-known super resolution imaging methods are employed, localization below the wavelength of light (the so-called diffraction limit) is possible, down to 100 nm or down to 20 nm. This is sufficient resolution to localize probes many preferred embodiments using such optically labelled
One preferred embodiment of a decoding method to produce the probe map is “direct decoding”, in which individual decoding probes specific to the individual hybridization probes are used to directly locate the sites of each probe type on the array. In a preferred embodiment, the decoding probes are hybridization probes. Assuming there are N hybridization probes types H1, H2, . . . Hi, . . . HN, in this method there are provided N decoding hybridization probes D1, D2, . . . , Di, . . . , DN. These should have distinct sequence targets, that have very low cross-hybridization between them. A target oligo for Di is physically affiliated to probe Pi, for I=1 . . . N, such as by any of the means represented in
Another family of preferred embodiments for decoding methods to produce the probe map is generally termed “combinatorial encoding and decoding”. In these embodiments, a series of decoding probe reactions are applied, and for each given probe site on the array, the series of detection/non-detection results from these reactions provides enough information in aggregate to uniquely determine the identity of the probe at the site. Several canonical exemplar embodiments of such combinatorial methods are given here. It is understood that there are many variations, reformulations, and combinations of these provided canonical exemplars that can be used as alternative decoding schemes for building the probe map, and which would also be obvious, from the canonical examples, to one skilled in the theory of codes. All such obvious variations, reformulations, and combinations are meant to be encompassed by these canonical exemplar embodiments.
The canonical combinatorial decoding embodiments provided may be described succinctly and efficiently as follows, wherein to achieve this, the assay to be performed, their order, and their outcomes are arranged and represented with 0/1 in way that allows the direct relation of decoding probe assay results to probe identification codes. Assume there are N hybridization probes types H1, H2, . . . Hi, . . . HN for which a location map is desired. In various preferred embodiments of this method, there is provided a set of N distinct K-bit binary code strings {B1, B2, . . Bi, . . . BN}, where these Bi are various strings of length K, composed of the symbols “0” and “1”, such as such as for example might be the string B=“1001011”, in a case where the length is K=7. The code Bi is assigned to probe Hi, for i=1, . . . , N, and these codes will be used in physical encoding and decoding process to identify this probe for the probe map on the array. Note that any such set of N such strings will provide a valid encoding for the methods that follow, although special sets of such strings, as described in preferred embodiments below, can also provide for the additional feature of error detection and correction in the decoding measurement process used in array assays. Also, note that as there are exactly 2 K distinct strings of length K, so it is required, in order to have enough such binary codes, that that 2 K≥N. Indeed, for any K satisfying this, preferred embodiments include the choice of any subset set of N strings from the master set of 2 K possibilities, and if N=2{circumflex over ( )}K, a preferred embodiment is simply to use all K-bit strings, listed in any order. Note that in these code assignments, if all the code strings {B1, B2, . . Bi, . . . BN} have the same binary digit in position j (i.e. the jth digit is always 0, or always 1), this position is uninformative and can be eliminated from the strings, reducing their length K to K-1. This can be repeated to remove all such uninformative positions in the strings, so as to reduce the number of physical encoding probes required in the methods below.
In one family preferred embodiments, there is further provided a set of 2K decoding probes that are hybridization probes denoted as Dij, where i=0 or 1, and j=1 . . . K. These decoding probes should have distinct target sequences, and preferably low potential for cross-hybridization. For the probe Hi, the associated physical encoding targets are taken to be the target DNA oligos of the encoding probes D(b1)1, D(b2)2, . . . , D(bK)K, where here b1, b2, . . . , bK are the binary digits of the encoding string Bi, i.e. bj is the jth digit of string Bi. These encoding probe target DNA oligos are then to be physically linked or affiliated with the physical hybridization probe oligo, such as by the methods illustrated in
In another family of preferred embodiments, the situation is as in the above embodiments, but the physical encoding is done in a more compact form: For the probe Hi, the associated physical encoding targets are taken to be the target DNA oligos of the encoding probes D(b1)1, D(b2)2, . . . , D(bK)K, but only tag the probe physically with the D1×targets, ie. Do not tag them with any of the DOx targets, and when doing the decoding above, apply only the K reactions of the probe D1×probes, D11, D12, . . . Dij, . . . , D1K. The results of these trial assays can be recorded as testj=1 if Dij binds at a probe site, and testj=0 if it does not bind. In this case, the result string (test1)(test2) . . . (testj) . . . (testK) is the same binary string as recovered above in the previous embodiment, because above, if D1j did bind, testj=1, as in the present method, and if D1j did not bind, this is the same as D0j binding, which also recorded as 0 above and in the present method. Thus, the same probe map decodind is achieved. It is a benefit of this embodiment that fewer physical target oligos need to be linked to each hybridization probes, and overall, the method requires only half as many physical encoding probes to be produced, and their associated targets to be produced and linked to probes.
Another family of preferred embodiments of methods for making probe location maps may be described efficiently and succinctly as follows. Again, reaction procedures and outcomes are efficiently encoded by 0/1 indicators that allows direct interpretation of decoding assay results for an unknown probe as the binary code identifying the probe. for This method relies on reacting pools of decoding probes, rather than individual probe reactions, within otherwise a similar logical framework. Assume there are hybridization probe types H1, . . . , HN, with assigned K-bit binary codes {B1, . . . , BN}. There are then further provided the same number of N decoding probes that are hybridization probes denoted as D1, . . . , DN. These decoding probes should have distinct target sequences, and preferably low potential for cross-hybridization. The target of each Di is to be physically linked to the corresponding probe Hi, such as by the means illustrated in
In another family of preferred embodiments, the situation is as in the above embodiments, but the physical encoding is done in a more compact form as follows. Only the K pools P11, P12, . . . , P1j, . . . P1K are physically constructed, and these are reacted to the array, in a series of K reactions, and for each site on the array, the result is recorded as trialj=1 if hybridization was observed with pool P1j, other 0 if it was not observed. The resulting string (trial1)(trial2) . . . (trialK) that encodes this outcome, is identical to the string in the above embodiments, and therefore this string provides the code string Bi that identifies the probe Hi. The results of reacting these K pools to the array, therefore, decodes all occurrences of all probes on the array, and provides the required probe map. This requires half as many pool constructions and hybridizations as the previous embodiments.
Preferred embodiments for Error Detection and Correction in Probe Mapping. As noted, in the above embodiments of decoding methods, any set of N binary K-bit strings (B1, . . . , BN) provides an encoding and decoding method, as so a great number of possible methodologies are outline above. Within this framework, the specification of specific code word sets for preferred embodiments can provide substantial benefits. For illustration of this point, note that in the combinatorial decoding schemes above, if the number of probes is N=2{circumflex over ( )}K, each and every K-bit binary string is then necessarily assigned a probe, in 1-to-1 fashion. However, in this minimal code length K scenario, if an error were made in measuring the code of a probe in the above methods, it would produce the code of a different probe, since all codes are used, and thus the result in incorrect decoding. Allowing a larger binary coding string length K than the minimum required allows for robustness against such errors. Specifically, it is possible that the set of binary codes {B1, B2, . . . , BK} is chosen as a set that allows for error correction or detection, such that if a code string from this set were corrupted by one or more bit flipping errors, it is possible to determine such corruption has occurred, and with some encodings, also to correct it back to the uncorrupted state, error free. This will provide for protection against errors that could be made in the decoding measurement process outlined above, in the form of an false detection of hybridization (error of 0→1), or missing a true hybridization (error of 1→0), so that such errors do not lead to incorrect or indeterminate decoding of probe identity. Many such error correction or error detection encodings are known to those skilled in the art of error correction methods for binary data. In some preference embodiments, one such method is the use of binary strings that add one or more parity bits add the end of an initial given string, which provide power to detect or correct certain errors. Another preferred embodiment is the use of Hamming Codes and Hamming distance to detect and correct errors. In this class of methods, the assigned number N of code words must be only a small fraction of all possible binary codes of length K, and the precise code words are taken to have highly distinct bit sequences, such as, for example, this could be N randomly selected code words from all 2{circumflex over ( )}K»N. In such a case, if there is a corrupted code, it may be detected because it does not match any of the assigned codes, and it can be corrected back to the closest of the allowed assigned code strings, with closeness measured by the Hamming Distance (number of mismatches between the digits of two binary strings). This general technique always affords some power for error correction of at least limited number of bit errors, and for any proposed set of code words, {B1, . . . , BN}, the error correction properties of this can be directly and exactly assessed by brute for examination of all possible corrupted versions of each Bi, and noting for which corruptions this process corrects them. Preferred embodiments of such methods are provided by specific Hamming Codes, which are strings sets {B1, . . . BN} that have optimal or highly effective and uniform error correction by this means of correcting to the Hamming distance closest allowed code. In general, many other error correction encoding schemes are known to those skilled in the art of coding theory, and any of these schemes defined for K-bit strings can be used to provide K-mer code word sets that also have powerful error correction capabilities, and which can be used here to correct for possible decoding hybridization errors. In general, this provides a mechanism with arbitrarily good power to correct errors, at the cost of larger K—and therefore more physical decoding probes and more decoding reactions).
In preferred embodiments, the decoding probes used in the above decoding methods, electrical or optical, are shorter oligos, such as in the range of 8-25 bases, and any two such targets have multiple mismatches between them, to reduce cross-hybridization, preferable 2 or more, and preferably 4 or more. In preferred embodiments, they may be PNA probes, so that a short probe can have stronger binding and higher Tm, and the impact of single mismatches can be greater on reducing cross-hybridization. In preferred embodiments, all of the methods disclosed above can be used with electronic detection of decoding probe hybridization provided by the sensor chip array, or, in other preferred embodiments, using optically labeled decoding probes-such as a dye label or Quantum dot label, or gold nanoparticle label, or any other label detectable by microscopy and compatible with attachment to a single molecule DNA oligo—and localization of probe binding by microscopic imaging.
In another preferred embodiment of the above decoding methodologies, the objective and benefit is to have a decoding method in which the number of decoding targets added to each probe is a number J that can be specified as desired, so as to control the amount or of hybridization target added to the probes for decoding purposes. This can be achieved as follows, using the compact form of the first family of preferred methods above: the binary codes string {B1, . . . , BN} are defined as follows: for the set of numbers (1, . . . , L), for some L, represent a subset S of this set by the K-bit string (b1)(b2) . . . (bK), where bi=1 in i is in the subset S, and 0 if not. This is the sometimes called the indicator function for the subset. E.g., the subset (2,4) would have indicator string 0101000 . . . 0. There are 2 K such strings, corresponding to the membership indicator strings of all 2 K subsets of S. In the setting, define as the codes the set of all strings that have exactly J 1's in them. The number of such strings is known in combinatorics as “L choose J”, and is N=L!/(J! (L−J)!), where “n!” denotes n factorial=n×n−1× . . . ×2×1. When this set of code strings is used in specified “compact” forms of the methods above, this has the advantage that for the physical encoding, wherein a target is added for every 1 occurring in the encoding string Bi, there are always exactly J such 1's, and so exactly J hybridization targets are added to encode each hybridization probe. This therefore has the advantage of controlling the amount of target material added for decoding, to be J oligo targets. For any desired number of hybridization probes N to be encoded, and any desired J>1, L can be chosen large enough to that L!/(J! (L−J)!) in >=N, and therefore provides enough such codes. The cost of achieving this as that L encoding probes are required. For example, suppose there were N=1024 hybridization probe types. One option would be to take all K=10-bit binary strings, and assigns all these as codes. However, in the above methods, each probe would get linked to either 10 targets (in the non-compact scheme), or a variable number of targets between 0 and 10 in the compact schemes. The decoding would require 20 reactions in the full scheme, or 10 in the compact scheme. However, restricted to linking to J=2 targets per probe, L=46 encoding probes and reactions are required, but allowing J=3 reduces this to L=20, and J=4 allows L=15. These are generally more desirable, such as required 15 probes and reactions, but only needing to add 4 decoding oligo targets to each probe. However, these do not provide any error correction capability, as a single bit error would produce a 3 element or 5 element subset indicator string, which does not have a unique Hamming distance closest string in 4 element set indicators.
Chip-Based Systems In preferred embodiments, the disclosed chip-based multiplex hybridization probe sensor devices are deployed in a compact, low cost electronic instrument that is suitable for distributed use, field use, or point-of-care use. Such instrument architectures in preferred embodiments comprise a chip board that mates to the chip, motherboard that hosts the chip, and FPGA-based control and data transfer subsystem, a data processing subsystem, which may comprise CPUs, GPUs, FPGAs or other signal processing hardware, a fluidics subsystem, on instrument data storage, and of-instrument data transfer systems.
In preferred embodiments, this chip device is deployed into a cartridge that in preferred embodiments also allows for some or all other liquid reagents or input sample required for operation to be on-cartridge, to allow for a partial or fully dry instrument platform. In preferred embodiments, this cartridge is run on a desktop instrument that provides for a user interface, a control computer controlling chip and system functions, control of any on-board fluidics or actuators that control on-cartridge fluidics to supply sample and reagents to the chip, transfer of data from the chip to internal storage or data processors, such as FPGA, GPU or CPU data processors, and transfer of data off instrument via direct internet or wireless connectivity to remote or cloud-based data centers, and such system also provides a sample prep system, internally or as a companion instrument, that takes biosamples of interest and coverts them to the form for on-chip application. In preferred embodiments, such a system can have a compact form factor suitable for mobile use. In other preferred embodiments, such a system can have a highly compact form factor suitable for point-of-use or point-of-care or in the field deployment. Such point-of-use applications in preferred embodiments would include testing stations deployed at airports, transportation hubs, hospitals, schools, stadiums, cruise ships, transport chips, or other major sites of congregation, or deployed at site of business or commercial activity. In preferred embodiments, such testing stations would be deployed for use in the home, for personal testing and monitoring. In preferred embodiments, point-of-use systems may be deployed in the field for military, police, customs or border control point-of-contact testing, or other in-the field testing and monitoring applications, such as testing of commercial vehicles, trains or aircraft for presence of pathogens.
Tag Reporter Assays For many types of DNA targets detection applications, it is possible to have the detection response encoded by the production of short segment of synthetic single stranded DNA oligo, or a “DNA tag”, such that the detection of the target is represented by the presence of the tag, and the relative abundance of the target may further be represented by the relative abundance of the tag. Thus, the DNA tag becomes a reporter, that can be used to readout the assay results. This can be generalized beyond the detection of DNA targets: a broad array of other molecular detection applications can be formulated in assays that represent the detection results via a reporter DNA tag. There are multiple benefits to such tag reporter assays, including: (1) the assay results can be read out by hybridization sensors that have the tag as a target, thereby allowing a universal, simple and convenient readout sensor for many diverse molecular detection assays; (2) tags can be designed for optimal and predictable hybridization performance with the sensor, unlike trying to make hybridization probes constrained by native DNA target sequences of interest in the primary detection assay, which are more variable in properties when used as hybridization probes; (3) tag reporters fundamentally allow for a very high degree of multiplexing, since the number of distinguishable DNA tag types is essentially unlimited, and hybridization sensor arrays provide for multiplex readout, unlike other common forms of reporters, such as dye labels, which are severely limited in distinguishable colors or dye spectra or the ability to group multiple dye molecules into a label, and which require increasing complex optical systems for multiplex readout; (4) tag reporters allow the primary detection reaction to take place under standard and optimal solution phase conditions, instead of performing primary detection under the less ideal conditions that may exist near the molecular electronics sensor (proximity of surfaces, other sensor molecular components, and varying voltages and charges transfer, etc.); (5) DNA tags can be easily amplified by PCR reactions, so there is a simple and convenient way to amplify the reporter signal as needed for more sensitive detection, and this provides sensitivity down to the single target molecule detection limit.
Examples of preferred embodiments of tag reporter assays are illustrated in
This circularizing ligation can be highly specific to the probe end segments exactly matching the target strand of interest. As shown in
Considering the processes illustrated in
The tag reporter assays illustrated in
Multiplex Tag Reporter Assays, DNA Tag Sets, and Tag Array Chips The DNA tag reporter assays illustrated above generally can be extended to multiplex versions of each assay, simply by using different reporter tags to report out the different analyte detections. This easily scalable multiplexing is a major advantage of using DNA tag reporters. In preferred embodiments, the resulting different tags generated by the assay are detected by corresponding hybridization sensors in a pooled hybridization reaction. Therefore, in such preferred embodiments, there is a corresponding sensor array chip whose hybridization probes comprise the complements of all the tags that could be generated in the multiplex assay. This preferred embodiment of a multiplex tag reporter assay is illustrated in
In performing this general multiplex method, in preferred embodiments the tags generated from the multiplex assay may be purified before hybridization, to remove unwanted components of the primary reactions, or to exchange into a different buffer for the hybridization reaction. In addition, in preferred embodiments, if the tags contain any common sequence segments that would promote cross-hybridization (such as illustrated in the A:B tags of
The tag sensor array is a hybridization sensor array of the general type disclosed above, produced by the means described above. The sensor for a given single stranded tag DNA comprises the single stranded complementary sequence for the tag, and is a hybridization sensor as disclosed above. The hybridization probes represented on the array comprise the complements all N tags under consideration. In preferred embodiments, the expected number of sensors for each tag would be in the range of 1-10, or in the range of 10-100, or in the range of 100-1000, or in the range of 1000-10,000. In preferred embodiments, such higher numbers may confer more sensitivity or greater dynamic range of measurement. In other preferred embodiments, the sensor array may be divided using fluidics chambers into sub arrays which represent subsets of the N tag set. In preferred embodiments, there may be 2, up to 4, up to 8, up to 16, up to 32, or up to 64, or more individually fluidically addressable subarrays, each of which contains probes for one or more tags from the N tag set.
In preferred embodiments, the N tag probes are assembled randomly into the array, and which of the tag probes is present at each pixel is decoded by the means disclosed above. In preferred embodiments, the tags themselves are used to do this decoding. In one preferred embodiment, they are serially and individually hybridized to the array, over the course of N serial hybridizations, to directly identify which pixel probes for which tag. In other preferred embodiments, the tags are used to perform combinatoric decoding and mapping of the array, by hybridizing a series of tag pools to the array. In preferred embodiments, each pool contains approximately half of the N tags, and such pools are constructed so that approximately M=Log2[N] such pooled reactions are sufficient to simultaneously decode the sensor probe identities, as disclosed in more detail previously. This pooled decoding approach is preferable to the serial decoding when the number of tags exceeds small numbers, such as N>8, N>12, or N>16.
As noted, the single stranded tag sequences in a tag reporter assay may in principle be any distinct sequences that are compatible with the details of the assay, such as for example any sequences in the length range of approximately 10-100 bases that is compatible with PCR amplification, in the case of the assay of
The concept of such an optimized DNA tag set is illustrated in more detail
In preferred embodiments of a tag set, the candidate tags for a tag set are designed theoretically to have desirable hybridization properties, using theoretical models for modelling hybridization interactions. In further preferred embodiments, such designed candidate tags and their complements are synthesized into physical DNA oligos, the corresponding tag complement arrays are produced, and then the tag set is experimentally screened under the desired candidate reaction conditions with the tag array chips, using an experimental design that can assess the different tag sensor performance properties of interest, which may include high sensitivity, high specificity, high signal to noise ratio, large dynamic range, linear response to concentration, and for low levels of undesirable cross hybridization or off target hybridization sensor responses, and for repeatable and consistent performance or low coefficient of variability across replicate experiments, and robust performance across the range of hybridization reaction conditions of interest. Based on such empirical screening, a subset of tags is selected that have good verified performance across all of synthesis, array fabrication, and hybridization sensor performance for multiplex tag assays.
In this way, tag sets that are well-designed and empirically verified for performance can be produced. In preferred embodiments, such a designed and verified tags set probes are synthesized in bulk with thorough application of well-known DNA synthesis quality controls, standard array chips are produced with probes for the tag set, and the resulting tag designs and tag arrays are used for diverse applications and classes of tag reported assays.
In preferred embodiments, the experimental design for empirical selection of a tag set may include hybridizing pools of tags that represent different subsets of the tags, different tag concentrations, and different hybridization reaction parameters, such as reaction temperature, buffer composition, and duration or washing conditions. Buffer composition may in particular include the salt ion concentration (such as Na+, or K+), the divalent cation concentration (such as Mg++), which have a strong effect on the melting point of hybridized DNA. In preferred embodiments, the tag set and hybridization reaction are co-optimized in this screening process, such that the outcome is an optimally performing tag set with a corresponding optimal hybridization reaction. In preferred embodiments, the screening will in particular identify tags that cross hybridize to off target sensors (i.e. those that do not have the tag complement), and will also identify tags that do not hybridize well to their target sensor. In preferred embodiments, the design will also identify the ideal or preferred reaction temperature and salt concentration, and divalent cation concentration, as well as the concentration of any well-known and common additives to hybridization reactions that may be used to control or modify hybridization stringency, alter melting point temperatures, or otherwise block unwanted interactions, such as betaine, DMSO, glycerol, Formamide, SDS, Tween-20, Triton-X, BSA, or blocking DNA, as well as other such well known and widely used additives.
In one preferred embodiment of the tag screening experimental design, each tag is individually hybridized to the array, at a representative range of concentrations, and under candidate reaction conditions, to directly check that it does not produce off target response, and that its on-target sensor has good response. Poor performing tags are then eliminated from the final set. This is not directly feasible when there are a large number of tags, such as N>100. In such a case, small subsets of tags could be similarly screened in pools, such as n tags at a time, n«N, e.g. n=10, tags with bad specific sensor performance are so identified for elimination, while if the pool shows off-target performance, each tag in that pool can be individually tested to determine which is the source of the off-target behavior, and in so doing identify the tags to be eliminated for off-target effects. More generally, even more efficient experimental designs can be used, which test for multiple effects within each pooled reaction. Many such informative experimental designs that test multiple factors within each pool would be obvious to those with expertise in DNA hybridization reactions and experimental design. The design of such experiments is also an area that is well known to those skilled in the statistical design of experiments (DOE), and such experimental designs can be produced with the help of well-known and widely used DOE analysis software, such as the JMP software package from the SAS corporation.
In preferred embodiments, the candidate DNA tag sequences for a tag set are designed to promote good performance as hybridization tags in tag reporter assays. Such preferred sequence design features may include sequences that promote efficient DNA synthesis, sequences that will have similarly hybridization responses-within some particular hybridization assay condition—in terms of uniformity across all tags of: sensitivity of sensor response, specificity of sensor response, dynamic range of sensor response, linearity of sensor response to concentration of the target tag, and absence of unwanted hybridization interactions that tend impair such performance, such as tags that have internal secondary structure (i.e. a single tag molecule folding onto itself), the tag complement having internal secondary structure (both of which would interfere with the desired tag target—tag probe hybridization), and unwanted hybridization of tags with the incorrect tag probe (off target binding to sensors), and unwanted tag vs tag solution phase hybridizations (which could reduce the amount of such tags available to bind with the respective tag sensors). In preferred embodiments, such sequence design specifications include matching the melting point, Tm, for all the tag-tag complement duplexes-relative to a given hybridization reaction condition (buffer composition, reaction temperature, tag concentration) —to be within a narrow range around a target melting point, and also include insuring that the melting point for tag self-interactions are much lower than Tm, and that the melting point for tag cross-hybridizations of tags with off-target tag probes having a much lower melting point than Tm. In preferred embodiments, such sequence designs will target the Tm to all be in an interval of [Tm1, Tm2], where in various preferred embodiments, the target tag melting point Tm1 may be near 90° C., near 80° C., near 70° C., near 60° C., near 50° C., near 40° C., near 30° C., near 20° C., or near 10° C., and the range tag melting points for the set, Tm2— Tm1, is preferably less than 5° C., or preferably less than 2° C., or preferably less than 1° C. In preferred embodiments, sequences of the tag set are designed so that the melting points of all unwanted reactions will be much lower than the desired Tm range, by some threshold difference Delta. In particular, all tag self-interactions (and tag complement self-interactions) in the set will have melting points below Tm1-Deltaself, and all cross-hybridization interactions tagi vs tagj probe, i≠j, (as well as tagi vs tagj interactions) will have melting point temperatures below Tm1-Deltacross. In preferred embodiments, these Delta temperature separations between the desired perfect match hybridization and unwanted hybridizations will be greater than 10° C., greater than 20° C., or greater than 30° C., or greater than 40° C. In preferred embodiments, that tag sequences in a tag set may all be of the same length, this may promote more convenient and efficient DNA synthesis, as well as promote uniformity of sensor performance, by reducing physically variability between sensors (such as, for example, the amount of molecular charge on the sensor).
In preferred embodiments, such tag sets may have up to 10 tags, up to 100 tags, up to 1000 tags, up to 10,000 tags, up to 1 million tags, or up to 10 million tags, or up to 100 million tags. For example, over 10 million SNP variants have been identified in the human genome, so a tag set with over 10 million tags would be needed to assay for all of these in a single assay. Of these, nearly 1,000,000 SNPS with clinical associations have been found, so a tag set of 1,000,000 tags would be appropriate for assaying for all of those. Over 10,000 SNPs have strong clinical indications have been for, so a tag set on this scale would be required to assay for all of those, and over 600 SNPs have clinical actions associated with them, so a tag set on the scale of 1000 tags would be required to assay for them. For another example, over 14 common sequence variant strains have been identified for the SARS-CoV-2 virus, therefore a tag set with up100 tags would be appropriate to assay for these strains.
The tables below show specific embodiments of tag set designs, for 16-32 tags. These are intended to illustrate various of the sequence design criteria disclosed above, and one process by which such tags can be designed. For these tag designs, the DINAMelt web-based software package was used to predict secondary hybridization structures, and their associated free energy, entropy and melting points (see Markham, N. R. & Zuker, M. (2005) DINAMelt web server for nucleic acid melting prediction. Nucleic Acids Res., 33, W577-W581). This is intended to illustrate the type of secondary structure melting point prediction software that can be used for such design efforts. This software performs secondary structure calculations for individual strands (self-interaction) as well as between strands (perfect match tag-tag complement or cross-hybridized). The hybridization reaction conditions that are allowed to be specified for these calculations consist of the reaction temperature, T, the molar concentration of sodium ions ([Na+]), the molar concentration of divalent magnesium ions ([Mg++]), and the molar concentration of the strands undergoing the interaction (CT). The calculations below were all done for reaction conditions of T=37° C., [Mg++]=0M, and CT=0.00001M, and for two different [Na+] concentrations, 0.1M and 1M, as indicated below. The tag sets were designed by starting from randomly generated n-mer oligo candidate tags, formed with equal 25% frequency of A, G, T, C, and starting from 1000-2000 such random oligos, selecting subsets that have their perfect match duplex hybridization Tm in a narrow target range within a few ° C. of mean Tm for all such oligos, and then further sub-selecting for such oligos which have their self-interaction Tmself below this range by a desired Delta in the range of 20-40° C., and then further selecting for a subset that had Tmcross for all cross-hybridizations of the tag against the complement of the other tags also lower than Tm by a similar Delta. Oligos were further eliminated if they contained a run of more than 3 G's in a row or more than 3 C's in a row (i.e. if they contain . . . GGGG . . . or . . . CCCC . . . ) in order to avoid the unwanted secondary structure of so-called “G tetrads” in the tag or its complement, which are another unwanted DNA secondary structure not necessarily predicted by tools such as DINAMelt, or reflected in the associated melting points from DINAMelt. Starting from 1000-2000 candidate tags, a final set of 16-20 tags were produced matching all this design specifications. The example sets below were chose to represent examples where the perfect match Tm are near various points of interest in range of 30-90° C., and where the unwanted interactions had Tm that were all lower by 20-40° C. from the perfect matches. Each tag oligo set has tags of the same length, and the length is the major variable that controls the Tm target of the set. The method illustrated here can be readily generalized to generate tag sets of any size N, by starting with larger random candidate sets. Other methods than random generation can be used to generate starting candidates for the tag set, such as starting with sequences strings that have a large Hamming Distance between them, or other methods of generating strings that have many mismatches between any two members of the set. Many such methods of generating initial sequence candidate sets are well known and obvious to those with expertise in algorithms for the analysis of text strings or in coding theory. The selection criteria here are also one meant to be illustrative, and not limiting. Other criteria for eliminating sequences with unwanted secondary structures or interactions are also obvious to those expert in DNA secondary structure analysis.
This example tag set is based on 10-mer tags. Table 1 shows the tag sequences, written 5′-3′, as well as the reverse complement sequences, written 5′-3′, i.e. the sequences of the tag complements or tag probes on the array. These have Tm near 50° C. or near 40° C., depending on the sodium concentration, while the undesirable structure melting points all lower by nearly 30° C. The specific critical melting point parameters that characterize the design criteria are as follows: at sodium ion concentration [Na+]=1.0 M in the hybridization reaction, the melting points of the tag duplexes are in the range of Tm=49.5-50.5° C., a range of 1° C., while the self and cross hybridization reactions all have Tmself<20° C. and Tmcross<20.5° C., which are at least Delta=29° C. below the perfect match melting point. At a 10-fold lower sodium ion concentration [Na+]=0.1 M in the hybridization reaction, the these temperatures are reduced by roughly 10° C.: specifically, at this condition, the melting points of the tag duplexes are in the range of Tm=39.0-40.1° C., while the self and cross hybridization reactions all have Tmself<10° C. and Tmcross<10° C., which are at least Delta=29° C. below the perfect match melting point. These parameters can be summarized as
AAGCAGAACC
AGGCCTACAT
GGAATCCGTT
GAGGTTTCGA
CACCTGAGAC
CGGATGTCAA
AGGATTGTGC
CGTCACCTTT
CCTAACAGCC
GAACTGCACA
CACAACCACA
CGGCTTAAGA
GACACCGAAT
TAGTACGTCG
GCCTCCAAAT
GGATCTGTCG
This example tag set is based on 14-mer tags, shown in Table 2. These have Tm near 60° C. or near 50° C., while undesirable structure melting points all lower by approximately 35° C. The specific critical melting point design parameters for the tag set are:
This example tag set is based on 20-mer tags, shown in Table 3. These have Tm near 70° C. or near 60° C., while undesirable structure melting points all lower by approximately 30° C. The specific critical melting point design parameters for the tag set are:
ATATTGCCACGCTAAGTTGG
GAGATTCCGAGCATTTCCTC
TAATTGAGTAAGGCGAGAGG
TGTAAGCCATATTGCCGGAT
TTCTGTTTGGATGCCTTGTT
CGCTATGATGACTACTCCGT
AGTTGGATTGTAAGCAGCAG
TGGAAGGATTGTCGGTTAGA
AACGCAAAGGATACAAGGTT
TTGTCAAATTGCAACTCGGT
CCTGTTCAGTATTGCGTTGA
TGTCGAAAGTCATTAGGCCT
GGACTGAGAACTTTAAGCGC
TCTGTTCTTTCGCACACTAC
CCATGCGTGTTGTCTATGAC
CATGTACGCGGTATCAATGG
GACTACTTGCTCAATCTGCC
ACTGGTTAGGCGTATCTGAG
TATACGAAGCCTCACTGACA
TATCATGCTTGGACCTTTCT
CGATCTCAATAACTCGCTGC
GTTATGAGTTGTACCGCTGG
ACCTTAGGTCAGTCTTCCAT
ACTGTGCACGACATAATTGG
This example tag set is based on 30-mer tags, shown in Table 4. These have Tm near 80° C. or near 70° C., while undesirable structure melting points all lower by approximately 30° C. The specific critical melting point design parameters for the tag set are:
This example tag set is based on 40-mer tags, shown in Table 5. These have Tm near 85° C. or near 75° C., while undesirable structure melting points all lower by approximately 30° C. The specific critical melting point design parameters for the tag set are:
This example tag set is based on 50-mer tags, shown in Table 6 (reverse complement sequences are not shown, for brevity). These have Tm near 90° C. or near 76° C., while undesirable structure melting points all lower by approximately 30° C. The specific critical melting point design parameters for the tag set are:
This example tag set is based on 60-mer tags, shown in Table 7 (reverse complement sequences are not shown, for brevity). These have Tm near 90° C. or near 80° C., while undesirable structure melting points all lower by approximately 35° C. The specific critical melting point design parameters for the tag set are:
The design of sequences for bipartite tag sets, for use in assays such as that illustrated in
Table 8 Shows the example A and B ½ tags, written 5′-3′, which are 10-mers in this illustration. These are used to create the A:B tag sets shown in subsequent tables.
When these 10-mer ½ tags of Table 8 are joined with no insert, the resulting 20-mer bipartite tag set of 16 tags is shown in Table 9, along with the reverse complements of the tags. This 16-tag set of bipartite tags have Tm near 75° C. at a target sodium ion concentration of 1.0M, while undesirable structure melting points all lower by approximately 10° C. The specific critical melting point design parameters for the bipartite tag set are:
When these 10-mer ½ tags of Table 8 are joined with a T insert, the resulting 21-mer bipartite tag set of 16 tags is shown in Table 10, along with the reverse complements of the tags, and with the join insert underlined. This 16-tag set of bipartite tags have Tm near 75° C. at a target sodium ion concentration of 1.0M, while undesirable structure melting points all lower by approximately 10° C. The specific critical melting point design parameters for the bipartite tag set are:
When these 10-mer ½ tags of Table 8 are joined with an AAAC insert, the resulting 24-mer bipartite tag set of 16 tags is shown in Table 11, along with the reverse complements of the tags, and with the join insert underlined. This 16-tag set of bipartite tags have Tm near 80° C. at a target sodium ion concentration of 1.0M, while undesirable structure melting points all lower by approximately 10° C. The specific critical melting point design parameters for the bipartite tag set are:
When these 10-mer ½ tags of Table 8 are joined with a 10-base insert, ATTCGGTGTC, the resulting 30-mer bipartite tag set of 16 tags is shown in Table 12, along with the reverse complements of the tags, and with the join insert underlined. This 16-tag set of bipartite tags have Tm near 85° C. at a target sodium ion concentration of 1.0M, while undesirable structure melting points all lower by at least approximately 5° C. The specific critical melting point design parameters for the bipartite tag set are:
The above examples of tag set designs are illustrative, and not meant to be limiting. From these examples, and the procedures disclosed, there are many variations on the design methodology disclosed that would be obvious to those expert in DNA secondary structure calculations and hybridization assay principles, that can similarly be used to design tag sets of arbitrary number of tags, and that are uniformly matched in their perfect duplex hybridization properties, and which have greatly reduced potential for all unwanted hybridization reactions within the set, including self-interactions, or cross-interactions, involving the tags or tag complements. Such method variations may include not just matching of melting points, Tm, but also matching of other thermodynamic parameters of tag duplexes, such as the entropy differential, enthalpy differential, or free energy differential. Such method variations also may also include consideration of an extended set of secondary structures that may occur for any given strand or pair of interacting strands, and estimating the effects of these multiple possible structures on the hybridization interactions within the tag set.
Tags Using Modified Nucleotides or Other Modifications. Another advantage of using the reporter tag framework, over direct detection of target native DNA, or other kinds of native targets, is that both the reporter tags and/or their complementary probes on the sensor array may make use of modifications which may be made to DNA oligos that could enhance the performance of the hybridization sensor, such as providing greater signal, greater signal to noise, or less cross-hybridization. Such modifications in preferred embodiments may include nucleotide analogues, modified bases, or signal enhancing labels. In preferred embodiments, this may include the use of Locked Nucleic Acids (LNAs) or Peptide Nucleic Acids (PNAs), or other modified forms of nucleic acids (XNAs). In some preferred embodiments, as disclosed, the reporter tag is generated by a polymerase replication of the tag complement, and not all such modifications can be specifically propagated through such a process. For such modifications that do not propagate under polymerase copying, the primary modifications must be on the tag probe on the sensor array. In this case, the tag set can still enjoy enhanced benefits of these modifications, versus their use with native DNA targets, because, in preferred embodiments, the tag sequences can be designed to get maximal performance benefit of such modifications residing in the array tag probes. This includes using the base pairing that are most impacted by these modifications (e.g. which specific base-vs-analog-base complements do the most to increasing desired proper pair Tm, or decrease unwanted mismatched pair Tm). Taking maximum advantage of these requires the tag design process outline above to may use of such rules, or in preferred embodiment, of their impact on the Tm calculations. Other modifications can be propagated through polymerase copying reactions, included extended genetic code analogs, and such modifications could therefore reside in the reporter tags themselves. Also, tags produced by ligation of tag parts may also readily include all such modifications under consideration.
Such signal enhancing groups that may be added to tags include in preferred embodiments, the use of biotinylated nucleotides (dNTPs) in the tag production assays that involve polymerase-based copying, such that resulting tags so produced are biotinylated via the presence of biotinylated nucleotides. As is well known to those expert in nucleic acid manipulation, this can be used for either subsequent tag purification processes to provide a purer pool of tags for better detection behavior on the sensor array, or this can be used for subsequent tag labeling by the biotin-avid conjugation reaction, to add avidin-based groups to enhance the sensor signals. As is well known to those expert in nucleic acid manipulation, other forms of conjugatable nucleotides (dNTPs) that are compatible with polymerase extension can be used, such as those with click chemistry groups, or any of many other known bioconjugation groups, biotin-avidin being just one well known and widely used exemplar.
Many such modifications are well-known to those skilled in nucleic acid chemistry, which have known properties of increase the melting point Tm of duplexes over corresponding native forms, when used in one or both strands, or which can also reduce the melting point/further destabilize the unwanted cross-hybridizations between tags, by increasing the energy costs of mismatched bases beyond the native levels. By using such modifications, the reporter tags can have much better hybridization properties as a set, relative to the detection sensor array, as compared to sets of native DNA reporter tags and tag probes, or also compared to native DNA hybridization target sets, such as natively occurring sequence segments of interest.
Methods and Applications for Infectious Disease In preferred embodiments, such testing or monitoring applications referred to above include testing for the presence of pathogens, and in preferred embodiments, testing for parasites, fungi, bacterial pathogens or viral pathogens. Such parasites include Malaria, Giardia, and Toxoplasmosis. Such bacterial pathogens include Salmonella, E. Coli.. Such viral pathogens include influenza, flu viruses, cold viruses-including rhinovirus, adenovirus, and human corona virus, HIV, Ebola, Dengue, Hanta, Zika and West Nile viruses, SARS, MERS, and COVID-19 virus, and novel viruses of DNA or RNA type related to or unrelated to these, that have a known genetic sequence to provide for defining relevant tag reporter assays such as those described in
The major elements of preferred embodiments for infectious disease pathogen detection applications are disclosed here, as illustrated in
In preferred embodiments for a testing methods and applications, a primary biosample is acquired directly from a test subject or the environment, and then some form of sample prep is required to prepare materials to the proper state to apply to the sensor device for measurement. The primary sample in preferred embodiments could be tissue, saliva, mucous, buccal swab, blood, sweat, urine, stool, out bodily fluids, or exhaled air, or material filtered from air or water, or material swabbed from a surface. It could also be such samples acquired from plants or animals in the environment, or from food, or from known vectors in the environment that carry such pathogens, such as bats, rodents, mosquitoes or snails. The sample prep could in preferred embodiments be a crude cell lysate extract containing DNA, or could be DNA further purified from the sample by standard purification column or filter paper purifications, or other extraction such as phenyl-chloroform. In preferred embodiments the purified sample could be the results from applying any of the many forms of PCR amplification reaction to the sample, which could in preferred embodiments be thermocycling or isothermal forms of PCR. In preferred embodiments, such sample prep is done by a self-contained sample prep device, or in other preferred embodiments, such a device integrated with the sensor platform, such as in the case of fully integrated point-of-us testing devices.
In preferred embodiments for the testing method using such devices and systems, the test system is deployed at a testing site, a primary biosample is collected and delivered to the testing site, to be tested for presence of a given pathogen or pathogen strain, a sample preparation process is applied to the primary sample to produce a product suitable for a tag reporter assay, and the resulting tag pool is to be applied to the molecular electronic sensor tag array chip device, which comprises a multiplicity of hybridization probe sensors that correspond to the tag set in use, and the device signals are readout, undergo primary local signal processing, and these data are then transferred to a centralized or cloud-based server for subsequent additional analysis or testing outcome report generation.
In preferred embodiments, the testing site could be a centralized testing facility of high capacity, for a business, hospital or other organization, or for a region such as a city, county, state, our country. In other preferred embodiments, the testing site could be a field deployment site, or a point-of-contact site, such as at an airport, transportation hub, or major gathering site such as an arena or stadium, or at an immigration control checkpoint or temporary monitoring point set up by the military, police, or government officials. In other embodiments, the testing site could be a mobile van that is deployed to sites as needed. In other preferred embodiments, the testing site may be in the home for private individuals. In other preferred embodiments, the testing site could be autonomous environmental monitoring stations deployed into the field, stationary or mobile, including driving, flying or aquatic drones, that monitor samples acquired locally from the environment, such as through filtering of air, or water, or trapping of known disease vectors or carriers in the environment, such as insects, rodents, bats or birds, or aquatic snails. In preferred embodiments, mosquitoes are one such vector.
In preferred embodiments, the primary biosample could be obtained as a swab of a surface that collects material deposited on the surface, as filtered material collected from air or water, or a water sample, or as a bodily fluid sample or buccal swab or saliva or excrement or tissue sample provided from a person or animal, or as a sample of a food item, or agricultural product.
In preferred embodiments, the sample collection may be done in close proximity to the test system, such as within 1 foot, 10 feet, or 100 feet, and in preferred embodiments such samples are rapidly delivered to the test system, such as within 10 seconds, one minute, 10 minutes or 1 hour, in order to have the benefit of distributed sample collection combined with rapid testing and test results. In preferred embodiments, the sample collection includes the assignment of a unique ID to the sample, such as an alpha-numeric code, serial number, barcode or QR code, to be used for sample tracking, and affiliation of final report back to the sample. In preferred embodiments, other identifying information may be collected and attached to the sample or affiliated with the sample ID, such as personal identifier, such as a personal name, social security number, government issued ID number, employee number, or date of birth, facial image or fingerprint.
In preferred embodiments, the sample preparation process comprises a PCR-based amplification method applied to the sample to produce amplified DNA material for detection. In other preferred embodiments, the sample preparation process is a process to extract and purify DNA or RNA without any amplification to produced purified material for detection. In preferred embodiment, this sample prep process is performed in a separate instrument from the sensor chip instrument, and is transferred to that instrument. In other preferred embodiments, the sample prep processes are performed on a subsystem integrated into the same instrument that runs the sensor chip device.
In preferred embodiments, the tag report assay may be performed off the sensor chip, the resulting tag pool may be applied to the tag array in either a purified form, that purifies for the released tags, or in unpurified form as produced by the tag reporter assay. In preferred embodiments, this tag reporter reaction may be done on a separate instrument, or integrated into the same instrument that runs the tag array sensor chip. In other preferred embodiments, the tag reporter assay may be performed in the flow cell or reaction volume of the tag array chip itself.
In preferred embodiments, the pathogen of interest is a pathogenic bacterium, such as E. Coli, or Salmonella, or Listeria, and the corresponding tag reporter assay probes include specific DNA probes that target segments common to many strains of such bacteria of interest, such as in
Application to Viral Pandemics and COVID-19 In other preferred embodiments, the pathogen of interest is a virus, such as influenza, flu viruses, cold viruses-including rhinovirus, adenovirus, and human corona virus, HIV, Ebola, SARS, MERS, and COVID-19, and novel viruses of DNA or RNA type related to or unrelated to these, that have a known genetic sequence to provide for defining tag reporter probes such as in the assays of
caaccaacag aatc
tattgt tagatttcct aatattacaa acttgtgccc ttttggtgaa
A double stranded nucleic acid sequence from COVID-19 used in embodiments herein is provided below.
An exemplary gRNA coupled to a bridge is provided below (1428 Int-RNA).
In certain embodiments, a nucleic acid probe comprises a nucleic acid sequence corresponding to or complementary a selected portion of a nucleic acid sequence of a pathogen of interest described herein. In such embodiments, a nucleic acid probe may for example comprise a nucleic acid sequence of at least 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45 or 50 contiguous nucleotides of a pathogen of interest described herein. In alternative embodiments, a nucleic acid probe may comprise a nucleic acid sequence of 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30 to 35, 35 to 40, 40 to 45, or 45 to 50 contiguous nucleotides of a pathogen of interest described herein. In alternative embodiments, a nucleic acid probe may comprise any range of consecutive nucleic acid sequence numbering between 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, and 25, such as a range between 4 and about 18 contiguous nucleotides of a pathogen, between 5 and about 15 contiguous nucleotides of a pathogen, between 6 and 12 contiguous nucleotides of a pathogen, etc.
In other embodiments, the pathogen of interest is a virus selected from the group consisting of Adeno-Associated Virus, Adenovirus, Arena virus (Lassa virus), Alpha virus, Astrovirus, Bacille Calmette-Guerin ‘BCG’, BK virus (including associated with kidney transplant patients), Papovavirus, Bunyavirus, Burkett's Lymphoma (Herpes), Calicivirus, California, encephalitis (Bunyavirus), Colorado tick fever (Reovirus), Corona virus, Coronavirus, Coxsackie, Coxsackie virus A, B (Enterovirus), Crimea-Congo hemorrhagic fever (Bunyavirus), Cytomegalovirus, Cytomegaly, Dengue (Flavivirus), Diptheria (bacteria), Ebola, Ebola/Marburg hemorrhagic fever (Filoviruses), Epstein-Barr Virus ‘EBV’, Echovirus, Enterovirus, Eastern equine encephalitis ‘EEE’, Togaviruses, Encephalitis, Enterovirus, Flavi virus, Hantavirus, Bunyavirus, Hepatitis A., (Enterovirus), Hepatitis B virus (Hepadnavirus), Hepatitis C (Flavivirus), Hepatitis E (Calicivirus), Herpes, Herpes Varicella-Zoster virus, HIV Human Immunodeficiency Virus (Retrovirus), HIV-AIDS (Retrovirus), Human Papilloma Virus ‘HPV’, Cervical cancer (Papovavirus), HSV 1 Herpes Simplex I, HSV 2 Herpes Simplex II, HTLV-T-cell leukemia (Retrovirus), Influenza (Orthomyxovirus), Japanese encephalitis (Flavivirus), Kaposi's Sarcoma associated herpes virus KSHV (Herpes HHV 8), Kyusaki, Lassa Virus, Lentivirus, Lymphocytic Choriomeningitis Virus LCMV (Arenavirus), Measles (Rubella), Measels, Measles Micro (Paramyxovirus), Monkey Bites (Herpes strain HHV 7), Mononucleosis (Herpes), Morbilli, Mumps (Paramyxovirus), Newcastle's diseases virus, Norovirus, Norwalk virus (Calicivirus), Orthomyxoviruses (Influenza virus A, B, C), Papillomavirus (warts), Papova (M.S.), Papovavirus (JC-progressive multifocal leukoencephalopathy in HIV) (Papovavirus), Parainfluenza Nonsegmented (Paramyxovirus), Paramyxovirus, Parvovirus (B19 virusaplastic crises in sickle cell disease), Picoma virus, Pertussus (bacteria), Polio (Enterovirus), Poxvirus (Smallpox), Prions, Rabies (Rhabdovirus), Reovirus, Retrovirus, Rhabdovirus (Rabies), Rhinovirus, Roseola (Herpes HHV 6), Rotavirus, Respiratory SyncitialVirus (Paramyxovirus), Rubella (Togaviruses), Bunyavirus, Flavivirus, Poxvirus, Vaccinia virus, Variola, Venezuelan Equine Encephalitis ‘VEE’ (Togaviruses), Wart virus (Papillomavirus), Western Equine Encephalitis “WEE’ (Togaviruses), West Nile Virus (Flavivirus), and Yellow fever (Flavivirus).
In preferred embodiments, the primary data analysis performed on system includes data reduction algorithms that reduce the amount of data needed to be transferred off-system. Such methods may include discarding uninformative portions of the signal trace, subsampling or parameterization of parts of the signal trace, and general data compression algorithms known to those skilled in data compression, such as methods utilized in zip, gzip, bzip, and other common compression utilities. In preferred embodiments, the primary analysis also includes analysis of traces to produce a net hybridization intensity score for each probe on the sensor chip, and in preferred embodiments, a final call of detection, non-detection, or indeterminate measurement for each probe on the sensor chip. In other preferred embodiments, such analysis is done in the off-instrument phase of an analysis. In other preferred embodiments, the off-instrument analysis includes the generation of a final report that affiliates sample identifiers with the outcome of the test for the presence of pathogens of interest. Such identifies may include a subject name or assigned ID or other identifier provided at the point of sample collection, as well as sample identifiers such as the time and place of sample collection, and time and place of sample processing on the sensor chip system.
In preferred embodiments, the test is performed rapidly, with the time from providing the primary biosample, to completion of analysis and report generation being less than 24 hours, and in preferred embodiments, less than 8 hours, less than 4 hours, less than 1 hour, less than 30 minutes, or less than 15 minutes.
In a preferred embodiment, the system disclosed above is applied to the monitoring of the pandemic disease COVID-19, a viral disease outbreak in 2019 originating in Wuhan, China. In this application, the hybridization probes are selected to be complements to segments from the genome of the underlying virus, the Severe Acute Respiratory Syndrome Coronavirus 2, also designated SARS-CoV-2. This SARS-CoV-2 virus has a single stranded RNA genome, of size approximately 30,000 bases. One exemplar sequence for this genome is available at the Genbank® database as accession ID LC528232 (see https://www.ncbi.nlm.nih.gov/genbank/). Thus, in preferred embodiments where tag reporter assay directly detects the genomic material by hybridization, this will be DNA-RNA assay, and the sample prep must extract and purify RNA from the primary biosample. In preferred embodiments where the sample prep comprises a PCR amplification of the genome, this would be a reverse-transcriptase mediated PCR that produces amplified DNA product, either of specific target segments, or non-specific segments of the entire genome, and the resulting tag reporter assay is a DNA-DNA assay. By taxonomy, this virus a specific strain of the Severe Acute Respiratory Syndrome-related Coronavirus (SARSr-Cov), which is a species of coronavirus that infects humans, bats and certain other mammals. There are hundreds of known strains of this virus, and hybridization probes must be chosen for sequence segments that distinguish the COVID-19 strain from other harmless strains, or other disease-causing strains, such as the strain designated SARS-CoV, which caused SARS disease outbreak in 2002 in Guangdong Province, China. There are numerous sequence differences between these strains, providing many candidates for distinguishing target sequences. In certain preferred embodiments, a nucleic acid probe comprises a nucleic acid sequence corresponding to or complementary a selected portion of the entire nucleic acid sequence of a COVID-19 virus described herein. In such embodiments, a nucleic acid probe may for example comprise a nucleic acid sequence of at least 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45 or 50 contiguous nucleotides of a Coronavirus or Coronavirus-related virus of interest described herein.
In some preferred embodiments for COVID-19 testing, primary samples would be environmental surface swabs, or air filters, and such testing provides for monitoring of the presence of the virus in a target location where such samples are collected. In other preferred embodiments for COVID-19 testing, primary samples would be saliva, buccal swab or mucus samples from individuals, and such testing provides for detection or diagnosis of subjects with active viral infections. In preferred embodiments of such testing, the present device provides for rapid, distributed testing. In preferred embodiments for this, the system provides for a test in less than 1 hour, or less than 30 minutes, or less than 15 minutes.
Other Infection Disease Applications In another preferred embodiment, the systems disclosed above can be applied to testing for and response to outbreaks of bacterial disease, such as when the food supply is contaminated by E. Coli or Salmonella. For example, in preferred embodiments, this could be an outbreak where lettuce is contaminated by E. Coli, or where ground beef is contaminated by Salmonella. such cases, in preferred embodiments testing platforms are deployed in point-of-use format to sites of food production, such as farms, fields, and processing plants. In preferred embodiments such testing platforms are also deployed in point-of-use form, as well as mobile or distributed permanent or temporary monitoring installations, to points of distribution, such as warehouses, shipping centers, or grocery stores and restaurants. In preferred embodiments, such testing platforms are deployed to the end point of consumption, such as in the home, for home-based testing. In preferred embodiments, the aggregated cloud-based analysis, including Big Data, AI and machine learning techniques, can be used to track the outbreak, and pinpoint the source.
In another preferred embodiments, the systems disclosed above can be applied to testing for sexually transmitted diseases (STDs). In this application, it is an advantage of the disclosure disclosed that such molecular electronic hybridization sensor systems can be deployed for rapid, low-cost testing in highly distributed fashion, such as in community or field clinics, or for the privacy of home use. For STDs, the causal pathogens to be detected may be parasites, such as Trichomoniasis, or fungi, such as Candidiasis (yeast infection), or bacteria such as Syphilis, Gonorrhea, or Chlamydia, or viruses such as Herpes, HPV, EBV, Hepatitis, and HIV. In such applications, the primary samples required are clinically well established, and may be a blood sample, or a swab of bodily fluids or of discharges, or from open sores. In preferred embodiments of such testing, the present device provides for rapid, distributed testing. In preferred embodiments for this, the system provides for a test in less than 1 hour, or less than 30 minutes, or less than 15 minutes. In preferred embodiments, the system provides the advantage of extreme personal privacy, with systems and test kits that can be used entirely within the home.
Experimental Demonstrations of Molecular Electronic Hybridization Sensors and Chips. Experiments that reduce these devices, methods and apparatus to practice are presented here.
The sensor embodiments used for these experiments are shown in
The specifics of the probe-bridge complex for these experiments is as follows. In all cases, the bridge molecule is a peptide that forms an alpha-helical protein structure, with primary 227 amino acid sequence SEQ1
The structure of this peptide is that of a repeat of the helix-promoting motif EAAAR, and where a central amino acid is replaced by a C to allow for Cysteine-mediated conjugation to the probe. The termini of this peptide consist of the repeats QQSWPISGSGQQSWPISGSGQQSWPISGSG, which is three repetitions of the metal binding peptide QQSWPIS, separated by short GSG spacers, and which provides for binding to the metal electrodes. In helical form, the length of this peptide is approximated 25 nm. It is used with nanoelectrode gaps in the range of 15-20 nm. This peptide was produced by bacterial protein expression of a synthetic gene encoding the peptide.
The conjugation of the hybridization probe to the bridge is done using a bifunctional cross-linker APN-BCN (Bicyclo[6.1.0]non-4-yn-9-ylmethyl (4-(cyanoethynyl)phenyl)carbamate; Sigma-Aldrich). The conjugation product is purified using a desalting spin-column, and the product peptide reacted with the hybridization probe oligonucleotide having an azide at the 5′ end, such that the 5′ end is conjugated proximal to the bridge. The resulting peptide/DNA complex was purified by size-exclusion chromatography and verified by SDS Gel electrophoresis. The relative quantities of fluorescein (for the labeled SEQ4), DNA and tryptophan were checked by UV-vis spectroscopy. In the case of the probe in
The target for this hybridization probe in experiments is the 14-mer SEQ3, which binds leaving a 5-base gap at the Bridge-end of the probe strand:
In the case of the longer probe indicated in
The various different primers (perfect match and with mismatches) bound to this sequence in various experiments are indicated in the table shown in
The experimental results shown were run at room temperature, in a standard molecular biology buffer, with composition comprising 50 mM NaCl; 10 mM Tris-HCl; 10 mM MgCl2; 1 mM DTT (pH 7.9 @ 25° C.).
Additional Molecular Electronic s Hybridization Sensor Embodiments In these experimental examples using the sensors shown in
A suggested by
In other preferred embodiments, an additional intermediary molecule may be used to complex the hybridization probe with the bridge, such as shown in
As shown in
As indicated in
Note that the embodiment shown in
There are many variations and combinations of the embodiments disclosed above that may provide useful benefits to molecular electronics hybridization sensors, methods and applications, and to one skilled in the art of molecular biology, these would be obvious variations, and these are therefore also all encompassed by the disclosure here.
The following examples are intended to illustrate but not to limit the invention.
The proprietary CMOS sensor array chips used in this study were designed at Roswell Biotechnologies Inc. and fabricated at TSMC in Taiwan, using a 180 nm CMOS node. These chips present a 16 k (16,384) sensor pixel array. Pixels are post-processed at the IMEC foundry (Leuven, Belgium) to have the tips of Ruthenium nano-electrodes exposed on the solution-facing surface of the chip, with such electrodes fabricated using either photolithography or e-beam lithography methods. The 16 k electrodes were fabricated to have various nano-gap sizes in different ranges: 10-12 nm, 14-16 nm, 17-20 nm and 20-30 nm. Gaps of 14-20 nm were used for present experiments, and other sizes were not analyzed for present experiments. The chips were mounted in custom-built instruments to supply support to chip operations and sensor pixel data collection. The data is collected from the 16 k sensor array at a frame rate of 1000 Hz, and current measurements have 10 bits of resolution.
Alpha Helix molecular wire bridge preparation. The peptide is a helical forming sequence 242 amino acids in length, including an N-terminal FLAG sequence and metal-binding motifs at each end. In the alpha-helical conformation the length is ˜25 nm. A single cysteine resides in the middle position as the attachment point for probes using alkyne/azide click chemistry. To attach a DNA to the peptide, it was first modified using a thiol-reactive (45) 3-arylpropiolonitrile (APN)-PEG4-bicyclo [6.1.0] nonyne (BCN) (Conju-Probe, San Diego, CA) yielding a reactive bicyclo nonyne alkyne on the peptide. Typically, 100 μL of peptide solution (3 to 4 mg/mL in PBS) was first mixed with freshly prepared DTT or TCEP (2 mM final) and left at room temperature for an hour. Then the APN-BCN reagent dissolved in DMSO (1 M stock), is added to a final concentration of 0.01 M and mixed thoroughly by pipetting. The reaction is left at 4° C. for a minimum of 48 hours. The excess APN-BCN is removed by size-exclusion chromatography. The purified peptide-BCN is stored at −20° C. until needed. Further click reactions are done using oligos designed with azide to obtain the bridges used in this study. The reaction of BCN-azide was performed in PBS at molar excess of the oligo-azide to purified peptide-BCN prep. The final reaction was further chromatographically purified to more than 95%. The oligos were blocked at the free 3′ end with a fluorescent dye (FAM or Cy3 to help detection of peptide on SDS-PAGE). A gel shift on SDS-PAGE confirmed the bridge conjugation to oligos.
Sequences of oligos used in this study as DNA probes:
All oligo binding experiments performed in a buffer 50 mM Tris HCl pH 7.5, 4 mM DTT, 10 mM KCl and 10 mM SrCl2 (Buffer A). Primer P-3 binds with its 3′ terminus 3 nucleotides away from the bridge; the sequence is 5′-CCTGTCACCTGCAC, complementary to the 17mer.
All temperature melt experiments were using the 45mer probe-peptide bridges. The two oligos used in this analysis are 2P-0: CCTCTGTGAAGGCCTGATCG and 2P-5: CCTCTGTGAAGGCCT. The temperature changes were controlled by the software interface that communicates with a Peltier device sitting attached to the chips. The temperature ramps were recorded as ignore and resume phases while every two-degree step were recorded continuously for four minutes of data collection stabilized at the temperature desired.
For assessing the binding kinetics for match and mismatch oligos following oligos designed against the 45mer probe bridge.
Exact Match 5′—CCTCTGTGAAGGCCTGATCG, 1 Mismatch 5′-CCTCTCTGAAGGCCTGATCG, 2 Mismatch 5′—CCTCTGTGAACCCCTGATCG, 3 Mismatch 5′-CCAGAGTGAAGGCCTGATCG. Targets (all 20-mers) were added separately, and binding kinetics monitored to tabulate fraction bound and other parameters.
The experiments below involve attaching bridge molecules to metal electrodes on chips with nanoelectrode spacings in the desired ranges of 15 nm to 20 nm. The molecular bridge used here is a 25 nm peptide with specific metal binding sequences at both ends. The end groups allow it to self-assemble onto the electrodes under proper conditions. An “active bridging” protocol uses electrical forces to attract the bridge to the electrodes, to radically accelerate and enhance this assembly process, allowing assembly to be completed in seconds. Specifically, using dielectrophoretic trapping readily shortens this time to 10 seconds, and also allows working at much lower input concentrations of bridge molecules. The dielectrophoretic trapping protocol relies on the application of an AC voltage (here 100 KHz, 1.6 V peak-to-peak). The DC current on sensor the after bridging is compared with the value prior, to assess the jump in current indicative of successful bridging. A population of sensors showing substantial bridging current increases is thereby observed, typically over 10% of all available pixels on the chip, indicating the presence of the 25 nm peptide bridge spanning the electrode gap.
DNA-DNA hybridization binding. The probe molecule attached to the molecular wire bridge for the work here is a single-stranded 17-mer DNA oligonucleotide with a specific sequence (
Analysis of single-molecule binding data using Hidden Markov Models (HMM) Hidden Markov Models (HMM) are signal processing methods known to be well-suited to the analysis of such timeseries measurements, based on their extensive use in speech recognition. In the present case, the trace is fit to a 2-state HMM, that has also in the past been successfully applied to data from single-molecule biophysics experiments, particularly those using nanopores and enzymes. The HMM assigns the “hidden” bound and unbound states of the sensor oligo to segments of the signal representing two different current levels that correspond to either the “unbound” bridge state (low current) or the “bound” bridge state (higher current). In this case, the unbound state is identified as a low-current range (˜30 pA) and the bound state is identified as the high current range (50 pA-70 pA). The key fundamental parameters that can be extracted from the HMM segmented signal trace are the individual waiting times between binding events, τ0, and the individual dwell times spent bound, τ1.
While these times are ideally exponentially distributed, note the complete empirical distributions also potentially contains richer information about the more complex nature of the binding interactions. Knowing these fundamental event duration times allows calculation of the kinetic rates that characterize such a binding reaction. The duration of such events are random variables, with an exponential probability density distribution,
where t is an event duration time,
The key kinetic rate parameters are the off rate, koff, which is computed from the totality of dwell times,
and the on rate, kon, computed from the waiting times,
In addition, the total fraction of time spent bound is another convenient summary statistic that is readily related to the concentration of the target in solution, and is also conveniently related to overall classical binding affinity of the interaction, Kd, which here is defined at the single-molecule level as the target concentration at which the single probe molecule spends equal time bound and unbound. By formula, the fraction of time that the probe is bound (denoted “Fraction Bound” in all figures) to the target molecule is given by the sum of all the τ1 periods divided by the total of both the τ1 and τ0 periods: fb=τ1/(Στ1+Στ0). Note that as target concentration increases, the mean waiting times will decrease in proportion, while dwell times should remain constant being a property only of the interaction, and thus the fraction of time bound is expected to scale with molar concentration, [c], like fb=1/(1+[c]/Kd), where Kd is the empirical binding affinity concentration. Note that Kd can also be visualized as the inflection point in a titration curve plot of Fraction Bound vs. Target Concentration, as shown throughout this report. Also note net amount of time spent in the bound and unbound states can be conveniently visualized using vertical histograms of the measured current values in a signal segment (sampled at 1 kHz) as shown to the right of the traces in
Single Molecule Thermodynamics: Melting Curves Another application of this same type of assay is to determine the melting temperature (Tm) of the DNA duplex, which is defined here at the single-molecule level here as the temperature at which the probe DNA molecule spends equal amounts of time in the bound and unbound states. This is directly observable in the single molecule binding traces. As shown in
Mismatch Sensitivity The single-molecule binding probe signal trace contains rich inform about the binding reaction, and is also highly sensitive to the specific binding target. This can be illustrated in fine detail for DNA oligo binding by looking at the impact of single-base mismatches in the target oligo sequence. As shown in
The buffer used for the experiment was 50 mM Tris 10 mM KCL and 4 mM DTT pH 7.5Temps: 30, 40 and 50. Chips: Ruthenium Bridge: N1Br21; EA3R-/5AzideN//iCy3/CCGCATTACGTITGGTGGACC Voltage: 0.7 V
Primers: N1-full-21, N1-double, N1-Trip, N1-Quad, N1-Mis1a, N1-Mis1b, Ni-Mis1c, N1-Mis-ends at 100nMResults: bufferTK-10 at pH 8.0bridge N1-full 24 EA3R at 20 nM
Dry baseline was recorded for few seconds wet baseline in bridging solutions itself. Bridging conditions at IV follow this 100 kHz for 10 sec, 60s at 1 MHz-odd, 30s at 1 MHz-even, 10s 1 MHz even. The oligonucleotides used in binding kinetics assays are described below.
In this experiment, N1Br26/5AzideN//iCy3/ccccgcattacgtttggtggaccctc is used as the probe containing the coding for the N1 gene of COVID that is tethered on the peptide bridge that act as a sensor that connect the two electrodes (see the diagrammatic representation in the following file). A series of single, double, triple and Quadruple mutants are designed into the primers (the nucleotides substitution is shown in CAPs).
Table 15 above shows exemplary EA3R bridges tethered to study the SNP kinetics.
Table 16 above shows exemplary SNPs designed into the primers to demonstrate Primers designed from Full match to four nucleotide substitutions that can be identified using kinetics of binding.
Embodiment 1: A molecular electronics sensor configured to detect a tag from a tag set, using any of the hybridization sensors disclosed, and comprising the reverse complement of the tag.
Embodiment 2: A tag reporter assay, producing a tag reporter, and using the sensor of Embodiment 1 to detect the tag.
Embodiment 3: A molecular electronics sensor array chip configured to detect the tags from a tag set.
Embodiment 4: A multiplex tag reporter assay, producing reporter tags from a given tag set, and using the sensor array chip of Embodiment 3 to detect the reporter tags.
Embodiment 5: The multiplex assay of Embodiment 4, for the detection of target DNA fragments, using linear detection probes as in
Embodiment 6: The multiplex assay of Embodiment 4, for the detection of allele-specific target DNA fragments, using linear allele-specific detection probes as in
Embodiment 7: The multiplex assay of Embodiment 4, used for multiplex detection of ligand binding reactions as in
Embodiment 8: The multiplex assay of Embodiment 4, used for multiplex detection of ligand binding interactions, and using bipartite tags as shown in
Embodiment 9: Tags sets designed to have good properties as disclosed, and functionally screened for good performance on the molecular electronic sensor array chip format of Embodiment 3.
Embodiment 10: Tag sets of Embodiment 9, where such tags in the set number up to 10, up to 100, up to 1000, up to 10,000, up to 100,000, up to 1,000,000, up to 10,000,000, or up to 100,000,000.
Embodiment 11: The tag sets of Embodiment 9, where the set are designed to be Tm matched near 30° C., near 40° C., near 50° C., near 60° C., near 70° C., near 80° C., near 90° C., or near 100° C., relative to a given
Embodiment 12: The tag sets of Embodiment 9, where the set are designed to have the unwanted reaction melting points lower than the target tag Tm by at least 10° C., or 20° C., or 30° C., or 40° C., or 50° C., relative to a given hybridization assay condition.
Embodiment 13: The tag sets of Embodiment 9, where the tags are bipartite tags.
Embodiment 14: Tag sets containing modifications to enhance sensor performance, such as modified nucleotides, nucleotide analogues, or signal enhancing chemical groups or labeling groups which may be attached to DNA oligos, either the tags themselves, or to the tag complement, as well as the design and methods of using such tag sets.
Embodiment 15: A method for genotyping analysis of a DNA sample, consisting of using the tag reporter assays and tag sensor arrays, as disclosed.
Embodiment 16: A method for determining which strain of a pathogen is present in a DNA sample, consisting of using the tag reporter assays and tag sensor arrays, as disclosed.
Embodiment 17: A method for performing gene expression analysis of a sample, for a given set of genes, consisting using the tag reporter assays and tag sensor arrays, as disclosed.
Embodiment 18: A method for pathogen detection, by applying the methods for detecting the presence of a target DNA segment above, consisting of using the tag reporter assays and tag sensor arrays, as disclosed.
Embodiment 19: A process for the applications of infectious disease detection, environmental monitoring, screening, or diagnosis, consisting of collecting a sample from the environment or a subject, providing this sample to a sample preparation system that prepares an extracted DNA sample, applying a multiplex tag reporter assay, providing this sample to the molecular electronics hybridization sensor chip system for the tag set, and processing the hybridization data to determine the presence of the pathogens of interest, as in and producing a final report out of detection or non-detection, using either local analysis, reporting and storage of data at the point of measurement, or remote or cloud-based analysis, reporting and storage of results.
Embodiment 20: The process of Embodiment 19, used for pandemic viral disease testing and monitoring, such as for COVID-19.
Embodiment 21: An apparatus and kit for testing for COVID-19 in environmental or human subject samples, based on the methods and processes of the above embodiments.
Embodiment 22: The process of Embodiment 19, for the detection of Sexually Transmitted Diseases.
Embodiment 23: An apparatus and kits for testing for STDS, based on the methods and processes of the above embodiments.
Embodiment 24: The process of the above embodiments, used for the detection of food borne pathogens.
Embodiment 25: An apparatus and kits for testing for food born pathogens, based on the methods and processes of the above embodiments.
This application is a National Phase of PCT/US21/37469 filed on Jun. 15, 2021, which claims priority to U.S. provisional application having Ser. No. 63/039,337 by Barry Merriman et al., filed on Jun. 15, 2020, and entitled ‘Molecular Electronic Sensors for Multiplex Genetic Analysis using DNA Reporter Tags,’ both of which are incorporated by reference herein.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2021/037469 | 6/15/2021 | WO |
Number | Date | Country | |
---|---|---|---|
63039337 | Jun 2020 | US |