The disclosure relates to methods and systems for high throughput protein discovery.
As genome sequences of various organisms have become available, it is now possible to analyze protein functions on a genome-wide scale. Structural genomics and proteomics, therefore, have become major research foci. Several protein function screening platforms have been developed and used for various purposes, e.g., developing novel antibiotics and novel cancer therapies. However, these platforms have various limitations that prevent them from being used to characterize the large number of proteins, peptides, and enzymes that have been uncovered by genomic and metagenomic sequencing efforts. For example, in vivo cell-based platforms for protein expression and evaluation take advantage of the cell as a natural environment for efficient protein production and functional assays.
Such platforms based on prokaryotic organisms such as Escherichia coli or eukaryotic model cells such as Human Embryonic Kidney cells are often favored for being well characterized and simple to manipulate. However, using conventional methods of protein extraction and purification, the number of proteins that can be synthesized and studied is limited compared to the scale of proteins identified from genomic and metagenomic sequencing. Pooled screening techniques that enable the simultaneous testing of multiple constructs at once suffer from limitations of readouts that cannot adequately measure a diverse set of protein functions and separate functional from non-functional protein candidates within a pool. Furthermore, all of such cell-based methods are limited in that a significant number of proteins cannot be adequately expressed in vivo—for example, expressing heterologous proteins in E. coli often leads to insoluble aggregated folding intermediates, known as inclusion bodies.
There remains a need for an ultra-high throughput protein discovery platform to address pressing needs in human health, sustainability, and beyond.
The disclosure provides systems and methods of leveraging genomic and metagenomic sequencing with large-scale gene synthesis, non-cellular protein synthesis, and low volume protein functional assays for ultrahigh throughput protein discovery and characterization. The systems and methods described in the present disclosure can perform 100,000s or more reactions per run and importantly, screen a diverse and versatile set of protein activities across a number of different domains and applications, including but not limited to genome editing, biologic drug discovery, agricultural insecticides, and advancing environmental sustainability. Such identified proteins can provide novel biotechnological applications, as well as add additional diversity of features and versatility to known protein activities.
In one aspect, the disclosure features microwell array systems having a microwell array including a plurality of isolated microwells, each microwell having side walls, a bottom wall, and a top opening, wherein the microwells are positioned in an array, and wherein each well comprises one or more filter holes arranged in the bottom wall of the microwell; a cover, e.g., a movable plate or immiscible fluid, arranged to optionally and selectively cap one or more of the filter holes; a reservoir to receive waste liquids exiting the microwells through the filter holes, through the top opening of microwells, or both; a substrate to receive contents of one or more of the microwells deposited at one or more locations of a microarray (also known as a “blotting plate”), wherein each location has a known coordinate within the microarray; a system for adding liquids to each microwell; a system for adding microbeads to each microwell; and a system for selecting and marking selected contents at specified locations in the microarray, and optionally, a system to decode contents with given coordinates.
In some embodiments, the volume of each isolated microwell is about 0.5 picoliters to about 100 nanoliters (nl), each microwell has a diameter of from about 5 to 200 microns, and each filter hole has a diameter of from about 0.5 to 150.0 microns. In certain embodiments, the volume of each isolated microwell is less than 100 nanoliters (nl), 50 nl, 10 nl, 5 nl, 1 nl, 500 picoliters (pl), 250 pl, 100 pl, 50 pl, 25 pl, 20 pl, 15 pl, 10 pl, 5 pl, or 1 pl.
In some implementations, each microwell has a diameter of less than 200, 150, 100, 90, 80, 70, 60, 50, 40, 30, 20, or 10 microns and each filter hole has a diameter of about 0.5, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 50, 75, 100, 125, or 150 microns.
In some embodiments, the microwell array has at least 5K, 10K, 50K, 100 K, 250 K, 500 K, 1 M, 5 M, 10 M, or 15 M microwells. In some embodiments, the microwell array has at least 100, 1 K, 5 K, 10 K, 50K, or 100 K microwells per cm2.
In certain embodiments, the inner walls of each microwell are hydrophilic, and surfaces of the microwell array and of the cover are hydrophobic.
In some implementations, the system for adding liquids adds liquids to each microwell via capillary force.
In certain implementations, the system for adding liquids includes one or more microfluidic channels. In some embodiments, the system for adding liquids includes a liquid jetting system. In some embodiments, the system for adding liquids includes a pressure or vacuum pump.
In some implementations, the system for adding liquids to each microwell includes a motor arranged to rotate the microwell array to distribute a liquid across a surface of the microwell array and into each microwell by spin-coating.
In certain implementations, the motor is controlled to spin sufficiently fast to remove excess liquids once the microwells are filled by the liquids.
In some embodiments, the diameter of the filter holes is smaller than a diameter of beads used with the system.
In certain embodiments, each microwell includes two or more filter holes, wherein all filter holes are smaller than a diameter of beads used with the system and wherein second and any subsequent filter holes are smaller than the first filter hole.
In some embodiments, the hole (e.g., rectangular or triangular shape) cannot be sealed completely by the beads and the remaining gap serve as a draining port for liquid.
In some implementations, the protein screening system further includes a centrifugation system arranged to empty waste liquids in the microwells by centrifugation.
In certain implementations, the liquid in each microwell is deposited on the substrate by centrifugation or air pressure.
In some embodiments, the liquids are reagents used for screening including emulsions, suspensions, and cell-free protein synthesis reagents.
In certain embodiments, each microwell includes one filter hole, wherein the filter hole is smaller than a diameter of beads. The hole (e.g., rectangular, triangular, or other shape) would not be sealed completely by the beads and the gap left serve as a draining port for liquid.
In another aspect, the disclosure provides methods of identifying a nucleic acid molecule encoding a polypeptide and/or RNA having a desired bioactivity. The methods include:
(a) attaching a plurality of nucleic acid constructs to a plurality of beads;
(b) loading the plurality of beads into microwells in a microwell array, e.g., the microwell array system described herein, wherein each microwell in the microwell array receives one or more beads, e.g., at most one bead;
(c) incubating the nucleic acid constructs with in vitro transcription/translation (IVTT) reagents for a time sufficient to produce a plurality of polypeptides encoded by the nucleic acid constructs in the microwell array;
(d) depositing nucleic acid constructs or polypeptides from each microwell in the microwell array at specific discrete locations on a substrate to form a ‘blotting plate’ of nucleic acid constructs or polypeptides preserving the spatial relationship of the samples, wherein each location in the blotting plate has a known coordinate that corresponds to a specific microwell in the microwell array;
(e) determining a bioactivity of the polypeptides and/or RNA in the microwells or on the blotting plate and selecting a microwell or location on the blotting plate corresponding to a desired bioactivity; and
(g) determining which nucleic acid constructs correspond to the selected microwell or location on the blotting plate corresponding to the desired bioactivity, thereby identifying the nucleic acid construct that corresponds to the polypeptide and/or RNA having the desired bioactivity.
In some embodiments, the methods further include assembling the plurality of nucleic acid constructs in each microwell by releasing oligo fragments of the nucleic acid constructs and assembling the oligo fragments, e.g., by polymerase cycling assembly, Golden gate assembly, or Gibson assembly.
In certain embodiments, each bead is bound to one or more nucleic acid constructs.
In some implementations, the one or more nucleic acid constructs at the location on the substrate that corresponds to the microwell containing the polypeptide having the desired bioactivity is selected by light induced DNA trapping, light induced surface charge switch, light induced pH change, light induced dissociation, laser microdissection, micromanipulator, or other mechanic picking method.
In certain implementations, the one or more nucleic acid constructs at the location on the substrate that corresponds to the microwell containing the polypeptide having the desired bioactivity is selected by sealing the nucleic acid construct by a sealing reagent.
In some embodiments, the one or more nucleic acid constructs at the location on the substrate that corresponds to the microwell containing the polypeptide having the desired bioactivity is selected by hybridizing the nucleic acid construct with a set of fluorescence probes.
In certain embodiments, the one or more nucleic acid constructs at the location of the substrate that corresponds to the microwell containing the polypeptide having the desired bioactivity is selected by a light-activated nuclease that releases the one or more nucleic acid constructs into solution for collection and sequencing to identify the constructs that correspond to the polypeptides that exhibit the desired bioactivity.
In some implementations, the one or more nucleic acid constructs at the location of the substrate that corresponds to the microwell containing the polypeptide having the desired bioactivity is selected automatically by the polypeptide catalyze a reaction that generates air bubble to expel liquid containing nucleic acid out from the microwells.
In certain implementations, the one or more nucleic acid constructs at the location of the substrate that corresponds to the microwell containing the polypeptide having the desired bioactivity is selected automatically by the polypeptide catalyze a reaction or condition that deforms or dissolves the beads so that nucleic acid could passing through the filtering holes.
In some embodiments, the bioactivity of the polypeptide is analyzed by a catalytical reaction, a binding assay, and a cleavage assay resulting optical signals (e.g., fluorescence, absorption).
In one aspect, the disclosure also provides methods of adding a liquid to a plurality of isolated microwells on a microwell array. The methods include applying a liquid to the microwell array; rotating the microwell array at a first speed, thereby filling each microwell on the microwell array with the liquid; and rotating the microwell array at a second speed, thereby removing excess liquid on the top of microwell array.
In some embodiments, the first speed is slower than the second speed. In certain embodiments, the liquid is applied to the microwell array continuously.
In some implementations, the liquid contains a plurality of beads. In certain implementations, the excess liquid that is removed from the microwell array is less than 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, or 80% of total liquid that is applied to the microwell array.
In another aspect, the disclosure features a centrifuge system having: a support for a microwell array, wherein the support is arranged for rotation and comprises a plate configured for connection to a microwell array and for receiving liquids from the microwell; a liquid dispenser positioned above a surface of the microwell array and configured to dispense one or more liquids onto the center of the surface of the microarray; a first motor arranged to rotate the support around a central axis (812) of the microwell array connected to the support when the microwell array is in a horizontal position; and a second motor arranged to move the microwell array into a vertical position, and to rotate the microwell array around an axis (912) that perpendicular to the central axis (812) of the microwell array and parallel to the vertically positioned microwell array surface.
In some embodiments, microwell array comprises a plurality of isolated microwells, each microwell having side walls, a bottom wall, and a top opening, and wherein each microwell comprises one or more filter holes arranged in the bottom wall of the microwell.
In certain embodiments, the volume of each isolated microwell is about 0.5 picoliters to about 50 nanoliters (nl), e.g., 1, 5, 10, 25, 50, 75, 100, 250, 500, 750, 1000 picoliters or 1, 5, 10, 20, 30, 40, or 50 nanoliters, each microwell has a diameter of from about 5 to 200 microns, e.g., 5, 10, 15, 25, 50, 75, 100, 150, or 200 microns, and each filter hole has a diameter of from about 0.5 to 40.0 microns, e.g., 0.5, 1.0, 5.0, 10.0, 20.0, 30.0, or 40.0 microns.
In some implementations, the first motor is controlled to rotate the support sufficiently fast to remove excess liquids from the surface of the microwell array once the microwells are filled by the liquids.
In certain implementations, the second motor is controlled to spin sufficiently fast to move liquids out of microwells through the filter holes and onto the plate.
In one aspect, the disclosure also provides methods of selectively releasing one or more nucleic acid constructs from a substrate (e.g., plate, beads, microwell array). The methods include providing a substrate comprising an array of nucleic acid constructs; adding a photosensitive agent to the substrate; exposing one or more selected locations on the substrate to light, wherein the light induces the photosensitive agent to cross-link the polymer layer at the selected locations, thereby trapping nucleic acid constructs at the selected locations within the substrate; and washing the substrate with a wash solution, thereby releasing one or more nucleic acid constructs from unselected locations.
In some embodiments, the one or more selected locations are exposed to light by using a light projector with a predetermined pattern. In certain embodiments, the substrate plate is covered by a photomask, and the one or more selected locations are exposed to light by uncovering portions of the photomask at the selected locations.
In some implementations, the methods further include sequencing the one or more nucleic acid constructs in the wash solution. In certain implementations, the methods further include releasing and sequencing the nucleic acid constructs that are trapped by the cross-linked polymer.
In another aspect, the disclosure also provides methods of selectively releasing one or more nucleic acid constructs from a surface. The methods include providing a surface comprising an array of nucleic acid constructs, wherein the nucleic acid constructs are attached to the surface through an electronic charge interaction; and exposing one or more selected locations on the surface to light, wherein the light induces charge-switching of the surface, thereby releasing nucleic acid constructs at the selected locations on the surface.
In some embodiments, one or more selected locations are exposed to light by using a light projector with a predetermined pattern. In certain embodiments, the plate is covered by a photomask, and the one or more selected locations are exposed to light by uncovering portions of the photomask at the selected locations.
In some implementations, the methods further include sequencing the one or more nucleic acid constructs that are released from the plate. In certain implementations, the methods further include releasing and sequencing the nucleic acid constructs at unselected locations.
In one aspect, the disclosure further relates to methods for loading of beads into microwells such that microwells contain either one or no beads, and that a low percentage of the microwells contain two or more beads. The methods include obtaining a plurality of beads in a liquid; obtaining a microwell array system described herein, wherein each microwell comprises one or more larger filter holes and one or more smaller filter holes; wherein each larger filter hole has a diameter that is smaller than a smallest outer diameter of the plurality of beads and is sized to enable the beads seat within and block the larger filter holes thereby decreasing flow of the liquid through the larger filter holes; wherein each smaller filter hole has a diameter that is smaller than the diameter of the larger filter holes and sufficiently smaller than the smallest outer diameter of the plurality of beads such that the beads cannot block the flow of the liquid through the smaller filter holes; and wherein blocking of the larger filter holes by one bead automatically prevents any additional bead from entering the microwell because of a decreased flow rate of the liquid through the microwell, while the smaller filter holes enable the liquid to drain slowly from the microwell to relieve pressure and to inhibit the beads from unblocking the one or more larger filter holes.
In another aspect, the disclosure provides methods of selectively trapping targets in one or more microwells of interest on a microwell array. The methods include identifying one or more microwells of interest; and selectively exposing the one or more microwells of interest to light to induce polymerization of a polymer solution in the one or more microwells of interest, thereby trapping targets in the one or more microwells of interest.
In some embodiments, identifying one or more microwells of interest includes analyzing florescent signals from the microwell array. In some embodiments, the one or more microwells of interest are exposed to light using a photomask. In certain embodiments, the one or more microwells of interest are exposed to light using a projector. In some embodiments, the targets are beads, nucleic acid constructs, or proteins.
In another aspect, the disclosure provides methods of selectively releasing targets in one or more microwells of interest on a microwell array. The methods include identifying one or more microwells of interest; selectively exposing the microwells on the array to light except the one or more microwells of interest, wherein targets in microwells on the array except the one or more microwells of interest are trapped in the microwells due to polymerization of a polymer solution; and collecting targets from the one or more microwells of interest.
In some embodiments, identifying one or more microwells of interest comprises analyzing florescent signals from the microwell array. In some embodiments, the one or more microwells of interest are exposed to light using a photomask. In some embodiments, the one or more microwells of interest are exposed to light using a projector. In some embodiments, the targets are beads, nucleic acid constructs, or proteins.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Methods and materials are described herein for use in the present invention; other, suitable methods and materials known in the art can also be used. The materials, methods, and examples are illustrative only and not intended to be limiting. All publications, patent applications, patents, sequences, database entries, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control.
Other features and advantages of the invention will be apparent from the following detailed description and figures, and from the claims.
The present disclosure relates to methods and systems for searching for proteins having various functions from millions of different organisms and for engineering these proteins for different purposes. Examples include antibody, single-chain antibody, ligases, transposases, methylases, nucleases, transcription factors, sortase, kinases, ubiquitinases, adenylases, proteases, phosphatases, deubiquitinases, anti-microbial peptides, defensin, receptor-interacting peptides/protein. In other examples, proteins can be one or more components of a Clustered Regularly Interspaced Short Palindromic Repeats (“CRISPR”) system.
The ultrahigh throughput protein discovery systems described herein allow one to access untapped resources of biodiversity. As compared to some traditional approaches, the ultrahigh throughput protein discovery described herein is based on genomic and metagenomic mining of many living organisms. After candidate sequences are identified, variations (e.g., random mutations or designed mutations) are introduced into the candidate sequences, and an ultrahigh throughput method is used to screen these proteins for specific, e.g., desired, functions. This approach can dramatically increase the efficiency of finding genes of interest, screening proteins for the desired function, and producing engineered proteins with desired characteristics. The methods described herein represent an ultrahigh throughput-screening tool, and can be used to develop, for example, gene therapies, diagnostic tools, and industrial catalysts, and can also be used in various fields, e.g., medicine, agriculture, and synthetic biology.
There are several key features to the methods and systems described herein for ultrahigh throughput protein discovery: 1) an in vitro synthetic pathway from DNA to RNA to protein and then to the final assay, all completely free of the environmental or cellular context of the original genetic material, providing a high level of control and additional freedom from toxicity; 2) the design and usage of the advanced microwell arrays that enable significantly greater versatility in reagent handling and reactions tested; 3) assay conditions and readouts that are consistent with the required scale, efficiency, and format; and 4) selection methods that enable efficient identification of specific constructs giving rise to positive signal from within a large number of reactions. The successful construction and implementation of the complete system requires a synergistic design that requires innovations on each of the key individual features, as well as how to combine them into efficient methods of ultrahigh throughput protein discovery. An overview of these key features and their integration is provided below.
Once the natural or engineered protein sequences to be screened are determined, the DNA sequences coding for them are codon optimized and synthesized. This synthetic approach takes advantage of the rapid advances in DNA synthesis capabilities that have yielded increased lengths of high fidelity synthesis products at continually decreasing costs. This differs from past methods of biodiscovery or bioprospecting that have relied on harvesting and amplifying nucleic acids directly from environmental samples (see, e.g., WO1998058085) or required deciphering of specific growth conditions for organisms of interest, many of which were unable to be cultured in a laboratory. Additionally, the synthetic approach allows other functionalization and modification of DNA, including, but not limited to, nucleic acid modifications such as biotinylation, fluorescence tagging, alternative base chemistries such as dideoxy or phosphorothioate modifications for resistance to specific enzymatic activities, and sequence additions such as specific barcodes, hybridization sites, or expression elements such as promoters. Together, the synthetic approach starting at the DNA provides a much more versatile set of methods that can be leveraged for efficient processing and larger scale.
In some embodiments, the synthetic DNA is modified with either barcodes, biotin, or other tags to enable them to be efficiently loaded into microwells. This loading can be enabled either by direct attachment of the DNA to a functionalized surface of the microwells, or via an indirect mechanism in which the synthetic DNA are first loaded onto carriers such as microbeads, which are then deposited into a microwell array for downstream reactions.
After loading the synthesized DNA into the microwell array, the synthetic methods are used to generate the functional RNA and protein macromolecules. In some embodiments, cell free in vitro transcription and translation (IVTT) systems are used, enabling the expression of RNA and protein without any environmental or cellular constraints. This technique differs from traditional bio-discovery approaches in that the methods described herein do not require culturing specific organisms for obtaining bioactive compounds. Thus, the proteins are not subject to culture, toxicity, or other conditions that would need to be either laboriously optimized for individual proteins of interest or otherwise precluded from being screened in cells altogether. Additionally, the synthesis is rapid, yielding amounts of protein compatible with microwell-based reactions in a few hours, versus the days required for traditional methods of recombinant protein expression and purification. Together, these properties enable the use of cell-free synthesis described in these methods to provide greater versatility as the basis for the ultrahigh throughput protein screening platform.
The microwell array systems described herein are particularly well suited to enabling such a cell-free, synthetic approach to biodiscovery. Microwells are characterized by their capability to perform a large number of low-volume reactions simultaneously and their reaction versatility. We describe in this system a series of liquid handling operations and instrumentation modifications to perform biological and biochemical reactions in microwell arrays while limiting the waste of reagents and/or samples that are typically costly and/or in very limited supply. We also describe embodiments in which beads-based filtering and washing take advantage of workflows for high throughput macromolecule manipulation and combine them with novel microwell designs to enable greater versatility in reaction conditions. Together, these provide novel capabilities of both throughput and functional diversity to enable ultrahigh throughput protein screening.
In certain embodiments of the invention, beads are used to load the synthesized nucleic acids into the advanced microwell arrays described herein. While beads have been used to separate desired reaction products from byproducts, buffers, and impurities, it is particularly difficult to handle beads in a microwell array screening system. The new microwell array systems described herein feature one or more filter holes at the bottom of the microwells, which can be optionally sealed by a movable plate or equivalent sealing mechanism, such as an oil sealing method. With this design, the microwell array systems can retain the reaction products (e.g., by attaching the reaction products to the beads or functionalized surfaces of the microwells) while other contents in the microwells are removed and/or exchanged. When one or both sides of the microwell array are sealed, the microwell array can also be used as standard vessels or containers for various reactions. The systems described herein allow numerous reactions (e.g., DNA synthesis, transcription, translation, and function assays) to occur in a single microwell platform, which makes large-scale biodiscovery (including e.g., gene synthesis, non-cellular protein synthesis, and screening assays) possible.
An ultrahigh throughput biodiscovery platform requires the ability to support versatile reactions and assays to assess function across a wide range of protein activities. Past discovery platforms based on low volume techniques such as microwells or microfluidics can be limited in their ability to facilitate exchange of reaction environments such as buffers, ion concentrations, substrates, and pH. In the system and methods described herein utilizing in vitro transcription and translation reagents (IVTT), the proteins of interest synthesized can be separated from the IVTT mixture, enabling a wide range of functional assays that may not otherwise be compatible with the IVTT mixture.
Additionally, we describe assays formats that can be adapted for this system so that they can be read out in parallel and high throughput. In some embodiments, these methods include the use of reactions whose functional endpoint results in a fluorescent signal to enable rapid detection via microscopy. In other embodiments, the functional endpoint results in the generation or release of a nucleic acid barcode fragment or similar identifier that can be collected and sequenced to identify the proteins resulting in positive functional activity. Altogether, the ability to exchange the solutions in the reaction environment expands the range of assays that can be performed in low volume microwells, generalizing them to encompass more conventional macro-scale biochemical or molecular biology workflows that would be challenging and costly to scale.
The new systems are capable of assaying large numbers of proteins and reactions in a given run, and so one challenge is the specific identification and retrieval of the candidates that yielded the positive result. In bead or droplet-based systems, sorting has conventionally been used to enrich the samples that have a positive signal. In a microwell-based platform in which the reactions are not assayed sequentially and are instead distributed spatially, alternate methods of extracting the identity from positive signal are needed.
Given that the ability to identify specific constructs that provide a positive signal from a large number of reactions is a critical component of efficient protein discovery and engineering, several technologies that are compatible with the system and methods are described herein. These range from direct selection of the positive wells, to targeting constructs using barcodes onto a pre-labeled array, to decoding a randomized array, to other molecular biology reactions that enable the release of signals that enable the retrieval of the exact constructs that led to positive activity.
By combining and analyzing these two streams of information of the positive signal as well as the underlying sequence, the sequence-to-function information for a large number of genes can be obtained. In some embodiments, the reaction is run and the DNA sequence is analyzed in the same microwell.
The present disclosure provides ultrahigh throughput screening methods and platforms characterized by great versatility and ultralow volume of reaction reagents.
In a first general step, the systems can synthesize physical nucleic acid constructs (e.g., DNA and/or RNA) in a single linear or circular form. As used herein, the term “nucleic acid construct” refers to a DNA, RNA, or other nucleic acid molecule. Such nucleic acid molecules are synthetic, but can be or include naturally occurring sequences and/or manmade sequences.
The nucleic acid constructs encode either the active components (e.g., enzymes or ribozymes) or substrates of a reaction. In some embodiments, the nucleic acid constructs encode the active protein components (e.g., an enzyme, a ribozyme, or one or more components of a CRISPR system) of a reaction of interest. The active components can then act on a chemical or a biological substrate. In some embodiments, the nucleic acid constructs encode the substrate of a reaction (e.g., a ligand, or a protein that can be modified (e.g., phosphorylated) by an enzyme). The substrates can be modified or catalyzed by the active components in a reaction.
In a next step, the systems automatically dispense the synthesized nucleic acid constructs (e.g., DNA or RNA) into containers, e.g., microwells, droplets, or beads. In some embodiments, one copy or multiple copies of a single nucleic acid construct are dispensed into a single microwell or droplet, or a single copy or a few copies are dispensed and amplified within each microwell. In other embodiments, multiple constructs can be loaded onto a single microwell, droplet, or bead to enable expression of one or more constructs simultaneously. These microwells or droplets have a very low volume, and can range from about 0.5 picoliters to about 50 nanoliters.
In the next step, in vitro transcription and translation (IVTT) reagents are automatically added to the microwells or droplets to perform high throughput protein synthesis, which results in expression of RNA and protein products from the nucleic acid constructs. Then, the systems can automatically add to the microwells or droplets active reaction reagents and/or substrates that are common to all reactions. These materials can be introduced before, simultaneously, or after the addition of IVTT reagents.
In the next step, the systems can incubate the RNA products (e.g., ribozymes, noncoding RNAs) and/or protein products generated from in vitro transcription and translation with the reaction reagents and/or substrates for a period of time (e.g., at least 5 minutes, 10 minutes, 15 minutes, 20 minutes, 30 minutes, 1 hour, 2 hours, 3 hours, 4 hours, or more) at a certain temperature (e.g., 25° C., 37° C., or other temperatures), sufficient to produce reaction products that can be detected and/or measured using massively parallel functional testing assays. Various known detection methods can be used, including, spectroscopy (such as fluorescence spectroscopy, ultraviolet-visible spectroscopy (UV-VIS), Raman spectroscopy, surface enhance Raman spectroscopy, and absorption spectroscopy), spectrometry (e.g., fluorometry), surface plasmon resonance, field effect transistor, and second-harmonic generation. In addition to the various assay techniques, other methods can be used to capture or release specifically constructs that demonstrate the desired activity for identification. Based on the detection results, the systems automatically provide the user with a functional characterization report for the RNA products and/or protein products in each separate well or droplet, whose location within a coordinate system is known. Then the systems automatically determine and identify the nucleotide sequence information of the nucleic acid constructs that correspond to a specific reaction result.
Proteins with desired characteristics can be selected for further genome mining and engineering.
The systems described herein also provide a versatile platform for a variety of different assays. In these platforms, multiple assays can be developed for different protein activities. These activities include, e.g., enzymatic activity, binding activity, cleavage activity, and bond-formation. In some embodiments, these activities generate optical signals (e.g., fluorescence, chemiluminescence, phosphorescence, color change, absorption change, and precipitation) or non-optical signals (e.g., heat, pH, volume, capacitance, impedance, conductivity, and other physical change), which can be detected by appropriate devices. In some embodiments, one single assay can be used to detect high dynamic range for individual activities. In some embodiments, multiple related activities can be screened in a single assay.
In some embodiments of the system, microwell assays are used to provide a technology that enables high throughput and versatile reaction conditions. The microwell arrays can be used to synthesize and/or screen nucleic acid constructs, peptides, and proteins. The advanced microwell designs used in the microwell arrays described herein can serve as the container for low volume reactions (
As shown in
Each microwell can have a diameter (110) of from about 10 to about 100 microns (μm), e.g., from 20 to 100 μm, from 30 to 90 μm, or from 60 to 80 μm. In some embodiments, the diameter (110) is less than 100 μm, 90 μm, 80 μm, 70 μm, 60 μm, 50 μm, 40 μm, 30 μm, 20 μm, or 10 μm. In some embodiments, the diameter (110) is greater than 5 μm, 10 μm, 20 μm, 30 μm, 40 μm, 50 μm, 60 μm, 70 μm, 80 μm, 90 μm, or 100 μm.
In some embodiments, the filter holes (108) have a diameter (112) of from about 0.5 to 40.0 μm, e.g., from 1 to 40 μm, from 1 to 30 μm, from 5 to 20 μm, or from 10 to 20 μm, In some embodiments, the diameter (112) is less than 40 μm, 30 μm, 20 μm, 10 μm, 5 μm, or 1 μm. In some embodiments, the diameter (112) is greater than 0.5 μm, 1 μm, 2 μm, 3 μm, 4 μm, 5 μm, 10 μm, 20 μm, 30 μm, or 35 μm.
In some embodiments, as shown in
In some embodiments, a larger filter hole is slightly smaller than the size, e.g., maximum outer diameter, of the beads, and the smaller filter holes are much smaller than the size of the beads. When a bead is loaded into the microwell, it will block the large filter hole and significantly decrease the rate of flow through the microwells, which prevents other beads from entering into this microwell. Such a self-limiting loading method could achieve higher bead loading ratios than regular Poisson loading (i.e., super-Poisson loading). In other words, many more microwells will contain only one bead than would be predicted by normal Poisson statistics. According to Poisson statistics, when the average rate of occurrence equals 1 (e.g., the number of beads equal the number of wells), the probability of microwell with a single bead is only 37%, while 26% of microwells will have two or more beads. This is why the average rate of occurrence should be kept low, e.g., <0.3, to minimize the chance of having two or more beads in a well. The drawback is that majority of microwells are empty (e.g., 74% of microwells are empty when the average rate of occurrence is 0.3). Self-limiting loading methods allow using higher values of the average rate of occurrence, but limit the ratio of loading multiple beads into a microwell.
In some embodiments, microwells and/or filter holes (e.g.,
In some embodiments, microwells and filter holes are blended together, e.g., “funnel” shaped (
In some embodiments, the opening at one end is bigger than the opening at another end. The beads can enter into the microwells through the bigger opening, but cannot exit the microwells through the smaller opening.
The microwell arrays can be used to screen nucleic acid constructs, peptides, and proteins, e.g., enzymes, for specific functional activities at ultra-high throughput.
The microwell arrays can be used with various types of affinity beads (120), including chemical or protein conjugation, nucleic acid hybridization. These beads can have various properties, e.g., non-magnetic or magnetic beads, affinity beads (e.g., beads with chemical or protein conjugate, or with nucleic acids for hybridization), or beads that are detectable via fluorescent or other markers or reporting agents, as described in more detail below. The beads have a diameter that can be greater than the diameter of the filter holes (112) and smaller than the diameter of microwells (110). In some embodiments, the bead diameter is greater than 0.5 μm, 1 μm, 2 μm, 3 μm, 4 μm, 5 μm, 10 μm, 20 μm, 30 μm, or 35 μm. In some embodiments, the diameter of beads is less than 40 μm, 30 μm, 20 μm, 10 μm, 5 μm, or 1 μm, but greater than the diameter of filter holes (112).
These beads can provide a convenient way to separate reaction products (or reaction agents) from other undesired contents (e.g., reaction byproducts). In different embodiments, the reaction agents or reaction products (e.g., nucleic acids, DNA, RNA, oligo nucleotides, proteins, and peptides) can attach to the beads, for example via a functional group, e.g., an antibody or one member of a binding pair, e.g., a chemical or ligand binding pair. Because the beads cannot pass through the filter holes, the reaction agents or reaction products that are attached to the beads will remain in the microwells, and the other agents in the one or more liquids (e.g., buffers, water, reaction byproducts, waste liquid) can be removed, e.g., through the filter holes (108), by various means, e.g., pressure, vacuum, or centrifugal force.
The beads used herein can be fabricated from materials known in the art. Examples of such materials include e.g., inorganics, natural polymers, and synthetic polymers. Examples of these materials include, e.g., cellulose, cellulose derivatives, acrylic resins, glass, silica gels, polystyrene, gelatin, polyvinyl pyrrolidone, co-polymers of vinyl and acrylamide, polystyrene cross-linked with divinylbenzene or the like, polyacrylamides, latex gels, polystyrene, dextran, rubber, silicon, plastics, nitrocellulose, celluloses, natural sponges, metals, plastics, cross-linked dextrans (e.g., Sephadex™) agarose gel (Sepharose™), or other materials known to those of skill in the art. In some embodiments, the beads can be streptavidin polymer beads, streptavidin-coated magnetic particles (Spherotech, Lake Forest, Ill.), AMpure® beads (Beckman Coulter, Brea, Calif.), Dynabeads® M270 (Thermo Fisher Scientific, Waltham, Mass.), or SPRI® beads (Agencourt AMPure® XP beads, Beckman Coulter, Brea, Calif.; Cat. No. A63881). In some embodiments, the beads are magnetic, paramagnetic, or superparamagnetic beads.
The platforms described herein can be used to image multiple microwells simultaneously and/or individually. In some embodiments, more than 10,000 microwells, more than 50,000 microwells, 100,000 microwells, 200,000 microwells, 300,000 microwells, 400,000 microwells, or 500,000 microwells can be imaged simultaneously.
The systems described herein also isolate each microwell from other microwells, eliminating crosstalk or contamination between microwells. In some embodiments, oil can be used to seal the microwells. In some embodiments, the oil sealing can be maintained for at least 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, or 55 minutes, or for 1, 2, 3, 4, 5, or more hours, e.g., 10, 15, 20, 24 hours, 3 days, or 1 week. In some embodiments, a movable plate can be used to seal one or more of the top openings (106) and/or one or more of the filter holes (108). In some embodiments, the movable plate can have a hydrophobic surface.
The microwell arrays can be made by many methods known in the art, e.g., etching, photodeposition, additive manufacturing (e.g., 3-D printing), photolithography, thin film deposition, UV-LIGA (Lithographie, Galvanoformung, and Abformung) imprinting, injection molding, embossing, particle blasting, and laser cutting.
At the handle-side, the photoresist layer (606) has a pattern that matches the desired pattern of microwell arrays. The top opening (106) of microwells is not covered by the photoresist layer (606). Thus, the photoresist layer (606) serves a mask and protects the area under the photoresist layer from subsequent etching process. A wet and/or dry etching process can be used to etch silicon oxide and/or silicon layer. An example of a dry etching technique includes deep reactive ion etching (deep RIE) using C4F8 and/or SF6 gas. Since the deep RIE process etches much slower on the silicon oxide layer than on the silicon layer, the etching process can be effectively stopped before reaching the device-side of the SOI wafer, because of the silicon oxide layer between the handle side (602) and the device side (604).
A similar process can be used on the device-side to make the filter holes. The process applies a second photoresist layer (608) to the device side (604) with uncovered areas for filter holes (108). In some embodiments, a mask aligner equipped with an IR light source can perform through-wafer registration using registration markers to fabricate an opening at the locations for filter holes (108). After both microwells and filter holes are fabricated, photoresist and oxide can be stripped away through a standard photolithography procedure. The general methods of making microwell arrays are described in detail, for example, in U.S. Pat. No. 9,409,139B2, and US20160310927A1; each of which is incorporated herein by reference in its entirety.
In some other embodiments, a layer of silicon dioxide, silicon nitride, silicon carbide, diamond, or sapphire is deposited or grown on both sides of the silicon wafer. In other embodiments, a substrate that allows selectively etching to silicon (e.g., silicon dioxide, silicon nitride, silicon carbide, diamond, and sapphire) is deposited or grown on one side, and another substrate is deposited or grown on the other side of the silicon wafer.
The microwell arrays described herein can allow a series of reactions to occur in the same microwell. These reactions include, e.g., nucleic acid synthesis, nucleic acid assembly, in vitro transcription and translation, and protein functional assays. It is emphasized that these methods for loading reagents can be used to both load the nucleic acid sequences containing the constructs of interest, as well as the reagents needed for their manipulation and reactants and substrates needed for downstream assays. Usually, the reagents for these reactions need to be added to the microwells and once the reactions are completed, the reagents need to be properly removed.
There are many different ways to add liquids (e.g., various reaction reagents) into microwells or remove liquids from the microwells.
As shown in
This disclosure also provides various methods to add one or more liquids to the entire microwell array.
Excess amounts of the one or more liquids on the top of the microwell array (800) are removed by spinning the microwell array (800) at a relatively high speed (840), e.g., 1000, 1500, 2000, 3000, 4000, 5000, 6000, 7000 rotations per minute. Once the excess amount of liquid is removed, the liquid in each microwell is isolated from the liquid in the other microwells, thus forming individual isolated reaction chambers. In some embodiments, beads are suspended in the liquids either before or after the one or more liquids are added, or before or after a specific liquid is added.
In some embodiments, the chip is placed in a humid chamber to prevent evaporation. In some embodiments, oil (e.g., fluorinated oil, mineral oil, hydrocarbon liquid) is added to cover opening of the microwells to prevent evaporation. In some embodiments, the chip is immersed in immiscible liquid to prevent evaporation.
In some embodiments, the systems shown in
The systems and methods described herein use a process for producing peptides or peptide derivatives by using a reaction system that transcribes a DNA sequence construct into an RNA and then translating the RNA into a polypeptide. Cell-free protein synthesis is typically simpler than in vivo methods, and requires only the addition of a template DNA or mRNA to a reaction mixture and then incubation for a sufficient time (e.g., several hours) to yield the desired protein. Thus, cell-free protein synthesis provides an effective approach for the high-throughput protein biodiscovery platforms described herein. Moreover, reaction conditions, such as the temperature or accessory factors, can be carefully controlled in cell-free systems.
In some embodiments, nucleic acid constructs are directly loaded, e.g., automatically, into a microwell array. For example, nucleic acid constructs can be dispensed from one donor microwell array into an acceptor microwell array that aligns with the wells of the donor array, as displayed in
In some embodiments, each bead can have more than one affinity binding molecules (e.g., streptavidin, antibodies, or oligonucleotides). However, the concentration of affinity tagged nucleotide constructs is titrated such that only one or at most one nucleic acid construct is attached or conjugated to each bead. Each bead may have two or more different types of affinity binding molecules attached to its surface. For example, one type of affinity binding can be used to attach target gene, and another type of affinity binding on the beads can be used to capture proteins generated through IVTT.
Beads are distributed across microwell arrays. In some embodiments, each microwell receives one or more beads. In some embodiments, each microwell receives at most one bead. In some embodiments, some or many microwells do not receive any beads. In some embodiments, all microwells receive the same type of bead. In other embodiments, different microwells receive different beads, or groups of microwells receive the same beads, and other groups of microwells receive different beads.
In some embodiments, the beads are not distributed randomly but instead are pre-localized on a bead-array as shown in
As shown in
A single copy of DNA can be amplified, e.g., using PCR or isothermal amplification methods (420). In vitro transcription/translation reagents and/or substrates are then added to the microwells (430). The in vitro transcription/translation reagents and substrates can be added before, at the same time, or after the beads are added to the microwells (430). The microwells can then be sealed, e.g., with oil or other hydrophobic liquid, or a physical cap structure (260), e.g., a glass, silicon rubber sheet, or some appropriate cover.
The nucleic acid constructs can include a sequence encoding an affinity tag, such as his-tag, FLAG-tag, or SNAP-tag. The polypeptides generated during IVTT (440) can be immobilized on protein-binding beads, e.g., through affinity binding. In some other embodiments, the surfaces of the microwells are functionalized using affinity tags or commonly used antibodies to tags such as FLAG, or SNAP, to enable immobilization of the synthesized polypeptides directly in the microwells.
Many cell-free protein synthesis reagents have been developed (see this review for comparison of common commercial cell-free systems (see, e.g., Chong, “Overview of cell-free protein synthesis: historic landmarks, commercial systems, and expanding applications,” Curr. Protoc. Mol. Biol., 2014 Oct. 1; 108:16.30.1-11. doi: 10.1002/0471142727.mb1630s108). In some embodiments, the system used in the present methods is a cell-extract based cell-free protein synthesis system, such as TnT® Quick Coupled Transcription/Translation System (Promega, Madison, Wis.) or other similar cell-extract based systems.
In some embodiments, the system used in the present methods is the PURExpress® system (New England Biolabs, Ipswich, Mass.) or other similar systems that are composed of recombinant or purified components and provide minimal contaminating background activities for direct downstream biological assays. In the PURExpress system, mRNA is translated into protein using aminoacyl tRNA intermediates and ribosomes consisting of dozens of proteins and three ribosomal RNAs in prokaryotes. To complete the translation of one open reading frame (ORF) encoded in the mRNA sequence, three reaction steps proceed on the ribosome: initiation, elongation, and termination. These reaction steps are followed by a ribosome recycling step to re-initiate translation. Several translation factors take part in each translation step: three initiation factors (IF1, IF2, and IF3), three elongation factors (EF-G, EFTu, and EF-Ts), three release (termination) factors (RF1, RF2, and RF3), and ribosome recycling factor (RRF). In addition, three other reactions are added to facilitate protein synthesis: transcription to synthesize mRNA, aminoacylation of tRNAs, and energy source regeneration. Thus, T7 RNA polymerase, pyrophosphatase, aminoacyl-tRNA synthetases, creatine kinase, myokinase, and nucleoside-diphosphate kinase are also incorporated into the system.
All factors in the PURExpress system are individually purified to remove contaminating activities such as nuclease and protease activities, and thus can significantly decrease the background signals for many downstream assays. DNA, RNA, and protein molecules are additionally more stable in such purified cell-free systems, which can increase the sensitivity in in vitro platforms that couple gene synthesis with protein synthesis and direct functional assays. Altogether, recombinant and synthetic cell-free IVTT systems such as PURExpress enable the same speed and high throughput of synthesis as cell-lysate based IVTT systems but allow greater experimenter control for reaction cofactors and decreased background for more sensitive readouts. A detailed description of this system is described, e.g., in Shimizu et al., “Protein synthesis by pure translation systems,” Methods, 36.3:299-304 (2005) and in U.S. Pat. No. 9,371,598, which are incorporated herein by reference in their entireties.
Following incubation of the reaction constructs, in vitro transcription/translation reagents, and substrates for a sufficient period of time, various methods can be used to detect and/or quantify the bioactivities of polypeptides, e.g., by fluorometric readout.
Once the RNA and polypeptides are synthesized, the design of the microwell system described herein with the ability for fluid exchange enable a versatile set of reactions. While there are many possible reactions, the preferred embodiments are those that have a functional readout that is converted to a signal capable of high throughput readouts compatible with the scale of the microwell arrays.
In some aspects, these may utilize fluorometric readouts that provide a fluorescent signal that is proportional to the amount of desired reactants that formed. Substrates that are to be modified can be tagged with fluorescent/quencher pairs that in the unmodified state of the substrate do not fluoresce, but upon modification, the separation of the fluorescent dye from the quencher enables a readout of reaction progress. Additionally, the presence of IVTT solution enables synthesis of fluorescent proteins as a potential readout, as previously demonstrated (Cui, N. et al., “A mix-and-read drop-based in vitro two-hybrid method for screening high-affinity peptide binders,” Sci. Rep. 6, 22575; doi: 10.1038/srep22575 (2016)).
For detection of DNA modification, aspects of the aforementioned fluorescent assays can be adapted as follows: a DNA or RNA fragment can be labeled with a fluorophore and quencher, which upon cleavage will dissociate and generate a fluorescent signal. This enables the detection of RNA nuclease, ssDNA nuclease, DNA nicking, dsDNA nuclease, as well as insertion/deletion activities. For insertion/deletion activities, nucleic acid segments containing quencher or fluorophore elements are inserted into a targeted sequence, either enhancing or disrupting activity. Additionally, these fluorophore modifying readout probes may be delivered in cis or trans configuration with the original nucleic acid fragments encoding the construct(s). The cis targeting can be enabled by having the synthesized nucleic acid construct contain the fluorescent/quencher target as well. This potentially enables testing different substrates for each construct, or enables an all-in-one delivery a gene encoding a protein effector and its potential substrate on a single nucleic acid construct (
The utilization of IVTT to synthesize a fluorescent protein in response to a DNA modification event provides additional possibilities to the readout of DNA modifications. In reactions classified as fluorescence restoration assays, a DNA/RNA encoding a fluorescent protein is either disrupted or restored using a DNA/RNA modifying effector, thus revealing a detectable change in protein activity. In one aspect, there are two constructs expressing different fluorescent proteins that can be spectrally distinguished; one channel acts as an internal control of nucleic acid loading and expression levels, while the other serves as the readout.
Beyond fluorescence, an alternative readout capable of scale compatible with ultrahigh throughput protein discovery is next generation sequencing. In one aspect, a construct encoding a nucleic acid target substrate and effector gene expressing the polypeptide/RNA of interest is immobilized in a microwell or to a microbead. Successful reconstitution of the effector system and nucleic acid modification results in release of the construct sequence encoding the modified target into the solution for collection and identification by next generation sequencing. The nucleic acid target substrate can either be located on the same gene synthesis product (cis), as suggested in
Example 1 below demonstrates the power of an all-in-one DNA cleavage readout assay that starts with a synthesized DNA product and proceeds to the DNA cleavage readout in a single reaction chamber. This reaction can be further miniaturized and the readout can be converted to released nucleic acids or a fluorometric readout for use in microwells.
In another aspect, all microwells and/or microbeads are loaded with nucleic acid barcode sequences that can be released in response to an external stimulus, such as light activated nuclease or other chemistries that enable a localized release of nucleic acids. In this embodiment, the reaction of interest is indirectly coupled to the nucleic acid release mechanism, such that positive readouts activate methods such as a scanning laser to selectively excite and release nucleic acids for identification.
These non-limiting embodiments serve to highlight the versatility and potential of the platform. Additional assays and readouts that are compatible with the ultrahigh throughput protein discovery platform are expected and can be modular with the other components described herein.
Various methods are available to determine the sequences of the nucleic acid constructs on the blotting plate or nucleic acid construct array (540). All of these different methods can be computer-controlled and automated. Without wishing to be limited, there are three broad embodiments of methods and systems for selecting and identifying the nucleic acid construct; the first utilizes a pre-decoded or registered construct arrays, in which prior to the reaction the exact spatial location of each construct is known so that a subsequent positive signal can immediately be associated with the sequence that gave rise to it. The second is identifying and collecting the constructs of interest directly in the microwell plate, whether that is using technologies such as robotic picking of the loaded beads or eluting cleaved fragments of nucleic acids. The third takes specific advantage of the reagent exchange capabilities of the advanced microwells described herein to elute a separate duplicate “blotting plate” whose identities can be read out separate of the actual reaction.
In the first embodiment utilizing pre-decoded arrays, numerous aspects exist. One instance is a microwell array that contains oligonucleotide sequences directly synthesized onto the functionalized surface of the individual microwells. Fluid jetting systems can be utilized to specifically synthesize oligonucleotides in specific wells so that the sequence identities of each well are well-defined. Thus, when a gene pool containing complementary barcode sequences are flowed over the array to load the microwells, specific full-length constructs will then hybridize to their pre-defined locations. In another instance, a randomly loaded array such as a microwell or bead array is first decoded utilizing methods, such as those described in U.S. Pat. No. 6,620,584B1, prior to hybridization with full-length genes of interest and performing the subsequent protein synthesis and functional assays.
In the second embodiment, the reaction array can be directly manipulated to yield the identity of the signals of interest. In one aspect, this can be performed by mechanical methods such as miniaturized robotics or piezoelectric actuators (Alogla et al. “Micro-tweezers: Design, fabrication, simulation and testing of a pneumatically actuated micro-gripper for micromanipulation and microtactile sensing.” Sensors and Actuators A: Physical 236 (2015): 394-404) to select the beads from wells of interest. In other embodiments, optical or laser based selection methods (Chen et al., “High-throughput analysis and protein engineering using microcapillary arrays,” Nature Chem. Biol., 12.2 (2016): 76) are able to select and transfer samples of interest for collection and downstream analysis.
Other methods of direct manipulation include the direct release of DNA fragments or barcodes for sequencing, as described in the “Protein and Nucleic Acid Assays” section.
In some embodiments, the constructs in the wells may be able to be directly assayed and then sequenced on a flow cell. Without wishing to be limited, in the instance in which sequencing by synthesis is performed as the readout, the microwells utilized as well as the nucleotide acid sequence constructs used have the necessary adaptor sequences on the microwell surface and at the gene terminals, respectively. After functional assays, the nucleic acid constructs can be directly sequenced on the patterned microwell array to reveal their identity.
In the third embodiment, the nucleic acid contents of the microwells are partially transferred to a ‘blotting plate’ that carries the same spatial localization of nucleic acids as the main reaction microwell array, but enables greater flexibility in reading out the identity of individual locations.
In some embodiments, the nucleic acid construct (e.g., DNA) arrays can be decoded by sequentially hybridizing different fluorescence probes to the nucleic acid construct. Methods of decoding nucleic acid constructs using fluorescence probes are described in detail, e.g., in U.S. Pat. No. 6,620,584, which is incorporated herein by reference in its entirety.
In some embodiments, light can be used to retain or release specific nucleic acid constructs (e.g., DNA) at selected locations. As shown in
In some other embodiments, all nucleic acid constructs (e.g., DNA) are trapped by a polymer, and light is used to break polymers and thus selectively release nucleic acid constructs at specific locations. In some embodiments, nucleic acid constructs (e.g., DNA) are captured on the blotting plate by light-sensitive binding, and light can be used to disrupt the binding and selectively release the nucleic acid constructs.
In some embodiments, a fluid jetting system (e.g., inkjets) can selectively add hydrophobic materials (e.g., wax) to seal the nucleic acid constructs at appropriate locations. The nucleic acid constructs at other locations (unsealed by hydrophobic material) can be rinsed out and analyzed.
In some embodiments, laser microdissection can be used to cut out areas in the blotting plate and nucleic acid constructs at the cut-out locations or at the remaining locations can be selectively released and sequenced. In some other embodiments, a robotic arm is used to collect nucleic acid constructs mechanically on the blotting plate.
As shown in
When the selected areas are exposed to light, the light causes the surface charge to switch from positive to negative, e.g., by switching between two different chemical structures, by breaking chemical bonds, or by inducing a pH change. Nucleic acid constructs, which are typically negatively charged, at the exposed area are then repelled and released from the surface and can be further manipulated, analyzed, or sequenced. In some embodiments, the unreleased nucleic acid constructs are also sequenced.
In some embodiments, nucleic acid constructs that encode polypeptides with desired properties are released and sequenced (“positive selection”). In some other embodiments, nucleic acid constructs that encode polypeptides without desired properties are released and washed away first, and the remaining nucleic acid constructs on the blotting plate are then released (e.g., by a stronger wash buffer) and sequenced (“negative selection”).
In some embodiments, protein to be screened can cleave bonds or trigger an action of releasing, e.g., nuclease. The positive hits of these proteins could automatically release polynucleotides used to make these proteins. After IVTT and incubation, all the liquid in the microwells can be pooled together to purify, sequence or analyze polynucleotides in the liquid. The nucleotides discovered are positive hits.
In some embodiments, target contents (e.g., beads, proteins, or nucleic acid constructs) can be selectively collected by inducing polymerization of a polymer solution in microcells. As shown in
In some embodiments, the contents of no interest are trapped in the microwells by the polymer. Then the target contents of interest can be collected by methods described herein.
In some embodiments, the contents that are of no interest can be washed away first, while the target contents of interest are retained in the microwells due to polymerization of the polymer solution. Thereafter, the target contents of interest can be collected from the microwells.
The disclosure also provides various methods to functionalize microwell surfaces, and can be adapted for a wide range of uses.
In some embodiments, a layer of hydrophobic material, (e.g., silane, thiol) can be spin-coated on a flat surface, such as PDMS, silicon, glass, or gold (
In some embodiments, the microwell surface or part thereof can be functionalized using a microstructured surface, e.g., polydimethylsiloxane (PDMS) (
In some embodiments, the chip can be flooded with hydrophilic material first to cover surface, and then use method (e.g., polishing) to selectively remove coating at the outside surface. In some embodiments, proteins (e.g., receptors, ligands, or antibodies) can be attached to surfaces inside the microwells. In some embodiments, oligo-conjugated antibodies can be used to functionalize flow-cell surfaces or oligos bound to microwell surfaces. In some embodiments, the protein binding can be used to sequester proteins in the microwells. In some embodiments, the protein binding can be detected, e.g., by Surface Plasmon Resonance (SPR).
Numerous methods of attaching proteins to microwell surfaces are known in the art. Oligonucleotide-protein conjugates can be used in numerous applications for diagnostic and therapeutic purposes. Proteins (e.g., antibody molecules) can include a number of functional groups suitable for modification or conjugation purposes. In some embodiments, oligonucleotide-protein constructs can be cross-linked through lysine ϵ-amine and N-terminal α-amine groups. In some embodiments, the protein is hydrazine-activated through a reaction between the amine group and the Sulfo-S-HyNic crosslinker. The S-HyNic (succinimidyl-6-hydrazino-nicotinamide) hetero-bifunctional crosslinker is used in Chromalink™ technology. Sulfo-S-HyNic is a water soluble analog of S-HyNic. The S-HyNic analog reacts with primary amines on proteins (amino group of lysine) or amino-modified oligonucleotides or surfaces, introducing a HyNic (6-hydrazino-nicotinamide) linker that forms stable covalent conjugates with biomolecules possessing 4FB (4-formylbenzamide) incorporated linkers. Sulfo-S-HyNic can also be used for incorporating HyNic linkers on amino-surfaces or biomolecules. The hydrazine-activated protein (e.g., antibody) is then linked to an aldehyde-activated oligonucleotide.
In some embodiments, the amine group in the protein can react with a maleimide-activated biopolymer, thus linking the protein with the biopolymer (e.g., oligonucleotides, polypeptides, and polysaccharides). In some embodiments, carboxylate groups can also be used to couple with another molecule using the C-terminal end, or with aspartic acid and/or glutamic acid residues.
In some embodiments, the protein is an antibody. Amine and carboxylate groups are as plentiful in antibodies as they are in most proteins, and the distribution of these functional groups is nearly uniform on the surface of antibodies. Thus, if some of the modified or conjugated residues are located on the antigen binding sites, the methods may produce oligonucleotide-antibody conjugates that are only partially active or inactive and thus may not bind to the antigen. In such cases, an alternative conjugation method can be used that involves a thiol reactive group by selectively cleaving an antibody with a reducing agent to create two half-antibody molecules, or using smaller antibody fragments such as Fab′ fragments.
In some embodiments, conjugation done using hinge area-SH groups can orient the attached oligonucleotide away from the antigen binding regions, thus preventing blockage of these sites and preserving activity. Reduction in a hinge region by a reducing agent, e.g., tris(2-carboxyethyl)phosphine (TCEP), dithiothreitol (DTT) or mercaptoethylamine (MEA), yields two half antibody molecules containing sulfhydryls. The sulfhydryl group can react with maleimide-activated biopolymers, forming an antibody-oligo conjugate through a thioether bond.
Other alternative methods of site-directed conjugation of antibody molecules can take place at carbohydrate chains, e.g., at the CH2 domain within the Fc region. Upon periodate oxidation an aldehyde group can be introduced to the antibody, which can react with an amine-modified oligonucleotide.
In other embodiments, the biopolymers can bind to the surfaces of microwells through complementary binding between oligonucleotides, thus attaching the proteins to the microwells.
Furthermore, products from the nucleic acid constructs can be immobilized on inner surfaces of microwells by nucleic acid conjugated antibodies that specifically bind to the gene products.
Nucleic acid constructs (DNA or RNA constructs) can be synthesized based on the nucleotide sequences from metagenome mining or engineered non-naturally occurring sequences. As used herein, a nucleic acid construct is an artificially constructed segment of nucleic acid molecule that can be transcribed and/or translated into a peptide, polypeptide, or protein. The nucleic acid constructs described herein can include a promoter sequence, followed by a desired coding sequence, and a transcription termination or polyadenylation signal sequence. The nucleic acid constructs can either be obtained pre-synthesized as full-length constructs in either pooled or arrayed forms, or can be directly synthesized in the system.
The present disclosure provides a method for the synthesis, sequencing, and optionally, the isolation of individual target DNA construct molecules resulting from the synthesis of one or more target DNA constructs. The present disclosure also provides a method for the sequencing, and optionally, the isolation of individual target DNA constructs from a homogeneous or heterogeneous population of circular or linear DNA molecules.
In one aspect, the disclosure provides a method for the assembly of one or more target DNA sequences, such that individual target DNA molecules can be fully or partially sequence verified, and isolated. The method includes or consists of multiple seed oligonucleotides or DNA fragments composing one or more target DNA constructs; a target subgroup barcode specifying subgroups of one or more target DNA constructs; and a unique molecular identifier specifying individual target DNA molecules resulting from assembly methods, such as polymerase chain assembly (PCA), Gibson Assembly® (Synthetic Genomics Inc.), or Golden Gate Assembly (tUMI). Polymerase chain assembly can be used to assemble complete oligonucleotides with complimentary overlapping regions. Alternatively, Gibson Assembly or Golden Gate Assembly can be used to assemble multiple double stranded DNA fragments with either overlapping ends or complimentary type-IIS restriction sites mediating the assembly of the target DNA construct.
In another aspect, the disclosure provides a method for sequencing of one or more subregions of interest within a population of homogeneous or heterogeneous DNA molecules. The method includes or consists of multiple DNA molecules composing one or more target DNA sequences of interest; a target subgroup barcode specifying subgroups of one or more target DNA constructs; and a unique molecular identifier specifying individual target DNA molecules resulting from PCA (tUMI).
In some embodiments, the seed oligonucleotides composing one or more target DNA constructs are assembled in a single pooled PCA reaction.
In some embodiments, the PCA seed oligonucleotides contain target subgroup barcodes for hybridization or binding to beads or surface-immobilized oligonucleotides.
In some embodiments, surface immobilized oligonucleotides are contained within compartments, such as microwells or interspersed hydrophilic regions separated by hydrophobic regions. In such cases, each bead or surface compartment contains one or multiple oligonucleotides complimentary to one or multiple target subgroup barcodes or a binding moiety that specifically binds and sequesters different target subgroup barcodes.
In some embodiments, individual beads containing bound seed oligonucleotides corresponding to one or multiple target subgroups are encapsulated in individual emulsion droplets or microwells. In some embodiments, individual surface compartments containing bound seed oligonucleotides corresponding to one or multiple target subgroups are encapsulated as sequestered reaction chambers based on physical or chemical properties of the surface (ex: microwell, interspersed hydrophilic regions, or localized regions within a single reaction chamber).
In some embodiments, once encapsulated in an emulsion droplet or sequestered reaction chambers, the seed oligonucleotides are released from the bead or surface. In some embodiments, the target subgroup barcodes are cleaved from the seed oligonucleotides prior to PCA. In some embodiments, the pooled PCA reaction or sequestered PCA reactions also contain terminal primers for amplification of fully assembled target DNA sequences. For example, application WO2012154201 describes the synthesis of multiple target DNA constructs in a single pooled reaction, and is incorporated herein by reference in its entirety. Application US20150051117 describes the sequestration of seed oligonucleotides for a single target DNA construct on beads, encapsulation of individual beads in emulsion droplets, and performance of PCA in droplets, and is incorporated herein by reference in its entirety.
DNA Synthesis within the Microwell Arrays
These beads are then loaded, manually or automatically, into the microwell array (316). The concentration of the beads is sufficiently low so that at most one bead is loaded into one microwell, e.g., each microwell then contains zero or one bead. Double-stranded oligonucleotides (e.g., DNA segments) are released into the microwell by restriction enzyme digestion (318). These oligonucleotides can together form a full length of nucleic acid sequence of interest (e.g., gene sequence), and are linked together because of pre-designed overhanging sequences. These oligonucleotides are then automatically assembled by polymerase cycling assembly (PCA). The one or more reaction solutions in the microwells are then emptied (322), and the nucleic acid constructs in the solution are pooled together (324). Nucleic acid constructs with appropriate lengths are selected, e.g., by gel electrophoresis (326).
The constructs with appropriate lengths are then used to prepare a next generation sequencing (NGS) library (328), and are automatically sequenced (330). Sequencing results are then analyzed to identify correct gene assembly (332). PCR primers can be designed to select and amplify only the correct gene assembly (334). The PCR products can then be used individually or pooled together for further use, e.g., screening sequences encoding proteins with desired properties.
In some embodiments, either or both of the tUMI and the target subgroup barcode are attached to terminal primers complimentary to terminal target DNA sequences that are used for amplification of assembled fragments during PCA, In some embodiments, terminal primers containing either one or both of the tUMI and target subgroup barcodes also contain a primer binding sequence 5′ of the tUMI or target subgroup barcode on one or both of the terminal primers. In some embodiments, primers for one or more primer binding sequences 5′ of the tUMI or target subgroup barcode are added to the PCA reaction. In some embodiments, primers for one or more primer binding sequences 5′ of the tUMI or target subgroup barcode may be added at 101, 102, 103, 104, 105, or 106 molar excess to PCA terminal primers containing the tUMI Importantly, this approach can be used to create a sparse set of tUMI labeled products that are further amplified by a primer 5′ to the tUMI.
Labeling of Target DNA Molecules from Circular or Linear DNA Populations
In some embodiments, either or both of the tUMI and the target subgroup barcode are attached to terminal primers complimentary to sequences flanking a target. DNA region of interest within a homogeneous or heterogeneous population of circular or linear DNA molecules. In some embodiments, terminal primers containing either one or both of the JAI′ and target subgroup barcodes also contain a primer binding sequence 5′ of the tUMI or target subgroup barcode on one or both of the terminal primers. In some embodiments, primers for one or more primer binding sequences 5′ of the tUMI or target subgroup barcode are added to the PCA reaction. In some embodiments, primers for primers for one or more primer binding sequences 5′ of the tUMI or target subgroup barcode may be added at 101, 102 103, 104, 105, or 106 molar excess of PCA terminal primers containing the tUMI, importantly, this approach can be used to create a sparse set of tUMI labeled products that are further amplified by a primer 5′ to the tUMI.
In some embodiments, one or more target DNA constructs resulting from a pooled PCA reaction or multiple sequestered PCA reactions will each contain a target unique molecular identifier and optionally a target subgroup barcode. In some embodiments, one or more regions of interest amplified from circular or linear DNA will each contain a target unique molecular identifier and optionally a target subgroup barcode. In some embodiments, each DNA fragment molecule labeled with a target UMI (tUMI) has been amplified.
In some embodiments, multiple subregions of each tUMI-labeled DNA fragment amplified using a primer that binds a site 5′ of the tUMI and multiple different primers amplifying from the opposing side of the DNA fragment, such that each subregion contains the tUMI-containing end of the fragments and stepwise truncations from the opposite side of the fragment.
Association of Target DNA UMI with Target DNA Subregions
In some embodiments, the 5′ most regions of the primer pairs used to amplify target DNA subregions contain complimentary restriction sites. The resulting complimentary restriction sites on either side of the amplified subregions can be used to circularize the subregions using restriction and ligation. In some embodiments, restriction sites are not included, and the blunt ends of amplified subregions are ligated to create a circular product.
Circularized subregion products contain a tUMI immediately proximal to subregions tiling the target DNA construct. In some embodiments, a primer 5′ to the tUMI, and another primer approximately up to 100 nt, 200 nt, 300 nt, 400 nt, or 500 nt 3′ to the tUMI are used to amplify short sections of the target DNA subregions attached to the tUMI. In some embodiments, these primers contain additional handle sequences for next generation sequencing (
Next generation sequencing provides, at a minimum, the sequence of a tUMI specifying a particular target DNA construct molecule and the sequences of one or more subregions tiling the fragment. Grouping of subregions by tUMI sequence can provide the complete reconstruction of the target DNA construct (
Nanopore Sequencing of tUMI-Labeled Target DNAs
Nanopore sequencing provides long reads capable of sequencing long target DNA constructs labeled with tUMIs in a single read. Although the error rate of sequencing-by-synthesis based next generation sequencing is much lower than Nanopore, amplification of target DNA construct molecules labeled with individual tUMIs could provide multiple coverage of a single tUMI labeled molecule by nanopore sequencing, enabling reconstruction of a consensus sequence for each labeled molecule (
In some embodiments, the tUMI sequence and optionally flanking sequence can serve as a probe or primer binding sequence that can be used to uniquely isolate a specific tUMI-labeled target DNA construct. In some embodiments, multiplexing of such primers or probes can be used to select for isolation of one or more target DNA constructs in a single reaction. In some embodiments, one or more probes specific to one or more target DNA constructs are used to affinity purify the fragments. In some embodiments, one or more primer pairs specific to one or more target DNA constructs are used to amplify and enrich for the fragments (334).
The invention is further described in the following examples, which do not limit the scope of the invention described in the claims.
This example demonstrates an all-in-one DNA cleavage readout assay utilizing IVTT that starts with a synthesized DNA product and proceeds to the DNA cleavage readout rapidly and in a single reaction chamber. In this example, the protein of interest is a CRISPR-Cas9 nuclease from S. pyogenes (SpCas9). The DNA fragment for the assay contains, from 5′ to 3′, the target sequence, a T7 promoter, a bacterial-codon optimized SpCas9 effector protein with a mH6 tag at the N′ terminus, a T7 terminator, and then a second T7 promoter to express the noncoding single guide RNA for SpCas9 (
The nuclease reaction is directly read out from the reaction well by gel electrophoresis, displaying a short DNA fragment that is newly formed as the result of the cleavage activity of the SpCas9-sgRNA effector complex on the template DNA strand for IVTT (
This demonstrates the full versatility of the proprietary IVTT reagent; we are able to observe all three macromolecule elements of the Central Dogma; protein and noncoding RNA are expressed and then complex together into a functional nuclease complex, enabling cleavage of the original DNA target. This reaction can be further miniaturized and the readout can be converted to released nucleic acids or a fluorometric readout for use in microwells.
The PURExpress® system was used to express green fluorescence protein (GFP) in microwells. Nucleic acid encoding GFP was coupled to magnetic beads (9 um diameter) through streptavidin-biotin linkage. The magnetic beads also contain red fluorescence dye for easy observation. An array of microwells (100 um well diameter, 300 um spacing) were fabricated using silicon wafer by a method shown in
It is to be understood that while the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.
This application claims the benefit of U.S. Provisional Application No. 62/641,940, filed on Mar. 12, 2018. The entire contents of the foregoing are incorporated herein by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US19/21765 | 3/12/2019 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62641940 | Mar 2018 | US |