The instant application contains a Sequence Listing which has been submitted electronically in ASCII format and is hereby incorporated by reference in its entirety. The copy of the Sequence Listing, created on Aug. 21, 2018, is named P2826US00_SeqList_ST25.txt and is 6,661 bytes in size. This application contains a partial sequence list in Table 5.
The present invention generally relates to the identification of nucleic acids specifically bound to organic targets. More specifically the present invention relates to: 1) microfluidic devices which are capable of selective isolation and purification of nucleic acids specifically bound to protein targets; 2) method for preparation, on-chip processing and recovery of nucleic acids that are specifically bound to protein targets and are directly compatible with high-throughput sequencing.
Identification of nucleic acids specifically bound to organic targets possesses an important niche of academic and industrial research. Discovery and development of non-traditional drugs based on innate, non-toxic and degradable organic compounds, such as nucleic acids, is an actively evolving branch of pharmaceutical and biotechnology industry. At the same time, knowing which nucleic acid sequences in a genome are recognized by which proteins is crucial for understanding the gene regulatory networks underlying various biological processes and still remains an important challenge of fundamental science. To this end, both, academia and industry are constantly looking for innovative technologies that aim to decrease costs and increase efficiency of isolation and identification of nucleic acid ligands selectively bound to specific biological targets.
Currently this type of screening is typically done through the use of one of the two technologies: SELEX (Systematic Evolution of Ligands by Exponential Enrichment) or microarrays. Despite the fact that several nucleic acid ligands have been identified through these technologies, the tedious procedure associated with screens as well as the cost of the screen remain a big issue for the massive integration of these standard technologies into any academic or drug developmental toolkit.
Here we present a novel technology, MITOMI-seq, aimed to increase efficiency and throughput as well as to enhance the quality of such screens. MITOMI-seq is based on two innovations:
1. Integrative parallel microfluidic platform for isolation of nucleic acids specifically bound to protein targets. The perform selection assays on this platform only minute amounts of biological material are required.
2. Method for rapid unbiased single-step on-chip selection of nucleic acid specifically bound to protein targets from a pool of randomized DNA or RNA.
Knowing which sites in a genome are recognized by which Transcription Factors (TFs) is crucial for understanding the gene regulatory networks underlying various biological processes. The characterization of the binding preferences of TFs and TF complexes remains an important challenge of molecular biology. However, there are still a large number of factors for which the DNA binding sites are not yet known. Moreover, none of the existing techniques, able to detect TF-DNA interactions, has been used to explore the comprehensive mapping of DNA binding specificities of TF heterodimers or even larger complexes.
In this invention we aimed to tackle the challenge of robust identification of nucleic acid sequences specifically bound to organic targets within a wide affinity range. We demonstrated the power of this novel microfluidics-based technology that exploits the power of next-generation sequencing, by characterizing the DNA binding preferences of a variety of TFs and TF complexes.
Microfluidics is the science and technology of manipulating fluids within the networks of micro channels. Microfluidic devices offer an ability to work with volumes that range from micro- to femtoliters and a possibility of parallel sample operation. Within the last ten years, the number of biological applications that involve microfluidics has significantly increased, mostly due to the development of novel micro components and techniques for introducing, mixing (Hong and Quake, 2003; Weibel et al., 2005), pumping (Laser and Santiago, 2004) and storing fluids in microfluidic channels. Currently, microfluidic devices hold a great promise of integrating an entire laboratory onto a single chip (i.e. lab-on-a-chip device).
Until the 1980s, when micro molding of polymers was introduced, microfluidic devices were mainly fabricated on silicon and glass substrates. This required specialized facilities, time and cost investments. And still, devices fabricated of glass or silicon are mostly inutile for analyses of biological samples in solutions. Silicon, in particular, is expensive, and opaque to visible and ultraviolet light, so cannot be used with conventional optical methods of sample detection. The introduction of polymer molding, or so-called soft lithography, enabled fabrication of cheap microfluidic devices that were found to be compatible with multiple biological assays and were quickly adapted by academic laboratories. Currently, the fabrication of microfluidic devices tailored toward biological applications is predominantly based on micro molding of polydimethylsiloxane (PDMS). PDMS is transparent, biocompatible and permeable to gas. Taken together, these unique properties created a strong interest of the scientific community in using this material for microfluidics-based devices. It was also established that multiple patterned layers of PDMS could be easily bonded together, creating complex integrative microfluidic circuits (Unger, 2000).
A good example of an integrative microfluidic device for biological assays is the MITOMI (Mechanically Induced Trapping of Molecular Interactions) platform, which was originally developed to characterize protein-DNA and protein-protein interactions. The physical trapping of molecular interactions on a microchip was one of the foremost technologies that could carefully isolate and quantify molecular complexes. This allowed MITOMI to detect molecular interactions at an unprecedented resolution. It was first applied to study the energy of TF-DNA interactions (Maerkl and Quake, 2007) and later was expanded to measure molecular interaction kinetics (Geertz et al., 2012) and to perform immunoassays on a chip (Garcia-Cordero and Maerkl, 2014).
1. WO_2010019969_A1 Device for rapid identification of nucleic acids for binding to specific chemical targets
Innovative step compared to this patent application:
Programming microfluidic devices with molecular information and
Mechanically induced trapping of molecular interactions
Innovative step compared to this patent application:
The present invention concerns a microfluidic device according to claim 1, a dispenser according to claim 6, a method for isolation of specifically bound nucleic acids to target molecules according to claim 10 and uses of said method according to claims 21 to 24.
Other advantages are provided by the features of the dependent claims.
An immediate and straightforward application of MITOMI-seq is a comprehensive identification of DNA sites bound by transcription factors monomers and dimers. As described in the methods, the technology could already be applied to robustly identify the genomic targets of multiple proteins at a time in a high-throughput and time-effective manner. This could be particularly interesting for academic research focused on any cellular process implicated in health or disease. It has been widely acknowledged that understanding of biological mechanisms behind any physiological or pathological condition, such as cancer, stem cell renewal or cellular differentiation, goes down to the identification of key transcriptional regulators and its respective DNA targets. MITOMI-seq has a potential to become an indispensable tool for screening the protein-DNA, DNA-RNA and protein-RNA interactions overcoming previously available technologies such as DNA microarrays, EMSA, DNA pull-down in respective cost and throughput.
Another field of application for MITOMI-seq is a clinical research. There MITOMI-seq could be applied to rapidly identify the influence of drugs or modifications on the ability of DNA-binding proteins to recognize their potential target sites. The direct applications of MITOMI-seq could be: drug screening and evaluation.
Another application of MITOMI-seq could be an aptamer screening.
Aptamers are a class of molecules with a great potential to rival poly- and monoclonal antibodies in therapeutic, diagnostic, analytical as well as basic research applications. Despite the fact that the aptamer technology has been known already for a decade, the identification of novel aptamers specific to the target still remains tedious and cost ineffective. Typically, SELEX technology is used for this purpose. However, as we already showed, MITOMI-seq proposes a robust, cost- and time-effective alternative to standard methods. Particularly, using MITOMI-seq one can perform de novo identification of aptamers specific to a target in parallel and rapid fashion using minute amounts of biological material.
MITOMI-seq holds a great promise to be applied in a rapidly evolving and medically relevant single-cell analysis technologies. MITOMI-seq is a sensitive technique that requires very small amounts of starting material and each step of it could be tightly controlled through the time course of the screen. These properties could be particularly useful when analyzing the biological interactions on a single cell level. Taking into account the fact that MITOMI-seq, similarly to the available cutting-edge technologies aiming for various single-cell analyses, is implemented on a microfluidic device, it could be potentially integrated in other more sophisticated devices.
The above object, features and other advantages of the present invention will be best understood from the following detailed description in conjunction with the accompanying drawings, in which:
We first thought of using original MITOMI devices and an established protocol to perform an on-chip selection assay. Initial experiments revealed however that MITOMI devices suffered from a small and uncontrolled carry-over between neighboring units. Such a cross talk between units is typically not a problem for standard MITOMI applications that require a basic fluorescence-based read out. However, if one wants to recover bound DNA material from the device and subsequently analyze it with an extremely sensitive method like HT-sequencing, even small amounts of cross-contamination between samples may possibly skew data interpretation. Therefore there is a need for a device that can perform mechanically induced trapping of interactions but also allows a controlled isolation of individual units to be one of the key components of a successful on-chip selection procedure.
Thus, we decided to develop a cross talk-devoid device as part of an assay that would allow us to perform a robust on-chip selection of TF binding sequences from a size-unbiased pool of randomized DNA requiring minimal amounts of biological material.
We first designed a micro device that accommodates all the desired features in CleWin software. For convenience we restricted the size of the device to fit a standard 75 by 25 mm glass substrate compatible with an available fluorescent scanner. We designed our device to contain 64 units and set the diameter of the reaction chamber within each unit to 300 μm (
We first prepared two masters: one for the flow and another for the control layer. Both masters were printed on chrome plates using standard photolithography techniques.
We then used the chrome masters as masks for fabricating two types of wafers: flow and control. Flow wafers were typically fabricated with AZ9260 positive resist at a thickness of 14 μm (Table 1) and control wafers were fabricated with SU-8 negative photo resist at a thickness of 10 μm (Table 1). Next, we used the fabricated wafers to mold two layers of a PDMS device. Finally, two PDMS layers were aligned and bonded together; we punched holes using a manual-punching machine (Syneo, USA).
The device, which allows individual access to each working unit, has an advantage of manipulating heterogeneous samples simultaneously without any risk of contamination or crosstalk. Parallel loading of multiple samples on the chip in this case requires several external sources of pressure. The commercially available tools, such as pneumatic manifolds, typically can branch an external source of compressed air creating several daughter sources. Most of the suppliers currently provide pneumatic manifolds made of metal or plastic that can bifurcate the pressure source into 8 to 12 parallel outputs. But for manipulating 64 samples simultaneously one would need to use several manifolds connected in a massive control unit. To avoid this complex construction, save space and facilitate the dispensing of samples into individual chambers we designed and fabricated a “passive” dispenser that is aimed to substitute pneumatic manifold. The aim of this device is to evenly distribute the input compressed air between 64 outlets. Similarly to previously mentioned micro device we fabricated the dispensers using soft lithography. But unlike the two-layer patterned devices, dispenser requires only one molded part that is bonded to non-patterned PDMS.
We exploited the principle of MITOMI-based affinity selection of bound sites from the randomized DNA library to characterize TF DNA binding specificities. We have designed the target DNA library by randomly introducing all four nucleotides at each position of a DNA sequence. This library of random DNA sequences can then be exposed to a TF of interest after which bound sequences are collected and decoded by sequencing. This approach was adopted by several techniques that aim to identify TF binding preferences including HT-SELEX and B1H. Unlike de Bruijn sequences, the design of a random library is relatively simple and synthesis of target oligos is not cost-prohibitive. At the same time, the length of the random site is not limited and one can easily design a library that would cover all possible 10, 20- or even 30-, 40- or 50-mers. This might be especially useful when identifying the binding preferences of homo- and heterodimers. And finally, randomized DNA libraries can be easily multiplexed and used for the characterization of binding specificities of several factors simultaneously.
Each MITOMI-seq experiment starts with the in vitro expression of TFs of interest and generation of target DNA libraries. We express TFs of interest in 6 μl of the TnT® SP6 High-Yield Wheat Germ (Promega) protein expression system. To make this expression system compatible with the Gateway cloning format and to allow the fluorescence-based detection of TFs, we shuttle the open-reading frame (ORF) of the TF of interest into one of the custom-made pMARE expression vectors (pMARE-eGFP or pMARE-mCherry (Hens et al., 2011)) (
The target DNA libraries are constructed from single stranded synthetic oligos by an enzymatic second strand synthesis. Meanwhile, the surface of the microfluidic device is functionalized to capture tagged TFs (the procedure is similar to the one established for regular MITOMI chips, see Methods).
The expressed TFs are then mixed with target random DNA libraries and the mixtures are immediately loaded on the micro device. After 40 minutes of incubation, newly formed TF-DNA complexes are trapped under a flexible “button” membrane, unbound material is removed from the device by washing and bound DNA is collected by continuous elution (for a detailed protocol, please see Methods). Collected DNA is then amplified and sequenced in one lane of HiSeq sequencer (Illumina).
To overcome all the difficulties related to the preparation of samples for HT sequencing, we decided to incorporate sequencing adapters directly within the random DNA library design. To do so we designed target DNA libraries that already contain Solexa (Illumina) sequencing-compatible adapters at the 5′- and 3′-prime ends of the barcoded randomized 30 bp fragment (
On the left of
For each MITOMI-seq experiment, we mix the DNA libraries with expressed TFs and load these mixtures onto the microfluidic chip (
The resulting sequencing data is then de-multiplexed using the barcodes and reads are trimmed to the random 30 bp fragment that is located between the two barcodes corresponding to each sample. These 30 bp fragments are subsequently sorted according to their frequency: from high to low, and identical reads are collapsed in one. We then extract the top 1500 reads from each sample and use it to generate representative binding motifs with MEME (Bailey and Elkan, 1994).
First we used the extended random library to perform MITOMI-seq on several well-studied TFs for which DNA binding preferences are known. We also included the now well-characterized PPARγ:RXRα heterodimer to the tested set. Indeed, the binding specificity of PPARγ:RXRα heterodimer was previously characterized through ChIP-seq data analysis (Nielsen et al., 2008) but importantly, it has never been probed by any of the high- or medium-throughput in vitro techniques such as PBM, HT-SELEX or B1H.
Motifs obtained by MITOMI-seq for the selected factors closely match those identified previously (Table 2). Interestingly, for PPARγ:RXRα heterodimer, we recovered a motif similar to the one identified by ChIP-seq (Nielsen et al. 2008).
Over a span of just a couple of weeks, we have already been able to process 40 various individual TF or TF dimers originating from different species: M. musculus, D. melanogaster and H. sapiens in three separate MITOMI-seq experiments (24 TFs per experiment). Some factors were processed twice. We were able to retrieve 24 high-confidence motifs with E-values ranging from 1.70E-14 to 4.3E-435 (
We next tried to estimate how well MITOMI-seq performs in comparison to the two other commonly used in vitro technologies that enable TF DNA binding measurements: PBM and HT-SELEX. For this purpose, we focused on NFKB1 DNA binding data sets since data are available from all three technologies. Specifically, we compared the HT-SELEX experimental data from selection cycles 2, 3 and 4 (Dolma et al., 2013), normalized PBM probe data (Siggers et al., 2011) and two MITOMI-seq datasets from independent experiments. Note however that we anticipate including additional comparisons involving other factors in the ultimate manuscript. The NFKB1-related analyses have to therefore be considered as a proof of principle.
First, we asked how well MITOMI-seq-derived PWM models predict HT-SELEX binding models and if there would be a difference in data obtained by the two technologies. We used a representation of an inferred NFKB1 binding model obtained by ChiP-seq in MITOMI-seq or HT-SELEX data as an estimate of the performance of each technique. For five HT-SELEX and MITOMI-seq datasets, we estimated the amount of unique reads obtained after HT sequencing. Next, we used FIMO (Grant et al., 2011) to calculate enrichment of the NFKB1 TRANSFAC motif (ChIP-seq-derived V$NFKB_Q6_01 (Matys et al., 2006)) within unique reads from all five datasets. We found that one MITOMI-seq data set was significantly enriched with sites mapping to the motif (23.70%) as compared to the three datasets obtained by HT-SELEX. A second MITOMI-seq dataset did however not show a similarly high motif enrichment, as the percentage of unique reads that contained the TRANSFAC-derived NFKB1 motif was close to the one found for the 2nd and the 3rd cycle data sets obtained using HT-SELEX. Thus, it appears that the motif enrichment in the data obtained by both technologies is comparable, at least for NFKB1, with MITOMI-seq being potentially in greater agreement with ChIP-seq data than HT-SELEX. We speculate that differences in the enrichment found between two MITOMI-seq data sets could be due to variability in the on-chip selection process between two experiments. We already observed that some experiments result in a wider representation of non-specific DNA binders in sequencing data than others. Nevertheless, we would like to argue that this technical bias does not hinder overall motif discovery.
A comprehensive understanding of protein—DNA binding properties is of central importance to gene regulation. Several high-throughput technologies have therefore been developed to study TF-DNA binding. The most popular technology allowing in vivo DNA profiling is ChIP-seq (Johnson et al., 2007). However, it is well appreciated that ChIP-seq-derived DNA binding properties might not provide an accurate picture of TF DNA binding specificities. First, the accessible sites in a particular genome might not cover all possible k-mers. Second, in vivo binding is affected by additional factors, such as chromatin structure, nucleosome positioning and co-factors and thus observed DNA binding in vivo may not even be direct. Thus, in contrast to in vivo studies, in vitro DNA binding assays are valuable because they enable the assessment of direct DNA binding properties, allowing the sampling of the full spectrum of DNA k-mers. In the past, in vitro binding models of TFs were defined based on low-throughput techniques and thus had low resolution and limited accuracy. With technological developments, the ability to measure and predict binding sites has improved. A large leap came in the form of PBMs and HT-SELEX. These two high-throughput technologies produced DNA binding specificity data covering hundreds of TFs. But despite these significant technological advances, all available in vitro binding models taken together currently explain the specificities of only about one third of the total number of known TFs. Moreover, the DNA binding properties of most TF homo- and heterodimers, let alone larger complexes still remains vastly unexplored.
In this study, we aimed to address this problem, presenting a novel platform for the robust characterization of DNA binding specificities of TF monomers and dimers. The platform itself is based on two core technologies: MITOMI and HT-sequencing, which in combination, enable us to examine TF DNA binding from a new perspective and to determine the binding specificities at an unprecedented resolution. MITOMI-seq combines robust selection of sequences bound to a certain TF from a pool of k-mers with subsequent identification of the bound DNA by deep sequencing. As part of these efforts, we also developed an integrative microfluidic device that allows to run several MITOMI-seq assays simultaneously and to process 64 TFs or TF combinations in parallel. The device is based on the MITOMI principle and performs physical trapping of TF-DNA complexes thereby reducing the loss of bound DNA during the washing step to a minimum. This, in turn, illuminates the unique property of the assay, namely, the ability to preserve and analyze interactions over a wide affinity range. Unlike PBM and HT-SELEX, the two most popular in vitro technologies that aim to identify TF binding specificities, MITOMI-seq operates at micro scale and requires minute amounts of biological material. For example, to perform MITOMI-seq on one TF, one needs only few nanograms of protein which can be easily produced through available in vitro expression systems. Thus, tedious bacterial or mammalian protein expression and purification is no longer required which significantly shortens the time needed for one experiment (Table 4) and even allows the analysis of TFs (such as KRAB ZFPs) that are otherwise difficult to study because of in cello expression issues.
We demonstrated that MITOMI-seq-derived specificity models generally agree with the TF binding models identified by ChIP-seq or by available in vitro methods (Table 2). We also showed that MITOMI-seq could identify DNA binding preferences of not only monomers or homodimers but also—of TF heterodimers. Particularly, using MITOMI-seq data, we were able to generate relevant binding models for PPARγ:RXRα and Clk:Cyc heterodimers through de novo motif discovery. In addition, we identified for several factors binding motifs that were never reported before. Two identified motifs (for ZEB1 and ZNF282) were later confirmed by ChIP-seq.
To understand the potential of MITOMI-seq data and how similar its performance is to PBM or HT-SELEX data, we also compared the output of MITOMI-seq to that of the other two in vitro technologies. A close comparison of NFKB1 datasets generated from HT-SELEX and MITOMI-seq revealed that unlike SELEX, MITOMI does not result in an over-selection of the same DNA sequences but instead provides a larger fraction of informative sites that contribute to a specificity model. Thus, binding models generated from MITOMI-seq data may potentially be more accurate and comprehensive compared to HT-SELEX models. At the same time, we showed that, similar to observations with the PBM approach, the DNA binding data from MITOMI-seq can guide DNA binding affinity estimates. We also showed that the sequences that were enriched in the NFKB1 MITOMI-seq data set were also ranked as highest NFKB1 affinity sites in the PBM data. This suggests that, in principle, MITOMI-seq and PBM can produce comparable data. But unlike PBM, MITOMI-seq allows the probing of a much larger sequence space making it an ideal platform for the identification of sites bound by TF dimers or by factors that recognize long DNA sequences.
One common concern about the DNA binding data obtained by various in vitro technologies, including MITOMI-seq, is how relevant in vitro models are to in vivo binding and if there are certain advantages or disadvantages of each in vitro technology with respect to one another. Currently we are investigating this problem and trying to assess on how well can MITOMI-seq data predict in vivo TF DNA binding in general and if there any advantage of it compared to PBM and HT-SELEX. As a first attempt to evaluate the “goodness” of the motifs derived by MITOMI-seq, HT-SELEX and the motif retrieved from JASPAR database (ChIP-seq based) we quantified the occurrence of each of the motifs within the ChIP-seq peaks corresponding to NFKB1 binding in lymphoblastoid cells. We used sensitivity at 1% false-positive and area under the receiver operating characteristic curve to gauge the binding prediction (see Orenstein and Shamir, 2014 for details). We found the area under curve (AUC) value to be higher for MITOMI-seq-derived motif compared to HT-SELEX and JASPAR motifs (
To enable the expression of TFs and their immobilization and fluorescence-based detection, we explored different strategies. We found that the wheat germ (WG) in vitro transcription translation expression system containing translation enhancer (TE) sequences from the barley yellow dwarf virus (BYDV) (Promega) yielded the most robust and reproducible protein expression (data not shown). To make this expression system compatible with Gateway TF ORF clone format and to allow the fluorescence-based detection of TFs, we generated several novel vectors, pMARE, that differ by fluoro C-terminal fusions (
Randomized extended DNA libraries were ordered as single stranded oligos from IDT. The adapter sequences and barcodes used for each library are listed in the Table 5. The oligo containing a Cy5 5′-fusion: /5Cy5/CAA GCA GAA GAC GGC ATA CG (SEQ ID NO 9) was used as a primer of the complementary strand synthesis by means of Klenow exo-extension reaction (NEB Cat No M0212). Detailed reaction conditions are described in the MITOMI-seq procedure. The libraries were then purified using MinElute PCR purification kit (Qiagen) and diluted with ddH20 in a ration 1:10. 50 ng of poly-dIdC (Sigma) were added to each 10 μl of the diluted library.
1. Sample Preparation:
1.1. Set Up the Expression Mix for the TFs as Follows:
1.2. Synthesis of the dsDNA Libraries:
2. MITOMI:
2.1. Surface Chemistry:
2.2. Sample Loading and MITOMI:
3. Elution and Library Preparation for HT-Sequencing
The detailed information about primers, barcodes and libraries used in this study could be found in a supplementary TableS1.
Raw Illumina reads were processed using custom perl scripts, FASTX-tools. Read statistics and HMM were implemented using custom scripts. De novo motif discovery was done with MEME.
Number | Date | Country | Kind |
---|---|---|---|
PCT/IB2014/065418 | Oct 2014 | IB | international |
This application claims the benefit of international patent application PCT/IB2014/065418 filed Oct. 17, 2014 the entire contents of which are incorporated herein by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/IB2015/058032 | 10/19/2015 | WO | 00 |