Next-generation sequencing (NGS) is increasingly being used m medicine for diagnosis of inherited genetic disease and for selection of appropriate therapies in oncology, Since many inherited disease and actionable cancer loci are single nucleotide polymorphisms (SNPs) within protein-coding sequences, most clinical sequencing (i.e., the Illumina and Ion Torrent platforms) is carried out using short-read sequencing in combination with hybridization-capture targeted sample preparation (Agilent SureSelect, Roche Nimblegen, IDT xGen).
However, short-read NOS methods are not well suited to long-range genomic analyses, such as haplotype determination and detection of structural variation (“SV”), defined here as rearrangements involving deletions, duplications, inversions, translocations. repeat expansions, greater than 100 bp in length. Short-read methods are particularly disadvantaged when SV break points and haplotypes involve regions of repeated sequence.
This presents a significant problem for the clinical sequencing field since recent studies on large patient cohorts suggest that some types of cancer contain driver mutations caused by SVs rather than by SNPs (4). Another population study of cancer (Pan-Cancer Analysis of Whole Genomes) found significantly recurrent SVs at 52 different loci (10). Similarly, there is growing awareness that a significant (but yet unknown) fraction of inherited disease is caused by deletions, repeat expansions, and other SVs which cannot be detected by short read methods (11, 12). Finally, there is accumulating evidence that haplotype structures extending across the entire 4 mb MHC region will be important for deciphering the many complex immune disorders (reviewed in 13). Here again, short read methods can identity discrete points of polymorphism, but linkage between therm can only be inferred (indirectly) from analyses of families.
Reliable detection of SVs and extended haplotypes require long-read single molecule sequencing (or optical mapping methods) such as those developed by Pacific Biosciences, Oxford Nanopore, 10× Genomics, Bionanogenomics, and Genomic Vision (2-12). As in the case of short read gene panels and exome sequencing, it is likely that use of long-read sequencing in clinical settings will be limited to targeted sequencing, for economic reasons. Unfortunately, targeted sample preparation for these long-read sequencing methods remains laborious, inefficient, and expensive compared with conventional targeted short-read sequencing.
Various apparatuses, systems, and methods are described herein. In some embodiments, a molecule retention cassette for retaining molecules during electrophoresis includes a housing and a lane configured within the housing. The lane may have a first elongate edge arid a second elongate edge. An elution module may be configured to be received in the lane and may divide the lane into a first chamber and a second chamber. A first buffer reservoir may be positioned adjacent the first elongate edge, and a second buffer reservoir may be positioned adjacent the second elongate edge. A first side of the elution module facing the first chamber may comprise a porous sterile filtration membrane, and a second side of the elution module facing the second chamber may comprise an ultrafiltration membrane that has a pore size to retain molecules during electrophoresis.
In some embodiments, the cassette may further comprise at least one electrode configured within the first chamber and at least one electrode configured within the second chamber.
The ultrafiltration membrane may be a 15 kDa ultrafilter. The pore size of the ultrafiltration membrane may be configured to retain DNA during electrophoresis.
The elution module may be centrally positioned between the first buffer reservoir and the second buffer reservoir. The elution module may be positioned between the first buffer reservoir and the second buffer reservoir. In some embodiments, the elution module may comprise the elution module and the sample well. The elution module may be configured to receive a sample.
The cassette may further comprise agarose gel. The agarose gel may be cast next to the porous sterile filtration membrane. The agarose gel may be cast to form a gel column, and dimensions of the gel column are configured to minimize loss of target molecules into the first chamber.
In some embodiments, a molecule retention cassette for retaining molecules during electrophoresis comprises a housing and a plurality of lanes configured within the housing. The plurality of lanes each has a first elongate edge and a second elongate edge. A plurality of elution modules may each be configured to be received in a lane of the plurality of lanes so as to divide each lane into a first chamber and a second chamber. A first buffer reservoir may be positioned adjacent the first elongate edge of each lane, and a second buffet reservoir may he positioned adjacent the second elongate edge of each lane. A first side of the elution module facing the first chamber of each lane comprises a porous sterile filtration membrane, and a second side of the elution module facing the second chamber of each lane comprises an ultrafiltration membrane. The ultrafiltration membrane has a pore size to retain molecules during electrophoresis.
In some embodiments, a method for isolating and collecting target segments of target particles may include receiving a sample in a sample well of an elution module. An SDS-containing lysis buffer may be received in a first buffer chamber, the first buffer chamber being configured along a first side of the elution module, A first electrophoresis voltage may be applied to migrate components of the sample towards a second buffer chamber that is configured along a second side of the elution module, surer that target particles are immobilized in a gel segment configured along the second side of the elution module between the elution module and the second buffer chamber, and non-target particles pass through the gel segment and into the second buffer chamber. The first buffer chamber, the second buffer chamber, and the elution module may be washed and filled with a Cas9 reaction buffer. The elution module may be emptied and refilled with a Cas9 enzyme mix to cleave sections of the target particles immobilized in the gel segment, An SDS stop solution may be loaded into the elution, module, and a second electrophoresis voltage may be applied to release the Cas9 from the target particles and migrate Cas9 into the second buffer chamber. The first buffer chamber, the second buffer chamber, and the elution module may be washed and filled with elution buffer. A third electrophoresis voltage may be applied in a reverse direction to migrate the cleaved sections of the target particles from the gel segment and into the elution module.
The first side of the elution module may comprise an ultrafilter and the second side of the elution module, may comprise a porous sterile filter. The porous sterile filter may prevent the cleaved sections of the target particles from leaving the elution module during and after the application of the third electrophoresis voltage.
The target particles may be DNA and the cleaved sections of the DNA may comprise desired genomic targets.
In some embodiments, application of the second electrophoresis voltage may he shorter than the first electrophoresis voltage and/or the third electrophoresis voltage.
Application of the first electrophoresis voltage may migrate SDS through the elution module to lyse the sample and coat the non-target particles of the sample such that the non-target particles pass through the gel segment and into the second buffer chamber. Application of the second electrophoresis voltage migrates particles smaller than the target particles into the second buffer chamber.
An electrophoretic instrument system may include an electrophoresis station and a drawer configured to receive at least one electrophoresis cassette, such as those described herein. The system may also comprise a liquid handling robot and a lateral extension arm that is configured to move the drawer laterally, such that moving the drawer in a first lateral direction exposes a first side of the at least one cassette to the liquid handling robot, and moving the drawer in a second lateral direction inserts the drawer into the electrophoresis station. The electrophoresis section houses electrodes that correspond to the at least one cassette, wherein the electrodes are configured to apply electrophoresis voltages. The system may also include at least one cold storage compartment and at least one room temperature storage compartment.
An integrated, high-throughput, automated sample prep system for targeted long-read sequencing is developed. The proposed system is intended for robust walk-away automated processing in a clinical diagnostic setting.
A semi-automated research instrument system (
As shown in
The SageHLS system can be used with customized Cas9 nucleases to isolate specific large genomic DNA targets that can be efficiently used to prepare targeted DNA libraries in the 10× Genomic Chromium system.
The overall workflow for targeted sequencing of a 200 kb genomic region encompassing the human BRCA1 gene has been demonstrated. Using 1.5 million human diploid cultured cells as the input material, approximately 30,000-50,000 copies of the BRCA1 Fragment were recovered from the peak elution module, with an enrichment of ˜25-fold over a non-targeted control gene (RNaseP) as measured by Taqman qPCR. After preparation of libraries on the 10× Genomics Chromium system, they were sequenced on an Illumina NextSeq500 system. The run produced 155× coverage of the targeted BRCA1 region, with only 4.4× coverage of the non-targeted remainder of the genome (
In parallel experiments, it has been demonstrated that HLS-CATCH targeting can also be used to detect large SVs, as gRNAs targeting 40 known SVs in the well-studied cell line GM12878 were designed. The targeted regions were isolated in a single multiplex CATCH procedure targeting 40 distinct 100 kb genomic fragments. Targets were localized in the HLS cassette output by qPCR, and subjected to 10× Genomics library prep and Illumina sequencing as for the BRCA1 gene. Some preliminary data demonstrating detection of a homozygous 40 kb deletion on GM12878 chr1 is shown in
The data of
The diagnostic tests based on the HLS-CATCH/10× Genomics workflow outlined above may have compelling advantages in genetic testing and oncology. The targeted nature of the workflow offers substantial decreases in Illumina sequencing costs ($8000 in ILMN reagents for 100× phased Chromium whole genome sequence vs $500 in ILMN reagents for the 100× phased targeted Chromium coverage of BRCA1 in
it may be advantageous for systems used for such applications to have high sample throughput, such that the number of samples processed is increased and/or maximized. Additionally, it may be advantageous to achieve efficient, specific Cas9 target cleavage in a reproducible fashion. In approximately 30 HLS-CATCH experiments with mouse WBCs, and human cultured cells (lymphoblastoid and HEK293 cells), CATCH target recovery has varied widely, ranging between 2 and 50%. Similarly, target enrichment has varied between 15 and 20-fold. Fortunately, the 10× Chromium library prep works quite well with samples at the low ends of these ranges. For instance, the high-coverage targeted data of
It may also be advantageous to modify and extend HLS-CATCH to work with very low cell/nuclei inputs This is a key issue for genotyping diagnostic samples such as biopsy samples, which may be very limited in size and cell number. Success in using low cell/nuclei inputs will depend on the success in optimizing the Cas9 digestion conditions, as discussed above.
Furthermore, it may be important to fully evaluate and extend the capabilities of the HLS-CATCH for using a variety of tissue types, including buffy coat, frozen blood, fresh and fresh frozen solid tissues including tumor biopsy materials of various types,
In Phase I, a cassette may be designed to accomplish HLS-CATCH large target enrichment. The new design may eliminate the two-dimensional electroelution step of the original HLS cassette, and thereby allow manufacture of cassettes capable of processing between 6 and 12 samples per cassette (in a 96 well plate footprint) The new cassette type is referred to herein as “CATCH-1D” where ID stands for “one-dimensional.”
Phase I may include using a sample of 750,000 diploid mammalian cells (human or mouse) to achieve a recovery of at least 20% (300,000 copies) of a single-copy 200,000 bp genomic target fragment in the CATCH-1D prototype. The copy number may be measured by qPCR. The enrichment of the 200,000 bp genomic target may be at least 20-fold over non-targeted gnomic sequence background, and enrichment may be measured by qPCR. Additionally, it may be demonstrated that the CATCH-1D prototype produces CATCH targets that can be sequenced efficiently in the 10× Genomics Chromium, Illumina workflow.
Phase II may include development of a CATCH-1D instrument with liquid handling capabilities for full high-throughput walk-away automated CATCH target preparation. Cas9 digestion may be optimized far CATCH-1D operation, Phase II may also include adapting the CATCH-1D cassette or workflows for use with low cell or nuclei input and adapting CATCH-1D cassette or workflows for use with diagnostically i:important tissue types.
The basic concept of the CATCH-1D cassette and its operation is shown in
In some embodiments, the elution module 620 is centrally positioned, approximately centrally positioned, or otherwise situated between the buffer reservoirs 635, 640, On one side of the elution module 620, a porous sterile filtration membrane 645 is attached. A small segment of agarose gel 660 is cast against the outside surface of the sterile filter 645. On the other side of the elution module 620, an ultrafiltration membrane 650 is attached. The ultrafilter 650 has a pore size which will retain DNA during electrophoresis.
In the CATCH-1D workflow, the elution module 620 serves both as the sample well 665 and the elution module 620. As illustrated in
The design of the CATCH-1D cassette is much simpler than the original SageHLS cassette, and fabrication of the prototype can be performed. A Sage Science PippinHT instrument (such as that shown in Appendix A) with customized electrode arrays for CATCH-1D prototype testing can be used. The PippinHT instrument can be modified to accept cassettes with up to 12 channels. All liquid handling steps may be performed manually (e.g., during Phase I) or may be automated (e.g, during Phase II).
There arc several reagent contamination risks in the CATCH-1D workflow which are different from the original HLS workflow. One example comes from using the elution module as the input sample well. Some components of the input sample may adhere to the elution module surfaces and prove hard to remove during lysis and washing steps, there is some risk that some residual amount of SDS may remain in the elution module following initial lysis or Cas9 cleanup steps. (The elution modules of the original HLS cassette are not exposed to input sample or concentrated SDS).
Another possible challenge is adjusting the gel column dimensions so that electrophoretic Cas9 cleanup with SDS can be accomplished without loss of CATCH target into the right buffer reservoir. It is not anticipated that this will be a serious challenge for CATCH targets greater than 100 kb, and target loss by qPCR of the right chamber buffer (after Cas9 cleanup electrophoresis) can be measured.
Another significant risk to the CATCH-1D workflow is the absence of a size-selection electrophoresis step. In the original HLS-CATCH workflow, a size-selection electrophoresis is performed to increase the purity of the CATCH product(s) just before elution. The proposed CATCH-1D workflow omits size selection electrophoresis, although the electrophoretic Cas9 cleanup step offers the possibility to remove low molecular weight DNA that migrates significantly faster than the CATCH target DNA, In any case, it is likely some non-specifically cleaved HMW DNA will be dined along with the CATCH products in the final elution, and purity of the CATCH products will not be as high as in the HLS-CATCH workflow, This may not be problematic, however, as excellent 10× sequencing results can be obtained with modest enrichment factors (˜15-25-fold). In addition, further optimization of Cas9 digestion conditions, better gRNA design, and new mutant Cas9 enzymes (or similar programmable endonucleases) could reduce the importance of size-selection.
In Phase I, feasibility of the CATCH-1D system is demonstrated. Phase II includes development of the fully automated high-throughput CATCH-1D system.
Phase II features a detailed systems engineering study on integrating of liquid handling functions with electrophoresis in an automated CATCH-1D system. An exemplary automated instrument is illustrated in
A prototype multichannel CATCH-1D cassette may be fabricated, which may include determining an optimum number of samples processed per cassette and the sample input size. Additionally, the efficiency of DNA extraction and Cas9 digestion as a function of channel/elution module dimensions (as explained below) may be determined. The channel of the CATCH-1D cassette may be small, so that many samples can be processed in each cassette. A determination may also be made to ensure that enough CATCH target is extracted to get ample 10× Genomics linked-read coverage for diagnostic utility.
After evaluating channel dimension effects on extraction and CATCH recovery, a second, substantially optimized CATCH-1D cassette may also be built. An initial instrument prototype that can perform a fully automated test with the version 2 cassette may also be built. Production versions of the CATCH-1D cassette and automated instrument may also be developed. Manufacturing protocols may be determined for the final production versions of cassette and instrument, and the final system may be tested.
Optimization of HMW DNA extraction in CATCH-1D
The overall efficiency of the CATCH-1D process is a composite of two values. The first is the efficiency of the initial genomic DNA extraction from the input sample. The second is the efficiency of recovery of Cas9-digested CATCH targets. It is believed that the efficiency of the initial genomic DNA extraction in CATCH-1D is a function of: 1) the surface area of the agarose gel sample well where cell lysis arid DNA immobilization occurs (larger SA is better), and 2) the amount of detergent used per cell during extraction (more is better). To examine this issue, multichannel CATCH-1D prototypes may be fabricated with different agarose gel cross-sectional areas, and extraction efficiency may be evaluated as a function of sample input. Extraction efficiency may be measured by total DNA recovery after digesting the immobilized DNA with a frequent cutting restriction enzyme and electrocuting the digest. Extraction efficiency for long genomic DNA fragments (100-200 kb) may be measured by digesting the immobilized DNA with rare-cutting restriction enzymes and testing the electrocuted yield of specific genomic DNA restriction fragments of known length. Recovery of specific genomic DNA fragments may be measured by qPCR. The maximum input load evaluated may be 750,000 diploid human cells, since phase I work may demonstrate that that input will produce ample CATCH target (at least 300,000 copies) for high coverage in the 10× Chromium sequencing workflow.
Once extraction efficiency scales with channel/elution module cross-section are understood, studies on the efficiency of Cas9 digestion may be initiated to determine how it scales with input sample load and channel dimensions, Since many kinds of clinical samples will have a low number of input cells, performance of the cassette prototypes may be evaluated and optimized at low cell inputs. It is possible to get good 10× Genomics sequence coverage (˜100×) in single target CATCH experiments with as few as 50,000 targets recovered. In several experiments, we have achieved a 50% recovery of a Cas9 target (the 200 kb BRCA1 locus target). At this recovery value, an input of only 50,000 intact cells (or nuclei) would be sufficient, for equivalent (˜100×) Chromium library sequencing coverage. When higher target recoveries are achieved, the required cell input would be lowered proportionally, which may be significant because many clinically relevant samples will have very low cell inputs. With human cell inputs in the low 10,000's, 100× targeted sequencing coverage may be achieved.
addition to optimizing the proper cassette channel dimensions for efficient extraction and CATCH target recovery gRNA design, gRNA type, Cas9 enzyme choice, and in-cassette Cas9 reaction conditions may also be addressed.
Cas9 reaction conditions (enzyme cone, and buffer) are the dominant factors for CATCH target recovery gRNA type, gRNA design, and Cas9 type have less effect, Almost every gRNA tested works to some degree on its intended target, and 75% of the guides show complete digestion of PCR products at equimolar target/Cas9 ratios (cone's in the 0.1-0.4 nM range). Moreover, a number of different gRNA design tools (guidescan.com, Feng Zhang lab website at MET, SureDesign tool from Agilent) may be used successfully. Furthermore, similar cutting efficiency may be seen using two-part synthetic, gRNAs and T7pol in vitro-transcribed single gRNAs (although we feel the synthetic RNAs are easier and more reliable to use). Commercially available Cas9 mutants have not been found to be more specific than wt-Cas9 enzyme in our in vitro assays.
Despite these initial conclusions, the CRISPR nuclease family is constantly expanding, and new mutants may be engineered that may be useful for our process. Thus, more enzymes for enhanced performance in the CATCH-1D cassette may be screened. Similarly, full-length synthetic single guide RNAs which are recently becoming commercially available may also be tested, The single guides may simplify the Cas9 assembly workflow. For example, Synthego may be used, as the supplier has advertised that their synthetic single guides have significantly improved specificity versus in vitro transcribed single gRNAs.
In the HLS-CATCH process, the best target recoveries have been achieved using 2-3 gRNAs per cut site window (multiple guides per cut site are used to avoid the possibility that a SNP in a single gRNA recognition sequence will eliminate a cut site), a cut site window of around 2 kb, and a total concentration of 1-4 uM Cas9 enzyme in the sample well (assembled at 1:1 ratio of Cas9:total gRNA concentration). In the SageHLS process, target recovery is also significantly enhanced by electrophoretically “injecting” the Cas9 enzyme into the wall of the sample well where the HMW DNA is immobilized. A 1 min period electrophoresis at ˜50V may be used. Enzyme concentration and electrophoretic injection are the two parameters that have the largest effect on CATCH target yield.
The HLS workflows may be adapted for a variety of materials including whole blood, frozen whole blood, bully coat, fresh solid tissues, fresh frozen tissue, tumor biopsies, and nuclei obtained from all of the above sources. For the CATCH process, it may be critical that the extracted DNA is nearly full chromosome length, or >>2 mb in size, for efficient immobilization in the agarose gel to occur. This means that the CATCH process may work best with fresh material, or nuclei prepared from fresh material. It also means that CATCH processes are unlikely to work on formalin-fixed tissues.
SageHLS development was accomplished with mammalian WBCs or human cell lines, or isolated nuclei from those sources. The SageHLS system has been used to isolate HMW DNA, from ˜50 ul of whole blood, which should contain around 2.5 ug of genomic DNA, which is sufficient input for the CATCH process. In addition, SageHLS has successfully been used with frozen human tissue culture cells (HEK293) for isolation of HMW DNA. Given these two pieces of data, protocols for processing buffy coat, frozen buffY coat, and frozen whole blood may be found.
For tissue work, there have been several instrument+reagent systems for tissue dissociation developed by suppliers of cell cytometers and FACS instruments (BD Biosciences and Miltenyi Biotech). These involve a combination of mechanical and enzymatic treatments followed by selective filtration to produce suspensions of single cells or nuclei. These systems may perform well for CATCH-1D input with fresh materials.
Based on limited success with frozen tissue culture cells, the most viable clinical tissue sample for CATCH will be fresh-frozen tissue, Efforts may be focused around that sample type, and studies using the BD and Miltenyi instrumentation systems cited above may be initiated,
The combination CATCH-1D+10× Genomics workflow may have bread applications in many areas of clinical sequencing. For this reason, development and optimization of the CATCH-1D system includes establishing optimal development paths for rapid production of new CATCH reagent kits. This may include integration of bioinformatics for genomic cut site selection, gRNA selection within cut site windows, streamlined pipeline for gRNA ordering and QC, and trackable product numbering, stocking, and packaging.
Clinical sequencing markets may be surveyed to select key diagnostic areas which may benefit from the CATCH-1D approach. For example, CATCH-1D may be used to enable long-range sequencing in the MHC region. Additionally, there arc other attractive assay areas in inherited disease testing and oncology.
An optimized workflow for design of gRNAs that can be applied to create CATCH assays for any region (or set of regions) of the human genome may be provided. Some assays in inherited disease testing, may only involve a single gene along with flanking regions. Other tests may involve a panel of cancer genes (perhaps low 100's) frequently involved in SVs (10). Still others may involve design of assays producing CATCH products that tile across a broad genomic region like the MHC (14).
Modifications, customization, or new versions of 10× Cienoinie Long Ranger software designed specifically for analysis of targeted CATCH-1D-F-10× Chromium sequencing results may be assessed.
A 10× Chromium sequencing workflow and QC method for validating performance of new kits may be developed. While qPCR is an inexpensive a fast method for assessing recovery and enrichment, a conical sequencing method may be developed.
Any and all references o publications or other documents, including but not limited to, patents, patent applications, articles, webpages, books, etc presented in the present application, are herein incorporated by reference in their entirety.
Example embodiments of the devices, systems and methods have been described herein. As noted elsewhere, these embodiments have been described for illustrative purposes only and are not limiting. Other embodiments are possible and are covered by the disclosure, which will be apparent from the teachings contained herein. Thus, the breadth and scope of the disclosure should not be limited by any of the above-described embodiments but should be defined only in accordance with claims supported by the present disclosure and their equivalents. Moreover, embodiments or the subject disclosure may include methods, systems and devices which may further include any and l elements from any other disclosed methods, systems, and devices, including any and all elements corresponding to molecular processing. In other words, elements from one or another disclosed embodiments rosy be interchangeable with elements from other disclosed embodiments. In addition, one or more features/elements of disclosed embodiments may be removed and still result in patentable subject matter (and thus, resulting in yet more embodiments of the subject disclosure). Correspondingly, some embodiments of the present disclosure may be patent ably distinct from one and/or another reference/prior art by specifically lacking one or more elements/features of a system, device and/or method disclosed in such prior art. In other words, claims to certain embodiments may contain negative limitation to specifically exclude one or more elements/features resulting in embodiments which are patentably distinct from the prior art which include such features/elements.
This application claims priority to and the benefit of U.S. Provisional Application No. 62/614,239, filed Jan. 5, 2018, and entitled, “A Semi-Automated Research Instrument System.” The present application expressly herein incorporates by reference the disclosure of the above-referenced application in its entirety.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2019/012416 | 1/4/2019 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62614239 | Jan 2018 | US |