1. Field
The present disclosure generally relates to the field of image processing, and more particularly, to systems and methods for imaging densely populated particles or features associated with high-throughput DNA sequencing devices.
2. Description of the Related Art
Many biological analysis processes utilize microscopic imaging techniques. For example, florescent-based sequencing processes may utilize an array or population comprising a large number of microscopic structures or features that facilitate or serve as a substrate or support for interactions between analytes and reagents. Imaging of such features to detect, for example, fluorescent signals resultant from molecular interactions can yield useful information about the composition of the analytes. In one exemplary application such images can be used to resolve the sequences of nucleic acid samples.
In imaging applications such as used in connection with fluorescent-based sequencing analysis, the number of features or structures can influence the throughput of sequencing analysis. Generally, greater throughput and/or reduced reagent consumption can be achieved using a larger number of such microscopic features or structures. One way to achieve larger feature numbers is to increase their density within a given area. However, such increases in density can create imaging and signal resolution issues, for example, due to the close proximity of features with respect to one another.
Various embodiments of an analyte imaging system are provided herein. In one embodiments, the present teachings set forth a method for microparticle identification used in sequencing processes, comprising; for one or more of a plurality of subsets of microparticles contained within a set of microparticles distributed within an area, obtaining a subset image representative of the microparticles and based on signals resulting from detectable characteristic associated with the subset of microparticles distributed within the area; identifying the microparticles in each subset image; and combining the microparticle identifications obtained from the subset images so as to yield a combined set of microparticle identifications.
In another embodiment, the present teachings describe a method for imaging in a biological analysis process, comprising; providing a set of particles to an area, the set of particles distributed over the area and configured to facilitate a plurality of reactions between analytes associated with the particles and reagents introduced to the particles, the set of particles comprising N subsets of particles, the quantity N being greater than one, each subset having particles distributed over the area and capable of emitting signals detectably different than signals from another subset of particles during at least some of the plurality of reactions; generating an enhanced list of identified particles of the set by; obtaining N images of the area, each image corresponding to signals from each of the N subsets of particles; for each of the N images, identifying at least some of the subset of particles; and combining the identified particles of the subsets; and for a given reaction, obtaining N images corresponding to the detectably different signals from the area and identifying particles in the N images based on the enhanced list of identified particles.
In still further embodiment, the present teachings describe a biological analysis system, comprising a flow cell configured to receive a population of microparticles distributed in an area and facilitate a sequence of reactions between analytes coupled to the microparticles and reagents flowing selectively through the flow cell, each of the microparticles being in one of N sub-populations based on type of signal emitted during a selected portion of the sequence of reactions, each sub-population of microparticles distributed in the area; an assembly of optical elements configured to form an image of the area; an imaging detector configured to detect the image and generate a signal representative of the image; and a processor configured to induce imaging of each of the N sub-populations of microparticles based on the type of signal, the processor further configured to process the N images to identify microparticles therein and combine the identified microparticles from the N images to yield an enhanced list of identified microparticles.
In another embodiment, the present teachings provide a storage medium having a computer-readable instruction, the instruction comprising; obtaining data representative of N images, each of the N images corresponding to detection of microparticles having a distinguishable fluorescence characteristic; and for each of the N images: identifying microparticles in the image; aligning the image to a common image such that the N images share a substantially common frame of reference; and ranking the identified microparticles based on fluorescent intensity so as to provide preference to identified microparticles having higher intensity values.
These and other aspects, advantages, and novel features of the present teachings will become apparent upon reading the following detailed description and upon reference to the accompanying drawings. In the drawings, similar elements have similar reference numerals.
The present disclosure generally relates to systems and methods for imaging signals associated with analytical devices such as biological analysis devices. By way of a non-limiting example, such biological analysis devices can include a genetic analysis device such as a nucleic acid sequencer.
In many of such devices, excitation energy such as electromagnetic energy in the fluorescent, ultraviolet, and/or visible light spectrum may be provided or transmitted to an interrogation region or analyte detection zone where samples or probes interact with analytes. In various embodiments, the samples, probes and/or analytes can be tagged with detectable labels such as fluorescent markers, dyes or molecules which are responsive to the excitation energy and produce a detectable signal arising from an interaction between the sample, probe and/or the analyte. In the context of certain nucleic acid sequencing techniques, such labeled sample/probe/analyte interactions can take place in connection with particles which may serve as a support or carrier for discrete interactions. For example, particles may be associated with specific fragments or portions of a nucleic acid strand or sample that may be desirably analyzed for example to determine its composition or sequence. Excitation of the labels attached result in detectable signals being emitted and detected, thereby allowing characterization of the sequence of the DNA sample.
While various embodiments of the present teachings describe imaging techniques associated with detecting signal emissions arising from or induced by excitation energy of selected wavelengths it will be appreciated by one of skill in the art that these teachings may also be applied in other contexts. For example, signals arising from not only from fluorescent labels or markers may be detected and analyzed but also other types of signal generating markers may be utilized which do not require an excitation source. For example, the present teachings may be readily adapted for use with chemiluminescent markers or radioactive labels which do not necessarily require an excitation source for signal detection. Similarly, different types and/or classes of markers may be utilized in a particular analysis such as using multiple fluorophores responsive to different wavelengths of excitation energy or mixed fluorescent/chemiluminescent/radioactive markers. Accordingly, the various embodiments described herein are illustrative and it will be recognized that the invention may be adapted for use in numerous contexts and as such the disclosed embodiments are not intended to the scope of the present teachings.
In certain applications, nucleic acid strands (such as DNA or RNA) being analyzed can be attached to or associated with particles such as beads. Such beads can in turn be disposed on a substrate and signals arising from or associated with the beads imaged. In various embodiments, these imaging operations capture or record signals resultant from analytical reactions such as sequencing analysis or operations occurring for example as nucleotides are incorporated into template nucleic acid strand(s) undergoing analysis. The beads can be disposed on the substrate in a number of ways. For example, beads, particles, and sample analytes can be deposited on a surface of a substrate such as a slide or flow cell which is exposed to various reagents and conditions which permit detection of the label/marker/tag. In another example, sample-containing beads or particles can be deposited on structures such as ends of densely packed fibers to form an array or collection of discrete samples that may be simultaneously imaged.
In still other embodiments, the sample undergoing analysis may not utilize a carrier such as a bead or particle but be deposited directly on a substrate surface or formed on/within a substrate so as to generate a plurality of closely packed features from which signals arising from the tags/markers/labels are desirably detected and distinguished from one another. Imaging operations whether used to detect signals associated with collections of sample-containing beads/microparticles or to detect closely spaced arrangements/clusters/lawns of sample are desirably configured to efficiently resolve the sample signals. It will be appreciated that the present teachings may be adapted for use in sample imaging operations applicable to a variety of different contexts where signals arising from high density sample features are desirably identified and resolved from one another.
In configurations where sample nucleic acid strands or fragments reside on a substrate in one of the foregoing manners, it is desirable to be able to accurately identify the beads or features that anchor/contain/localize the sample at various positions on the substrate so as to monitor where detected interactions occur. In certain situations, however, such identification of beads or features can become difficult or challenging for a number of reasons.
For example, as density of bead or features increase or as bead or feature size decreases, identification efficiency can be adversely affected due to, for example, resolving capability of optics, decreased signal intensity, limitations of bead or feature-finding algorithms, signal crosstalk or some combination thereof. As described herein, increasing the density of sample containing beads or features and/or decreasing the size of the beads or features can be an important factor that contributes to an increase in throughput of sequencing analysis.
In various embodiments, the present disclosure can improve bead or feature finding capability. In certain embodiments, such improvement can include improved bead or feature finding capabilities as well as improved bead or feature resolution both of which may occur at higher densities so as to facilitate increases in analytical throughput.
In various embodiments, methods and systems of the present disclosure may be applied to numerous different types and classes of photo and signal detection methodologies. In certain embodiments, the detector 106 may comprise a CCD or CMOS based detector which is configured to capture signals arising from the beads or features. Signal capture may be facilitated by the optics 104 which may include various filters, lenses, and other components which direct and condition the signals associated with the beads or features such that they may be captured and/or registered by the detector 106. Additionally, although various embodiments of the present disclosure are described in the context of sequence analysis, these methods may be readily adapted to other devices/instrumentations and used for purposes other than biological analysis.
As previously described and in various embodiments, the methods and systems of the present disclosure may be applied to numerous different types and classes of excitation/signal emission methodologies and are not necessarily limited to excitation by light or laser-based excitation systems such systems may include those which do not utilize an excitation source but rather employ self-emitting tags or markers such as radioactive or chemiluminescent labels.
In the context of sequence analysis, the analyzer 100 can include a detection zone 102 where sequencing reactions occur. In certain embodiments, the detection zone 102 can include clonally amplified nucleic acid strands anchored to particles such as microbeads. Such beads can populate a detection platform such as a slide. As previously described, use of beads is not required, however, and the sample to be interrogated may be secured or retained on the substrate in various manners. Although the description provided herein discusses imaging processes in the context of beads deposited on slides, it will be understood that in other arrangements or embodiments different sample substrates or manners of sample analysis may be used including both microparticle (e.g. bead-based) approaches as well as approaches for which image features are not necessarily associated with beads but nonetheless can benefit from one or more features of the present disclosure.
As shown in
As shown in
As shown in
In certain embodiments, the analysis of data (e.g., base calling in sequencing analysis) may be performed by the processor 108. The processor 108 may further be configured to operate in conjunction with one or more other processors. The processor's components may include, but are not limited to, software or hardware components, modules such as software modules, object-oriented software components, class components and task components, processes methods, functions, attributes, procedures, subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, databases, data structures, tables, arrays, and variables. Furthermore, the processor 108 may output a processed signal or analysis results to other devices or instrumentation where further processing may take place.
Further details of exemplary systems and methods suitable for use with the present teachings include Assignee's PCT Application Publication No. WO 2006/084132, entitled “Reagents, Methods, And Libraries for Bead-Based Sequencing,” U.S. patent application Ser. No. 12/873,194 entitled “Low-Volume Sequencing Systems and Method of Use,” filed Aug. 31, 2010, and Ser. No. 12/873,132 entitled “Fast-Indexing Filter Wheel and Method of Use” filed Aug. 31, 2010, the entireties of which are incorporated herein by reference thereto.
For the purpose of description, particles such as the beads 112 do not necessarily need to have generally spherical or bead-like shapes. In many situations where one or more features of the present disclosure can be applied, images of “particles” generally result from relative small point or point-like signal (e.g., fluorescence) sources. Thus, it will be understood that the present disclosure can also be applied to situations where it is desirable to be able to identify and discriminate densely packed signal sources that may behave as point or point-like objects.
When a group of particles or signal sources are disposed in a given area, some of such objects can be either clustered together or be overlapped. For example,
Such limitations in particle-identifying capability can become more significant as the overall density of particles increases and/or the size of the particles decrease. In certain situations, and as shown in
As described herein, one or more features of the present disclosure can facilitate and extend the detectable particle density limit of conventional imaging systems. Such beneficial capabilities can be manifested as a detected particle density curve 122 that extends beyond the plateau of the curve 120 and provide improved data quality, throughput, and efficiency compared to conventional systems.
In
As one can appreciate, such capability can be beneficial, since a particle-finding algorithm can be utilized at a particle-density value that is a fraction of the full set 130. In the example shown in
Images of beads 158 facilitating such interactions can be obtained via an optics assembly 160 and a detector 162. SOLiD System available from Life Technologies/Applied Biosystems is a non-limiting example of a high-throughput sequencing device that utilizes a flow cell similar to that described in reference to
In the example panel 172, the beads 158 are shown to be deposited in a substantially random manner. It will be understood, however, that one or more features of the present disclosure can be applied to semi-ordered or ordered arrays of beads or other particles.
Images such as those shown in
As described herein, two or more different imagings of the exemplary area of reflected by the panel can be obtained, where each image includes distinguishably detectable signals from a subset of the beads in the panel. Thus, for each image, a lessor number of beads may be identified using for example different markers/labels than that of a monochromatic image (where all of the signal emitting beads emit the same type of signal for example resultant from the same marker/label). As described herein, such two or more different images can be used to facilitate identification of higher densities of beads.
For example, four sample images (178a-178d) shown in
In certain embodiments, one or more features of the present disclosure can be implemented in ligation-based high-throughput DNA sequencing applications such as that associated with Applied Biosystem's SOLiD System.
In certain embodiments, the 5′ end of the template portion of the strand 180 can be attached to the four unique primers 182 (denoted as 4×P1) via a universal primer (denoted as P1). As shown also, the 3′ end of the template can be attached to a universal P2 primer 188 (vertical line pattern). Thus, a common probe 194 (vertical line pattern) having a common label 196 (in this example, a green light emitting label) can be hybridized to each of the four P2 primers 188. As described herein, such common probes can be utilized to obtain a monochromatic focalmap image of the beads in a given panel. In certain embodiments as described herein, such a focalmap image can provide a basis for constructing a reference map using four unique-probe based images.
As described herein, a reference map 216 can be based on the combination of the P1 images 210. Because such a reference map is based on the less dense P1 images 210, there is likely a greater number of beads represented in the map 216 than a reference map based on the P2 image alone.
As shown in
As described herein, use of different colored images (e.g. using multiple tags/markers/labels for selected subpopulations of beads) obtained during ligation cycles allows finding of beads in a less crowded (and hence less crosstalk) environment. In principle, a set of different colored ligation images or differentially labeled sub-populations of nucleic acid templates, beads, or features can be combined to generate and/or update a reference map. In certain embodiments, colored images or similarly labeled templates, beads, or features from the same ligation cycle may be desirable due to the commonality in operating and imaging conditions. In certain embodiments, earlier ligation cycles can be preferable and give rise to less crosstalk among the beads. Thus in certain embodiments, images obtained from the earliest ligation cycle can be used for generating and/or updating a reference map. In the context of the example configuration shown in
In certain embodiments as described in reference to
In certain embodiments, the panel coordinate system can be based on segmented nature of images formed on segmented detectors.
In certain embodiments, process blocks 288 to 302 can be performed for each of a plurality of ligation images (e.g., four P1 images) (loop 286). Once all ligation images are processed, the process 280 can proceed to updating of the temporary set T.
For each image I of a set ligation images S (process block 288), a set of found beads B(I) can be obtained by identifying beads in the image I in a process block 290. Such identified beads can have one or more attributes, including positions. In a process block 292, fluorescent intensity of the found beads in the set B(I) can be obtained. In a process block 294, the image I can be aligned with the focalmap image F. In certain embodiments, such alignment can be achieved using a known technique where a quantity representative of overlapping of beads in the two images is maximized. Based on such alignment, offset values between the two images I and F can be obtained. In a process block 296, pixel-equivalent integer values of the offset values can be obtained. In a process block 298, positions of the found beads in the set B(I) can be translated by the pixel-equivalent offset values. In certain situations, such translation of the bead positions may result in one or more beads that fall outside of a boundary defining the focalmap image F. In certain embodiments, such beads can be discarded in process block 298. In a process block 300, the remaining translated beads in the set B(I) can be ranked based on fluorescent intensity. In a process block 302, the intensity-ranked beads can be added to the temporary set T.
Upon processing of all the ligation images, the process 280 can rank the beads in the set T (that now includes intensity-ranked beads from all of the ligation images) based on fluorescent intensity in a process block 304. In a process block 306, ranked beads in the set T can be added to a bead mask image M if an area at or about the bead's position is unoccupied in the mask image M. In certain embodiments, the mask image M can be based on the focalmap image F. Examples of generating and populating the mask image M (also referred to as reference map herein) are described in greater detail in reference to
In certain embodiments, a manner in which the mask image M is populated can depend on factors such as bead density, bead dimension, and/or pixel dimension. In various example data and results described herein, pixels on the imaging detector are approximately ⅓ m-side squares, and beads have an average diameter of approximately 0.9 m. Thus, a given bead's diameter spans approximately 2.7 pixels. It will be understood that the examples of generating and populating the mask image M in
In certain embodiments, and as shown in
As shown in
In a process block 332, pixel coordinate for each of the beads in the set T is obtained. In the example temporary set T (342), beads corresponding to the four labels B, G, Y, and R (192 (unfilled pattern, slanted line pattern, cross hatch pattern, shaded pattern) in
At this stage, the mask image M 340a contains a map of found beads B from the focalmap image F. In the process block 332, beads from the set T can be added to the found beads B if the image mask M is unoccupied at the pixel coordinate of the beads from T. In the example shown in
In the example shown in
As with the process block 314 of
In many nucleic acid analyzers, beads can be coated with or features can contain a number of same or substantially same sample or fragment nucleic acid strands. Within a certain range of densities of such sample or strands per bead/feature, more strands associated with a given bead or feature may generally yield a better (e.g. stronger/more distinct) detectable signal. Thus, it may be generally preferable to use such strong-signal generating beads than beads that yield lower intensity or quality signals.
As described herein in reference to the process 280 of
In certain embodiments, it may be desirable to quantify the extent of duplicate matches of beads. For example, placement or rejection of newly found beads in the process block 332 (
Thus, a loop 352 loops through one or more of such bead separation distances. In certain embodiments, such distances are selected to exclude larger separations where duplicate identification is unlikely. For a given distance, the process 350 can determine in a decision block whether the current bead B is a duplicate. If the answer is “Yes,” information about the current duplicate bead can be updated in a process block 356. For example, a duplicate count for the current distance can be incremented. If the answer is “No,” distance between the current bead B and the next bead C can be obtained in a process block 360.
In a decision block 362, the process 350 can determine whether the B-C separation distance is less than or equal to the current distance. If the answer is “Yes,” the next bead C can be designated as a duplicate to bead B in a process block 364. The process 350 can then loop through other distances.
Once completed, one or more parameters associated with duplicate bead finding phenomenon can be analyzed. For example, a distribution of separation distances (obtained via the process block 356) can yield information about, for example, image alignment performance.
As described herein, relying on a single image (such as a focalmap image) for identifying beads can lead to a limitation in bead density, beyond which identification of additional beads becomes difficult or impossible for a given bead-finding algorithm. An example of such saturation of bead identification performance can be seen in Table 1, where bead identification is performed by a given bead-finding algorithm using a monochromatic focalmap image.
As described herein, a panel is a square having an approximately 750 m side. Thus, a value D1 expressed in beads/panel can be converted to a value D2 expressed beads/cm2 by dividing the quantity D1 by a quantity 0.005625. Thus, 130,000 beads/panel=23.1×106 beads/cm2, and so on.
For the example bead-finding performance listed in Table 1, the maximum density of identifiable beads (for the given bead-finding algorithm) appears to be somewhere between 140,000 and 160,000 beads/panel. As described herein, bead-finding performance can be enhanced—even with the same (or substantially same) bead-finding algorithm—by implementing one or more features of the present disclosure.
In a set of runs using a SOLiD Version 3 System, following data was obtained: 50 bp DH10b mate-pair on duplicate quad slides with bead densities of approximately 120K, 140K, 160K, and 180K beads per panel; and 50 bp single tag on duplicate quad slides with bead densities of approximately 130K, 160K, 190K, and 220K beads per panel. Based on such data,
For the purpose of description herein, total identified beads refers to total number of beads that can be identified and used in every ligation cycle (see, for example,
For the purpose of description herein, total matched beads refers to total number of beads having 0 to 6 mismatches when assessing bead performance using a reference genome having known and fixed number of base pairs. The matched beads are part of the identified beads.
For the purpose of description herein, perfectly matched beads refer to matched beads having zero mismatch. Thus, perfectly matched beads are part of the matched beads.
As shown in
However, the total identified beads for P1C1_Enhanced case continues to increase beyond V3's upper limitation of about 130K beads per panel. Such an increase continues to the upper limit of density (about 220K per panel) evaluated in the example data.
For the scatter plot 410 shown in
For the scatter plot 420 shown in
Table 2 summarizes overall statistics for data corresponding to bead densities in the 120K-to-140K per panel range:
In Table 2, V3 column represents bead finding and matching achieved via use of monochromatic focalmap image alone, and P1C1E (P1C1_Enhanced) column represents the same via use of four color P1 images.
As shown in Table 2, bead finding performance improves by approximately 27%, matching performance by approximately 19%, and perfect bead matching performance by approximately 14%. Thus, one can see significant improvements in performance via use of four color P1 images at a bead density region near the upper limit associated with use of monochrome focalmap image alone.
A more dramatic performance improvement can be achieved for bead densities that are above the upper limit associated with use of monochrome focalmap image alone. Table 3 summarizes overall statistics for data corresponding to bead densities in the 160K-to-220K per panel range:
As shown in Table 3, bead finding performance improves by approximately 45%, matching performance by approximately 33%, and perfect bead matching performance by approximately 24%. Thus, one can see even more dramatic improvements in performance via use of four color P1 images at bead densities beyond the upper limit associated with use of monochrome focalmap image alone.
In certain embodiments, an increase in identifiable bead density generally yields an increase in throughput in sequencing processes. For example,
As shown, the increase in throughput is generally proportional to the increase in bead density, with the maximum throughput being about 8 Gb. For the example data shown in
The example data 460 shown in
In various embodiments, the above-described approaches comparing bead, particle, and/or feature finding information obtained using subset imaging and comparison to a common image may be altered and/or rearranged as desired. For example, bead finding may be initiated in the common image and augmented using information/bead identifications for the subset images. Likewise, bead finding may be initiated in a selected subset images or images, compared with other subset images, and/or compared to a common image. Alternative analytical approaches utilizing for example the subset images independent of the common image or in connection with the common image irrespective of the order or manner of analysis are understood to be other embodiments of the present teachings.
Based on the foregoing, it is estimated that throughput increases similar to projections of
Although the above-disclosed embodiments have shown, described, and pointed out the fundamental novel features of the invention as applied to the above-disclosed embodiments, it should be understood that various omissions, substitutions, and changes in the form of the detail of the devices, systems, and/or methods shown may be made by those skilled in the art without departing from the scope of the invention. Consequently, the scope of the invention should not be limited to the foregoing description, but should be defined by the appended claims.
All publications and patent applications mentioned in this specification are indicative of the level of skill of those skilled in the art to which this invention pertains. All publications and patent applications are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.
This application claims priority pursuant to 35 U.S.C. §119(e) to U.S. Provisional Patent Application Ser. No. 61/240,977, filed on Sep. 9, 2009, entitled “Systems and Methods for Identifying Microparticles” the entirety of this application being incorporated herein by reference thereto.
Number | Date | Country | |
---|---|---|---|
61240977 | Sep 2009 | US |