The present disclosure relates to integrated devices and related instruments that can perform massively-parallel analyses of samples by providing short optical pulses to tens of thousands of sample wells or more simultaneously and receiving fluorescent signals from the sample wells for sample analyses. The instruments may be useful for point-of-care genetic sequencing and for personalized medicine.
Photodetectors are used to detect light in a variety of applications. Integrated photodetectors have been developed that produce an electrical signal indicative of the intensity of incident light. Integrated photodetectors for imaging applications include an array of pixels to detect the intensity of light received from across a scene. Examples of integrated photodetectors include charge coupled devices (CCDs) and Complementary Metal Oxide Semiconductor (CMOS) image sensors.
Instruments that are capable of massively-parallel analyses of biological or chemical samples are typically limited to laboratory settings because of several factors that can include their large size, lack of portability, requirement of a skilled technician to operate the instrument, power need, need for a controlled operating environment, and cost. When a sample is to be analyzed using such equipment, a common paradigm is to extract a sample at a point of care or in the field, send the sample to the lab and wait for results of the analysis. The wait time for results can range from hours to days.
Some embodiments provide for a method, comprising: determining information about a sample that emits emission light in response to excitation light based on at least one of pulse duration and interpulse duration and at least two of wavelength, intensity, and lifetime of the emission light, wherein the sample comprises a reagent configured to be coupled to a luminescent label, and wherein a shielding element is disposed between the reagent and the luminescent label.
Some embodiments provide for an integrated circuit, comprising: at least one photodetection region configured to generate charge carriers responsive to incident photons emitted from a sample; at least one charge storage region configured to receive the charge carriers from the photodetection region; and at least one component configured to obtain information about the incident photons from a sample comprising a reagent configured to be coupled to a luminescent label, wherein a shielding element is disposed between the reagent and the luminescent label, the information comprising: pulse duration and at least one of intensity, wavelength, luminescence lifetime, or interpulse duration; interpulse duration and at least one of intensity, wavelength, luminescence lifetime or pulse duration; wavelength and at least one of interpulse duration, pulse duration, or luminescence lifetime; intensity and at least one of interpulse duration, pulse duration, or luminescence lifetime; or luminescence lifetime and at least one of interpulse duration, pulse duration, wavelength, or intensity.
Some embodiments provide for an integrated circuit comprising: at least one photodetection region configured to generate charge carriers responsive to incident photons emitted from a sample; at least one charge storage region configured to receive the charge carriers from the photodetection region; and at least one component configured to obtain information about the incident photons from a sample comprising a reagent configured to be coupled to a luminescent label, wherein a shielding element is disposed between the reagent and the luminescent label, the information comprising at least three members selected from a group comprising wavelength information, luminescence lifetime information, intensity information, pulse duration information, and interpulse duration information.
The features and advantages of the present invention will become more apparent from the detailed description set forth below when taken in conjunction with the drawings. When describing embodiments in reference to the drawings, directional references (“above,” “below,” “top,” “bottom,” “left,” “right,” “horizontal,” “vertical,” etc.) may be used. Such references are intended merely as an aid to the reader viewing the drawings in a normal orientation. These directional references are not intended to describe a preferred or only orientation of features of an embodied device. A device may be embodied using other orientations.
I. Introduction
Aspects of the present disclosure relate to techniques for performing sequencing, such as nucleic acid sequencing (e.g., DNA, RNA) and/or peptide sequencing, with an integrated device.
For example, aspects of the present disclosure relate to integrated devices, instruments and related systems capable of analyzing samples in parallel, including identification of analytes in sequencing applications. Such an instrument may be compact, easy to carry, and easy to operate, allowing a physician or other provider to readily use the instrument and transport the instrument to a desired location where care may be needed. Analysis of a sample may include labeling the sample with one or more luminescent labels, which may be used to detect the sample and/or identify single molecules of the sample (e.g., individual nucleotide identification as part of nucleic acid sequencing).
In accordance with embodiments described herein, sequencing methods can be carried out by illuminating a surface-immobilized polypeptide with excitation light, and detecting luminescence produced by a label attached to a reagent that associates with the polypeptide during a sequencing reaction. In some cases, radiative and/or non-radiative decay of the label can result in photodamage to the polypeptide and/or a surface linkage group attached thereto. Accordingly, in some embodiments, the disclosure provides labeled reagents comprising a shielding element which mitigates label-induced photodamage during a sequencing reaction. In some embodiments, the shielding element forms a linkage group between the label and the reagent. In embodiments related to nucleic acid sequencing, the surface-immobilized polypeptide is a polymerizing enzyme, and the labeled reagent is a nucleotide comprising a luminescent label, where the nucleotide is bound by the polymerizing enzyme during a sequencing reaction. In embodiments related to polypeptide sequencing, the surface-immobilized polypeptide is a polypeptide sample, and the labeled reagent is an amino acid recognition molecule comprising a luminescent label, where the amino acid recognition molecule associates with (e.g., binds to) a terminus of the polypeptide during a sequencing reaction.
As used herein, a luminescent label is a molecule that absorbs one or more photons and may subsequently emit one or more photons after one or more time durations. In some embodiments, the term is used interchangeably with “label” or “luminescent molecule” depending on context. A luminescent label in accordance with certain embodiments described herein may refer to a luminescent label of a labeled recognition molecule, a luminescent label of a labeled nucleotide, a luminescent label of a labeled peptidase (e.g., a labeled exopeptidase, a labeled non-specific exopeptidase), a luminescent label of a labeled peptide, a luminescent label of a labeled cofactor, or another labeled composition described herein. In some embodiments, a luminescent label in accordance with the disclosure refers to a labeled amino acid of a labeled polypeptide comprising one or more labeled amino acids.
In some embodiments, a luminescent label may comprise a first and second chromophore. In some embodiments, an excited state of the first chromophore is capable of relaxation via an energy transfer to the second chromophore. In some embodiments, the energy transfer is a Förster resonance energy transfer (FRET). Such a FRET pair may be useful for providing a luminescent label with properties that make the label easier to differentiate from amongst a plurality of luminescent labels in a mixture. In yet other embodiments, a FRET pair comprises a first chromophore of a first luminescent label and a second chromophore of a second luminescent label. In certain embodiments, the FRET pair may absorb excitation energy in a first spectral range and emit luminescence in a second spectral range. In general, a donor chromophore is selected that has a substantial spectrum of the acceptor chromophore. Furthermore, it may also be desirable in certain applications that the donor have an excitation maximum near a laser frequency such as Helium-Cadmium 442 nM, Argon 488 nM. NdrYAG 532 nm, He—Ne 633 nm, etc. In such applications, the use of intense laser light can serve as an effective means to excite the donor fluorophore.
In some embodiments, an acceptor chromophore of a FRET label has a substantial overlap of its excitation spectrum with the emission spectrum of a donor chromophore of the FRET label. In some embodiments, the wavelength maximum of the emission spectrum of the acceptor chromophore is preferably at least 10 nm greater than the wavelength maximum of the excitation spectrum of the donor chromophore. Additional examples of useful FRET labels include, e.g., those described in U.S. Pat. Nos. 5,654,419, 5,688,648, 5,853,992, 5,863,727, 5,945,526, 6,008,373, 6,150,107, 6,177,249, 6,335,440, 6,348, 596, 6,479,303, 6,545,164, 6,849,745, 6,696,255, and 6,908,769 and Published. U.S. Patent Application Nos. 2002/0168641, 2003/0143594, and 2004/0076979, the disclosures of which are incorporated herein by reference for all purposes.
In some embodiments, a luminescent label refers to a fluorophore or a dye. Typically, a luminescent label comprises an aromatic or heteroaromatic compound and can be a pyrene, anthracene, naphthalene, naphthylamine, acridine, stilbene, indole, benzindole, oxazole, carbazole, thiazole, benzothiazole, benzoxazole, phenanthridine, phenoxazine, porphyrin, quinoline, ethidium, benzamide, cyanine, carbocyanine, salicylate, anthranilate, coumarin, fluoroscein, rhodamine, xanthene, or other like compound.
In some embodiments, a luminescent label comprises a dye selected from one or more of the following: 5/6-Carboxyrhodamine 6G, 5-Carboxyrhodamine 6G, 6-Carboxyrhodamine 6G, 6-TAMRA, Abberior® STAR 440SXP, Abberior® STAR 470SXP, Abberior® STAR 488, Abberior® STAR 512, Abberior® STAR 520SXP, Abberior® STAR 580, Abberior® STAR 600, Abberior® STAR 635, Abberior® STAR 635P, Abberior® STAR RED, Alexa Fluor® 350, Alexa Fluor® 405, Alexa Fluor® 430, Alexa Fluor® 480, Alexa Fluor® 488, Alexa Fluor® 514, Alexa Fluor® 532, Alexa Fluor® 546, Alexa Fluor® 555, Alexa Fluor® 568, Alexa Fluor® 594, Alexa Fluor® 610-X, Alexa Fluor® 633, Alexa Fluor® 647, Alexa Fluor® 660, Alexa Fluor® 680, Alexa Fluor® 700, Alexa Fluor® 750, Alexa Fluor® 790, AMCA, ATTO 390, ATTO 425, ATTO 465, ATTO 488, ATTO 495, ATTO 514, ATTO 520, ATTO 532, ATTO 542, ATTO 550, ATTO 565, ATTO 590, ATTO 610, ATTO 620, ATTO 633, ATTO 647, ATTO 647N, ATTO 655, ATTO 665, ATTO 680, ATTO 700, ATTO 725, ATTO 740, ATTO Oxa12, ATTO Rho101, ATTO Rho11, ATTO Rho12, ATTO Rho13, ATTO Rho14, ATTO Rho3B, ATTO Rho6G, ATTO Thio12, BD Horizon™ V450, BODIPY® 493/501, BODIPY® 530/550, BODIPY® 558/568, BODIPY® 564/570, BODIPY® 576/589, BODIPY® 581/591, BODIPY® 630/650, BODIPY® 650/665, BODIPY® FL, BODIPY® FL-X, BODIPY® R6G, BODIPY® TMR, BODIPY® TR, CAL Fluor® Gold 540, CAL Fluor® Green 510, CAL Fluor® Orange 560, CAL Fluor® Red 590, CAL Fluor® Red 610, CAL Fluor® Red 615, CAL Fluor® Red 635, Cascade® Blue, CF™350, CF™405M, CF™405S, CF™488A, CF™514, CF™532, CF™543, CF™546, CF™555, CF™568, CF™594, CF™620R, CF™633, CF™633-V1, CF™640R, CF™640R-V1, CF™640R-V2, CF™660C, CF™660R, CF™680, CF™680R, CF™680R-V1, CF™750, CF™770, CF™790, Chromeo™ 642, Chromis 425N, Chromis 500N, Chromis 515N, Chromis 530N, Chromis 550A, Chromis 550C, Chromis 550Z, Chromis 560N, Chromis 570N, Chromis 577N, Chromis 600N, Chromis 630N, Chromis 645A, Chromis 645C, Chromis 645Z, Chromis 678A, Chromis 678C, Chromis 678Z, Chromis 770A, Chromis 770C, Chromis 800A, Chromis 800C, Chromis 830A, Chromis 830C, Cy®3, Cy®3.5, Cy®3B, Cy®5, Cy®5.5, Cy®7, DyLight® 350, DyLight® 405, DyLight® 415-Col, DyLight® 425Q, DyLight® 485-LS, DyLight® 488, DyLight® 504Q, DyLight® 510-LS, DyLight® 515-LS, DyLight® 521-LS, DyLight® 530-R2, DyLight® 543Q, DyLight® 550, DyLight® 554-R0, DyLight® 554-R1, DyLight® 590-R2, DyLight® 594, DyLight® 610-B1, DyLight® 615-B2, DyLight® 633, DyLight® 633-B1, DyLight® 633-B2, DyLight® 650, DyLight® 655-B1, DyLight® 655-B2, DyLight® 655-B3, DyLight® 655-B4, DyLight® 662Q, DyLight® 675-B1, DyLight® 675-B2, DyLight® 675-B3, DyLight® 675-B4, DyLight® 679-05, DyLight® 680, DyLight® 683Q, DyLight® 690-B1, DyLight® 690-B2, DyLight® 696Q, DyLight® 700-B1, DyLight® 700-B1, DyLight® 730-B1, DyLight® 730-B2, DyLight® 730-B3, DyLight® 730-B4, DyLight® 747, DyLight® 747-B1, DyLight® 747-B2, DyLight® 747-B3, DyLight® 747-B4, DyLight® 755, DyLight® 766Q, DyLight® 775-B2, DyLight® 775-B3, DyLight® 775-B4, DyLight® 780-B1, DyLight® 780-B2, DyLight® 780-B3, DyLight® 800, DyLight® 830-B2, Dyomics-350, Dyomics-350XL, Dyomics-360XL, Dyomics-370XL, Dyomics-375XL, Dyomics-380XL, Dyomics-390XL, Dyomics-405, Dyomics-415, Dyomics-430, Dyomics-431, Dyomics-478, Dyomics-480XL, Dyomics-481XL, Dyomics-485XL, Dyomics-490, Dyomics-495, Dyomics-505, Dyomics-510XL, Dyomics-511XL, Dyomics-520XL, Dyomics-521XL, Dyomics-530, Dyomics-547, Dyomics-547P1, Dyomics-548, Dyomics-549, Dyomics-549P1, Dyomics-550, Dyomics-554, Dyomics-555, Dyomics-556, Dyomics-560, Dyomics-590, Dyomics-591, Dyomics-594, Dyomics-601XL, Dyomics-605, Dyomics-610, Dyomics-615, Dyomics-630, Dyomics-631, Dyomics-632, Dyomics-633, Dyomics-634, Dyomics-635, Dyomics-636, Dyomics-647, Dyomics-647P1, Dyomics-648, Dyomics-648P1, Dyomics-649, Dyomics-649P1, Dyomics-650, Dyomics-651, Dyomics-652, Dyomics-654, Dyomics-675, Dyomics-676, Dyomics-677, Dyomics-678, Dyomics-679P1, Dyomics-680, Dyomics-681, Dyomics-682, Dyomics-700, Dyomics-701, Dyomics-703, Dyomics-704, Dyomics-730, Dyomics-731, Dyomics-732, Dyomics-734, Dyomics-749, Dyomics-749P1, Dyomics-750, Dyomics-751, Dyomics-752, Dyomics-754, Dyomics-776, Dyomics-777, Dyomics-778, Dyomics-780, Dyomics-781, Dyomics-782, Dyomics-800, Dyomics-831, eFluor® 450, Eosin, FITC, Fluorescein, HiLyte™ Fluor 405, HiLyte™ Fluor 488, HiLyte™ Fluor 532, HiLyte™ Fluor 555, HiLyte™ Fluor 594, HiLyte™ Fluor 647, HiLyte™ Fluor 680, HiLyte™ Fluor 750, IRDye® 680LT, IRDye® 750, IRDye® 800CW, JOE, LightCycler® 640R, LightCycler® Red 610, LightCycler® Red 640, LightCycler® Red 670, LightCycler® Red 705, Lissamine Rhodamine B, Napthofluorescein, Oregon Green® 488, Oregon Green® 514, Pacific Blue™, Pacific Green™, Pacific Orange™ PET, PF350, PF405, PF415, PF488, PF505, PF532, PF546, PF555P, PF568, PF594, PF610, PF633P, PF647P, Quasar® 570, Quasar® 670, Quasar® 705, Rhodamine 123, Rhodamine 6G, Rhodamine B, Rhodamine Green, Rhodamine Green-X, Rhodamine Red, ROX, Seta™ 375, Seta™ 470, Seta™ 555, Seta™ 632, Seta™ 633, Seta™ 650, Seta™ 660, Seta™ 670, Seta™ 680, Seta™ 700, Seta™ 750, Seta™ 780, Seta™ APC-780, Seta™ PerCP-680, Seta™ R-PE-670, Seta™ 646, SeTa™ 380, SeTa™ 425, SeTa™ 647, SeTa™ 405, Square 635, Square 650, Square 660, Square 672, Square 680, Sulforhodamine 101, TAMRA, TET, Texas Red®, TMR, TRITC, Yakima Yellow™, Zenon®, Zy3, Zy5, Zy5.5, and Zy7.
A luminescent label may become excited in response to illuminating the luminescent label with excitation light (e.g., light having a characteristic wavelength that may excite the luminescent label to an excited state) and, if the luminescent label becomes excited, emit emission light (e.g., light having a characteristic wavelength emitted by the luminescent label by returning to a ground state from an excited state). Detection of the emission light may allow for identification of the luminescent label, and thus, the sample or a molecule of the sample labeled by the luminescent label. According to some embodiments, the instrument may be capable of massively-parallel sample analyses and may be configured to handle tens of thousands of samples or more simultaneously.
The inventors have recognized and appreciated that an integrated device, having sample wells configured to receive the sample and integrated optics formed on the integrated device, and an instrument configured to interface with the integrated device may be used to achieve analysis of this number of samples. The instrument may include one or more excitation light sources, and the integrated device may interface with the instrument such that the excitation light is delivered to the sample wells using integrated optical components (e.g., waveguides, optical couplers, optical splitters) formed on the integrated device. The optical components may improve the uniformity of illumination across the sample wells of the integrated device and may reduce a large number of external optical components that might otherwise be needed. Furthermore, the inventors have recognized and appreciated that integrating photodetectors (e.g., photodiodes) on the integrated device may improve detection efficiency of fluorescent emissions from the sample wells and reduce the number of light-collection components that might otherwise be needed.
The inventors have further recognized that certain characteristics of fluorescent emissions from the sample wells may be measured and analyzed for use in a number of applications. For example, certain characteristics of emitted light may enable identification of the sample being analyzed (e.g., identification of a luminescent label) which can facilitate genetic sequencing applications, such as DNA, RNA, and/or protein sequencing. Multiple characteristics of emission light may be obtained, including information regarding intensity, wavelength, lifetime, pulse duration, interpulse duration and any combination thereof, to enable multi-dimensional discrimination techniques for analyzing a chemical or biological sample. In some embodiments, the device is configured to obtain measurements for characteristics of emitted light to enable techniques for 2-D, 3-D, 4-D, and 5-D discrimination of one or more samples under analysis.
For example, the inventors have developed techniques for obtaining spectral information such as wavelength of the incident light emitted from a sample well. For instance, in some aspects, a pixel may include one or more charge storage regions configured to receive charge carriers generated responsive to incident photons from a light source, with charge carriers stored in the charge storage region(s) indicative of spectral and timing information. In one example, two charge storage regions may receive charge carriers generated responsive to incident light at different wavelengths, such that a difference in power spectral density of the incident light is indicated in the accumulated charge in the charge storage regions. Alternatively or additionally, in some aspects, a pixel may include regions having different depths, each configured to generate charge carriers responsive to incident photons. For instance, in one example, the pixel may include two or more photodetection regions having different depths (e.g., along the optical axis) such that charge carriers are generated in the different photodetection regions responsive to incident photons of different wavelengths. Alternatively or additionally, in some aspects, a pixel may include multiple charge storage regions having different depths, and one or more of the charge storage regions may be configured to receive the incident photons and generate charge carriers therein. Another of the charge storage regions may be configured to receive charge carriers generated in the photodetection region(s) of the pixel. In some aspects, a pixel may alternatively or additionally include an optical sorting element configured to direct at least some incident photons to one charge storage region and other incident photons to another charge storage region. For instance, in one example, the optical sorting element may include an at least partially refractive, diffractive, scattering, and/or plasmonic element. The inventors have recognized that wavelength information may be used as one degree of discrimination in some embodiments of 2-D, 3-D, 4-D and/or 5-D discrimination sample analysis techniques.
In addition, the inventors have developed methods for obtaining, separately and in any combination, lifetime, pulse duration, interpulse duration, and intensity information for a sample under analysis. In particular, time-gating techniques may be used to obtain measurements of fluorescence lifetime, pulse width/duration, and/or interpulse duration of an emission from a sample under analysis. In some embodiments, one or more measurements for intensity of emission light are obtained by collecting and quantifying charge carriers generated by incident photons in one or more charge storage regions. The inventors have recognized that such fluorescence lifetime, pulse duration, interpulse duration, and/or intensity information may be used as degrees of discrimination in some embodiments of 2-D, 3-D, 4-D and/or 5-D discrimination sample analysis techniques in addition or alternative to wavelength information. For example, in 2-D discrimination techniques, discrimination of a sample may be based on any two types of information used in combination, such as wavelength and intensity, lifetime and intensity, etc. For 3-D discrimination techniques, discrimination of a sample may be performed based on any three types of information used in combination, such as wavelength, lifetime, and intensity, etc. Likewise, 4-D discrimination techniques may be performed based on any four types of information used in combination and 5-D discrimination techniques may be performed based on any five types of information used in combination.
Thus, the inventors have recognized that the techniques described herein for obtaining wavelength, lifetime, pulse duration, interpulse duration, and intensity information and/or any other suitable characteristic of emission light from a sample may be used to facilitate multi-dimensional analysis of a biological or chemical sample using any combination of characteristics described herein. In some embodiments, the multi-dimensional analysis may be used for identifying the particular sample from which emission light is collected and analyzed, for example, identifying a particular amino acid or nucleotide. The inventors have recognized that multi-dimensional analysis of a sample can provide for more accurate identification of a molecule as opposed to single dimensional analysis. In addition, techniques using more dimensions can provide for more accurate identification of a molecule compared to techniques which use fewer dimensions for analysis.
In some embodiments, a two-dimensional discrimination technique is used for analyzing and identifying a sample based on characteristics of emission light associated with the sample. Any suitable grouping of characteristics may be used in such two-dimensional techniques, for example, lifetime and wavelength information, and/or wavelength and intensity information. In some embodiments, a three-dimensional discrimination technique is used for analyzing and identifying a sample, for example, using wavelength, lifetime, and intensity information, using wavelength, pulse duration, interpulse duration, and lifetime information, and/or any other suitable grouping of characteristics. In some embodiments, a four-dimensional discrimination technique is used for analyzing and identifying a sample, for example, using wavelength, lifetime, intensity, pulse duration and interpulse duration information of collected emission light associated with a sample under analysis. In some embodiments, the integrated device is configured for massively parallel sample analysis and thus the multi-dimensional analysis techniques can be used for analyzing and identifying a high volume of samples at a time.
It should be appreciated that integrated devices described herein may incorporate any or all techniques described herein alone or in combination.
II. Integrated Device and Related Aspects
The multi-dimensional signal analysis techniques described herein may, in some embodiments, be implemented using an integrated device, such as integrated device 1-102 shown in
The directionality of the emission light from a sample well 1-108 may depend on the positioning of the sample in the sample well 1-108 relative to metal layer(s) 1-106 because metal layer(s) 1-106 may act to reflect emission light. In this manner, a distance between metal layer(s) 1-106 and a luminescent label positioned in a sample well 1-108 may impact the efficiency of photodetector(s) 1-110, that are in the same pixel as the sample well, to detect the light emitted by the luminescent label. The distance between metal layer(s) 1-106 and the bottom surface of a sample well 1-108, which is proximate to where a sample may be positioned during operation, may be in the range of 100 nm to 500 nm, or any value or range of values in that range. In some embodiments the distance between metal layer(s) 1-106 and the bottom surface of a sample well 1-108 is approximately 300 nm.
The distance between the sample and the photodetector(s) may also impact efficiency in detecting emission light. By decreasing the distance light has to travel between the sample and the photodetector(s), detection efficiency of emission light may be improved. In addition, smaller distances between the sample and the photodetector(s) may allow for pixels that occupy a smaller area footprint of the integrated device, which can allow for a higher number of pixels to be included in the integrated device. The distance between the bottom surface of a sample well 1-106 and photodetector(s) may be in the range of 5 μm to 15 μm, or any value or range of values in that range. It should be appreciated that, in some embodiments, emission light may be provided through other means than an excitation light source and a sample well. Accordingly, some embodiments may not include sample well 1-108.
Photonic structure(s) 1-230 may be positioned between sample wells 1-108 and photodetectors 1-110 and configured to reduce or prevent excitation light from reaching photodetectors 1-110, which may otherwise contribute to signal noise in detecting emission light. As shown in
Coupling region 1-201 may include one or more optical components configured to couple excitation light from an external excitation source. Coupling region 1-201 may include grating coupler 1-216 positioned to receive some or all of a beam of excitation light. Examples of suitable grating couplers are described in U.S. patent application Ser. No. 15/844,403 titled “OPTICAL COUPLER AND WAVEGUIDE SYSTEM,” filed Dec. 15, 2017 under Attorney Docket No. R0708.70021US01 which is hereby incorporated by reference herein in its entirety. Grating coupler 1-216 may couple excitation light to waveguide 1-220, which may be configured to propagate excitation light to the proximity of one or more sample wells 1-108. Alternatively, coupling region 1-201 may comprise other well-known structures for coupling light into a waveguide.
Components located off of the integrated device may be used to position and align the excitation source 1-106 to the integrated device. Such components may include optical components including lenses, mirrors, prisms, windows, apertures, attenuators, and/or optical fibers. Additional mechanical components may be included in the instrument to allow for control of one or more alignment components. Such mechanical components may include actuators, stepper motors, and/or knobs. Examples of suitable excitation sources and alignment mechanisms are described in U.S. patent application Ser. No. 15/161,088 titled “PULSED LASER AND SYSTEM,” filed May 20, 2016 under Attorney Docket Number R0708.70010US02 which is hereby incorporated by reference in its entirety. Another example of a beam-steering module is described in U.S. patent application Ser. No. 15/842,720 titled “COMPACT BEAM SHAPING AND STEERING ASSEMBLY,” filed Dec. 14, 2017 under Attorney Docket No. R0708.70024US01 which is hereby incorporated herein by reference.
A sample to be analyzed may be introduced into sample well 1-108 of pixel 1-112. The sample may be a biological sample or any other suitable sample, such as a chemical sample. The sample may include multiple molecules and the sample well may be configured to isolate a single molecule. In some instances, the dimensions of the sample well may act to confine a single molecule within the sample well, allowing measurements to be performed on the single molecule. Excitation light may be delivered into the sample well 1-108, so as to excite the sample or at least one luminescent label attached to the sample or otherwise associated with the sample while it is within an illumination area within the sample well 1-108.
In operation, parallel analyses of samples within the sample wells are carried out by exciting some or all of the samples within the wells using excitation light and detecting signals from sample emission with the photodetectors. Emission light from a sample may be detected by a corresponding photodetector and converted to at least one electrical signal. Information regarding various characteristics of the emission light (e.g., wavelength, fluorescence lifetime, intensity, pulse duration and/or any other suitable characteristic) may be collected and used for subsequent analysis, as described herein. The electrical signals may be transmitted along conducting lines (e.g., metal layers 1-240) in the circuitry of the integrated device, which may be connected to an instrument interfaced with the integrated device. The electrical signals may be subsequently processed and/or analyzed. Processing or analyzing of electrical signals may occur on a suitable computing device either located on or off the instrument.
In some embodiments, operation of pixel 1-112 may include one or more rejection (e.g., drain) periods and one or more collection periods. In one example, operation of pixel 1-112 in accordance with one or more pulses of an excitation light source may begin with a rejection period, such as to discard charge carriers generated in pixel 1-112 (e.g., in photodetection region PD) responsive to excitation photons from the light source. For instance, the excitation photons may arrive at pixel 1-112 prior to the arrival of fluorescence emission photons from the sample well. Transfer gates for the charge storage regions may be biased to have low conductivity in the charge transfer channels coupling the charge storage regions to the photodetection region, blocking transfer and accumulation of charge carriers in the charge storage regions. A drain gate for the drain region may be biased to have high conductivity in a drain channel between the photodetection region and the drain region, facilitating draining of charge carriers from the photodetection region to the drain region. Transfer gates for any charge storage regions coupled to the photodetection region may be biased to have low conductivity between the photodetection region and the charge storage regions, such that charge carriers are not transferred to or accumulated in the charge storage regions during the rejection period.
Following the rejection period, a collection period may occur in which charge carriers generated responsive to the incident photons are transferred to one or more charge storage regions. During the collection period, the incident photons may include fluorescent emission photons, resulting in accumulation of fluorescent emission charge carriers in the charge storage region(s). For instance, a transfer gate for one of the charge storage regions may be biased to have high conductivity between the photodetection region and the charge storage region, facilitating accumulation of charge carriers in the charge storage region. Any drain gates coupled to the photodetection region may be biased to have low conductivity between the photodetection region and the drain region such that charge carriers are not discarded during the collection period. Some embodiments may include multiple collection periods, such as a second collection period following a first collection period, for multiple charge storage regions to accumulate charge carriers at different times. For instance, during one of multiple collection periods, one of the transfer gates may be biased to facilitate accumulation of charge carriers in the corresponding charge storage region, and the other transfer gates may be biased to block accumulation of charge carriers in the other charge storage regions. In some embodiments, multiple charge storage regions may accumulate charge carriers during a single collection period. In some embodiments, operation of the pixel may include as many collection periods as charge storage regions. In some embodiments, operation as described herein may be repeated for each pulse of the excitation light source. In some embodiments, collection periods for the various charge storage regions may be separated by rejection periods. For example, in some embodiments, each pulse of the excitation light source may be followed by one rejection period and one collection period (e.g., having accumulation in a single charge storage region).
As described further herein, the rejection and/or collection periods may be controlled using one or more control signals from a control circuit of the integrated circuit, such as by providing the control signal(s) to drain and/or transfer gates of the pixel(s) of the integrated circuit. It should be appreciated that, in some embodiments, operation of pixels described herein may occur as described in this section.
For example,
In some embodiments, some components of pixels described herein may be disposed and/or formed on one or more substrate layers of an integrated circuit. In some embodiments, the substrate layer(s) may alternatively or additionally include one or more auxiliary layers (e.g., epitaxial layers) disposed above and/or below the other substrate layer(s). In some embodiments, some components of pixels described herein may be formed by etching away at least a portion of the substrate and/or auxiliary layer(s). In some embodiments, transfer and/or drain gates described herein may be formed using a semiconductor material such as polysilicon, which may be at least partially opaque.
III. Techniques for Obtaining Lifetime Information
According to an aspect of the technology described herein, the inventors have developed techniques for obtaining information regarding multiple characteristics of emission light from a sample well to facilitate sample analysis including sample identification with multiple dimensions of discrimination. The inventors have recognized that luminescent labels used to label biological or chemical samples, when excited by incident light, fluoresce with a characteristic lifetime (e.g., a characteristic emission decay time period), such that analyzing the lifetime information of emission light may facilitate identification of the particular sample to which the luminescent label is attached (e.g., bonded). Fluorescence lifetime, also referred to herein as simply “lifetime”, is a measure of the time which a luminescent label spends in the excited state before returning to a ground state and emitting a photon. In some embodiments, fluorescence lifetime information may be obtained through techniques for time binning charge carriers generated by incident photons.
For example,
A second fluorescent molecule may have a decay profile pB(t) that is exponential, but has a measurably different lifetime τ2, as depicted for curve B in
Differences in fluorescent emission lifetimes can be used to discern between the presence or absence of different fluorescent molecules and/or to discern between different environments or conditions to which a fluorescent molecule is subjected. In some cases, discerning fluorescent molecules based on lifetime (rather than emission wavelength, for example) can simplify aspects of an analytical instrument. As an example, wavelength-discriminating optics (such as wavelength filters, dedicated detectors for each wavelength, dedicated pulsed optical sources at different wavelengths, and/or diffractive optics) can be reduced in number or eliminated when discerning fluorescent molecules based on lifetime. In some cases, a single pulsed optical source operating at a single characteristic wavelength can be used to excite different fluorescent molecules that emit within a same wavelength region of the optical spectrum but have measurably different lifetimes. An analytic system that uses a single pulsed optical source, rather than multiple sources operating at different wavelengths, to excite and discern different fluorescent molecules emitting in a same wavelength region can be less complex to operate and maintain, more compact, and can be manufactured at lower cost.
Although analytic systems based on fluorescent lifetime analysis can have certain benefits, the amount of information obtained by an analytic system and/or detection accuracy can be increased by allowing for additional detection techniques. For example, some analytic systems can additionally be configured to discern one or more properties of a sample based on fluorescent wavelength, pulse duration/width, and/or fluorescent intensity as described herein.
Referring again to
For a single molecule or a small number of molecules, however, the emission of fluorescent photons occurs according to the statistics of curve B in
Examples of a time-binning photodetector 1-322 are described in U.S. patent application Ser. No. 14/821,656, filed Aug. 7, 2015, titled “Integrated Device for Temporal Binning of Received Photons” under Attorney Docket No. R0708.70002US02 and in U.S. patent application Ser. No. 15/852,571, filed Dec. 22, 2017, titled “Integrated Photodetector with Direct Binning Pixel,” under Attorney Docket No. R0708.70017US01 both of which are hereby incorporated herein by reference in their entireties. For explanation purposes, a non-limiting embodiment of a time-binning photodetector is depicted in
In operation, a portion of an excitation pulse from a pulsed optical source (e.g., a mode-locked laser) is delivered to a reaction chamber over the time-binning photodetector 1-322. Initially, some excitation radiation photons 1-901 may arrive at the photon-absorption/carrier-generation region 1-902 and produce carriers (shown as light-shaded circles). There can also be some fluorescent emission photons 1-903 that arrive with the excitation radiation photons 1-901 and produce corresponding carriers (shown as dark-shaded circles). Initially, the number of carriers produced by the excitation radiation can be too large compared to the number of carriers produced by the fluorescent emission. The initial carriers produced during a time interval |te-t1| can be rejected by gating them into a carrier-discharge channel 1-906 with a first transfer gate 1-920, for example.
At a later times mostly fluorescent emission photons 1-903 arrive at the photon-absorption/carrier-generation region 1-902 and produce carriers (indicated a dark-shaded circles) that provide useful and detectable signal that is representative of fluorescent emission from the reaction chamber 1-330. According to some detection methods, a second electrode 1-921 and third electrode 1-923 can be gated at a later time to direct carriers produced at a later time (e.g., during a second time interval |t1-t2|) to a first carrier-storage region 1-908a. Subsequently, a fourth electrode 1-922 and fifth electrode 1-924 can be gated at a later time (e.g., during a third time interval |t2-t3|) to direct carriers to a second carrier-storage region 1-908b. Charge accumulation can continue in this manner after excitation pulses for a large number of excitation pulses to accumulate an appreciable number of carriers and signal level in each carrier-storage region 1-908a, 1-908b. At a later time, the signal can be read out from the bins. In some implementations, the time intervals corresponding to each storage region are at the sub-nanosecond time scale, though longer time scales can be used in some embodiments (e.g., in embodiments where fluorophores have longer decay times).
The process of generating and time-binning carriers after an excitation event (e.g., excitation pulse from a pulsed optical source) can occur once after a single excitation pulse or be repeated multiple times after multiple excitation pulses during a single charge-accumulation cycle for the time-binning photodetector 1-322. After charge accumulation is complete, carriers can be read out of the storage regions via the read-out channel 1-910. For example, an appropriate biasing sequence can be applied to electrodes 1-923, 1-924 and at least to electrode 1-940 to remove carriers from the storage regions 1-908a, 1-908b. The charge accumulation and read-out processes can occur in a massively parallel operation on an optoelectronic chip resulting in frames of data.
Although the described example in connection with
Regardless of how charge accumulation is carried out for different time intervals after excitation, signals that are read out can provide a histogram of bins that are representative of the fluorescent emission decay characteristics, for example. An example process is illustrated in
In some implementations, only a single photon may be emitted from a fluorophore following an excitation event, as depicted in
In some implementations, there may not be a fluorescent photon emitted and/or detected after each excitation pulse received at a reaction chamber. In some cases, there can be as few as one fluorescent photon that is detected at a reaction chamber for every 10,000 excitation pulses delivered to the reaction chamber. One advantage of implementing a mode-locked laser as the pulsed excitation source is that a mode-locked laser can produce short optical pulses having high intensity and quick turn-off times at high pulse-repetition rates (e.g., between 50 MHz and 250 MHz). With such high pulse-repetition rates, the number of excitation pulses within a 10 millisecond charge-accumulation interval can be 50,000 to 250,000, so that detectable signal can be accumulated.
After a large number of excitation events and carrier accumulations, the carrier-storage regions of the time-binning photodetector 1-322 can be read out to provide a multi-valued signal (e.g., a histogram of two or more values, an N-dimensional vector, etc.) for a reaction chamber. The signal values for each bin can depend upon the decay rate of the fluorophore. For example and referring again to
To further aid in understanding the signal analysis, the accumulated, multi-bin values can be plotted as a histogram, as depicted in
Thus, obtaining lifetime information obtained by the time binning techniques described herein may facilitate analysis and identification of a sample in a sample well. It should be appreciated that other suitable techniques of time binning charge carriers and/or otherwise obtaining fluorescence lifetime information other than the techniques described herein may be implemented, and aspects of the technology are not limited in this respect
IV. Techniques for Obtaining Wavelength Information
According to another aspect of the technology described herein, the inventors developed techniques for discriminating spectral information (e.g., wavelength information) of incident light. For instance, in addition to or alternative to using time-gating techniques for discriminating timing information (e.g., lifetime information), devices described herein may be configured to determine spectral information to enhance the data that can be obtained from a sample. The inventors have recognized that, similar to lifetime, emission light from a particular luminescent label may have a characteristic wavelength such that analyzing wavelength information of emission light may facilitate identification of the sample to which the luminescent label is attached. Thus, in some embodiments, emission light wavelength is an additional dimension of discrimination used in the multi-dimensional discrimination techniques described herein.
The inventors have recognized that an integrated device, such as integrated device 1-102 may obtain wavelength measurements of emission light through a variety of techniques described further herein. For example, in some embodiments, wavelength measurements may be obtained through techniques incorporating photodetector regions having different depths. In other embodiments, wavelength measurements may be obtained using techniques incorporating charge transfer channels of different depths. In some embodiments, wavelength information may be obtained using techniques incorporating optical shielding elements. Further, in some embodiments, wavelength measurements of emission light may be obtained using one or more optical sorting elements. It should be appreciated that any suitable technique in addition to or alternative to the techniques described herein may be used to obtain wavelength measurements of emission light to facilitate multi-dimensional discrimination of samples, and aspects of the technology are not limited in this respect.
a. Techniques Incorporating Regions of Different Depths
In some aspects, spectral information may be obtained using pixels having regions of differing depth (e.g., semiconductor junction depth). For example, in some embodiments, one or more charge storage regions are configured having different depths. In some embodiments, one or more photodetection regions may be configured having different depths in addition to or in the alternative to configuring one or more charge storage regions with different depths.
In some embodiments, pixel 2-112 may include one or more lightly p-doped substrate layers. Photodetection region PPD and charge storage regions SD0 and SD1 may be formed and/or disposed in or on the substrate layer(s) using n-type doping techniques. One or more barriers may be formed and/or disposed in or on the substrate layer(s) using p-type doping techniques. Transfer gate TG0 may be formed using a more opaque material than the substrate layer(s), such as polysilicon. It should be appreciated that, in some embodiments, the substrate layer(s) may be lightly n-doped, and the photodetection region PPD and charge storage regions SD0 and SD1 may be formed and/or disposed using p-type doping techniques. In such embodiments, the barrier(s) may be formed and/or disposed using p-type doping techniques.
In some embodiments, some regions of pixel 2-112 may have greater depth than other regions of the pixel. For instance, in the example of
In some embodiments, differences between the charge carriers collected in charge storage region SD0 and the charge carriers collected in charge storage region SD1 may indicate spectral and/or timing (e.g., lifetime) information of the incident light. For instance, in some embodiments, a sum and/or difference of the number of charge carriers collected in the charge storage regions may indicate a fluorescence lifetime and/or wavelength of the fluorescence emissions received from the sample well. In one example, higher wavelength photons may contribute more substantially to the number of charge carriers collected in charge storage region SD0, and lower wavelength photons may contribute more substantially to the number of charge carriers collected in charge storage region SD1. In this example, lower wavelength photons may have higher energy than higher wavelength photons, causing many of the lower wavelength photons to continue traveling beyond photodetection region PPD to charge storage region SD1, whereas higher wavelength photons may predominantly terminate at photodetection region PPD (e.g., due to attenuation in the bulk of pixel 2-112). As a result, the higher wavelength photons may generate more charge carriers to be collected in charge storage region SD0 during the collection period, and lower wavelength photons may generate more charge carriers in charge storage region SD1. Accordingly, a sum and/or difference of charge carriers accumulated in charge storage regions SD0 and SD1 may indicate spectral information of the incident light, such as a wavelength of the incident light. In some embodiments, a depth of charge storage region SD0 and/or a depth of charge storage region SD1 may be configured such that each charge storage region predominantly collects incident photons having a particular wavelength and/or range of wavelengths. Alternatively or additionally, in some embodiments, the difference in depth between charge storage region(s) SD0 and/or SD1 and photodetection region PPD may be configured such that each charge storage region predominantly collects incident photons having a particular wavelength and/or range of wavelengths.
In some embodiments, one or more processors (e.g., microprocessors, field programmable gate arrays (FPGAs), and/or application specific integrated circuits (ASICs), part or each of which may be integrated with the integrated device, etc.) coupled to pixel 2-112 may be configured to determine lifetime and/or spectral information based on the number of charge carriers accumulated in charge storage region(s) SD0 and/or SD1. It should be appreciated that, alternatively or additionally, the number of charge carriers accumulated in charge storage region SD0 and/or SD1 may indicate a fluorescence lifetime of the incident light. In some embodiments, charge carriers collected in one of the charge storage regions may indicate timing information, and charge carriers collected in another of the charge storage regions may indicate spectral information.
It should be appreciated that integrated circuits described herein may be configured to discriminate among incident photons having various optical wavelengths and/or ranges of optical wavelengths. In some embodiments, the higher wavelength photons of the above example may have a wavelength greater than 600 nm, and the lower wavelength photons may have a wavelength less than 600 nm. In some embodiments, the higher wavelength photons of the above example may have a wavelength greater than 700 nm, and the lower wavelength photons may have a wavelength less than 600 nm. In some embodiments, the higher wavelength photons of the above example may have a wavelength greater than 700 nm, and the lower wavelength photons may have a wavelength less than 700 nm. In some embodiments, the higher wavelength photons of the above example may have a wavelength greater than 600 nm, and the lower wavelength photons may have a wavelength less than 600 nm. In some embodiments, the higher wavelength photons of the above example may have a wavelength greater than 600 nm, and the lower wavelength photons may have a wavelength less than 550 nm. In some embodiments, the higher wavelength photons of the above example may have a wavelength greater than 550 nm, and the lower wavelength photons may have a wavelength less than 550 nm. In some embodiments, pixels described herein may have an area less than or equal to 40 square microns.
As in
In some embodiments, the difference in depth between charge storage regions SD0 and SD1, and/or photodetection region PPD, may cause charge carriers accumulated in the charge storage regions to have different indicate timing and/or spectral information of the incident light. For instance, in some embodiments, a depth of charge storage region SD0, a depth of charge storage region SD1, and/or a depth of photodetection region PPD may be configured such that each charge storage region predominantly collects incident photons having a particular wavelength and/or range of wavelengths. However, unlike in
It should be appreciated that, in embodiments having more than two photodetection regions, some (or all) of the photodetection regions may have a same depth. Alternatively or additionally, in some embodiments having multiple photodetection regions (e.g., having different depths), the charge storage regions may have different depths. By including more regions of different depths, such as illustrated in
b. Techniques Incorporating Charge Transfer Channels of Different Depths
Regions of pixel 3-112 may have different depths. In some embodiments, charge storage regions SD0 and SD3 may have a same depth, and charge transfer channels coupling respective photodetection regions PPD0 and PPD1 to charge storage regions SD0 and SD3 may have different depths, such as shown in
c. Techniques Incorporating Optical Shielding
It should be appreciated that, in some embodiments, spectral information may be obtained using a pixel that does not incorporate regions of different depths. For instance,
As shown in
d. Techniques Incorporating One or More Optical Sorting Elements
In accordance with various embodiments, optical sorting element OSE may be configured as at least partially refractive, diffractive, scattering, and/or plasmonic. For instance, in some embodiments, the optical sorting element OSE may include a micro-disk, a micro-lens, and/or a prism configured to refract incident light towards the charge storage regions SD0 and SD1, such as depending on a wavelength of the incident light. In some embodiments, the optical sorting element OSE may include a linear grating element, a curved grating element, a zone plate, and/or a photonic crystal configured to diffract the incident light towards the charge storage regions SD0 and SD1, such as depending on the wavelength of the incident light. In some embodiments, the optical sorting element OSE may include a scattering element, such as having multiple elements with different refractive indices. In some embodiments, the optical sorting element OSE may include a plasmonic element, such as nano-holes and/or an extraordinary optical transmission element. Because the optical sorting element OSE may cause incident photons having different frequencies towards the different photodetection regions PPD0 and PPD1, charge carriers accumulated in the charge storage regions SD0 and SD1 may be indicative of different spectral information of the incident photons, such as different wavelength information.
It should be appreciated that, in some embodiments, one or more of the charge storage regions may be configured to receive incident photons from the optical sorting element OSE and generate and store charge carriers in response.
V. Techniques for Obtaining Pulse Duration Information
According to a further aspect of the technology described herein, the inventors have developed techniques for obtaining pulse duration information for emission light from a sample in a sample well. For example, luminescent labels bound to molecules in a sample may have characteristic pulse and interpulse durations such that obtaining measurements of pulse and interpulse duration of emission light from a particular luminescent label facilitates identifying the luminescent label from which light is emitted. Pulse duration, also referred to herein as pulse width, refers to the interval of time measured across a pulse, in some embodiments, at the full width half maximum of a pulse. Interpulse duration, also referred to herein as interpulse width, refers to the interval of time between pulses.
Thus, in some embodiments, the integrated device described herein may be configured to implement techniques for obtaining pulse duration information, such as the techniques described in U.S. patent application Ser. No. 16/686,028 titled “METHODS AND COMPOSITIONS FOR PROTEIN SEQUENCING,” filed Nov. 15, 2019 under Attorney Docket No. R0708.70042US02 and PCT Application No. PCT/US19/61831 titled “METHODS AND COMPOSITIONS FOR PROTEIN SEQUENCING,” filed Nov. 15, 2019 under Attorney Docket No. R0708.70042WO00, both which are incorporated by reference in their entireties. For example, as described herein, molecules in a sample may be labeled with luminescent labels. One or more luminescent labels may attach (e.g., bond) to a molecule, and, upon being excited by excitation light, may each emit a photon, collectively referred to as a fluorescence event. Emitted photons generated by many such fluorescent events due to repeated excitation of the luminescent labels may be referred to as a signal pulse. Each signal pulse comprises a pulse duration (“pd”) corresponding to an association event between a recognition molecule of the luminescent labels and the sample molecule under analysis.
For example, without wishing to be bound by theory, labeled affinity reagent of a luminescent label selectively binds with a recognition molecule according to a binding affinity (KD) defined by an association rate, or an “on” rate, of binding (kon) and a dissociation rate, or an “off” rate, of binding (koff). When the fluorescent molecule is bound to the recognition molecule, the label may fluoresce and emit a photon, while, when the luminescent label is unbound, the label may not fluoresce even when receiving excitation light and entering and exiting an excited state, as described herein. The rate constants koff and kon are the critical determinants of pulse duration (e.g., the time corresponding to a detectable binding event) and interpulse duration (e.g., the time between detectable binding events), respectively. In some embodiments, these rates can be engineered to achieve pulse durations and pulse rates (e.g., the frequency of signal pulses) that give the best sequencing accuracy.
Thus, in some embodiments, the pulse duration is characteristic of a dissociation rate of binding. In addition, each signal pulse of a characteristic pattern is separated from another signal pulse of the characteristic pattern by an interpulse duration (“ipd”). In some embodiments, the interpulse duration is characteristic of an association rate of binding. In some embodiments, a change in magnitude (“ΔM”) can be determined for a signal pulse based on a difference between baseline and the peak of a signal pulse. In some embodiments, a characteristic pattern is determined based on pulse duration. In some embodiments, a characteristic pattern is determined based on pulse duration and interpulse duration. In some embodiments, a characteristic pattern is determined based on any one or more of pulse duration, interpulse duration, and change in magnitude. In some embodiments, the series of pulses provide a pulsing pattern (e.g., a characteristic pattern) which may be diagnostic of the identity of the sample under analysis.
As described herein, signal pulse information may be used to identify a molecule, such as an amino acid, based on a characteristic pattern in a series of signal pulses. In some embodiments, a characteristic pattern comprises a plurality of signal pulses, each signal pulse comprising a pulse duration. In some embodiments, the plurality of signal pulses may be characterized by a summary statistic (e.g., mean, median, time decay constant) of the distribution of pulse durations in a characteristic pattern. In some embodiments, the mean pulse duration of a characteristic pattern is between about 1 millisecond and about 10 seconds (e.g., between about 1 ms and about 1 s, between about 1 ms and about 100 ms, between about 1 ms and about 10 ms, between about 10 ms and about 10 s, between about 100 ms and about 10 s, between about 1 s and about 10 s, between about 10 ms and about 100 ms, or between about 100 ms and about 500 ms). In some embodiments, different characteristic patterns corresponding to different types of amino acids in a single polypeptide may be distinguished from one another based on a statistically significant difference in the summary statistic. For example, in some embodiments, one characteristic pattern may be distinguishable from another characteristic pattern based on a difference in mean pulse duration of at least 10 milliseconds (e.g., between about 10 ms and about 10 s, between about 10 ms and about 1 s, between about 10 ms and about 100 ms, between about 100 ms and about 10 s, between about 1 s and about 10 s, or between about 100 ms and about 1 s). It should be appreciated that, in some embodiments, smaller differences in mean pulse duration between different characteristic patterns may require a greater number of pulse durations within each characteristic pattern to distinguish one from another with statistical confidence.
As described herein, fluorescence labels may be configured such that the fluorescence labels only fluoresce when attached to a molecule of interest in the sample, and may not fluoresce during periods of disassociation. For example, in some embodiments, a fluorescence resonance energy transfer (FRET) technique is used such that fluorescence of labels selectively occur when attached to a molecule of interest in the sample. Labeled affinity reagent for a luminescent label may comprise a label having binding-induced luminescence. For example, in some embodiments, a labeled aptamer comprises a donor label and an acceptor label. Labeled aptamer as a free molecule may adopt a conformation in which donor label and acceptor label are separated by a distance that limits detectable FRET between the labels (e.g., about 10 nm or more). Labeled aptamer as a selectively bound molecule adopts a conformation in which donor label and acceptor label are within a distance that promotes detectable FRET between the labels (e.g., about 10 nm or less). In yet other embodiments, labeled aptamer comprises a quenching moiety and functions analogously to a molecular beacon, wherein luminescence of labeled aptamer is internally quenched as a free molecule and restored as a selectively bound molecule (see, e.g., Hamaguchi, et al. (2001) Analytical Biochemistry 294, 126-131). Without wishing to be bound by theory, it is thought that these and other types of mechanisms for binding-induced luminescence may advantageously reduce or eliminate background luminescence to increase overall sensitivity and accuracy of the methods described herein.
The inventors have recognized that obtaining one or more measurements for pulse duration and/or interpulse duration to facilitate the discrimination techniques described herein may be accomplished using the integrated device 1-102 and the time binning techniques described herein, particularly with respect to Section III of this disclosure. In some embodiments, pulse duration and/or interpulse duration may be used as dimensions for discriminating a sample in a sample well under analysis. For example, as described herein, particular molecules may have a characteristic pulse duration and/or interpulse duration and a sample may be identified by comparing measured pulse durations and/or interpulse duration with known characteristic durations.
VI. Techniques for Obtaining Intensity Information
According to another aspect of the technology described herein, the inventors have developed techniques for obtaining one or more measurements of emission light intensity which can, in some embodiments, facilitate multi-dimensional discrimination techniques of a sample in a sample well. The inventors have recognized that some fluorophores may emit at significantly different intensities or have a significant difference in their probabilities of excitation (e.g., at least a difference of about 35%) even though their decay rates may be similar. By referencing binned signals (e.g., by measuring the accumulation of charge carriers) to measured excitation energy and/or other acquired signals, it can be possible to distinguish different fluorophores based on intensity levels. Thus, it should be appreciated that an integrated device, such as integrated device 1-102, may be configured to measure intensity based on the accumulation of charge carriers in a storage bin through various read-out periods as described herein, and measurements of intensity may be used to distinguish the particular sample under analysis.
In some embodiments, different numbers of fluorophores of the same type may be linked to different molecules in a sample, so that each molecule may be identified based on luminescence intensity. For example, two fluorophores may be linked to a first labeled molecule and four or more fluorophores may be linked to a second labeled molecule. Because of the different numbers of fluorophores, there may be different excitation and fluorophore emission probabilities associated with the different molecules. For example, there may be more emission events for the second labeled molecule during a signal accumulation interval, so that the apparent intensity of the bins is significantly higher than for the first labeled molecule. Thus, the inventors have recognized that, in some embodiments, controlling the number of fluorophores which are linked to a particular molecule in a sample may facilitate identification of the sample. In some embodiments, intensity is therefore at least one of a number of characteristics used in multi-dimensional discrimination techniques for sample analysis.
VII. Sequencing Applications
Having thus described multiple techniques for acquiring information regarding various characteristics (e.g., lifetime, wavelength, intensity, pulse duration, and/or interpulse duration) of emission light from a sample, example applications of the multi-dimensional discrimination techniques will now be described. For example, the inventors have recognized that identification of one or more molecules in a sample under analysis may be identified using the multi-dimensional techniques described herein. In particular, measurements for one or more characteristics of emission light may be obtained by a device, such as the integrated device described herein, and the collected measurements may be compared to known characteristic values of the measured characteristics for a luminescent label to determine which luminescent label is the most likely source of the emission light. In turn, by identifying the luminescent label, the identity of the molecule to which the luminescent label is attached can be known based on the particular type of molecule to which the luminescent label is known to attach.
Any suitable combination of characteristics may be combined and used in the multi-dimensional techniques described herein. For example, in some embodiments, a two-dimensional discrimination technique may identify a sample of interest based on information of any two of lifetime, wavelength, pulse duration, interpulse duration, and intensity of emission light associated with the sample. In some embodiments, a two-dimensional discrimination technique for identifying a sample of interest is based on wavelength information and lifetime information of emission light associated with the sample. In some embodiments, a two-dimensional discrimination technique for identifying a sample of interest is based on lifetime information and intensity information of emission light associated with the sample. In some embodiments, a two-dimensional discrimination technique for identifying a sample of interest is based on wavelength information and lifetime information of emission light associated with the sample. In some embodiments, a three-dimensional discrimination technique may identify a sample of interest based on information of any three of lifetime, wavelength, pulse duration, interpulse duration, and intensity of emission light associated with the sample. In some embodiments, a three-dimensional discrimination technique for identifying a sample of interest is based on lifetime information, wavelength information, and intensity information of emission light associated with the sample. In some embodiments, a three-dimensional discrimination technique for identifying a sample of interest is based on any two of wavelength information, lifetime information, and intensity information, and any one of pulse duration information and interpulse duration information of emission light associated with the sample. According to another aspect of the technology described herein, a four-dimensional discrimination technique is used to identify a sample of interest based information of any four of lifetime, wavelength, pulse duration, interpulse duration, and intensity of emission light associated with the sample. In some embodiments, a four-dimensional discrimination technique for identifying a sample of interest is based on lifetime information, wavelength information, intensity information and one of interpulse duration information and pulse duration information of emission light associated with the sample. According to another aspect of the technology described herein, a five-dimensional discrimination technique is used to identify a sample of interest based on information of lifetime, wavelength, intensity, pulse duration, and interpulse duration of emission light associated with the sample.
In some embodiments, a two-dimensional discrimination technique for identifying a sample of interest is based on measurements of any two of wavelength, lifetime, intensity, pulse duration and interpulse duration of emission light associated with the sample. In some embodiments, a two-dimensional discrimination technique for identifying a sample of interest is based on measurements of lifetime and intensity of emission light associated with a sample. In some embodiments, a two-dimensional discrimination technique for identifying a sample of interest is based on measurements of wavelength and intensity of emission light associated with a sample.
For example,
Analysis of the signal at the various time intervals provides information for various characteristics of incident emission light. For example, intensity of the incident emission light may be analyzed best by comparing relative signal measurements for the SNA ping interval, which is the initial interval of charge carrier read out. As shown in the graph and accompanying table, pulses 1-4 have a relatively lower signal than pulses 5-8, and thus, pulses 1-4 can be considered as having a low intensity while pulses 5-8 can be considered as having a relatively higher intensity. Lifetime information can be analyzed by viewing the relative difference in signal between the SNA ping interval and the SNA pong interval. For example, where the SNA pong interval signal read out is approximately the same as the SNA ping interval, much of the signal was generated by charge carriers generated during the SNA ping interval, which is the earlier interval, and therefore for those pulses, lifetime is relatively short. Where the SNA pong interval signal read out is larger than the SNA ping interval, more charge carriers were generated at a later time interval and thus lifetime is relatively long. In the illustrated figure, odd pulses are shown to have short lifetimes while even pulses have longer lifetimes. Finally, wavelength information can be determined based on the arrangement of the storage nodes SNA and SNB. When the storage nodes are configured with an optical blocking element as described with respect to
The inventors have recognized that discrimination techniques having a relatively higher number of dimensions (e.g., three dimensions vs. two dimensions, four dimensions vs. three dimensions, etc.) may improve the ability to accurately identify the source of emission light and therefore to accurately identify the sample under analysis. As described herein, in the application of the multidimensional discrimination techniques to protein sequencing, in order to accurately identify 20 different amino acids as well as post-translational modifications, it is advantageous to use discrimination techniques having higher degrees of dimensionality (e.g., three, four, and/or five dimensions for discriminating a sample). In particular, higher dimensional techniques may facilitate accurate identification of an emission source (e.g., a luminescent label and/or a sample to which the luminescent label is attached) even when the received signal is relatively low (i.e., the emission light intensity is relatively low). The inventors have recognized that developing discrimination techniques which function even with relatively low signal may extend read length for the sequencing applications described herein and improve the scalability of the technology to larger assay sizes.
Although it may be advantageous to increase the dimensionality of a discrimination technique used to identify a sample, doing so may increase the complexity of the integrated device from which the discrimination information is obtained. In addition, providing luminescent labels optimized for obtaining information on multiple characteristics (e.g., wavelength and lifetime) may require increased effort and complexity of the luminescent label and/or integrated device. The inventors have recognized, however, that certain types of information may be obtained without increasing the complexity of the integrated device, as described herein. For example, pulse duration information, interpulse duration information, and pulse intensity information may be obtained using an existing device configured for obtaining wavelength information and/or lifetime information without minimal to no changes to the device required. Thus, the inventors have recognized that increasing the dimensionality of a discrimination technique by using pulse intensity information, pulse duration information, and/or interpulse duration information to identify a sample may improve identification accuracy without significantly increasing the complexity of the integrated device.
As described herein, the characteristics described above, e.g., wavelength information, intensity information, lifetime information, pulse duration information, and interpulse duration information, may be obtained based on charge carriers stored in at least one charge storage region. A component, which, in some embodiments, is part of the integrated device, is configured for obtaining the information regarding one or more of the characteristics described herein. For example, the component may be a hardware module (e.g. one or more controllers, one or more processors, circuitry implemented via one or more Field Programmable Gate Arrays (FPGAs), an application-specific integrated circuit (ASICs)) and/or any other suitable component configured to perform the functions of the component described herein.
In some embodiments, the integrated device is configured having one component capable of obtaining wavelength information, intensity information, lifetime information, pulse duration information, and interpulse duration information. In some embodiments, the integrated device may be configured having multiple components capable of obtaining wavelength information, intensity information, lifetime information, pulse duration information, and interpulse duration information. For example, in some embodiments, each type of information (e.g., wavelength information, lifetime information, etc.) may be obtained by a different component. In some embodiments, a first component may be configured to obtain some but not all types of information, and one or more other components may be configured to obtain the other types of information not obtained by the first component. For example, interpulse duration information and pulse duration information may be obtained by a first component, while wavelength information, luminescence lifetime information, and intensity information may be obtained by one or more other components. In some embodiments, interpulse duration information, pulse duration information, and luminescence lifetime information may be obtained by a first component, and wavelength information and luminescence lifetime information may be obtained by one or more other components. In some embodiments, multiple components may be configured to obtain a single type of information (e.g., wavelength information).
The inventors have appreciated that the multi-dimensional discrimination techniques can be implemented in a variety of applications, two non-limiting examples of which include DNA and/or RNA sequencing applications and/or protein sequencing applications, each of which is described further herein.
a. DNA and/or RNA Sequencing Applications
The inventors have recognized that, in some embodiments, the techniques described herein for multi-dimensional discrimination of a sample may be used in DNA and/or RNA sequencing applications, as one non-limiting example. For example, an analytic system described herein may include an integrated device and an instrument configured to interface with the integrated device. The integrated device may include an array of pixels, where a pixel includes a reaction chamber and at least one photodetector. A surface of the integrated device may have a plurality of reaction chambers, where a reaction chamber is configured to receive a sample from a suspension placed on the surface of the integrated device. A suspension may contain multiple samples of a same type, and in some embodiments, different types of samples. In this regard, the phrase “sample of interest” as used herein can refer to a plurality of samples of a same type that are dispersed in a suspension, for example. Similarly, the phrase “molecule of interest” as used herein can refer to a plurality of molecules of a same type that are dispersed in a suspension. The plurality of reaction chambers may have a suitable size and shape such that at least a portion of the reaction chambers receive one sample from a suspension. In some embodiments, the number of samples within a reaction chamber may be distributed among the reaction chambers such that some reaction chambers contain one sample with others contain zero, two or more samples.
In some embodiments, a suspension may contain multiple single-stranded DNA templates, and individual reaction chambers on a surface of an integrated device may be sized and shaped to receive a sequencing template. Sequencing templates may be distributed among the reaction chambers of the integrated device such that at least a portion of the reaction chambers of the integrated device contain a sequencing template. The suspension may also contain labeled nucleotides which then enter in the reaction chamber and may allow for identification of a nucleotide as it is incorporated into a strand of DNA complementary to the single-stranded DNA template in the reaction chamber. In some embodiments, the suspension may contain sequencing templates and labeled nucleotides may be subsequently introduced to a reaction chamber as nucleotides are incorporated into a complementary strand within the reaction chamber. In this manner, timing of incorporation of nucleotides may be controlled by when labeled nucleotides are introduced to the reaction chambers of an integrated device.
Excitation light is provided from an excitation source located separate from the pixel array of the integrated device. The excitation light is directed at least in part by elements of the integrated device towards one or more pixels to illuminate an illumination region within the reaction chamber. A label may then emit emission light when located within the illumination region and in response to being illuminated by excitation light. In some embodiments, one or more excitation sources are part of the instrument of the system where components of the instrument and the integrated device are configured to direct the excitation light towards one or more pixels.
Emission light emitted from a reaction chamber (e.g., by a luminescent label) may then be detected by one or more photodetectors within a pixel of the integrated device. Characteristics of the detected emission light may provide an indication for identifying the label associated with the emission light. Such characteristics may include any suitable type of characteristic, including an arrival time of photons detected by a photodetector, an amount of photons accumulated over time by a photodetector, and/or a distribution of photons across two or more photodetectors. In some embodiments, a photodetector may have a configuration that allows for the detection of one or more characteristics associated with emission light, such as timing characteristics (e.g., fluorescence lifetime), wavelength, pulse duration, interpulse duration, and/or intensity. As one example, the photodetector may detect a distribution of photon arrival times after a pulse of excitation light propagates through the integrated device, and the distribution of arrival times may provide an indication of a timing characteristic of the emission light (e.g., a proxy for fluorescence lifetime, pulse duration, and/or interpulse duration). In some embodiments, the one or more photodetectors provide an indication of the probability of emission light emitted by the label (e.g., fluorescence intensity). In some embodiments, a plurality of photodetectors may be sized and arranged to capture a spatial distribution of the emission light (e.g., wavelength). Output signals from the one or more photodetectors may then be used to distinguish a label from among a plurality of labels, where the plurality of labels may be used to identify a sample or its structure. In some embodiments, a sample may be excited by multiple excitation energies, and emission light and/or timing characteristics of the emission light from the reaction chamber in response to the multiple excitation energies may distinguish a label from a plurality of labels.
In some embodiments, a system and integrated device similar to the technology previously described herein may be implemented to facilitate DNA and/or RNA sequencing applications. For example, a schematic overview of the system 5-100 is illustrated in
A pixel 5-112 has a reaction chamber 5-108 configured to receive a single sample of interest and a photodetector 5-110 for detecting emission light emitted from the reaction chamber in response to illuminating the sample and at least a portion of the reaction chamber 5-108 with excitation light provided by the excitation source 5-106. In some embodiments, reaction chamber 5-108 may retain the sample in proximity to a surface of integrated device 5-102, which may ease delivery of excitation light to the sample and detection of emission light from the sample or a reaction component (e.g., a labeled nucleotide).
Optical elements for coupling excitation light from excitation light source 5-106 to integrated device 5-102 and guiding excitation light to the reaction chamber 5-108 are located both on integrated device 5-102 and the instrument 5-104. Source-to-chamber optical elements may comprise one or more grating couplers located on integrated device 5-102 to couple excitation light to the integrated device and waveguides to deliver excitation light from instrument 5-104 to reaction chambers in pixels 5-112. One or more optical splitter elements may be positioned between a grating coupler and the waveguides. The optical splitter may couple excitation light from the grating coupler and deliver excitation light to at least one of the waveguides. In some embodiments, the optical splitter may have a configuration that allows for delivery of excitation light to be substantially uniform across all the waveguides such that each of the waveguides receives a substantially similar amount of excitation light. Such embodiments may improve performance of the integrated device by improving the uniformity of excitation light received by reaction chambers of the integrated device.
Reaction chamber 5-108, a portion of the excitation source-to-chamber optics, and the reaction chamber-to-photodetector optics are located on integrated device 5-102. Excitation source 5-106 and a portion of the source-to-chamber components are located in instrument 5-104. In some embodiments, a single component may play a role in both coupling excitation light to reaction chamber 5-108 and delivering emission light from reaction chamber 5-108 to photodetector 5-110. Examples of suitable components, for coupling excitation light to a reaction chamber and/or directing emission light to a photodetector, to include in an integrated device are described in U.S. patent application Ser. No. 14/821,688, filed Aug. 7, 2015, titled “INTEGRATED DEVICE FOR PROBING, DETECTING AND ANALYZING MOLECULES,” under Attorney Docket No. 14/821,688 and U.S. patent application Ser. No. 14/543,865, filed Nov. 17, 2014, titled “INTEGRATED DEVICE WITH EXTERNAL LIGHT SOURCE FOR PROBING, DETECTING, AND ANALYZING MOLECULES,” under their entirety.
Pixel 5-112 is associated with its own individual reaction chamber 5-108 and at least one photodetector 5-110. The plurality of pixels of integrated device 5-102 may be arranged to have any suitable shape, size, and/or dimensions. Integrated device 5-102 may have any suitable number of pixels. The number of pixels in integrated device 2-102 may be in the range of approximately 10,000 pixels to 1,000,000 pixels or any value or range of values within that range. In some embodiments, the pixels may be arranged in an array of 512 pixels by 512 pixels. Integrated device 5-102 may interface with instrument 5-104 in any suitable manner. In some embodiments, instrument 5-104 may have an interface that detachably couples to integrated device 5-102 such that a user may attach integrated device 5-102 to instrument 5-104 for use of integrated device 5-102 to analyze at least one sample of interest in a suspension and remove integrated device 5-102 from instrument 5-104 to allow for another integrated device to be attached. The interface of instrument 5-104 may position integrated device 5-102 to couple with circuitry of instrument 5-104 to allow for readout signals from one or more photodetectors to be transmitted to instrument 5-104. Integrated device 5-102 and instrument 5-104 may include multi-channel, high-speed communication links for handling data associated with large pixel arrays (e.g., more than 10,000 pixels).
A cross-sectional schematic of integrated device 5-102 illustrating a row of pixels 5-112 is shown in
The directionality of the emission light from a reaction chamber 5-108 may depend on the positioning of the sample in the reaction chamber 5-108 relative to metal layer(s) 5-116 because metal layer(s) 5-116 may act to reflect emission light. In this manner, a distance between metal layer(s) 5-116 and a luminescent label positioned in a reaction chamber 5-108 may impact the efficiency of photodetector(s) 5-110, that are in the same pixel as the reaction chamber, to detect the light emitted by the luminescent label. The distance between metal layer(s) 5-116 and the bottom surface of a reaction chamber 5-106, which is proximate to where a sample may be positioned during operation, may be in the range of 100 nm to 500 nm, or any value or range of values in that range. In some embodiments the distance between metal layer(s) 5-116 and the bottom surface of a reaction chamber 5-108 is approximately 300 nm.
The distance between the sample and the photodetector(s) may also impact efficiency in detecting emission light. By decreasing the distance light has to travel between the sample and the photodetector(s), detection efficiency of emission light may be improved. In addition, smaller distances between the sample and the photodetector(s) may allow for pixels that occupy a smaller area footprint of the integrated device, which can allow for a higher number of pixels to be included in the integrated device. The distance between the bottom surface of a reaction chamber 5-108 and photodetector(s) may be in the range of 1 μm to 15 μm, or any value or range of values in that range.
Photonic structure(s) 5-230 may be positioned between reaction chambers 5-108 and photodetectors 5-110 and configured to reduce or prevent excitation light from reaching photodetectors 5-110, which may otherwise contribute to signal noise in detecting emission light. As shown in
Coupling region 5-201 may include one or more optical components configured to couple excitation light from an external excitation source. Coupling region 5-201 may include grating coupler 5-216 positioned to receive some or all of a beam of excitation light. Examples of suitable grating couplers are described in U.S. patent application Ser. No. 15/844,403, filed Dec. 15, 2017, titled “OPTICAL COUPLER AND WAVEGUIDE SYSTEM,” under Attorney Docket No. R0708.70021US01 which is hereby incorporated by reference in its entirety. Grating coupler 5-216 may couple excitation light to waveguide 5-220, which may be configured to propagate excitation light to the proximity of one or more reaction chambers 5-108. Alternatively, coupling region 5-201 may comprise other well-known structures for coupling light into a waveguide.
Components located off of the integrated device may be used to position and align the excitation source 5-106 to the integrated device. Such components may include optical components including lenses, mirrors, prisms, windows, apertures, attenuators, and/or optical fibers. Additional mechanical components may be included in the instrument to allow for control of one or more alignment components. Such mechanical components may include actuators, stepper motors, and/or knobs. Examples of suitable excitation sources and alignment mechanisms are described in U.S. patent application Ser. No. 15/161,088, filed May 20, 2016, titled “PULSED LASER AND SYSTEM,” under Attorney Docket No. R0708.70010US02 which is hereby incorporated by reference in its entirety. Another example of a beam-steering module is described in U.S. patent application Ser. No. 15/842,720, filed Dec. 14, 2017, titled R0708.70024US01 which is hereby incorporated herein by reference.
A sample to be analyzed may be introduced into reaction chamber 5-108 of pixel 5-112. The sample may be a biological sample or any other suitable sample, such as a chemical sample. In some cases, the suspension may include multiple molecules of interest and the reaction chamber may be configured to isolate a single molecule. In some instances, the dimensions of the reaction chamber may act to confine a single molecule within the reaction chamber, allowing measurements to be performed on the single molecule. Excitation light may be delivered into the reaction chamber 5-108, so as to excite the sample or at least one luminescent label attached to the sample or otherwise associated with the sample while it is within an illumination area within the reaction chamber 5-108.
In operation, parallel analyses of samples within the reaction chambers are carried out by exciting some or all of the samples within the reaction chambers using excitation light and detecting signals with the photodetectors that are representative of emission light from the reaction chambers. Emission light from a sample or reaction component (e.g., luminescent label) may be detected by a corresponding photodetector and converted to at least one electrical signal. The electrical signals may be transmitted along conducting lines (e.g., metal layers 5-240) in the circuitry of the integrated device, which may be connected to an instrument interfaced with the integrated device. The electrical signals may be subsequently processed and/or analyzed. Processing or analyzing of electrical signals may occur on a suitable computing device either located on or off the instrument.
Instrument 5-104 may include a user interface for controlling operation of instrument 5-104 and/or integrated device 5-102. The user interface may be configured to allow a user to input information into the instrument, such as commands and/or settings used to control the functioning of the instrument. In some embodiments, the user interface may include buttons, switches, dials, and a microphone for voice commands. The user interface may allow a user to receive feedback on the performance of the instrument and/or integrated device, such as proper alignment and/or information obtained by readout signals from the photodetectors on the integrated device. In some embodiments, the user interface may provide feedback using a speaker to provide audible feedback. In some embodiments, the user interface may include indicator lights and/or a display screen for providing visual feedback to a user.
In some embodiments, instrument 5-104 may include a computer interface configured to connect with a computing device. Computer interface may be a USB interface, a FireWire interface, or any other suitable computer interface. Computing device may be any general purpose computer, such as a laptop or desktop computer. In some embodiments, computing device may be a server (e.g., cloud-based server) accessible over a wireless network via a suitable computer interface. The computer interface may facilitate communication of information between instrument 5-104 and the computing device. Input information for controlling and/or configuring the instrument 5-104 may be provided to the computing device and transmitted to instrument 5-104 via the computer interface. Output information generated by instrument 5-104 may be received by the computing device via the computer interface. Output information may include feedback about performance of instrument 5-104, performance of integrated device 5-112, and/or data generated from the readout signals of photodetector 5-110.
In some embodiments, instrument 5-104 may include a processing device configured to analyze data received from one or more photodetectors of integrated device 5-102 and/or transmit control signals to excitation source(s) 2-106. In some embodiments, the processing device may comprise a general purpose processor, a specially-adapted processor (e.g., a central processing unit (CPU) such as one or more microprocessor or microcontroller cores, a field-programmable gate array (FPGA), an application-specific integrated circuit (ASIC), a custom integrated circuit, a digital signal processor (DSP), or a combination thereof.) In some embodiments, the processing of data from one or more photodetectors may be performed by both a processing device of instrument 5-104 and an external computing device. In other embodiments, an external computing device may be omitted and processing of data from one or more photodetectors may be performed solely by a processing device of integrated device 5-102.
Referring to
In some cases, the analytic instrument 5-100 is configured to receive a removable, packaged, bio-optoelectronic or optoelectronic chip 5-140 (also referred to as a “disposable chip”). The disposable chip can include a bio-optoelectronic chip, for example, that comprises a plurality of reaction chambers, integrated optical components arranged to deliver optical excitation energy to the reaction chambers, and integrated photodetectors arranged to detect fluorescent emission from the reaction chambers. In some implementations, the chip 5-140 can be disposable after a single use, whereas in other implementations the chip 5-140 can be reused two or more times. When the chip 5-140 is received by the instrument 5-100, it can be in electrical and optical communication with the pulsed optical source 5-106 and with apparatus in the analytic system 5-160. Electrical communication may be made through electrical contacts on the chip package, for example.
In some embodiments and referring to
According to some embodiments, the pulsed optical source 5-106 comprises a compact mode-locked laser module 5-113. The mode-locked laser can comprise a gain medium 5-105 (which can be solid-state material in some embodiments), an output coupler 5-111, and a laser-cavity end mirror 5-119. The mode-locked laser's optical cavity can be bound by the output coupler 5-111 and end mirror 5-119. An optical axis 5-125 of the laser cavity can have one or more folds (turns) to increase the length of the laser cavity and provide a desired pulse repetition rate. The pulse repetition rate is determined by the length of the laser cavity (e.g., the time for an optical pulse to make a round-trip within the laser cavity).
In some embodiments, there can be additional optical elements (not shown in
When the laser 5-113 is mode locked, an intracavity pulse 5-120 can circulate between the end mirror 5-119 and the output coupler 5-111, and a portion of the intracavity pulse can be transmitted through the output coupler 5-111 as an output pulse 5-122. Accordingly, a train of output pulses 5-122, as depicted in the graph of
The output pulses 5-122 can be separated by regular intervals T. For example, T can be determined by a round-trip travel time between the output coupler 5-111 and cavity end mirror 5-119. According to some embodiments, the pulse-separation interval T can be between about 1 ns and about 30 ns. In some cases, the pulse-separation interval T can be between about 5 ns and about 20 ns, corresponding to a laser-cavity length (an approximate length of the optical axis 5-125 within the laser cavity) between about 0.7 meter and about 3 meters. In embodiments, the pulse-separation interval corresponds to a round trip travel time in the laser cavity, so that a cavity length of 3 meters (round-trip distance of 6 meters) provides a pulse-separation interval T of approximately 20 ns.
According to some embodiments, a desired pulse-separation interval T and laser-cavity length can be determined by a combination of the number of reaction chambers on the chip 5-140, fluorescent emission characteristics, and the speed of data-handling circuitry for reading data from the optoelectronic chip 5-140. In embodiments, different fluorophores can be distinguished by their different fluorescent decay rates or characteristic lifetimes. Accordingly, there needs to be a sufficient pulse-separation interval T to collect adequate statistics for the selected fluorophores to distinguish between their different decay rates. Additionally, if the pulse-separation interval T is too short, the data handling circuitry cannot keep up with the large amount of data being collected by the large number of reaction chambers. Pulse-separation interval T between about 5 ns and about 20 ns is suitable for fluorophores that have decay rates up to about 2 ns and for handling data from between about 60,000 and 10,000,000 reaction chambers.
According to some implementations, a beam-steering module 5-150 can receive output pulses from the pulsed optical source 5-106 and is configured to adjust at least the position and incident angles of the optical pulses onto an optical coupler (e.g., grating coupler) of the optoelectronic chip 5-140. In some cases, the output pulses 5-122 from the pulsed optical source 5-106 can be operated on by a beam-steering module 5-150 to additionally or alternatively change a beam shape and/or beam rotation at an optical coupler on the optoelectronic chip 5-140. In some implementations, the beam-steering module 5-150 can further provide focusing and/or polarization adjustments of the beam of output pulses onto the optical coupler. One example of a beam-steering module is described in U.S. patent application Ser. No. 15/161,088 titled “Pulsed Laser and Bioanalytic System,” filed May 20, 2016, which is incorporated herein by reference. Another example of a beam-steering module is described in a separate U.S. patent application Ser. No. 15/842,720, filed Dec. 14, 2017 under Attorney which is incorporated herein by reference.
Referring to
Each waveguide 5-312 can include a tapered portion 5-315 below the reaction chambers 5-330 to equalize optical power coupled to the reaction chambers along the waveguide. The reducing taper can force more optical energy outside the waveguide's core, increasing coupling to the reaction chambers and compensating for optical losses along the waveguide, including losses for light coupling into the reaction chambers. A second grating coupler 5-317 can be located at an end of each waveguide to direct optical energy to an integrated photodiode 5-324. The integrated photodiode can detect an amount of power coupled down a waveguide and provide a detected signal to feedback circuitry that controls the beam-steering module 5-150, for example.
The reaction chambers 5-330 or reaction chambers 5-330 can be aligned with the tapered portion 5-315 of the waveguide and recessed in a tub 5-340. There can be photodetectors 5-322 located on the semiconductor substrate 5-305 for each reaction chamber 5-330. In some embodiments, a semiconductor absorber (shown in
There can be a plurality of rows of waveguides, reaction chambers, and time-binning photodetectors on the optoelectronic chip 5-140. For example, there can be 128 rows, each having 512 reaction chambers, for a total of 65,536 reaction chambers in some implementations. Other implementations may include fewer or more reaction chambers, and may include other layout configurations. Optical power from the pulsed optical source 5-106 can be distributed to the multiple waveguides via one or more star couplers or multi-mode interference couplers, or by any other means, located between an optical coupler 5-310 to the chip 5-140 and the plurality of waveguides 5-312.
A non-limiting example of a biological reaction taking place in a reaction chamber 5-330 is depicted in
When a labeled nucleotide or nucleotide analog 5-610 is incorporated into a growing strand of complementary nucleic acid, as depicted in
Techniques for time binning charge carriers generated by incident emission light to facilitate obtaining timing information of the emission light (e.g., fluorescent lifetime, pulse duration, interpulse duration) described herein, for example with respect to Sections III. And V may be applied to DNA and/or RNA sequencing applications. For example, according to some embodiments, an advanced analytic instrument 5-100 that is configured to analyze samples based on fluorescent emission characteristics can detect differences in fluorescent lifetimes and/or intensities between different fluorescent molecules, and/or differences between lifetimes and/or intensities of the same fluorescent molecules in different environments. By way of explanation,
A second fluorescent molecule may have a decay profile pB(t) that is exponential, but has a measurably different lifetime τ2, as depicted for curve B in
Differences in fluorescent emission lifetimes can be used to discern between the presence or absence of different fluorescent molecules and/or to discern between different environments or conditions to which a fluorescent molecule is subjected. In some cases, discerning fluorescent molecules based on lifetime (rather than emission wavelength, for example) can simplify aspects of an analytical instrument 5-100. As an example, wavelength-discriminating optics (such as wavelength filters, dedicated detectors for each wavelength, dedicated pulsed optical sources at different wavelengths, and/or diffractive optics) can be reduced in number or eliminated when discerning fluorescent molecules based on lifetime. In some cases, a single pulsed optical source operating at a single characteristic wavelength can be used to excite different fluorescent molecules that emit within a same wavelength region of the optical spectrum but have measurably different lifetimes. An analytic system that uses a single pulsed optical source, rather than multiple sources operating at different wavelengths, to excite and discern different fluorescent molecules emitting in a same wavelength region can be less complex to operate and maintain, more compact, and can be manufactured at lower cost.
Although analytic systems based on fluorescent lifetime analysis can have certain benefits, the amount of information obtained by an analytic system and/or detection accuracy can be increased by allowing for additional detection techniques. For example, some analytic systems 5-160 can additionally be configured to discern one or more properties of a sample based on fluorescent wavelength, pulse duration/width, interpulse duration, and/or fluorescent intensity as described herein.
Referring again to
For a single molecule or a small number of molecules, however, the emission of fluorescent photons occurs according to the statistics of curve B in
Examples of a time-binning photodetector 5-322 are described in U.S. patent application Ser. No. 14/821,656, filed Aug. 7, 2015, titled “Integrated Device for Temporal Binning of Received Photons” and in U.S. patent application Ser. No. 15/852,571, filed Dec. 22, 2017, titled “Integrated Photodetector with Direct Binning Pixel,” which are both incorporated herein by reference in their entirety. For explanation purposes, a non-limiting embodiment of a time-binning photodetector is depicted in
In operation, a portion of an excitation pulse 5-122 from a pulsed optical source 5-106 (e.g., a mode-locked laser) is delivered to a reaction chamber 5-330 over the time-binning photodetector 5-322. Initially, some excitation radiation photons 5-901 may arrive at the photon-absorption/carrier-generation region 5-902 and produce carriers (shown as light-shaded circles). There can also be some fluorescent emission photons 5-903 that arrive with the excitation radiation photons 5-901 and produce corresponding carriers (shown as dark-shaded circles). Initially, the number of carriers produced by the excitation radiation can be too large compared to the number of carriers produced by the fluorescent emission. The initial carriers produced during a time interval |te-t1| can be rejected by gating them into a carrier-discharge channel 5-906 with a first transfer gate 5-920, for example.
At a later times mostly fluorescent emission photons 5-903 arrive at the photon-absorption/carrier-generation region 5-902 and produce carriers (indicated a dark-shaded circles) that provide useful and detectable signal that is representative of fluorescent emission from the reaction chamber 5-330. According to some detection methods, a second electrode 5-921 and third electrode 5-923 can be gated at a later time to direct carriers produced at a later time (e.g., during a second time interval |t1-t2|) to a first carrier-storage region 5-908a. Subsequently, a fourth electrode 5-922 and fifth electrode 5-924 can be gated at a later time (e.g., during a third time interval |t2-t3|) to direct carriers to a second carrier-storage region 5-908b. Charge accumulation can continue in this manner after excitation pulses for a large number of excitation pulses to accumulate an appreciable number of carriers and signal level in each carrier-storage region 5-908a, 5-908b. At a later time, the signal can be read out from the bins. In some implementations, the time intervals corresponding to each storage region are at the sub-nanosecond time scale, though longer time scales can be used in some embodiments (e.g., in embodiments where fluorophores have longer decay times).
The process of generating and time-binning carriers after an excitation event (e.g., excitation pulse from a pulsed optical source) can occur once after a single excitation pulse or be repeated multiple times after multiple excitation pulses during a single charge-accumulation cycle for the time-binning photodetector 5-322. After charge accumulation is complete, carriers can be read out of the storage regions via the read-out channel 5-910. For example, an appropriate biasing sequence can be applied to electrodes 5-923, 5-924 and at least to electrode 5-940 to remove carriers from the storage regions 5-908a, 5-908b. The charge accumulation and read-out processes can occur in a massively parallel operation on the optoelectronic chip 5-140 resulting in frames of data.
Although the described example in connection with
Regardless of how charge accumulation is carried out for different time intervals after excitation, signals that are read out can provide a histogram of bins that are representative of the fluorescent emission decay characteristics, for example. An example process is illustrated in
In some implementations, only a single photon may be emitted from a fluorophore following an excitation event, as depicted in
In some implementations, there may not be a fluorescent photon emitted and/or detected after each excitation pulse received at a reaction chamber 5-330. In some cases, there can be as few as one fluorescent photon that is detected at a reaction chamber for every 10,000 excitation pulses delivered to the reaction chamber. One advantage of implementing a mode-locked laser 5-113 as the pulsed excitation source 5-106 is that a mode-locked laser can produce short optical pulses having high intensity and quick turn-off times at high pulse-repetition rates (e.g., between 50 MHz and 250 MHz). With such high pulse-repetition rates, the number of excitation pulses within a 10 millisecond charge-accumulation interval can be 50,000 to 250,000, so that detectable signal can be accumulated.
After a large number of excitation events and carrier accumulations, the carrier-storage regions of the time-binning photodetector 5-322 can be read out to provide a multi-valued signal (e.g., a histogram of two or more values, an N-dimensional vector, etc.) for a reaction chamber. The signal values for each bin can depend upon the decay rate of the fluorophore. For example and referring again to
To further aid in understanding the signal analysis, the accumulated, multi-bin values can be plotted as a histogram, as depicted in
In some implementations, fluorescent intensity can be used additionally or alternatively to distinguish between different fluorophores. For example, some fluorophores may emit at significantly different intensities or have a significant difference in their probabilities of excitation (e.g., at least a difference of about 35%) even though their decay rates may be similar. By referencing binned signals (bins 5-3) to measured excitation energy and/or other acquired signals, it can be possible to distinguish different fluorophores based on intensity levels.
In some embodiments, different numbers of fluorophores of the same type can be linked to different nucleotides or nucleotide analogs, so that the nucleotides can be identified based on fluorophore intensity. For example, two fluorophores can be linked to a first nucleotide (e.g., “C”) or nucleotide analog and four or more fluorophores can be linked to a second nucleotide (e.g., “T”) or nucleotide analog. Because of the different numbers of fluorophores, there may be different excitation and fluorophore emission probabilities associated with the different nucleotides. For example, there may be more emission events for the “T” nucleotide or nucleotide analog during a signal accumulation interval, so that the apparent intensity of the bins is significantly higher than for the “C” nucleotide or nucleotide analog.
Distinguishing nucleotides or any other biological or chemical specimens based on fluorophore decay rates and/or fluorophore intensities enables a simplification of the optical excitation and detection systems in an analytical instrument 5-100. For example, optical excitation can be performed with a single-wavelength source (e.g., a source producing one characteristic wavelength rather than multiple sources or a source operating at multiple different characteristic wavelengths). Additionally, wavelength-discriminating optics and filters may not be needed in the detection system to distinguish between fluorophores of different wavelengths. Also, a single photodetector can be used for each reaction chamber to detect emission from different fluorophores. However, in some embodiments, it may be advantageous to add additional dimensions of discrimination for identifying a particular molecule by using multiple of intensity, lifetime, wavelength, pulse duration and/or interpulse duration to distinguish a sample.
The phrase “characteristic wavelength” or “wavelength” is used to refer to a central or predominant wavelength within a limited bandwidth of radiation (e.g., a central or peak wavelength within a 20 nm bandwidth output by a pulsed optical source). In some cases, “characteristic wavelength” or “wavelength” may be used to refer to a peak wavelength within a total bandwidth of radiation output by a source.
Fluorophores having emission wavelengths in a range between about 560 nm and about 900 nm can provide adequate amounts of fluorescence to be detected by a time-binning photodetector (which can be fabricated on a silicon wafer using CMOS processes). These fluorophores can be linked to biological molecules of interest, such as nucleotides or nucleotide analogs for genetic sequencing applications. Fluorescent emission in this wavelength range can be detected with higher responsivity in a silicon-based photodetector than fluorescence at longer wavelengths. Additionally, fluorophores and associated linkers in this wavelength range may not interfere with incorporation of the nucleotides or nucleotide analogs into growing strands of DNA. In some implementations, fluorophores having emission wavelengths in a range between about 560 nm and about 660 nm can be optically excited with a single-wavelength source. An example fluorophore in this range is Alexa Fluor 647, available from Thermo Fisher Scientific Inc. of Waltham, Mass. Excitation energy at shorter wavelengths (e.g., between about 500 nm and about 650 nm) may be used to excite fluorophores that emit at wavelengths between about 560 nm and about 900 nm. In some embodiments, the time-binning photodetectors can efficiently detect longer-wavelength emission from the reaction chambers, e.g., by incorporating other materials, such as Ge, into the photodetectors' active regions.
b. Protein Sequencing Applications
Some aspects of the present disclosure may be useful for protein sequencing. For example, some aspects of the present disclosure are useful for determining amino acid sequence information from polypeptides (e.g., for sequencing one or more polypeptides) such as by applying the multi-dimensional discrimination techniques using wavelength, lifetime, intensity, pulse duration and/or interpulse duration measurements to identify a particular sample. In some embodiments, amino acid sequence information can be determined for single polypeptide molecules. In some embodiments, one or more amino acids of a polypeptide are labeled (e.g., directly or indirectly) and the relative positions of the labeled amino acids in the polypeptide are determined. In some embodiments, the relative positions of amino acids in a protein are determined using a series of amino acid labeling and cleavage steps. In particular, the multi-dimensional discrimination techniques described herein may be implemented with the protein sequencing methods described in U.S. patent application Ser. No. 16/686,028 titled “METHODS AND COMPOSITIONS FOR PROTEIN SEQUENCING,” filed Nov. 15, 2019 under Attorney Docket No. R0708.70042US02 and PCT Application No. PCT/US19/61831 titled “METHODS AND COMPOSITIONS FOR PROTEIN SEQUENCING,” filed Nov. 15, 2019 under Attorney Docket No. R0708.70042WO00, both which are hereby incorporated by reference in their entireties
For example,
In some embodiments, the identity of a terminal amino acid (e.g., an N-terminal or a C-terminal amino acid) is assessed, after which the terminal amino acid is removed and the identity of the next amino acid at the terminus is assessed, and this process is repeated until a plurality of successive amino acids in the polypeptide are assessed. In some embodiments, assessing the identity of an amino acid comprises determining the type of amino acid that is present. In some embodiments, determining the type of amino acid comprises determining the actual amino acid identity, for example by determining which of the naturally-occurring 20 amino acids is the terminal amino acid is (e.g., using a recognition molecule that is specific for an individual terminal amino acid). However, in some embodiments assessing the identity of a terminal amino acid type can comprise determining a subset of potential amino acids that can be present at the terminus of the polypeptide. In some embodiments, this can be accomplished by determining that an amino acid is not one or more specific amino acids (and therefore could be any of the other amino acids). In some embodiments, this can be accomplished by determining which of a specified subset of amino acids (e.g., based on size, charge, hydrophobicity, binding properties) could be at the terminus of the polypeptide (e.g., using a recognition molecule that binds to a specified subset of two or more terminal amino acids).
Amino acids of a polypeptide can be indirectly labeled, for example, using amino acid recognition molecules that selectively bind one or more types of amino acids on the polypeptide. Amino acids of a polypeptide can be directly labeled, for example, by selectively modifying one or more types of amino acid side chains on the polypeptide with uniquely identifiable labels. Methods of selective labeling of amino acid side chains and details relating to the preparation and analysis of labeled polypeptides are known in the art (see, e.g., Swaminathan, et al. PLoS Comput Biol. 2015, 11(2):e1004080). Accordingly, in some embodiments, the one or more types of amino acids are identified by detecting binding of one or more amino acid recognition molecules that selectively bind the one or more types of amino acids. In some embodiments, the one or more types of amino acids are identified by detecting labeled polypeptide.
In some embodiments, the relative position of labeled amino acids in a protein can be determined without removing amino acids from the protein but by translocating a labeled protein through a pore (e.g., a protein channel) and detecting a signal (e.g., a Förster resonance energy transfer (FRET) signal) from the labeled amino acid(s) during translocation through the pore in order to determine the relative position of the labeled amino acids in the protein molecule.
As used herein, sequencing a polypeptide refers to determining sequence information for a polypeptide. In some embodiments, this can involve determining the identity of each sequential amino acid for a portion (or all) of the polypeptide. However, in some embodiments, this can involve assessing the identity of a subset of amino acids within the polypeptide (e.g., and determining the relative position of one or more amino acid types without determining the identity of each amino acid in the polypeptide). However, in some embodiments amino acid content information can be obtained from a polypeptide without directly determining the relative position of different types of amino acids in the polypeptide. The amino acid content alone may be used to infer the identity of the polypeptide that is present (e.g., by comparing the amino acid content to a database of polypeptide information and determining which polypeptide(s) have the same amino acid content).
In some embodiments, sequence information for a plurality of polypeptide products obtained from a longer polypeptide or protein (e.g., via enzymatic and/or chemical cleavage) can be analyzed to reconstruct or infer the sequence of the longer polypeptide or protein. Accordingly, some embodiments provide compositions and methods for sequencing a polypeptide by sequencing a plurality of fragments of the polypeptide. In some embodiments, sequencing a polypeptide comprises combining sequence information for a plurality of polypeptide fragments to identify and/or determine a sequence for the polypeptide. In some embodiments, combining sequence information may be performed by computer hardware and software. The methods described herein may allow for a set of related polypeptides, such as an entire proteome of an organism, to be sequenced. In some embodiments, a plurality of single molecule sequencing reactions may be performed in parallel (e.g., on a single chip). For example, in some embodiments, a plurality of single molecule sequencing reactions are each performed in separate sample wells on a single chip.
In some embodiments, methods provided herein may be used for the sequencing and identification of an individual protein in a sample comprising a complex mixture of proteins. Some embodiments provide methods of uniquely identifying an individual protein in a complex mixture of proteins. In some embodiments, an individual protein is detected in a mixed sample by determining a partial amino acid sequence of the protein. In some embodiments, the partial amino acid sequence of the protein is within a contiguous stretch of approximately 5 to 50 amino acids.
Without wishing to be bound by any particular theory, it is believed that most human proteins can be identified using incomplete sequence information with reference to proteomic databases. For example, simple modeling of the human proteome has shown that approximately 98% of proteins can be uniquely identified by detecting just four types of amino acids within a stretch of 6 to 40 amino acids (see, e.g., Swaminathan, et al. PLoS Comput Biol. 2015, 11(2):e1004080; and Yao, et al. Phys. Biol. 2015, 12(5):055003). Therefore, a complex mixture of proteins can be degraded (e.g., chemically degraded, enzymatically degraded) into short polypeptide fragments of approximately 6 to 40 amino acids, and sequencing of this polypeptide library would reveal the identity and abundance of each of the proteins present in the original complex mixture. Compositions and methods for selective amino acid labeling and identifying polypeptides by determining partial sequence information are described in in detail in U.S. patent application Ser. No. 15/510,962, filed Sep. 15, 2015, titled “SINGLE MOLECULE PEPTIDE SEQUENCING,” which is hereby incorporated by reference in its entirety.
Sequencing in accordance with some embodiments can involve immobilizing a polypeptide on a surface of a substrate or solid support, such as a chip or integrated device. In some embodiments, a polypeptide can be immobilized on a surface of a sample well (e.g., on a bottom surface of a sample well) on a substrate. In some embodiments, a first terminus of a polypeptide is immobilized to a surface, and the other terminus is subjected to a sequencing reaction as described herein. For example, in some embodiments, a polypeptide is immobilized to a surface through a C-terminal end, and terminal amino acid recognition and degradation proceeds from an N-terminal end of the polypeptide toward the C-terminal end. In some embodiments, the N-terminal amino acid of the polypeptide is immobilized (e.g., attached to the surface). In some embodiments, the C-terminal amino acid of the polypeptide is immobilized (e.g., attached to the surface). In some embodiments, one or more non-terminal amino acids are immobilized (e.g., attached to the surface). The immobilized amino acid(s) can be attached using any suitable covalent or non-covalent linkage, for example as described herein. In some embodiments, a plurality of polypeptides are attached to a plurality of sample wells (e.g., with one polypeptide attached to a surface, for example a bottom surface, of each sample well), for example in an array of sample wells on a substrate.
Some aspects of the present disclosure provide a method of sequencing a polypeptide by detecting luminescence of a labeled polypeptide which is subjected to repeated cycles of terminal amino acid modification and cleavage. For example,
As shown in the example depicted in
In some embodiments, the method comprises repeating steps (1) through (2) for a plurality of cycles, during which luminescence of the labeled polypeptide is detected, and cleavage events corresponding to the removal of a labeled amino acid from the terminus may be detected as a decrease in detected signal. In some embodiments, no change in signal following step (2) as shown in
Some aspects of the present disclosure provide methods of polypeptide sequencing in real-time by evaluating binding interactions of terminal amino acids with labeled amino acid recognition molecules and a labeled cleaving reagent (e.g., a labeled exopeptidase).
Without wishing to be bound by theory, labeled amino acid recognition molecule 5-1410 selectively binds according to a binding affinity (KD) defined by an association rate of binding (kon) and a dissociation rate of binding (koff). The rate constants koff and kon are the critical determinants of pulse duration (e.g., the time corresponding to a detectable binding event) and interpulse duration (e.g., the time between detectable binding events), respectively. In some embodiments, these rates can be engineered to achieve pulse durations and pulse rates that give the best sequencing accuracy.
As shown in the inset panel, a sequencing reaction mixture further comprises a labeled cleaving reagent 5-1420 comprising a detectable label that is different than that of labeled amino acid recognition molecule 5-1410. In some embodiments, labeled cleaving reagent 5-1420 is present in the mixture at a concentration that is less than that of labeled amino acid recognition molecule 5-1410. In some embodiments, labeled cleaving reagent 5-1420 displays broad specificity such that it cleaves most or all types of terminal amino acids.
As illustrated by the progress of signal output 5-1400, in some embodiments, terminal amino acid cleavage by labeled cleaving reagent 5-1420 gives rise to a uniquely identifiable signal pulse, and these events occur with higher wavelength than the binding pulses of a labeled amino acid recognition molecule 5-1410. In this way, amino acids of a polypeptide can be counted and/or identified in a real-time sequencing process. As further illustrated in signal output 5-1400, in some embodiments, a labeled amino acid recognition molecule 5-1410 is engineered to bind more than one type of amino acid with different binding properties corresponding to each type, which produces uniquely identifiable pulsing patterns. In some embodiments, a plurality of labeled amino acid recognition molecules may be used, each with a diagnostic pulsing pattern, including a characteristic wavelength, lifetime, intensity, pulse duration and/or interpulse duration, which may be used to identify a corresponding terminal amino acid.
VIII. Sequencing Methods
As described herein, the multi-dimensional discrimination techniques may be used in combination with one or more sequencing applications (e.g., peptide sequencing, nucleic acid sequencing).
Polypeptide Sequencing
In addition to methods of identifying a terminal amino acid of a polypeptide, the disclosure provides methods of sequencing polypeptides using labeled recognition molecules. In some embodiments, methods of sequencing may involve subjecting a polypeptide terminus to repeated cycles of terminal amino acid detection and terminal amino acid cleavage. For example, in some embodiments, the disclosure provides a method of determining an amino acid sequence of a polypeptide comprising contacting a polypeptide with one or more labeled recognition molecules described herein and subjecting the polypeptide to Edman degradation.
As described herein, in some aspects, the disclosure provides compositions and methods for polypeptide sequencing.
As shown in
In some embodiments, as shown in
In some embodiments, the method further comprises identifying the terminal amino acid of polypeptide 5100 by detecting labeled amino acid recognition molecule 5102 during an association event between labeled amino acid recognition molecule 5102 and the terminal amino acid of polypeptide 5100. In some embodiments, detecting comprises detecting a luminescence from labeled amino acid recognition molecule 5102. In some embodiments, the luminescence is uniquely associated with labeled amino acid recognition molecule 5102, and the luminescence is thereby associated with the type of amino acid to which labeled amino acid recognition molecule 5102 binds. As such, in some embodiments, the type of amino acid is identified by determining one or more luminescence properties of labeled amino acid recognition molecule 5102.
In some embodiments, polypeptide sequencing proceeds by (2) removing the terminal amino acid by contacting polypeptide 5100 with a cleaving reagent 5104 that binds and cleaves the terminal amino acid of polypeptide 5100. In some embodiments, cleaving reagent 5104 is a peptidase (e.g., an exopeptidase). Upon removal of the terminal amino acid by cleaving reagent 5104, polypeptide sequencing proceeds by (3) subjecting polypeptide 5100 (having n−1 amino acids) to additional cycles of terminal amino acid recognition and cleavage. In some embodiments, steps (1) through (3) occur in the same reaction mixture, e.g., as in a dynamic peptide sequencing reaction. In some embodiments, steps (1) through (3) may be carried out using other methods known in the art, such as peptide sequencing by Edman degradation.
In some embodiments, peptide sequencing can be carried out in a dynamic peptide sequencing reaction. In some embodiments, referring again to
In some embodiments, dynamic polypeptide sequencing is carried out in real-time by evaluating binding interactions of labeled amino acid recognition molecules with a terminus of a polypeptide while the polypeptide is being degraded by a cleaving reagent.
As further shown in the inset panel (left) of
In some embodiments, dynamic peptide sequencing is performed by observing different association events, e.g., association events between an amino acid recognition molecule and an amino acid at a terminal end of a peptide, wherein each association event produces a change in magnitude of a signal, e.g., a luminescence signal, that persists for a duration of time. In some embodiments, observing different association events, e.g., association events between an amino acid recognition molecule and an amino acid at a terminal end of a peptide, can be performed during a peptide degradation process. In some embodiments, a transition from one characteristic signal pattern to another is indicative of amino acid cleavage (e.g., amino acid cleavage resulting from peptide degradation). In some embodiments, amino acid cleavage refers to the removal of at least one amino acid from a terminus of a polypeptide (e.g., the removal of at least one terminal amino acid from the polypeptide). In some embodiments, amino acid cleavage is determined by inference based on a time duration between characteristic signal patterns. In some embodiments, amino acid cleavage is determined by detecting a change in signal produced by association of a labeled cleaving reagent with an amino acid at the terminus of the polypeptide. As amino acids are sequentially cleaved from the terminus of the polypeptide during degradation, a series of changes in magnitude, or a series of signal pulses, is detected.
Methods and compositions for performing dynamic sequencing are described more fully in PCT International Application No. PCT/US2019/061831, filed Nov. 15, 2019, and PCT International Application No. PCT/US2021/033493, filed May 20, 2021, each of which is incorporated herein by reference in its entirety.
Nucleic Acid Sequencing
In accordance with embodiments described herein, nucleic acid sequencing methods can be carried out by illuminating a surface-immobilized polymerizing enzyme with excitation light, and detecting luminescence produced by a label attached to a nucleotide bound by the polymerizing enzyme. In some cases, radiative and/or non-radiative decay of the label can result in photodamage to the polymerizing enzyme and/or a surface linkage group attached thereto.
In accordance with embodiments described herein, single-molecule polypeptide sequencing methods can be carried out by illuminating a surface-immobilized polypeptide with excitation light, and detecting luminescence produced by a label attached to an amino acid recognition molecule. In some cases, radiative and/or non-radiative decay of the label can result in photodamage to the polypeptide.
Without wishing to be bound by theory, it is thought that a shielding element, positioned between a reagent (e.g., a nucleotide or an amino acid recognition molecule) and a label, can absorb, deflect, or otherwise block the effects or products of radiative and/or non-radiative decay produced by the label (i.e., to mitigate photodamage during a sequencing reaction). In some embodiments, the shielding element prevents, limits, or modulates the extent to which one or more labels (e.g., luminescent labels) interact with one or more reagents (e.g., one or more nucleotides or amino acid recognition molecules). In some embodiments, the shielding element prevents or limits the extent to which one or more labels interact with one or more surface-immobilized molecules associated with a reagent (e.g., a surface-immobilized polypeptide, such as a polymerizing enzyme, and/or a surface linkage group attached thereto). Accordingly, in some embodiments, the term shielding can generally refer to a protective or shielding effect that is provided by a covalent or non-covalent linkage group formed between a reagent and a label.
In some embodiments, a shielding element, which may generally be referred to as a shield herein, is attached to a reagent (e.g., a reagent component comprising a nucleotide or an amino acid recognition molecule) and to one or more labels (e.g., a label component). Each attachment between the shield and the reagent and the one or more labels may be a covalent or non-covalent attachment. In some embodiments, the reagent and label components are attached at non-adjacent sites on the shield. For example, a reagent can be attached to a first side of the shield, and one or more labels can be attached to a second side of the shield, where the first and second sides of the shield are distant from each other. In some embodiments, the attachment sites are on approximately opposite sides of the shield.
The distance between the site at which a shield is attached (e.g., covalently or non-covalently attached) to a reagent (e.g., a nucleotide or an amino acid recognition molecule) and the site at which the shield is attached (e.g., covalently or non-covalently attached) to a label can be a linear measurement through space or a non-linear measurement across the surface of the shield. The distance between the reagent and label attachment sites on a shield can be measured by modeling the three-dimensional structure of the shield. In some embodiments, this distance can be at least 2 nm, at least 4 nm, at least 6 nm, at least 8 nm, at least 10 nm, at least 12 nm, at least 15 nm, at least 20 nm, at least 30 nm, at least 40 nm, or more. Alternatively, the relative positions of the reagent and label on a shield can be described by treating the structure of the shield as a quadratic surface (e.g., ellipsoid, elliptic cylinder). In some embodiments, the reagent and label attachment sites are separated by a distance that is at least one eighth of the distance around an ellipsoidal shape representing the shield. In some embodiments, the reagent and label are separated by a distance that is at least one quarter of the distance around an ellipsoidal shape representing the shield. In some embodiments, the reagent and label are separated by a distance that is at least one third of the distance around an ellipsoidal shape representing the shield. In some embodiments, the reagent and label are separated by a distance that is one half of the distance around an ellipsoidal shape representing the shield.
In embodiments relating to nucleic acid sequencing, the size of a shield should be such that a label is unable or unlikely to directly contact the polymerizing enzyme when the nucleotide is bound by the polymerizing enzyme. In embodiments relating to polypeptide sequencing, the size of a shield should be such that a label is unable or unlikely to directly contact the polypeptide when the amino acid recognition molecule is associated with the polypeptide. The size of a shield should also be such that an attached label is detectable when the nucleotide or amino acid recognition molecule is associated with the polymerizing enzyme or polypeptide, respectively. For example, the size should be such that an attached luminescent label is within an illumination volume to be excited.
It should be appreciated that there are a variety of parameters by which a practitioner could evaluate shielding effects. Generally, the effects of a shielding element can be evaluated by conducting a comparative assessment between a composition having the shielding element and a composition lacking the shielding element. For example, in embodiments relating to polypeptide sequencing, a shielding element can increase recognition time of an amino acid recognition molecule. In some embodiments, recognition time refers to the length of time in which association events between the recognition molecule and a polypeptide are observable in a polypeptide sequencing reaction as described herein. In some embodiments, recognition time is increased by about 10-25%, 25-50%, 50-75%, 75-100%, or more than 100%, for example by about 2-fold, 3-fold, 4-fold, 5-fold, or more, relative to a polypeptide sequencing reaction performed under the same conditions, with the exception that the amino acid recognition molecule lacks the shielding element but is otherwise similar or identical. In some embodiments, a shielding element can increase sequencing accuracy and/or sequence read length (e.g., by at least 5%, at least 10%, at least 15%, at least 25% or more, relative to a sequencing reaction performed under comparative conditions as described above).
Accordingly, in some aspects, the disclosure provides shielded recognition molecules comprising at least one amino acid recognition molecule, at least one detectable label, and a shielding element that forms a covalent or non-covalent linkage group between the recognition molecule and label. In some embodiments, a shielding element is at least 2 nm, at least 5 nm, at least 10 nm, at least 12 nm, at least 15 nm, at least 20 nm, or more, in length (e.g., in an aqueous solution). In some embodiments, a shielding element is between about 2 nm and about 100 nm in length (e.g., between about 2 nm and about 50 nm, between about 10 nm and about 50 nm, between about 20 nm and about 100 nm).
In some embodiments, a shield (e.g., shielding element) forms a covalent or non-covalent linkage group between a reagent (e.g., a reagent component comprising a nucleotide or an amino acid recognition molecule) and one or more labels. As used herein, in some embodiments, covalent and non-covalent linkages or linkage groups refer to the nature of the attachments of the reagent and label components to the shield. In some embodiments, covalent and non-covalent linkages or linkage groups refer to the nature of the attachments of the chromophores within a label component (e.g., a FRET label) to the shield. In some embodiments, a covalent linkage, or a covalent linkage group, refers to a shield that is attached to each of the reagent and label components through a covalent bond or a series of contiguous covalent bonds. Covalent attachment to one or both components can be achieved by covalent conjugation methods known in the art. For example, in some embodiments, click chemistry techniques (e.g., copper-catalyzed, strain-promoted, copper-free click chemistry, etc.) can be used to attach one or both components to the shield. Such methods generally involve conjugating one reactive moiety to another reactive moiety to form one or more covalent bonds between the reactive moieties. Accordingly, in some embodiments, a first reactive moiety of a shield can be contacted with a second reactive moiety of a reagent or label component to form a covalent attachment. Examples of reactive moieties include, without limitation, reactive amines, azides, alkynes, nitrones, alkenes (e.g., cycloalkenes), tetrazines, tetrazoles, and other reactive moieties suitable for click reactions and similar coupling techniques.
In some embodiments, a non-covalent linkage, or a non-covalent linkage group, refers to a shield that is attached to one or both of the reagent and label components through one or more non-covalent coupling means, including but not limited to receptor-ligand interactions and oligonucleotide strand hybridization. Examples of receptor-ligand interactions are provided herein and include, without limitation, protein-protein complexes, protein-ligand complexes, protein-aptamer complexes, and aptamer-nucleic acid complexes. Various configurations and strategies for oligonucleotide strand hybridization are described herein and are known in the art (see, e.g., U.S. Patent Publication No. 2019/0024168).
In some embodiments, a shielding element comprises a polymer, such as a biomolecule or a dendritic polymer.
In some embodiments, a protein shield is a protein having a molecular weight of at least 10 kDa. For example, in some embodiments, a protein shield is a protein having a molecular weight of at least 10 kDa and up to 500 kDa (e.g., between about 10 kDa and about 250 kDa, between about 10 kDa and about 150 kDa, between about 10 kDa and about 100 kDa, between about 20 kDa and about 80 kDa, between about 15 kDa and about 100 kDa, or between about 15 kDa and about 50 kDa). In some embodiments, a protein shield is a protein comprising at least 25 amino acids. For example, in some embodiments, a protein shield is a protein comprising at least 25 and up to 1,000 amino acids (e.g., between about 100 and about 1,000 amino acids, between about 100 and about 750 amino acids, between about 500 and about 1,000 amino acids, between about 250 and about 750 amino acids, between about 50 and about 500 amino acids, between about 100 and about 400 amino acids, or between about 50 and about 250 amino acids).
In some embodiments, a protein shield is a polypeptide comprising one or more tag proteins. In some embodiments, a protein shield is a polypeptide comprising at least two tag proteins. In some embodiments, the at least two tag proteins are the same (e.g., the polypeptide comprises at least two copies of a tag protein sequence). In some embodiments, the at least two tag proteins are different (e.g., the polypeptide comprises at least two different tag protein sequences). Examples of tag proteins include, without limitation, Fasciola hepatica 8-kDa antigen (Fh8), Maltose-binding protein (MBP), N-utilization substance (NusA), Thioredoxin (Trx), Small ubiquitin-like modifier (SUMO), Glutathione-S-transferase (GST), Solubility-enhancer peptide sequences (SET), IgG domain B1 of Protein G (GB1), IgG repeat domain ZZ of Protein A (ZZ), Mutated dehalogenase (HaloTag), Solubility eNhancing Ubiquitous Tag (SNUT), Seventeen kilodalton protein (Skp), Phage T7 protein kinase (T7PK), E. coli secreted protein A (EspA), Monomeric bacteriophage T7 0.3 protein (Orc protein; Mocr), E. coli trypsin inhibitor (Ecotin), Calcium-binding protein (CaBP), Stress-responsive arsenate reductase (ArsC), N-terminal fragment of translation initiation factor IF2 (IF2-domain I), Stress-responsive proteins (e.g., RpoA, SlyD, Tsf, RpoS, PotD, Crr), and E. coli acidic proteins (e.g., msyB, yjgD, rpoD). See, e.g., Costa, S., et al. “Fusion tags for protein solubility, purification and immunogenicity in Escherichia coli: the novel Fh8 system.” Front Microbiol. 2014 Feb. 19; 5:63, the relevant content of which is incorporated herein by reference.
In some embodiments, protein shield 6-330 forms a non-covalent linkage group between reagent 6-331 and a label. For example, in some embodiments, protein shield 6-330 is a monomeric or multimeric protein comprising one or more ligand-binding sites. In some embodiments, a non-covalent linkage group is formed through one or more ligand moieties bound to the one or more ligand-binding sites. Additional examples of non-covalent linkages formed by protein shields are described elsewhere herein.
A second shielded construct 6-306 shows an example of a double-stranded nucleic acid shield comprising a first oligonucleotide strand 6-332 hybridized with a second oligonucleotide strand 6-334. As shown, in some embodiments, the double-stranded nucleic acid shield can comprise a reagent attached to first oligonucleotide strand 6-332, and a label attached to second oligonucleotide strand 6-334. In this way, the double-stranded nucleic acid shield forms a non-covalent linkage group between the reagent and the label through oligonucleotide strand hybridization. In some embodiments, a reagent and a label can be attached to the same oligonucleotide strand, which can provide a single-stranded nucleic acid shield or a double-stranded nucleic acid shield through hybridization with another oligonucleotide strand. In some embodiments, strand hybridization can provide increased rigidity within a linkage group to further enhance separation between the reagent and the label. Although a particular example of a double-stranded nucleic acid is shown in
Where a shielding element comprises a nucleic acid, the separation distance between a label and a reagent can be measured by the distance between attachment sites on the nucleic acid (e.g., direct attachment or indirect attachment, such as through one or more additional shield polymers). In some embodiments, the distance between attachment sites on a nucleic acid can be measured by the number of nucleotides within the nucleic acid that occur between the label and the reagent. It should be understood that the number of nucleotides can refer to either the number of nucleotide bases in a single-stranded nucleic acid or the number of nucleotide base pairs in a double-stranded nucleic acid.
Accordingly, in some embodiments, the attachment site of a reagent and the attachment site of a label can be separated by between 5 and 200 nucleotides (e.g., between 5 and 150 nucleotides, between 5 and 100 nucleotides, between 5 and 50 nucleotides, between 10 and 100 nucleotides). It should be appreciated that any position in a nucleic acid can serve as an attachment site for a reagent, a label, or one or more additional polymer shields. In some embodiments, an attachment site can be at or approximately at the 5′ or 3′ end, or at an internal position along a strand of the nucleic acid.
The non-limiting configuration of second shielded construct 6-306 illustrates an example of a shield that forms a non-covalent linkage through strand hybridization. A further example of non-covalent linkage is illustrated by a third shielded construct 6-308 comprising an oligonucleotide shield 6-336. In some embodiments, oligonucleotide shield 6-336 is a nucleic acid aptamer that binds a reagent to form a non-covalent linkage. In some embodiments, the reagent is a nucleic acid aptamer, and oligonucleotide shield 6-336 comprises an oligonucleotide strand that hybridizes with the aptamer to form a non-covalent linkage.
A fourth shielded construct 6-310 shows an example of a dendritic polymer shield 6-338. As used herein, in some embodiments, a dendritic polymer refers generally to a polyol or a dendrimer. Polyols and dendrimers have been described in the art, and may include branched dendritic structures optimized for a particular configuration. In some embodiments, dendritic polymer shield 6-338 comprises polyethylene glycol, tetraethylene glycol, poly(amidoamine), poly(propyleneimine), poly(propyleneamine), carbosilane, poly(L-lysine), or a combination of one or more thereof.
A dendrimer, or dendron, is a repetitively branched molecule that is typically symmetric around the core and that may adopt a spherical three-dimensional morphology. See, e.g., Astruc et al. (2010) Chem. Rev. 110:1857. Incorporation of such structures into a shield of the disclosure can provide for a protective effect through the steric inhibition of contacts between a label and one or more biomolecules associated therewith. Refinement of the chemical and physical properties of the dendrimer through variation in primary structure of the molecule, including potential functionalization of the dendrimer surface, allows the shielding effects to be adjusted as desired. Dendrimers may be synthesized by a variety of techniques using a wide range of materials and branching reactions, as is known in the art. Such synthetic variation allows the properties of the dendrimer to be customized as necessary.
In some embodiments, a shielded reagent of the disclosure is an avidin-nucleic acid construct 6-314. In some embodiments, avidin-nucleic acid construct 6-314 includes a shield comprising an avidin protein 6-340 and a double-stranded nucleic acid. As described herein, avidin protein 6-340 may be used to form a non-covalent linkage between a reagent and one or more labels, either directly or indirectly, such as through one or more additional shield polymers described herein.
Avidin proteins are biotin-binding proteins, generally having a biotin binding site at each of four subunits of the avidin protein. Avidin proteins include, for example, avidin, streptavidin, traptavidin, tamavidin, bradavidin, xenavidin, and homologs and variants thereof. In some cases, the monomeric, dimeric, or tetrameric form of the avidin protein can be used. In some embodiments, the avidin protein of an avidin protein complex is streptavidin in a tetrameric form (e.g., a homotetramer). In some embodiments, the biotin binding sites of an avidin protein provide attachment sites for a reagent, one or more labels, and/or one or more additional shield polymers described herein.
An illustrative diagram of an avidin protein complex is shown in the inset panel of
Various further examples of avidin protein shield configurations are shown. A first avidin construct 6-316 shows an example of an avidin shield attached to a reagent through a bis-biotin linkage moiety and to two labels through separate biotin linkage moieties. A second avidin construct 6-318 shows an example of an avidin shield attached to two reagents through separate biotin linkage moieties and to a label through a bis-biotin linkage moiety. A third avidin construct 6-320 shows an example of an avidin shield attached to two reagents through separate biotin linkage moieties and to a labeled nucleic acid through a biotin linkage moiety of each strand of the nucleic acid. A fourth avidin construct 6-322 shows an example of an avidin shield attached to a reagent and to a labeled nucleic acid through separate bis-biotin linkage moieties. As shown, the label is further shielded from the reagent by a dendritic polymer between the label and nucleic acid. A fifth avidin construct 6-324 shows an example of an internal label 6-326 attached to two avidin-shielded reagents. As shown, each reagent is attached to a different avidin protein through a bis-biotin linkage moiety, and internal label 6-326 is attached to both avidin proteins through separate bis-biotin linkage moieties.
It should be appreciated that the example configurations of shielded reagents shown in
As shown at the top of
It should be appreciated that shielded reagents of the disclosure can comprise shielding element 6-352 attached to one or more reagents (e.g., one or more nucleotides or one or more amino acid recognition molecules) and one or more labels. Where reagent component 6-350 comprises more than one recognition molecule, each recognition molecule can be attached to shielding element 6-352 at one or more attachment sites on shielding element 6-352. In some embodiments, reagent component 6-350 comprises a single polypeptide fusion construct having tandem copies of two or more amino acid binding proteins. Where label component 6-354 comprises more than one label, each label can be attached to shielding element 6-352 at one or more attachment sites on shielding element 6-352. While label component 6-354 is generically shown as having a single attachment point, it is not limited in this respect. For example, in some embodiments, an internal label having more than one attachment point can be used to join more than one reagent component 6-350 and/or shielding element 6-352, as illustrated by avidin construct 6-324.
In some embodiments, shielding element 6-352 comprises a protein 6-360. In some embodiments, protein 6-360 is a monovalent or multivalent protein. In some embodiments, protein 6-360 is a monomeric or multimeric protein, such as a protein homodimer, protein heterodimer, protein oligomer, or other proteinaceous molecule. In some embodiments, shielding element 6-352 comprises a protein complex formed by a protein non-covalently bound to at least one other molecule. For example, in some embodiments, shielding element 6-352 comprises a protein-protein complex 6-362. In some embodiments, protein-protein complex 6-362 comprises one proteinaceous molecule specifically bound to another proteinaceous molecule. In some embodiments, protein-protein complex 6-362 comprises an antibody or antibody fragment (e.g., scFv) bound to an antigen. In some embodiments, protein-protein complex 6-362 comprises a receptor bound to a protein ligand. Additional examples of protein-protein complexes include, without limitation, trypsin-aprotinin, barnase-barstar, and colicin E9-Im9 immunity protein.
In some embodiments, shielding element 6-352 comprises a protein-ligand complex 6-364. In some embodiments, protein-ligand complex 6-364 comprises a monovalent protein and a non-proteinaceous ligand moiety. For example, in some embodiments, protein-ligand complex 6-364 comprises an enzyme bound to a small-molecule inhibitor moiety. In some embodiments, protein-ligand complex 6-364 comprises a receptor bound to a non-proteinaceous ligand moiety.
In some embodiments, shielding element 6-352 comprises a multivalent protein complex formed by a multivalent protein non-covalently bound to one or more ligand moieties. In some embodiments, shielding element 6-352 comprises an avidin protein complex formed by an avidin protein non-covalently bound to one or more biotin linkage moieties. Constructs 6-366, 6-368, 6-370, and 6-372 provide illustrative examples of avidin protein complexes, any one or more of which may be incorporated into shielding element 6-352.
In some embodiments, shielding element 6-352 comprises a two-way avidin complex 6-366 comprising an avidin protein bound to two bis-biotin linkage moieties. In some embodiments, shielding element 6-352 comprises a three-way avidin complex 6-368 comprising an avidin protein bound to two biotin linkage moieties and a bis-biotin linkage moiety. In some embodiments, shielding element 6-352 comprises a four-way avidin complex 6-370 comprising an avidin protein bound to four biotin linkage moieties.
In some embodiments, shielding element 6-352 comprises an avidin protein comprising one or two non-functional binding sites engineered into the avidin protein. For example, in some embodiments, shielding element 6-352 comprises a divalent avidin complex 6-372 comprising an avidin protein bound to a biotin linkage moiety at each of two subunits, where the avidin protein comprises a non-functional ligand-binding site 6-348 at each of two other subunits. As shown, in some embodiments, divalent avidin complex 6-372 comprises a trans-divalent avidin protein, although a cis-divalent avidin protein may be used depending on a desired implementation. In some embodiments, the avidin protein is a trivalent avidin protein. In some embodiments, the trivalent avidin protein comprises non-functional ligand-binding site 6-348 at one subunit and is bound to three biotin linkage moieties, or one biotin linkage moiety and one bis-biotin linkage moiety, at the other subunits.
In some embodiments, shielding element 6-352 comprises a dendritic polymer 6-374. In some embodiments, dendritic polymer 6-374 is a polyol or a dendrimer, as described elsewhere herein. In some embodiments, dendritic polymer 6-374 is a branched polyol or a branched dendrimer. In some embodiments, dendritic polymer 6-374 comprises a monosaccharide-TEG, a disaccharide, an N-acetyl monosaccharide, a TEMPO-TEG, a trolox-TEG, or a glycerol dendrimer. Examples of polyols useful in accordance with shielded reagents of the disclosure include polyether polyols and polyester polyols, e.g., polyethylene glycol, polypropylene glycol, and similar such polymers well known in the art. In some embodiments, dendritic polymer 6-374 comprises a compound of the following formula: —(CH2CH2O)n—, where n is an integer from 1 to 500, inclusive. In some embodiments, dendritic polymer 6-374 comprises a compound of the following formula: —(CH2CH2O)n—, wherein n is an integer from 1 to 100, inclusive.
In some embodiments, shielding element 6-352 comprises a nucleic acid. In some embodiments, the nucleic acid is single-stranded. In some embodiments, label component 6-354 is attached directly or indirectly to one end of the single-stranded nucleic acid (e.g., the 5′ end or the 3′ end) and reagent component 6-350 is attached directly or indirectly to the other end of the single-stranded nucleic acid (e.g., the 3′ end or the 5′ end). For example, the single-stranded nucleic acid can comprise a label attached to the 5′ end of the nucleic acid and a reagent attached to the 3′ end of the nucleic acid.
In some embodiments, shielding element 6-352 comprises a double-stranded nucleic acid 6-376. As shown, in some embodiments, double-stranded nucleic acid 6-376 can form a non-covalent linkage between reagent component 6-350 and label component 6-354 through strand hybridization. However, in some embodiments, double-stranded nucleic acid 6-376 can form a covalent linkage between reagent component 6-350 and label component 6-354 through attachment to the same oligonucleotide strand. In some embodiments, label component 6-354 is attached directly or indirectly to one end of the double-stranded nucleic acid and reagent component 6-350 is attached directly or indirectly to the other end of the double-stranded nucleic acid. For example, the double-stranded nucleic acid can comprise a label attached to the 5′ end of one strand and an amino acid recognition molecule attached to the 5′ end of the other strand.
In some embodiments, shielding element 6-352 comprises a nucleic acid that forms one or more structural motifs which can be useful for increasing steric bulk of the shield. Examples of nucleic acid structural motifs include, without limitation, stem-loops, three-way junctions (e.g., formed by two or more stem-loop motifs), four-way junctions (e.g., Holliday junctions), and bulge loops.
In some embodiments, shielding element 6-352 comprises a nucleic acid that forms a stem-loop 6-378. A stem-loop, or hairpin loop, is an unpaired loop of nucleotides on an oligonucleotide strand that is formed when the oligonucleotide strand folds and forms base pairs with another section of the same strand. In some embodiments, the unpaired loop of stem-loop 6-378 comprises three to ten nucleotides. Accordingly, stem-loop 6-378 can be formed by two regions of an oligonucleotide strand having inverted complementary sequences that hybridize to form a stem, where the two regions are separated by the three to ten nucleotides that form the unpaired loop. In some embodiments, the stem of stem-loop 6-378 can be designed to have one or more G/C nucleotides, which can provide added stability with the addition hydrogen bonding interaction that forms compared to A/T nucleotides. In some embodiments, the stem of stem-loop 6-378 comprises G/C nucleotides immediately adjacent to an unpaired loop sequence. In some embodiments, the stem of stem-loop 6-378 comprises G/C nucleotides within the first 2, 3, 4, or 5 nucleotides adjacent to an unpaired loop sequence. In some embodiments, an unpaired loop of stem-loop 6-378 comprises one or more attachment sites. In some embodiments, an attachment site occurs at an abasic site in the unpaired loop. In some embodiments, an attachment site occurs at a base of the unpaired loop.
In some embodiments, stem-loop 6-378 is formed by a double-stranded nucleic acid. As described herein, in some embodiments, the double-stranded nucleic acid can form a non-covalent linkage group through strand hybridization of first and second oligonucleotide strands. However, in some embodiments, shielding element 6-352 comprises a single-stranded nucleic acid that forms a stem-loop motif, e.g., to provide a covalent linkage group. In some embodiments, shielding element 6-352 comprises a nucleic acid that forms two or more stem-loop motifs. For example, in some embodiments, the nucleic acid comprises two stem-loop motifs. In some embodiments, a stem of one stem-loop motif is adjacent to the stem of the other such that the motifs together form a three-way junction. In some embodiments, shielding element 6-352 comprises a nucleic acid that forms a four-way junction 6-380. In some embodiments, four-way junction 6-380 is formed through hybridization of two or more oligonucleotide strands (e.g., 2, 3, or 4 oligonucleotide strands).
In some embodiments, shielding element 6-352 comprises one or more polymers selected from 6-360, 6-362, 6-364, 6-366, 6-368, 6-370, 6-372, 6-374, 6-376, 6-378, and 6-380 of
In some aspects, the disclosure provides a labeled reagent of Formula (II):
A-(Y)n-D (II),
wherein: A is a reagent component comprising at least one reagent; each instance of Y is a polymer that forms a covalent or non-covalent linkage group; n is an integer from 1 to 10, inclusive; and D is a label component comprising at least one detectable label. In some embodiments, the at least one reagent is at least one nucleotide. In some embodiments, the at least one reagent is at least one amino acid recognition molecule. In some embodiments, the disclosure provides a composition comprising a soluble labeled reagent of Formula (II).
In some embodiments, A comprises a plurality of reagents, such as a plurality of nucleotides or a plurality of amino acid recognition molecules. In some embodiments, each reagent of the plurality is attached to a different attachment site on Y. In some embodiments, at least two reagents of the plurality are attached to a single attachment site on Y.
In some embodiments, the detectable label is a luminescent label or a conductivity label. In some embodiments, the luminescent label comprises at least one fluorophore dye molecule. In some embodiments, D comprises 20 or fewer fluorophore dye molecules. In some embodiments, the ratio of the number of fluorophore dye molecules to the number of reagents (e.g., nucleotides or amino acid recognition molecules) is between 1:1 and 20:1. In some embodiments, the luminescent label comprises at least one FRET pair comprising a donor label and an acceptor label. In some embodiments, the ratio of the donor label to the acceptor label is 1:1, 2:1, 3:1, 4:1, or 5:1. In some embodiments, the ratio of the acceptor label to the donor label is 1:1, 2:1, 3:1, 4:1, or 5:1.
In some embodiments, D is less than 200 Å in diameter. In some embodiments, —(Y)n— is at least 2 nm in length. In some embodiments, —(Y)n— is at least 5 nm in length. In some embodiments, —(Y)n— is at least 10 nm in length. In some embodiments, each instance of Y is independently a biomolecule, a polyol, or a dendrimer. In some embodiments, the biomolecule is a nucleic acid, a polypeptide, or a polysaccharide.
In some embodiments, the labeled reagent is of one of the following formulae: A-Y1—(Y)m-D or A-(Y)m—Y1-D, wherein: Y1 is a nucleic acid or a polypeptide; and m is an integer from 0 to 10, inclusive. In some embodiments, the nucleic acid comprises a first oligonucleotide strand. In some embodiments, the nucleic acid comprises a second oligonucleotide strand hybridized with the first oligonucleotide strand. In some embodiments, the nucleic acid forms a covalent linkage through the first oligonucleotide strand. In some embodiments, the nucleic acid forms a non-covalent linkage through the hybridized first and second oligonucleotide strands. In some embodiments, the polypeptide is a monovalent or multivalent protein. In some embodiments, the monovalent or multivalent protein forms at least one non-covalent linkage through a ligand moiety attached to a ligand-binding site of the monovalent or multivalent protein. In some embodiments, A, Y, or D comprises the ligand moiety.
In some embodiments, the labeled reagent is of one of the following formulae: A-(Y)m—Y2-D or A-Y2—(Y)m-D, wherein: Y2 is a polyol or dendrimer; and m is an integer from 0 to 10, inclusive. In some embodiments, the polyol or dendrimer comprises polyethylene glycol, tetraethylene glycol, poly(amidoamine), poly(propyleneimine), poly(propyleneamine), carbosilane, poly(L-lysine), or a combination of one or more thereof.
In some aspects, the disclosure provides a labeled reagent of Formula (III):
A-Y1-D (III),
wherein: A is a reagent component comprising at least one reagent; Y1 is a nucleic acid or a polypeptide; D is a label component comprising at least one detectable label. In some embodiments, when Y1 is a nucleic acid, the nucleic acid forms a covalent or non-covalent linkage group. In some embodiments, when Y1 is a polypeptide, the polypeptide forms a non-covalent linkage group characterized by a dissociation constant (KD) of less than 50×10−9 M. In some embodiments, the at least one reagent is at least one nucleotide. In some embodiments, the at least one reagent is at least one amino acid recognition molecule.
In some embodiments, Y1 is a nucleic acid comprising a first oligonucleotide strand. In some embodiments, the nucleic acid comprises a second oligonucleotide strand hybridized with the first oligonucleotide strand. In some embodiments, A is attached to the first oligonucleotide strand, and D is attached to the second oligonucleotide strand. In some embodiments, A is attached to a first attachment site on the first oligonucleotide strand, and D is attached to a second attachment site on the first oligonucleotide strand. In some embodiments, each oligonucleotide strand of the nucleic acid comprises fewer than 150, fewer than 100, or fewer than 50 nucleotides.
In some embodiments, Y1 is a monovalent or multivalent protein. In some embodiments, the monovalent or multivalent protein forms at least one non-covalent linkage through a ligand moiety attached to a ligand-binding site of the monovalent or multivalent protein. In some embodiments, at least one of A and D comprises the ligand moiety. In some embodiments, the polypeptide is an avidin protein (e.g., avidin, streptavidin, traptavidin, tamavidin, bradavidin, xenavidin, or a homolog or variant thereof). In some embodiments, the ligand moiety is a biotin moiety.
In some embodiments, the labeled reagent is of one of the following formulae: A-Y1—(Y)n-D or A-(Y)nY1-D, wherein: each instance of Y is a polymer that forms a covalent or non-covalent linkage group; and n is an integer from 1 to 10, inclusive. In some embodiments, each instance of Y is independently a biomolecule, a polyol, or a dendrimer.
In other aspects, the disclosure provides a labeled reagent comprising: a nucleic acid; at least one reagent attached to a first attachment site on the nucleic acid; and at least one detectable label attached to a second attachment site on the nucleic acid. In some embodiments, the nucleic acid forms a covalent or non-covalent linkage group between the at least one reagent and the at least one detectable label. In some embodiments, the at least one reagent is at least one nucleotide. In some embodiments, the at least one reagent is at least one amino acid recognition molecule.
In some embodiments, the nucleic acid is a double-stranded nucleic acid comprising a first oligonucleotide strand hybridized with a second oligonucleotide strand. In some embodiments, the first attachment site is on the first oligonucleotide strand, and the second attachment site is on the second oligonucleotide strand. In some embodiments, the at least one reagent is attached to the first attachment site through a protein that forms a covalent or non-covalent linkage group between the at least one reagent and the nucleic acid. In some embodiments, the at least one detectable label is attached to the second attachment site through a protein that forms a covalent or non-covalent linkage group between the at least one detectable label and the nucleic acid. In some embodiments, the first and second attachment sites are separated by between 5 and 100 nucleotide bases or nucleotide base pairs on the nucleic acid.
In yet other aspects, the disclosure provides a labeled reagent comprising: a multivalent protein comprising at least two ligand-binding sites; at least one reagent attached to the protein through a first ligand moiety bound to a first ligand-binding site on the protein; and at least one detectable label attached to the protein through a second ligand moiety bound to a second ligand-binding site on the protein. In some embodiments, the at least one reagent is at least one nucleotide. In some embodiments, the at least one reagent is at least one amino acid recognition molecule.
In some embodiments, the multivalent protein is an avidin protein comprising four ligand-binding sites. In some embodiments, the ligand-binding sites are biotin binding sites, and the ligand moieties are biotin moieties. In some embodiments, at least one of the biotin moieties is a bis-biotin moiety, and the bis-biotin moiety is bound to two biotin binding sites on the avidin protein. In some embodiments, the at least one reagent is attached to the protein through a nucleic acid comprising the first ligand moiety. In some embodiments, the at least one detectable label is attached to the protein through a nucleic acid comprising the second ligand moiety.
In some aspects, the disclosure provides labeled reagents having a configuration as generally depicted in
In the example structure of the nucleotide scaffold shown in
In some aspects, the disclosure provides labeled reagents comprising a shielding element that protects a target molecule (e.g., a polypeptide sample, a polymerizing enzyme) from label-induced photodamage. In some embodiments, a labeled reagent comprises a structure of Formula (IVa):
Z
S′-B′]m (IVa),
wherein: Z is a multivalent central core element comprising a luminescent label; each S′ is independently an intermediate chemical group, wherein at least one S′ comprises a shielding element; each B′ is independently a terminal chemical group, wherein at least one B′ comprises a binding element that binds a target molecule; and m is an integer from 2 to 24, inclusive.
In some embodiments, Z comprises a multivalent fluorescent dye element. In some embodiments, Z comprises a multivalent cyanine dye. In some embodiments, Z comprises a luminescent label other than a fluorescent dye. In some embodiments, Z comprises a FRET label (e.g., one or more chromophores of a FRET pair).
In some embodiments, m is an integer from 2 to 12, inclusive. In some embodiments, m is an integer from 2 to 8, inclusive. In some embodiments, m is an integer from 2 to 4, inclusive.
In some embodiments, a labeled reagent comprises a structure of Formula (IVb), (IVc), or (IVd):
wherein: X is a non-luminescent multivalent central core element; each instance of D is independently a luminescent label or a covalent bond, with the proviso that at least one instance of D is a luminescent label; each instance of W, if present, is a branching element; each S′ is independently an intermediate chemical group, wherein at least one S′ comprises a shielding element; each B′ is independently a terminal chemical group, wherein at least one B′ comprises a binding element that binds a target molecule; each instance of n is independently an integer from 2 to 6, inclusive; each instance of o is independently an integer from 1 to 4, inclusive; and each instance of p is independently an integer from 1 to 4, inclusive.
In some embodiments, X comprises a polyamine. In some embodiments, X comprises a tertiary amide. In some embodiments, X comprises a substituted triazine group (e.g., a trisubstituted triazine). In some embodiments, X comprises a substituted phenyl group (e.g., a disubstituted or trisubstituted phenyl). In some embodiments, X comprises a substituted carbocyclic group (e.g., a substituted cyclohexane). In some embodiments, X comprises a secondary, tertiary, or quaternary carbon atom.
In some embodiments, D comprises a fluorescent dye. In some embodiments, D comprises a FRET label (e.g., one or more chromophores of a FRET pair).
In some embodiments, W comprises the structure:
wherein each instance of x is independently an integer from 1 to 6, inclusive. In some embodiments, each instance of x is independently an integer from 1 to 4, inclusive.
Referring to Formulae (IVa)-(IVd) above, in some embodiments, the shielding element decreases photodamage of the binding element and/or of a target molecule associated with the binding element. In some embodiments, the shielding element decreases contact between the luminescent label and the binding element. In some embodiments, the shielding element decreases contact between the luminescent label and a target molecule associated with the binding element.
In some embodiments, the binding element comprises a biotin moiety. In some embodiments, the binding element comprises a nucleotide (e.g., a nucleoside polyphosphate). For example, in some embodiments, the binding element comprises a nucleotide, and the target molecule comprises a polymerizing enzyme (e.g., a DNA polymerase) that binds the nucleotide.
In some embodiments, the shielding element comprises a plurality of side chains. In some embodiments, at least one side chain has a molecular weight of at least 300 g/mol (e.g., at least 350, at least 400, at least 450, or at least 500 g/mol). In some embodiments, at least one side chain has a molecular weight of between about 300 and 1,000 g/mol (e.g., 350-1,000, 400-1,000, 450-1,000, or 500-1,000 g/mol). In some embodiments, all of the side chains have a molecular weight of at least 300 g/mol.
In some embodiments, the shielding element comprises at least one side chain comprising a dendrimer, a polyethylene glycol, or a negatively-charged component. In some embodiments, the negatively-charged component comprises a sulfonic acid. In some embodiments, the shielding element comprises at least one side chain comprising a substituted phenyl group. In some embodiments, the at least one side chain comprises the structure
wherein each instance of x is independently an integer from 1 to 6, inclusive. In some embodiments, each instance of x is independently an integer from 1 to 4, inclusive.
In some embodiments, the shielding element comprises the structure:
wherein each instance of y is independently an integer from 1 to 6, inclusive.
Labeled nucleotides comprising shielding elements were evaluated in DNA sequencing reactions to determine the effects of shielding elements on sequencing accuracy and efficiency. Labeled nucleotides without shielding elements (QTDN3) and with shielding elements (QTDN4) were produced. Each set of labeled nucleotides (QTDN3 and QTDN4) included four types of nucleotides (A, C, G, T) having different luminescent labels. Sequencing reactions were performed using the following conditions: 60 mM MOPS (pH 8.0), 50 mM KOAc, 10 mM Mg(OAc)2, 5 mM trolox, 6 mM nitrobenzoic acid, 20 mM PCA, 0.05% Tween-20, 10 mM NaCl, 2 μL PCD, and 2.5 μM of each nucleotide in a total volume of 40 μL.
During incorporation of labeled nucleotides in a sequencing-by-synthesis reaction, each labeled nucleotide emits a detectable signal that is uniquely identifiable by emission intensity and lifetime (
The sequencing results demonstrated that both sets of labeled nucleotides (QTDN3 and QTDN4) provided data with favorable intensity-lifetime separation. Additionally, the data was further evaluated to determine the average rates of incorporation during sequencing runs with both sets of labeled nucleotides. The average rate of incorporation for the QTDN3 set was determined to be 0.62 nucleotide bases per second, and the average rate of incorporation for the QTDN4 set was determined to be 1.80 nucleotide bases per second. Thus, the results from these experiments demonstrated that both nucleotide sets provide accurate sequencing results, with the QTDN4 set providing improved efficiency.
IX. Alternatives and Scope
Having thus described several aspects and embodiments of the technology of the present disclosure, it is to be appreciated that various alterations, modifications, and improvements will readily occur to those of ordinary skill in the art. Such alterations, modifications, and improvements are intended to be within the spirit and scope of the technology described herein. It is, therefore, to be understood that the foregoing embodiments are presented by way of example only and that, within the scope of the appended claims and equivalents thereto, inventive embodiments may be practiced otherwise than as specifically described. In addition, any combination of two or more features, systems, articles, materials, kits, and/or methods described herein, if such features, systems, articles, materials, kits, and/or methods are not mutually inconsistent, is included within the scope of the present disclosure.
Also, as described, some aspects may be embodied as one or more methods. The acts performed as part of the method may be ordered in any suitable way. Accordingly, embodiments may be constructed in which acts are performed in an order different than illustrated, which may include performing some acts simultaneously, even though shown as sequential acts in illustrative embodiments.
All definitions, as defined and used herein, should be understood to control over dictionary definitions, definitions in documents incorporated by reference, and/or ordinary meanings of the defined terms.
The indefinite articles “a” and “an,” as used herein in the specification and in the claims, unless clearly indicated to the contrary, should be understood to mean “at least one.”
The phrase “and/or,” as used herein in the specification and in the claims, should be understood to mean “either or both” of the elements so conjoined, i.e., elements that are conjunctively present in some cases and disjunctively present in other cases.
As used herein in the specification and in the claims, the phrase “at least one,” in reference to a list of one or more elements, should be understood to mean at least one element selected from any one or more of the elements in the list of elements, but not necessarily including at least one of each and every element specifically listed within the list of elements and not excluding any combinations of elements in the list of elements. This definition also allows that elements may optionally be present other than the elements specifically identified within the list of elements to which the phrase “at least one” refers, whether related or unrelated to those elements specifically identified.
In the claims, as well as in the specification above, all transitional phrases such as “comprising,” “including,” “carrying,” “having,” “containing,” “involving,” “holding,” “composed of,” and the like are to be understood to be open-ended, i.e., to mean including but not limited to. The transitional phrases “consisting of” and “consisting essentially of” shall be closed or semi-closed transitional phrases, respectively.
Number | Date | Country | |
---|---|---|---|
63298969 | Jan 2022 | US |