The present invention relates to the field of determining the copy number of proteins in sample imaged with super resolution microscopy.
Single molecule localization microscopy has become an important tool for imaging intracellular structures and protein complexes with nanoscale spatial resolution (Oddone, A. et al., 2014). Recently, an immense effort has been dedicated to the quantification of super-resolution images (Durisic, N. et al., 2014, Deschout, H. et al., 2014). Among the different quantitative parameters that can be extracted, protein copy-number and stoichiometry have been of particular interest. Single-molecule-based super-resolution methods are uniquely positioned to determine protein copy-numbers, since the single molecule information can be exploited for counting. However, the exact quantification is ultimately impaired by the stochasticity of the labeling method and the complex photophysics of the fluorescent probes. Therefore, it is not surprising that a substantial effort has been dedicated to developing analytical approaches and calibration standards aimed to overcome this challenge. For example, the photophysics of photoactivatable and photoconvertible fluorescent proteins (FPs) have been extensively studied and nano-templates have been developed to calibrate the signal and count FP-tagged proteins (Fricke, F. et al., 2015). Since FPs provide a one to one labeling stoichiometry and have limited blinking or reactivation probability, they are desirable for quantitative imaging. However, a major limitation is imposed by their low photon budget, leading to images with a lower spatial resolution compared to small organic fluorophores, which are the probe of choice for a large number of super-resolution studies. Targeting these bright fluorophores to the protein of interest typically requires immunofluorescent labeling by primary and secondary antibodies. In this case, unfortunately, both the antibody labelling efficiency as well as the number of fluorophores conjugated to the primary or to the secondary antibody are highly stochastic. In addition, fluorophores might undergo repeated blinking or reactivation events. Combined together, these issues pose major challenges for protein-copy quantification. Partial solutions to these challenges have been reported. For example, the fluorophore photophysics can be modelled (Hummer, G. et al., 2016, Rollins, G. C. et al., 2015)) or characterized using single fluorophores conjugated to antibodies or images of sparse spots on the sample Ricci, M. A., et al., 2015, Ehmann, N. et al 2014). In the case of DNA-PAINT approaches—that rely on “on-off” binding of fluorophore-labeled small oligos—the binding kinetics can be modeled and accounted for in the quantification (Jungmann, R. et al. 2016). Nonetheless, in all cases the unknown stoichiometry of antibody-based labeling, resulting from the stochasticity of fluorophore-antibody and antibody-target binding, largely affects the precision of the final quantification. Therefore, there is an urgent need for versatile calibration standards that take into account not only the fluorophore photophysics but also the antibody and fluorophore labeling stoichiometry. Although in other works, ad hoc calibration standards have allowed quantifying complex structures such as nucleosomes (Ricci, M. A. et al., 2015) there is lack of a general approach toward this problem.
The development of methods able to access a precise molecular counting of protein copy numbers is essential, clearing the way to address several biological questions using super-resolution techniques based on single molecule localization.
In a first aspect, the invention relates to a method for obtaining a calibration curve for quantifying protein copy number in immunofluorescence-based super resolution microscopy which comprises
a) incubating a DNA origami immobilized on a support, wherein the DNA origami comprises handle oligonucleotides protruding from said DNA origami, said handle oligonucleotides being attached to the DNA origami at predetermined positions and at least one tag, with a protein of interest functionalized with oligonucleotides complementary to the handles protruding from said DNA origami, in conditions allowing the hybridization between the oligonucleotides attached to the DNA origami and the oligonucleotides attached to the protein of interest,
b) recording a super resolution image of the protein of interest which colocalizes with the tag of the DNA origami,
c) clustering the image obtained in step b) and identifying the clusters separated by the distance between the handles to obtain the number clusters in said image obtained in step b),
d) fitting a generic probability distribution function depending on a set of parameters μ to the distribution of the number of localizations x for one predetermined cluster,
ƒ1(μ;x)
and extending it iteratively to larger clusters by using the equation for n=2, 3 . . . Nmax
ƒn=ƒn-1⊗ƒ1
where ⊗ represent the convolution in respect to the x variables between two functions and Nmax is a predetermined maximum number of clusters, and
e) obtaining a calibration curve by the parameters determined through the fitting procedure described in d).
In a second aspect, the invention relates to a method for quantifying protein copy number in a sample imaged with super resolution microscopy which comprises, obtaining a statistical parameter of the number of localizations in a sample having the protein of interest and comparing it with the calibration curve obtained for said protein of interest according to the method of the invention.
In a third aspect, the invention relates to a method for determining the percentage of oligomeric state of a protein in a sample imaged with super-resolution microscopy which comprises fitting the overall distribution of the number of localizations obtained in the sample to,
where αn represents the weight of the distribution of n-mers being Σn=1N
ƒn=ƒn-1⊗ƒ1
obtained for said protein of interest according to the method of the invention, wherein fittings are performed by optimization of an objective function.
In a fourth aspect, the invention relates to a computer program comprising instructions which, when the program is executed by a computer, causes the computer to carry out the methods of the invention.
In a fifth aspect, the invention relates to a kit comprising
a) a DNA origami attachable to a support comprising handle sequences protruding from said DNA origami and at least one tag, optionally the DNA origami is protected from degradation,
b) reagents suitable for obtaining a super resolution image of a protein of interest, and
c) a computer-readable medium comprising instructions which, when executed by a computer, cause the computer to carry out the methods of the invention.
In a sixth aspect, the invention relates to the use of a kit of the invention for obtaining a calibration curve for quantifying protein copy number in immunofluorescence-based super resolution microscopy, for quantifying protein copy number in a sample imaged with super resolution microscopy and for determining the percentage of oligomeric state of a protein in a sample imaged with super-resolution microscopy.
The inventors have developed a method for quantifying protein copy number in immunofluorescence based super-resolution microscopy using DNA origami. This calibration method is suitable to quantify the average protein copy number in a cell and to determine the abundance of various oligomeric states.
In an aspect, the invention relates to a method for obtaining a calibration curve for quantifying protein copy number in immunofluorescence-based super resolution microscopy which comprises
a) incubating a DNA origami immobilized on a support, wherein the DNA origami comprises handle oligonucleotides protruding from said DNA origami, said handle oligonucleotides being attached to the DNA origami at predetermined positions and at least one tag, with a protein of interest functionalized with oligonucleotides complementary to the handles protruding from said DNA origami, in conditions allowing the hybridization between the oligonucleotides attached to the DNA origami and the oligonucleotides attached to the protein of interest,
b) recording a super resolution image of the protein of interest which colocalizes with the tag of the DNA origami,
c) clustering the image obtained in step b) and identifying the clusters separated by the distance between the handles to obtain the number clusters in said image obtained in step b),
d) fitting a generic probability distribution function depending on a set of parameters y to the distribution of the number of localizations x for one predetermined cluster,
ƒ1(μ;x)
and extending it iteratively to larger clusters by using the equation for n=2, 3 . . . Nmax
ƒn=n-1⊗ƒ1
where ⊗ represent the convolution in respect to the x variables between two functions and Nmax is a predetermined maximum number of clusters, and
e) obtaining a calibration curve by the parameters determined through the fitting procedure described in d).
In a preferred embodiment, steps d) and/or e) are executed by a computer.
According to the method of the invention any distribution function can be used. As a way of illustrative non limitative example the distribution function f1 is
Values μ1 and μ2 are two free parameters which are determined through fitting the ƒ1 function to the distribution of localizations obtained from one cluster corresponding to one “protein X” in the calibration experiment.
“Calibration curve”, as used herein,” is a calibration standard that can be used to quantify protein copy number from super-resolution images obtained after immunofluorescence labeling. In particular, it can be used to extract average protein copy-numbers in a given image by comparing the median number of localizations obtained in the cellular context to the curve.
“Immunofluorescence-based super resolution microscopy”, as used herein relates to a microscopic technique which allows obtaining an image with an axial and lateral resolution under 100 nm allowing single molecule localization.
In a preferred embodiment, the images obtained are characterized by a lateral (XY) resolution of approximately 20-30 nm and axial (Z) resolution of 50-60 nm.
The super resolution images can be obtained by any super resolution technique known in the art. Super-resolution techniques allow the capture of images with a higher resolution than the diffraction limit. They fall into two broad categories, “true” super-resolution techniques, which capture information contained in evanescent waves, and “functional” super-resolution techniques, which use clever experimental techniques and known limitations on the matter being imaged to reconstruct a super-resolution image. There are two major groups of methods for functional super-resolution microscopy:
1. Deterministic super-resolution: The most commonly used emitters in biological microscopy, fluorophores, show a nonlinear response to excitation, and this nonlinear response can be exploited to enhance resolution. These methods include without limitation STED, GSD, RESOLFT and SSIM.
2. Stochastical super-resolution: The chemical complexity of many molecular light sources gives them a complex temporal behaviour, which can be used to make several close-by fluorophores emit light at separate times and thereby become resolvable in time. These methods include without limitation SOFI and all single-molecule localization methods (SMLM) such as SPDM, SPDMphymod, PALM, FPALM, STORM and dSTORM.
In a preferred embodiment, the super resolution image is obtained by a stochastical super resolution technique, preferably STORM, PALM and fPALM, and more preferably by STORM. STORM combines two concepts: single molecule localization and fluorophore photoswitching. The first concept allows one to localize the position of a single fluorophore with nanometer precision. Photoswitching makes it possible to “turn off” most fluorophores into a dark state and “turn on” only a small subset of them at a time. As a result, the images of the “active” fluorophores are isolated in space and their positions can be localized with high precision. Once all the fluorophores are imaged and their positions are localized, a high-resolution image can be reconstructed from these localizations. To date, the spatial resolution achieved by this technique is ˜20 nm in the lateral dimensions and ˜50 nm in the axial dimension. More details of STORM technology are described in WO2013090360, WO2009085218 and EP2378343.
In step a) of the first method of the invention, a DNA origami immobilized on a support, wherein the DNA origami comprises handle oligonucleotides protruding from said DNA origami and at least one tag, said handle oligonucleotides being attached to the DNA origami at predetermined positions, is incubated with a protein of interest functionalized with oligonucleotides complementary to the handles protruding from said DNA origami, in conditions allowing the hybridization between the oligonucleotides attached to the DNA origami and the oligonucleotides attached to the protein of interest.
“DNA origami” as used herein relates to the nanoscale folding of DNA to create non-arbitrary two- and three-dimensional shapes at the nanoscale. The specificity of the interactions between complementary base pairs makes DNA a useful construction material, through design of its base sequences. DNA is a well-understood material that is suitable for creating scaffolds that hold other molecules in place or to create structures all on its own.
In general, the DNA origami process involves the folding of one or more long, “scaffold” or “chassis” of DNA strands into a particular shape using a plurality of rationally designed “staple” DNA strands. The sequences of the staple strands are designed such that they hybridize to particular portions of the scaffold strands and, in doing so, force the scaffold strands into a particular shape. This chassis serves as a skeleton for attaching additional components via the use of “handle” sequences that project outward. These handles provide site- and sequence-specific attachment points for single fluorophores as well as proteins of interest and allow testing of several different labeling strategies. In different embodiments, such strategies involve antibody, nanobody, Clip tag or Halo/SNAP tag labeling.
Methods useful in making of DNA origami structures are known by those skilled in the art. In some embodiments, the DNA origami device (or “robot” or “DNA robot” or “DNA nanorobot”) may include a scaffold strand and a plurality of rationally designed staple strands. The scaffold strand can have any sufficiently non-repetitive sequence. The sequences of the staple strands are selected such that the DNA origami device has at least one shape in which biologically active moieties can be sequestered.
In some embodiments, the DNA origami can be of any shape that has at least one inner surface and at least one outer surface. In general, an inner surface is any surface area of the DNA origami device that is sterically precluded from interacting with the surface of a cell, while an outer surface is any surface area of the DNA origami device that is not sterically precluded from interacting with the surface of a cell. In some embodiments, the DNA origami device has one or more openings (e.g., two openings), such that an inner surface of the DNA origami device can be accessed by sub-cellular sized particles. In another particular embodiment, the DNA origami can comprise several double helices, by way of example the DNA origami can comprises 6 parallel double helices, 8 parallel double helices, 10 parallel double helices, 12 parallel double helices, 14 parallel double helices. In a more preferred embodiment, the DNA origami comprises 12 parallel double helices. In a preferred embodiment, the DNA origami comprises 6 inner helices and 6 outer helices.
In a preferred embodiment, the DNA origami chassis is the one described in Derr et al,. Science 338, 662-665 (2012) or Goodman, B. S. et al., Meth. Enzymol. 540, 169-188 (2014).
As a way of illustrative non-limitative example the DNA origami can be prepared using p8064 scaffold and oligonucleotide staple sequence by folding the 12-helix bundle DNA origami chassis structures in DNA origami folding buffer by way of illustrative non-limitative example (5 mM Tris [pH 8.0], 1 mM EDTA and 16 mM MgCl2) by mixing 100 nM p8064 scaffold with 600 nM core staples, 3.6 μM handle staples, 3.6 μM biotin staples, and 9 μM fluorophore anti-handles. In addition a thermal folding cycle is run, by rapid heating to 80° C. and cooling in single degree increments to 65° C. for 75 min, followed by cooling in single degree increments to 30° C. for 17.5 hr. The folded DNA origami chassis can be stored at either 4° C. or −20° C.
Alternatively the DNA origami chassis is commercially available.
The DNA origami used in the present invention is immobilized on a support. “Support”, as used herein relates to any surface wherein the DNA origami can be attached. As a way of illustrative non limitative examples, a support is a coverslip. Methods for immobilizing DNA origami structures on a solid substrate are not particularly limited and are known in the art. They include for example top-down patterning approaches such as ink-jet printing, DPN, polymer pen lithography and the like.
The DNA origami comprises handle oligonucleotides protruding from said DNA origami and at least one tag.
“Handle oligonucleotides”, as used herein relates to sequence that project outward from the structure of the DNA and provide site and sequence specific attachment points from single fluorophores, proteins of interest and allow testing several different labelling strategies.
“Tag”, as used herein relates to any molecule which allows the identification by super resolution technique of the DNA origami. In a preferred embodiment, the tag is a fluorescence tag, more preferably TAMRA. When applied to the protein of interest, said tag, named as second tag, is a peptide sequence which can be used to identify said protein of interest and that forms part of a fusion protein together with the protein of interest.
Preferably, the DNA origami nanostructures used in the method of the present invention carry one or more tags. Tags can be located anywhere on the DNA origami structure. In a particular embodiment, the tag is presented into the direction away from the solid substrate in the final assembly. Said tags are preferably selected from the group consisting of metal nanoparticles, semiconductor nanoparticles, proteins, peptides, nucleic acids, lipids, polysaccharides, small molecule organic compounds, colloids, and combinations thereof. Methods for attaching various types of tags to DNA origami structures are not particularly limited and are known in the art. Tags can be attached directly or indirectly. Indirect attachment can be effected by suitable binding pairs known in the art, e.g. Streptavidin-biotin, self-ligating linker proteins such as SNAP- or Halo-Tag, and antibodies, antibody mimetics or antibody fragments binding to their respective antigens. In the context of antibody fragments, single-chain antibody fragments (scFv) are particularly preferred. Each of these binding agents can be present on the DNA origami structure, with the respective binding partner present on the tags. Means for coupling said binding agents to the DNA origami structure and the tags are known in the art. Methods for the direct attachment of tags to DNA origami structures include expressed protein ligation, chemoenzymatic coupling (e.g. Sortase A coupling), coiled-coil peptide assembly, and the generation of oligonucleotide conjugates. Further, conventional homo- and heterospecific cross-linkers containing reactive groups directed against carboxyls, amines, thiols or orthogonal coupling pairs such as azide/alkyne cycloaddition or variants thereof can be used.
In a preferred embodiment, a tag is bound to the DNA origami through a sequence complementary to an oligonucleotide protruding from the DNA origami. In a preferred embodiment the tag is a fluorescence tag, and more preferably TAMRA.
The DNA origami can comprise several tags at any position. In a preferred embodiment, the DNA origami comprises at least 1, at least 2, at least 3, at least 4, at least 5 tags, at least 6, at least 7, at least 8, at least 9, at least 10, at least 11 or more tags. In a more preferred embodiment, the tag is localized at position 14 of each of the outer helices.
Protein of interest as used herein relates to any protein. In a preferred embodiment the protein of interest forms part of a fusion protein together with any other protein such as a second tag.
Said tag is generally a peptide or amino acid sequence which can be used in the isolation, purification or detection of said fusion protein. Illustrative non-limitative examples of tags are histidine tag (His-tag or HT), FLAg tag,GFP Arg-tag, FLAG-tag Strep-tag, an epitope capable of being recognized by an antibody, such as c-myc-tag, HA tag, V5 tag SBP-tag, S-tag, calmodulin binding peptide, cellulose binding domain, chitin binding domain, glutathione S-transferase-tag, maltose binding protein, NusA, TrxA, DsbA, Avi-tag, etc.
In a preferred embodiment, the protein of interest is forming a fusion protein together with GFP.
In addition, a protein of interest must be functionalized with oligonucleotides complementary to the handles protruding from the DNA origami chassis. In a preferred embodiment, the protein of interest is functionalized with oligonucleotides complementary to the handles protruding at any position 0 to 14, particularly at positions 1, 7 and 13 of helix 0 of the DNA origami.
The person skilled in the art knows several methods to functionalize a protein with oligonucleotides, such as those disclosed in the experimental part of the present invention.
A skilled person in the art knows the conditions allowing the hybridization between the handles protruding oligonucleotides of the DNA origami and the oligonucleotides of the protein of interest
As anybody skilled in the art knows, “conditions allowing hybridization” according to the method of the present invention are such that (i) do not compromise the structure and integrity of the DNA origami structures, the first and second structural features, e.g. ssDNA strands, and the solid substrate, and (ii) allow for binding of the first and second structural features, e.g. for the Watson-Crick pairing between the protruding ssDNA strands of the origami structures and the ssDNA capture strands, so that immobilization of the origami structures on the solid substrate is achieved.
In particular embodiment, the hybridization is performed under stringent conditions.
The phrase “stringent hybridization conditions” refers to conditions under which a probe will hybridize to its target subsequence, typically in a complex mixture of nucleic acid, but to no other sequences.
Stringent conditions are sequence-dependent and will be different in different circumstances. Longer sequences hybridize specifically at higher temperatures. See Tijssen S, “Overview of principles of hybridization and the strategy of nucleic acid assays”, Laboratory Techniques in Biochemistry and Molecular Biology (Elsevier Science Publishers B.V., Amsterdam, The Netherlands 1993). Generally, stringent conditions are selected to be about 5-10° C. degrees lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength pH. The Tm is the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50 percent of the probes complementary to the target hybridize to the target sequence at equilibrium (as the target sequences are present in excess, at Tm 50 percent of the probes are occupied at equilibrium). Stringent conditions will be those in which the salt concentration is less than about 1.0 M sodium ion, typically about 0.01 to 1.0 M sodium ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30° C. for short probes (e.g. 10 to 50 nucleotides) and at least about 60° C. for long probes (e.g. greater than 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. For high stringency hybridization, a positive signal is at least two times background, preferably 10 times background hybridization. Exemplary high stringency or stringent hybridization conditions include: 50 percent formamide, 5×SSC and 1 percent SDS incubated at 42° C. or 5×SSC and percent SDS incubated at 65° C., with a wash in 0.2×SSC and 0.1 percent SDS at 65° C.
Step b) of the first method of the invention comprises recording a super resolution image of the protein of interest which colocalizes with the tag of the DNA origami.
“Super resolution image” as used herein, refers to an image with an axial and/or lateral resolution below the diffraction limit, for example under 100 nm allowing single molecule localization.
Recording an image by means of an optical sensor aiming to the optically excited sample provides a bitmap image at certain resolution having information about the protein localization organization. Once all the fluorophores are imaged and their positions are localized, a high-resolution image can be reconstructed from these localizations. This information will be used to provide characteristic length scales and density of some relevant structural parts of the protein that allows identifying clusters of the protein of interest, particularly a fluorescent probe.
“Colocalization” as used herein, relates to the observation of the spatial overlap between two different signals, one from the protein of interest and the other from the DNA origami.
As a skilled person in the art can understand in order to obtained a super resolution image of the protein of interest is necessary the labeling of the protein of interest with a molecule that allows detecting a signal from said protein, for example antibody, nanobody, Halo, SNAP, clip substrate, depending on the labeling strategy used but having a molecule that can be detected by super-resolution microscopy. In a preferred embodiment, the super resolution image of the protein of interest is obtained by detecting said protein with a fluorophore, a fluorescent protein, an antibody, nanobody, clip tag or halo/snap tag.
By way of illustrative non limitative example the protein of interest may be detected by a first antibody against an epitope of the protein of interest or against a sequence expresses together with the protein of interest in a fusion protein, and detecting the protein of interest: antibody binding with a secondary antibody, having at least one photoswitchable fluorophore adapted to be optically excited at a certain wavelength λ1 and to emit light at a wavelength Δ2 different from λ1. When the sample having the antibody:protein of interest complex is excited with optical energy, for instance by means of a laser beam of a wavelength Δ1, those locations of the antibody: protein of interest complex linked to the photoswitchable fluorophore emit light at the wavelength Δ2.
Step c) of the first method of the invention comprises clustering the image obtained in step b) and identifying the clusters separated by the distance between the handles to obtain the number clusters in said image obtained in step b).
The individual locations of a fluorophore and cluster information need to be identified over the image.
Analysis and reconstruction of super resolution images may be performed by obtaining fluorescent probe positions. Several known softwares can be used for obtaining fluorescent probe positions, as illustrative non-limiting example the Insight 3 provided by BO Huang, University of California, San Francisco. Molecules may be identified by a threshold and the radial positions x and y are extracted by fitting with a simple Gaussian function. As a way of illustrative, non-limitative examples, the final image is obtained plotting each identified molecule as a Gaussian spot with a width corresponding to the localization precision (9 nm) and finally corrected for drift. Molecules appearing within a distance of 9 nm are merged and considered as the same molecule. Spatial clusters of localizations were identified based on a distance-based clustering algorithm, by means of custom-written code written in Matlab11. The localizations list may be first binned to 20 nm pixel size images that were filtered with a square kernel (7×7 pixels2) and thresholded to obtain a binary mask. Specifically, a density map was built by 2-dimensional convolution of the localization images with a square kernel (7×7 pixels2) and a constant threshold was used to digitize the maps into binary images. The low-density areas, where the density is lower than the threshold value and a value of 0 was assigned, are discarded from further analysis. Only the components of the binary image, where adjacent (6-connected neighbors) non-zero pixels were found, are analysed. A peak finding routine provides the clusters number and the relative centroid coordinates from the maxima of the density map in the connected regions. Molecular localizations lying over connected regions of the mask were assigned to each cluster using a distance-based algorithm, depending on their proximity to the cluster centroids. For each cluster, its centroid position is iteratively re-calculated and saved for further analysis until convergence of the sum of the squared distances between localizations and the associated cluster is reached. It may be obtained the cluster centroid positions, the number of localizations obtained per cluster and the cluster size.
In the particular case that 2 molecules of protein of interest are attached to one DNA origami, then single clusters and double clusters will be obtained. In the particular case that 3 molecules of protein of interest are attached to one DNA origami, then single, double and triple clusters will be obtained.
As the person skilled in the art can understand, in order to perform the method for obtaining a calibration curve according to the invention, several previous steps can be performed. By way of example, methods for determining the efficiency of the anti-handle oligos to the complementary handle oligos in the DNA origami. Said determination can be performed as described in the experimental part. In addition, the number of tags successfully conjugated to the DNA origami can be determined by way of illustrative, non-limitative example by single-step photobleaching or STORM.
In a preferred embodiment, the clusters analyzed in step c) are separated by a distance shorter than 200 nm, such as between 85±7 nm and 157±17 nm.
Step d) of the first method of the invention comprises fitting a generic probability distribution function depending on a set of parameters y to the distribution of the number of localizations x for one predetermined cluster,
ƒ1(μ;x)
and extending it iteratively to larger clusters by using the equation for n=2, 3 . . . Nmax
ƒn=ƒn-1⊗ƒ1
where ⊗ represent the convolution in respect to the x variables between two functions and Nmax is a predetermined maximum number of clusters
In a preferred embodiment step d) is executed by a computer.
It is understand that a statistical parameter of the number of localizations for at least one cluster may be determined previous to step d). In a preferred embodiment, the statistical parameter is obtained for the number of localizations for two cluster, three clusters or more. Alternatively, the distribution function may be obtained from the fit parameters
“Localization” as used herein relates to the centroid of the pixels defining a cluster.
The term “statistical parameter” relates to a quantity that indexes a family of probability distributions. In a preferred embodiment, the statistical parameter is selected from the group consisting of median, mean, percentile or combinations thereof.
In a preferred embodiment, Nmax is the number for which the objective function is minimum, however the shape of the stoichiometry distribution obtained after the fit can also help guide the choice for Nmax as its tail should show a smooth decay.
The “probability distribution” as used herein is a description of a random phenomenon in terms of the probabilities of events. Examples of random phenomena can include the results of an experiment or survey. A probability distribution is defined in terms of an underlying sample space, which is the set of all possible outcomes of the random phenomenon being observed. The sample space may be the set of real numbers or a higher-dimensional vector space, or it may be a list of non-numerical values. A “probability distribution function” is some function that may be used to define a particular probability distribution. As used herein, a “function” is a relation between a set of inputs and a set of permissible outputs with the property that each input is related to one output.
Some aspects of the present disclosure relate to fitting functions. A “fittinq function,” as used herein, refers to a mathematical function used to fit the number of localizations distribution. An example of fitting function for use as provided herein include, without limitation, a log normal distribution. The above-described embodiments of the present disclosure can be implemented in any of numerous ways. For example, the embodiments may be implemented using hardware, software or a combination thereof. In preferred embodiment, the method of the invention is a computer-implemented method. When implemented in software, the software code can be executed on any suitable processor or collection of processors, whether provided in a single computer or distributed among multiple computers. It should be appreciated that any component or collection of components that perform the functions described above can be generically considered as one or more controllers that control the above-discussed functions. The one or more controllers can be implemented in numerous ways, such as with dedicated hardware, or with general purpose hardware (e.g., one or more processors) that is programmed using microcode or software to perform the functions recited above. In this respect, it should be appreciated that one implementation of the embodiments of the present disclosure comprises at least one non-transitory computer-readable storage medium (e.g., a computer memory, a floppy disk, a compact disk, a tape, etc.) encoded with a computer program (i.e., a plurality of instructions), which, when executed on a processor, performs the above-discussed functions of the embodiments of the present disclosure. The computer-readable storage medium can be transportable such that the program stored thereon can be loaded onto any computer resource to implement the aspects of the present disclosure discussed herein. In addition, it should be appreciated that the reference to a computer program which, when executed, performs the above-discussed functions, is not limited to an application program running on a host computer. Rather, the term computer program is used herein in a generic sense to reference any type of computer code (e.g., software or microcode) that can be employed to program a processor to implement the above-discussed aspects of the present disclosure.
Step e) of the first method of the invention comprises obtaining a calibration curve by the parameters determined through the fitting procedure described in d).
In a preferred embodiment, step e) is executed by a computer.
All the terms and embodiments described in the present invention are also applicable to this aspect of the invention,
Method for Quantifying Protein Copy Number in a Sample Imaged with Super Resolution Microscopy
In another aspect, the invention relates to a method for quantifying protein copy number in a sample imaged with super resolution microscopy which comprises, obtaining a statistical parameter of the number of localizations in a sample having the protein of interest and comparing it with the calibration curve obtained for said protein of interest according to the method for obtaining a calibration curve of the invention.
In a particular aspect, the sample is imaged by immunofluorescence.
According to the method of the present invention the value of σ and μ are obtained from the calibration curve and they are used in the method for quantifying protein copy number.
In a preferred embodiment, the statistical parameter is selected from the group consisting of median, mean or any other statistical parameter.
Quantifying, as used herein, refers to determine protein number and stoichiometry. As will be understood by those skilled in the art, the quantification, although preferred to be, need not be correct for 100% of the samples to be detected or evaluated. The term, however, requires that a statistically significant portion of number of proteins can be determined. Whether a number of proteins is statistically significant can be determined by a skilled in the art using various well known statistic evaluation tools, e.g., determination of confidence intervals, p-value determination, Student's t-test, Mann-Whitney test, etc. Details are found in Dowdy and Wearden, Statistics for Research, John Wiley & Sons, New York 1983. Preferred confidence intervals are at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, or at least 95%. The p-values are, preferably, 0.05, 0.01, 0.005 or lower.
“Sample”, as used herein refers to any sample susceptible of containing proteins, and it can be obtained by conventional methods known by those of average skill in the art, depending on the nature of the sample.
In a particular embodiment, said sample is a biopsy sample, tissue, cell or biofluid sample (plasma, serum, saliva, semen, sputum, cerebral spinal fluid (CSF), tears, mucus, sweat, milk, brain extracts and the like). Said samples can be obtained by any conventional method. In another aspect, the sample is a cell culture sample.
In a particular aspect, the sample is imaged by immunofluorescence.
In a preferred embodiment, the super resolution image is obtained by a stochastical super resolution technique, preferably STORM, PALM and fPALM, and more preferably by STORM.
In another particular embodiment, the DNA origami can comprise various double helices, by way of example 12 parallel double helices and/the DNA origami comprises ine tag at position 14 of each of the outer helices.
In a preferred embodiment, the protein of interest is functionalized with oligonucleotides complementary to the handles protruding at any position, for example from 0 to 14, particularly at positions 1, 7 and 13 of helix 0.
In a preferred embodiment, the clusters analyzed in step c) are separated by a distance shorter than 200 nm, such as between 85±7 nm and 157±17 nm.
In a preferred embodiment, the super resolution image of the protein of interest is obtained by detecting said protein with a fluorophore, a fluorescent protein, an antibody, nanobody or halo/snap tag.
All the terms and embodiments previously described are equally applicable to this aspect of the invention.
Method for Determining the Percentage of Oligomeric State of a Protein in a Sample Imaged with Super-Resolution Microscopy
In another aspect, the invention relates to a method for determining the percentage of oligomeric state of a protein in a sample imaged with super-resolution microscopy which comprises fitting the overall distribution of the number of localizations obtained in the sample to
where an represents the weight of the distribution of n-mers being Σn=1N
ƒn=ƒn-1⊗ƒ1
obtained for said protein of interest according to the method for obtaining a calibration curve of the invention, wherein fittings are performed by optimization of an objective function.
By way of illustrative, non-limitative example fittings are performed by a two-step numerical minimization of the objective function:
which represents the sum of the negative log-likelihood and the entropy. In the first term, p(x) corresponds to the number of occurrences for number of localization x. In the first optimization step, it can set
x with (x representing the average value of the data, and let the optimization run at varying Nmax until the minimum of the objective function Fmin is found. By means of this procedure, the maximum number of log-likelihood functions necessary to satisfactorily fit the data is calculated. Once this number is determined, the fit is further optimized by performing a second step of optimization, where the weight of the log-likelihood is set to the inverse of its target value wL=1/Fmin.
Oligomeric state as used herein relates to the formation of a macromolecular complex formed by non-covalent bonding of a few proteins. Dimers, trimers, and tetramers are, for instance, oligomers composed of two, three and four monomers, respectively.
“Objective function”, as used herein, relates to an equation to be optimized given certain constraints and with variables that need to be minimized or maximized using nonlinear programming techniques. The objective function indicates how much each variable contributes to the value to be optimized in the problem. In a preferred embodiment, the objective function is likelihood, entropy or any combination thereof.
All the terms and embodiments previously described are equally applicable to this aspect of the invention.
In another aspect, the invention relates to a computer program comprising instructions which, when the program is executed by a computer, cause the computer to carry out step d)-e) of the method for obtaining a calibration curve for quantifying protein copy number in immunofluorescence-based super resolution microscopy, to carry out the method for quantifying protein copy number in a sample imaged with super resolution microscopy and/or for determining the percentage of oligomeric state of a protein in a sample imaged with super-resolution microscopy
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
All the terms and embodiments previously described are equally applicable to this aspect of the invention.
In another aspect, the invention relates to a kit comprising
a) a DNA origami attachable to a support comprising handle sequences protruding from said DNA and at least one tag, optionally the DNA origami is protected from degradation
b) reagents suitable for obtaining a super resolution image of a protein of interest, and
c) a computer-readable medium comprising instructions which, when executed by a computer, cause the computer to carry out the methods of the invention.
In the context of the present invention, “kit” is understood as a product containing the different reagents necessary for carrying out the methods of the invention packed so as to allow their transport and storage. Additionally, the kits of the invention can contain instructions for the simultaneous, sequential or separate use of the different components which are in the kit. Said instructions can be in the form of printed material or in the form of an electronic support capable of storing instructions susceptible of being read or understood, such as, for example, electronic storage media (e.g. magnetic disks, tapes), or optical media (e.g. CD-ROM, DVD), or audio materials. Additionally or alternatively, the media can contain internet addresses that provide said instruction.
In a preferred embodiment, the reagents comprise at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90% or at least 100% of the total amount of reagents forming the kit.
In a preferred embodiment, if the DNA origami is not attached to a support, the DNA origami is protected from degradation. By way of illustration and non-limitative the DNA origami may be covered with a polymer to preserve from degradation.
In a preferred embodiment the reagents for obtaining a super resolution image of a protein of interest comprises
a) an antibody or nanobody specific for the protein of interest having at least one fluorophore, or
b) a primary antibody specific for the protein of interest and a secondary antibody having at least one fluorophore.
In a more preferred embodiment, said fluorophore is a photoswitchable fluorophore.
In another preferred embodiment, the DNA origami is attached to a support, more preferably to a cover slip.
In another preferred embodiment, the kit comprises additional reagents such as imaging buffer.
In another preferred embodiment, the kit DNA origami comprises 12 parallel DNA double helices. In another preferred embodiment the tag of the DNA origami is localized at potion 14 of each of the outer helices.
As used herein, the term “antibody” refers to immunoglobulin molecules and immunologically active portions of immunoglobulin molecules, i.e., molecules containing an antigen fixing site binding specifically (immunoreacting) with an antigen, such as a protein for example. There are 5 isotypes or main classes of immunoglobulins: immunoglobulin M (IgM), immunoglobulin D (IgD), immunoglobulin G (IgG), immunoglobulin A (IgA) and immunoglobulin E (IgE).
The antibodies that are going to be used in the present invention can be, for example, polyclonal sera, hybridoma supernatants or monoclonal antibodies, antibody fragments, Fv, Fab, Fab′ and F(ab′)2, scFv, diabodies, triabodies, tetrabodies and humanized antibodies.
The suitable conditions for the formation of the antibody: protein complex to take place are known by the skilled in the art. If the sample containing cells contains histone proteins, then the corresponding antibody:protein will be formed.
“Fluorophore”, as used herein, refers to entities that can emit light of a certain emission wavelength when exposed to a stimulus, for example, an excitation wavelength.
“Photoswitchable” as used herein, relates to an entity which can be switched between different light-emitting or non-emitting states by incident light of different wavelengths. Typically, a “switchable” entity can be identified by one of ordinary skill in the art by determining conditions under which an entity in a first state can emit light when exposed to an excitation wavelength, switching the entity from the first state to the second state, e.g., upon exposure to light of a switching wavelength, then showing that the entity, while in the second state can no longer emit light (or emits light at a reduced intensity) or emits light at a different wavelength when exposed to the excitation wavelength. Examples of switchable entities are disclosed in WO 2008/091296. As a non-limiting example of a switchable fluorophore, Cy5 can be switched between a fluorescent and a dark state in a controlled and reversible manner by light of different wavelengths, e.g., 633 nm or 657 nm red light can switch or deactivate Cy5 to a stable dark state, while 405 nm or 532 nm light can switch or activate the Cy5 back to the fluorescent state.
In some cases, the fluorophore can be reversibly switched between the two or more states, e.g., upon exposure to the proper stimuli. For example, a first stimuli (e.g., a first wavelength of light) may be used to activate the switchable fluorophore, while a second stimuli (e.g., a second wavelength of light) may be used to deactivate the switchable fluorophore, for instance, to a non-emitting state. Any suitable method may be used to activate the fluorophore. For example, in one embodiment, incident light of a suitable wavelength may be used to activate the entity to emit light, i.e., the entity is photoswitchable. Thus, the photoswitchable fluorophore can be switched between different light-emitting or non-emitting states by incident light, e.g., of different wavelengths. The light may be monochromatic (e.g., produced using a laser) or polychromatic.
In another embodiment, the entity may be activated upon stimulation by electric field and/or magnetic field. In other embodiments, the entity may be activated upon exposure to a suitable chemical environment, e.g., by adjusting the pH, or inducing a reversible chemical reaction involving the entity, etc.
Similarly, any suitable method may be used to deactivate the entity, and the methods of activating and deactivating the entity need not be the same. For instance, the entity may be deactivated upon exposure to incident light of a suitable wavelength, or the entity may be deactivated by waiting a sufficient time.
In some embodiments, the switchable entity includes a first, light-emitting portion (e.g., a fluorophore), and a second portion that activates or “switches” the first portion.
Upon exposure to light, the second fluorophore may activate the first fluorophore a, causing the first fluorophore to emit light. Examples of activator fluorophores include, but are not limited to Alexa Fluor 405 (Invitrogen), Alexa 488 (Invitrogen), Cy2 (GE Healthcare), Cy3 (GE Healthcare), Cy3.5 (GE Healthcare), or Cy5 (GE Healthcare), or other suitable dyes. Examples of light-emitting portions include, but are not limited to, Cy5, Cy5.5 (GE Healthcare), or Cy7 (GE Healthcare), Alexa Fluor 647 (Invitrogen), or other suitable dyes. These may linked together, e.g., covalently, for example, directly, or through a linker, e.g., forming compounds such as, but not limited to, Cy5-Alexa Fluor 405, Cy5-Alexa Fluor 488, Cy5-Cy2, Cy5-Cy3, Cy5-Cy3.5, Cy5.5-Alexa Fluor 405, Cy5.5-Alexa Fluor 488, Cy5.5-Cy2, Cy5.5-Cy3, Cy5.5-Cy3.5, Cy7-Alexa Fluor 405, Cy7-Alexa Fluor 488, Cy7-Cy2, Cy7-Cy3, Cy7-Cy3.5, or Cy7-Cy5. In a more preferred embodiment the first fluorophore (activator) is Alexa 405 and the second fluorophore is Alexa 647.
Any suitable method may be used to link the first, light-emitting fluorophore and the second, activation fluorophore. In some cases, a linker is chosen such that the distance between the first and second fluorophore is sufficiently close to allow the activator fluorophore to activate the light-emitting fluorophore as desired, e.g., whenever the light-emitting fluorophore has been deactivated in some fashion. Typically, the fluorophore will be separated by distances on the order of 500 nm or less, for example, less than about 300 nm, less than about 100 nm, less than about 50 nm, less than about 20 nm, less than about 10 nm, less than about 5 nm, less than about 2 nm, less than about 1 nm, etc. Examples of linkers include, but are not limited to, carbon chains (e.g., alkanes or alkenes), polymer units, or the like.
The switchable entity may comprise a first fluorophore directly bonded to the second fluorophore, or the first and second entity may be connected via a linker or a common entity. Whether a pair of light emitting portion and activator portion produces a suitable switchable entity can be tested by methods known to those of ordinary skills in the art. For example, light of various wavelength can be used to stimulate the pair and emission light from the light-emitting portion can be measured to determine whether the pair makes a suitable switch.
Additional details about fluorophores can be found in WO2009/085218.
“Computer-readable medium”, as used herein relates to the computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
In another aspect, the invention relates to the use of a kit of the invention for obtaining a calibration curve for quantifying protein copy number in immunofluorescence-based super resolution microscopy, for quantifying protein copy number in a sample imaged with super resolution microscopy and for determining the percentage of oligomeric state of a protein in a sample imaged with super-resolution microscopy.
All the terms and embodiments previously described are equally applicable to this aspect of the invention.
The invention will be described by way of the following examples, which are to be considered as merely illustrative and not limitative of the scope of the invention.
Imaging was performed on an inverted Nikon Eclipse Ti microscope (Nikon Instruments). The excitation module is equipped with four excitation laser lines: 405 nm (100 mW, OBIS Coherent, Calif.), 488 nm (200 mW, Coherent Sapphire, Calif.), 561 nm (500 mW MPB Communications, Canada) and 647 nm (500 mW MPB Communications, Canada). The laser beam power was regulated through AOMs (AA Opto Electonics MT80 A1,5 Vis) and different wavelengths were mixed and coupled into the microscope objective through dichroic mirrors. The focus was locked through the Perfect Focus System (Nikon) and imaging was performed on an EmCCD camera (Andor iXon X3 DU-897, Andor Technologies). Fluorescence emitted signal was spectrally filtered by a Quad Band filter (ZT405/488/561/647rpc-UF2, Chroma Technology) and selected by an emission filter (ZET405/488/561/647m-TRF, Chroma). For single molecule detection the emitted light was acquired at 25 Hz by an oil immersion objective (Nikon, CFI Apo TIRF 100×, NA 1.49, Oil) providing a corresponding pixel size of 157 nm.
12-helix bundle DNA origami chassis structures were prepared using p8064 scaffold and oligonucleotide staple sequences as previously described (Derr et al., 2012). Briefly, 100 nM scaffold (Tilibit Nanosystems), was mixed with 600 nM core staples (Life Technologies), 3.6 μM handle staples (IDT), and 9 μM TAMRA-labeled fluorophore anti-handles (IDT). Folding was performed in DNA origami folding buffer (5 mM Tris [pH 8.0], 1 mM EDTA and 16 mM MgCl2) with heating to 80° C. and cooling in single degree increments to 65° C. for 75 min, followed by cooling in single degree increments to 30° C. for 17.5 hr. Folded chassis were purified by glycerol gradient sedimentation by centrifugation (Lin, C. et al., 2013) through a 10-45% glycerol gradient in TBE buffer supplemented with 11 mM MgCl2 for 130 min at 242,704 g in a SW50.1 rotor (Beckman) at 4° C. and collected in fractions. Fractions were evaluated with 2% agarose gel electrophoresis and the fractions containing well-folded monomeric chassis were collected.
The following handle sequences were used for binding either AlexaFluor 647 complementary anti-handles or anti-handle labeled dynein motors. Sequence portion in black is complementary to the scaffold, while underline sequence is complementary to anti-handle:
TCTACC (SEQ ID NO: 1)
TCTACC (SEQ ID NO: 2)
TCTACC (SEQ ID NO: 3)
TCTACC (SEQ ID NO: 4)
TCTACC (SEQ ID NO: 5)
The following sequences were used for functionalizing the chassis structures with biotin for immobilization on streptavidin functionalized surfaces
Complementary oligonucleotides A* (NH2-GGTAGAGTGGTAAGTAGTGAA(SEQ ID NO: 10) were incubated with BG-GLA-NHS (NEB) by mixing: 16 μL A* (2 mM), 32 μL of Hepes ph8.5 (200 mM), 8 μL of BG-NHS (20 mM) at room temperature for 30 min. Oligos were filtered by 0.1 ultrafree MC durapore membrane (Millipore) and purified using MicroBioSpin6 colums (BIORAD) previously equilibrated in protein buffer (10 mM TRIS ph8, 150 mM KCl, 10% v/v Glycerol).
Dynein was purified as previously published (Torreno-Pina, J. A. et al, 2014) and labeled with BG-oligos while attached to IgG sepharose beads during the purification (Qiu, W. et al. 2012). Briefly, yeasts (RPY1084 [Derr et al, 2012]) were grown overnight (200 rpm, 30° C.) in YPD (2% glucose) and cultures poured into YPR (2% raffinose). Yeasts culture was transferred in YP media (supplemented with 8 ml 200× adenine and 2% galactose) and kept growing for 24 h. Cells were pelleted twice (6,000 rpm, 6 min, 4° C.) and froze at −80° C. Ground cells were diluted in Dynein lysis buffer (30 mM HEPES (pH 7.2), 50 mM KAcetate, 2 mM MgAcetate, 1 mM EGTA, 10% glycerol, 1 mM DTT, 0.5 mM Mg-ATP, 1 mM Pefabloc) and spun for 1 hour at 60K rpm at 4°. The supernatant was incubated with equilibrated IgG Sepharose beads and nutated for 1-2 hours at 4°. Once nutation was done, beads were washed twice with 1×TEV buffer (10 mM Tris (pH 8.0), 150 mM KCl, 0.5 mM ATP, 1 mM DTT, 1 mM Pefabloc) and then incubated (20 minutes at RT) with BG-oligonucleotides (20 μM). After functionalization, beads were washed three times in TEV buffer and incubated in TEV protease (1:100 in TEV buffer) for 1 hour at 16° C. with slow rotation. Beads were removed with centrifugal filters and protein concentrated with Amikon 100K and frozen in LiaN2. Concentration of the purified dynein (330 nM (
A LabTek chamber (No. 1.0, 8 well) was rinsed with KOH (1M) and PBS three times. Coverglass was incubated with 100 μL of streptavidin (0.5 mg/ml in PBS) for 20′ and washed 3 times with PBS. The coverglass was subsequently incubated with 100 μl of BSA-Biotin (0.5 mg/mL in PBS) for 20′, extensively washed in PBS, and incubated with fiducial markers (Carboxyl Fluorescent Particles, Yellow, 1% w/v Spherotec SPH-CFP-0252-2, diameter 111 nm, diluted 1:25000 in PBS). Blocking of coverglass was performed in blocking buffer containing 10% (wt/vol) BSA (Sigma) in DAB solution (30 mM Hepes, 50 mM KAcetate, 2 mM MgAcetate, 1 mM EGTA 7.5, 10% glycerol, 1 mM DTT, 1 mM Mg-ATP, 2.5 mg/mL casein) for 10 minutes at room temperature. Biotin labeled DNA structures were then incubated on ice for 30′ with oligo-functionalized dynein (300 nM), diluted up to 30 μM concentration in blocking buffer and incubated on the coverslip for 5 min. Structures were then washed twice and blocked in blocking buffer for 15 min at 4° C. Immuno-staining was performed by incubation with primary antibody (chicken polyclonal anti GFP, Abcam 13970) diluted 1:2000 in blocking buffer for 1 h was performed at 4° C. Samples were rinsed 3 times in blocking buffer and incubated for 1 h at 4° C. with donkey-anti chicken secondary antibodies (1:50 in blocking buffer) labeled with photoactivatable dye pairs for STORM Alexa Fluor 405-Alexa Fluor 647. For experiments on BSC-1 cells (from ATCC, #CLL-26), cells were plated (30,000 seeding density) on 8-well Lab-tek 1 coverglass chamber (Nunc) and grown under standard conditions and fixed with Methanol-Ethanol (1:1) at 20° C. for 2′ and incubated for 5′ with DNA origami (1 motor attached) and rinsed 3 times in DAB solution.
Human osteosarcoma U20S cells (from ATCC, #HTB-96) were plated (30,000 seeding density) on 8-well LabTek chambered coverglass (Nunc) and grown under standard conditions (DMEM, high glucose, pyruvate (Invitrogen 41966052) supplemented with 10% FBS). U20S were chosen since they are well performing for transfection and siRNA KD of Nup. For GFP-tagged Nup107 and GFP-tagged Nup133 experiments cells were transfected with the constructs (Szymborska, A. et al., 2013) (plasmid from Jan Ellenberg, EMBL, Heidelberg, pEGFP-Nup107-s 32727res, Euroscarf plasmid ref. P0729 and pmEGFP-Nup133-s31401res, Euroscarf plasmid ref. P30728) using Fugene (FUGENE HD Transfection Reagent, Roche 04709705001). Incorporation into the pore of the GFP-tagged Nup was facilitated by depletion of the endogenous protein, performed by RNA interference, transfecting after 24 h the cells with a matching siRNA (Nup107 SiRNA s32727 and Nup133 SiRNA s31401, Thermo Fisher, Silencer Select siRNA s32727 and Silencer Select siRNA s31401. Nup107 and Nup133, 3picomol of siRNA per well was used). After 70 h cells were rinsed with PFA 3%, extracted with 0.2% Triton X-100 in PBS for 2 min and fixed with PFA (3%) for 7′. Immunostaining of Nup107-green fluorescent protein (GFP) fusion protein was performed using immunofluorescence as described above. Cells lines were regularly tested for microplasma contamination by PCR based standard methods (ATCC, Universal Mycoplasma Detection Kit, 30-1012K).
The imaging conditions were kept constant for all the experiments.
Imaging was performed using TIRF illumination with an excitation intensity of ˜1 KW/cm2 for the 647 nm readout laser line and ˜25 W/cm2 using the 405 nm laser line. 85,000 frames at 25 Hz frame rate were acquired. For dual color imaging of DNA origami structures, fluorescence signal from TAMRA was acquired with 561 nm laser (intensity of ˜200 W/cm2). STORM imaging buffer was used containing GLOX solution as oxygen scavenging system (40 mg/mL−1 Catalase [Sigma], 0.5 mg/ml−1 glucose oxidase, 10% Glucose in PBS) and MEA 10 mM (Cysteamine MEA [SigmaAldrich, #30070-50G] in 360 mM Tris-HCl).
Analysis and reconstruction of super-resolution images were performed using custom software (Insight3, kindly provided by Bo Huang, University of California) by Gaussian fitting of the single molecules images to calculate the molecular localization coordinates. Molecules are identified by a threshold and the radial positions x and y are extracted by fitting with a simple Gaussian function. The final image is obtained plotting each identified molecule as a Gaussian spot with a width corresponding to the localization precision (9 nm) and finally corrected for drift. Molecules appearing within a distance of 9 nm are merged and considered as the same molecule. Spatial clusters of localizations were identified based on a distance-based clustering algorithm, by means of custom-written code written in Matlab (Puchner, E. M. et al., 2013). The localizations list was first binned to 20 nm pixel size images that were filtered with a square kernel (7×7 pixels2) and thresholded to obtain a binary mask. Specifically, a density map was built by 2-dimensional convolution of the localization images with a square kernel (7×7 pixels2) and a constant threshold was used to digitize the maps into binary images. The low-density areas, where the density is lower than the threshold value and a value of 0 was assigned, are discarded from further analysis. Only the components of the binary image, where adjacent (6-connected neighbours) non-zero pixels were found, are analysed. A peak finding routine provides the clusters number and the relative centroid coordinates from the maxima of the density map in the connected regions. Molecular localizations lying over connected regions of the mask were assigned to each cluster using a distance-based algorithm, depending on their proximity to the cluster centroids. For each cluster, its centroid position is iteratively re-calculated and saved for further analysis until convergence of the sum of the squared distances between localizations and the associated cluster is reached. The cluster centroid positions, the number of localizations obtained per cluster and the cluster size are saved.
For DNA origami calibration, first dual color cluster analysis allowed the identification of TAMRA signal (used as a reference to identify the DNA origami structures) and dynein clusters attached to the same DNA origami. In order to consider only the signal belonging to motors attached to DNA origami structures, only the clusters with a relative distance shorter than 200 nm between the clusters in the two channels were considered for further analysis. Clusters identifying single, double and triple motors were then sorted depending on the number motors attached. Additional filter was applied to select structures with the expected handle to handle distance (85±7 nm and 157±17 nm). To ensure the statistical significance the inventors chose a sample size able to ensure a power value close to 1 (the total number of DNA origami considered was N=3077, N=1153, N=250 for single, double and triple motors, respectively.
The distributions of the number of localizations per cluster obtained for DNA origami structures showing 1, 2 and 3 dyneins (corresponding to 2, 4, 6 GFPs respectively) were used as a calibration standard. To this aim, the inventors considered that the distribution of the number of localizations for a structure composed by n GFP can be recursively obtained as
ƒn=ƒn-1⊗ƒ1
where ⊗ represent the convolution and ƒ1 is a log-normal distribution:
The distributions of localizations obtained for 1, 2, 3 dyneins (n=2, 4, 6) were simultaneously fitted to the functions ƒ2, ƒ4, ƒ6 obtaining the parameters μ1=3.35 and μ2=0.85. The same parameters were used for all the other fittings. The log-normal distribution was chosen because, among several tested distributions, it provided the best data model.
For a general distribution of number of localizations, the copy number of a given protein can thus be estimated by fitting the distributions to a linear combination of the “calibration” distributions ƒn
where αn represents the weight of the distribution of n-mers and Σn=1N
To estimate motors attached to the DNA origami chassis the fit was performed considering only dimers (linear combination of distributions ƒn, with even values n=2, 4, 6, 8, . . . , 2k) given the dimeric nature of the motors containing two copies of GFP per motor, while for NPC estimation the fit was performed considering n monomers (linear combination of distributions ƒn, with values n=1, 2, 3, 4, . . . , k).
Fittings are performed by a two-step numerical minimization of the objective function:
which represents the sum of the negative log-likelihood and the entropy. In the first term, p(x) corresponds to the number of occurrences for number of localization x. In the first optimization step, the inventors set
with x representing the average value of the data, and let the optimization run at varying Nmax until the minimum of the objective function Fmin is found. By means of this procedure, the inventors calculate the maximum number of log-likelihood functions necessary to satisfactorily fit the data. Once this number is determined, the inventors further refine the fit by performing a second step of optimization, where the weight of the log-likelihood is set to the inverse of its target value wL=1/Fmin. When fitting distributions involving the linear combination of only dimeric terms (n=2, 4, 6, . . . , 2k), in the second step of optimization the inventors further allow the parameters μ1 and μ2 to slightly vary constrained to a maximum tolerance of 5%, in order to supply to the reduced number of degrees of freedom. Calculation of the errors on the estimated weights αn was based on the reciprocal of the diagonal elements of the Fisher information matrix and thus represent a lower bound to the standard error of the estimators.
For Nup133 and Nup107 quantification, clustering analysis is carried out to segment single nuclear pores and the distribution of the number of localizations/NPC ring was filtered considering a minimum average cluster radius of 40 nm. The total number of nuclear pores analysed was N=1460 for Nup133 and N=855 for Nup107.
The DNA origami data used for calibration are obtained by 5 independent experiments and the total number of DNA origami structures imaged was N1=3077, N2=1153, N3=250 for single, double and triple motors, respectively (
Sorted data were used to quantify Nup133 (
The box plots (
The inventors performed a ChiSquare test to verify the matching of the data to a binomial distribution in all cases (
Performances of the method and the correlation between estimated and actual values at varying statistics and stoichiometry have been characterized calculating the Pearson correlation coefficient R (
The error bars on stoichiometry estimation correspond to the lower bound to the standard errors based on the Fisher Information Matrix (
To overcome these challenges and thus develop versatile calibration standards that can be used for quantifying protein copy-number in intracellular contexts, the inventors took advantage of DNA origami. Specifically, the inventors used a previously developed 3D DNA origami chassis comprised of 12 parallel DNA double helices. This chassis serves as a skeleton for attaching additional components via the use of “handle” sequences that project outward from the structure (Derr, N. D. et al. 2012) These handles provide site- and sequence-specific attachment points for single fluorophores as well as proteins of interest and allow testing of several different labeling strategies such as antibody, nanobody and Halo/SNAP tag labeling (
Chassis were functionalized at three positions with dynein labelled through GFP immunostaining with AlexaFluor 405/AlexaFluor 647 and the number of localizations was calculated from origami images containing single, double and triple clusters corresponding to single dynein (2 GFPs), double dyneins (4 GFPs) and triple dyneins (6 GFPs) (median values, standard deviations and number of clusters analyzed are reported in 2nd row). The sample size chosen was sufficient to ensure (for a=0.05) a statistical power value of 1 and the total number of clusters analysed was N=3077, N=1153, N=250 for single, double and triple motors, respectively.
The inventors next purified a modified dimeric Saccharomyces cerevisiae dynein motor (Reck-Peterson, S. L. et al., 2006) (Methods and
To determine whether this method could be used not only to extract average protein copy-numbers but also the percentage of each oligomeric state, the inventors further explored whether the inventors could fit the distribution of the number of localizations per cluster to a functional form. Indeed, the distribution of localizations for single, double or triple clusters (corresponding to 2, 4 and 6 copies of GFP, respectively) could be simultaneously fit using only 2 free parameters (μ1 and μ2) assuming that they correspond to the convolutions of respectively 2, 4 and 6 functions f1, where f1 is a log-normal distribution describing the probability distribution of number of localizations obtained by labeling monomeric GFP with A647-conjugated antibodies (see Methods,
The inventors finally applied this calibration method to determine copy-numbers of protein complexes imaged in cells. As a first test of a biological structure, we performed immunofluorescence of the nuclear pore complex (NPC) subunit Nup133 fused to GFP, expressed in U2OS cells in the presence of siRNA to knock down the endogenous copy of Nup133 (
In conclusion, the inventors developed a versatile calibration standard that can be used to quantify protein copy number from super-resolution images obtained after immunofluorescence labeling. Interestingly, the calibration curve obtained for GFP antibody labeling was mostly linear for a range of up to 10 GFPs suggesting that the antibody labeling is efficient and not affected by crowding and steric hindrance. The use of GFP antibodies provides a particularly versatile strategy for quantifying a large number of proteins of interest using the calibration curve reported here. In order to do so, it is important to point out that same imaging and image analysis conditions should be used as detailed in the Methods and Protocol. The inventors used standard imaging buffers, laser powers and acquisition settings that are typical for STORM experiments. Finally, the method the inventors developed is not limited to GFP antibodies and is also applicable to the use of antibodies against any endogenous protein of interest. In addition, it can be used to calibrate nanobody labeling, Halo or SNAP-tag fusions and photoactivatable and photoswitchable fluorescent proteins.
Number | Date | Country | Kind |
---|---|---|---|
17382394.9 | Jun 2017 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2018/066708 | 6/22/2018 | WO | 00 |