The present invention relates to a method of selective sequestration, separation, extraction and purification of target molecules, and associated uses, compositions and kits.
Purification of biomolecules is a key requirement for a broad range of research, diagnostic and commercial applications. Traditional approaches involve procedures that can be laborious, require expert knowledge and often entail exposing biomolecules to non-physiological conditions. Existing approaches also assume cells have to be lysed before biomolecules can be purified. In particular, the typical methods for separating nucleic acids and proteins on a preparative or analytical scale rely on lysis of the cell followed by extraction with organic solvents (e.g. phenol/chloroform) or via solid-phase extraction. Such methods are time consuming and often provide only a low yield of the target molecule to be separated. Current solvent extraction techniques can also disrupt the native structure, intermolecular interactions and post-translational modifications of the molecule to be separated. Some biomolecules are also very difficult to separate from complex mixtures and require several steps of purification.
An aim of the present invention is to provide an alternative or improved method of separation, extraction and purification of target molecules.
According to the first aspect of the invention, there is provided a method of selective separation of target molecules from a heterogeneous mixture of target molecules and non-target molecules in a solution, the method comprising:
The proteinaceous phase may be bi-phasic. For example, the self-assembling proteins of the proteinaceous phase may associate or disassociate from each other to form, re-form or disperse the proteinaceous phase depending on the solution conditions. Therefore, in one embodiment the proteinaceous phase is provided in the solution by providing the self-assembling proteins in liquid phase in the solution, and allowing or inducing phase transition by the assembly of the self-assembling proteins into a matrix to form the proteinaceous phase.
A proteinaceous phase may form a globule, herein referred to as a “proteinaceous phase globule”. The term “proteinaceous phase” or proteinaceous phase globule” may also be referred to as a “membraneless organelle”.
The invention is a novel variation of solvent extraction, where the solvent is a proteinaceous liquid phase and it is the first demonstration of solvent extraction using liquid-droplets composed of disordered proteins. An advantage of the invention lies in the ability to separate nucleic acids based on their structure (e.g. double-stranded, single-stranded or higher structure, such as hairpin nucleic acid, or protein structure) and on their length (e.g. separation of a 12mer duplex from a 24mer duplex). Another advantage is that the separation proceeds in an aqueous environment which maintains native structure, intermolecular interactions and posttranslational modifications of the target molecule to be separated. Advantageously, the proteinaceous phase globules can destabilise one of the most stable biological structures known, the DNA double helix, by up to 6 orders of magnitude while simultaneously stabilising the structures formed by single stranded nucleic acids such as regulatory RNAs. The reversibility of the phase transition allows controlled dissolution of the extraction solvent for downstream applications. Through directed mutagenesis, the extraction properties may be tuneable and evolvable, lending unprecedented control over the types of species extracted and conditions under which the proteinaceous phase is formed.
Droplets of the self-assembling protein liquid phase are held together through many weak electrostatic interactions, formed by a pattern of charged and aromatic amino acids in the protein chain. The phase transition is initiated when a mono-disperse solution of the self-assembling protein is quenched below the phase boundary by adjustment of the conditions in the solution. Such conditions may comprise adjusting one or more of temperature, ionic strength, pH and protein concentration. Once formed, the proteinaceous phase (membraneless organelle) constitutes a dense, viscous liquid with a dielectric distinct from bulk water, capable of selectively absorbing (solubilising) or excluding target molecules.
Induction of the assembly of the self-assembling protein into the proteinaceous phase may comprise the induction of a phase transition of the self-assembling protein. The induction of the phase transition into the proteinaceous phase (i.e. phase transition from the self-assembling protein dispersed in a liquid phase to the assembly into a proteinaceous phase) may comprise modulation of salt concentration and/or temperature. Additionally or alternatively, the induction of the phase transition may comprise modification of the pH.
The skilled person will understand that the phase transition behaviour of any given self-assembling protein may be influenced by variation in one or more factors such as temperature, ionic strength, pH and protein concentration. Therefore, the phase transition of the self-assembling protein into the proteinaceous phase may be induced by a temperature between about 4° C. and about 50° C. The phase transition of the self-assembling protein into the proteinaceous phase may be induced by a temperature increase or decrease. The phase transition of the self-assembling protein into the proteinaceous phase may be induced by a pH increase or decrease. Additionally or alternatively, the ionic strength may be adjusted wherein the phase transition of the self-assembling protein into the proteinaceous phase may be induced by a salt (e.g. NaCl) concentration of between about 5 mM and about 5M. Additionally or alternatively, the ionic strength may be adjusted wherein the phase transition of the self-assembling protein into the proteinaceous phase may be induced by reducing a salt, such as NaCl, concentration. The ionic strength may be reduced by reducing NaCl concentration from about 300 mM to about 100 mM NaCl. Additionally or alternatively, the phase transition may be induced by a self-assembling protein concentration of between about 10 μM and about 1 mM. Description of the phase behaviour of a prototypical self-assembling protein is outlined in Nott et al., 2015, Molecular Cell 57, 936-947, which is herein incorporated by reference. Additionally or alternatively, the phase transition may be induced by pH conditions of between about pH2 and about pH10 (see
The self-assembling protein may comprise an intrinsically disordered protein, or fragment thereof, associated with membraneless organelles.
The self-assembling protein may comprise repeating 8-10 residue blocks of alternating net charge, and optionally an over-representation of FG, GF, RG, and GR motifs within the positively charged blocks. In this context, over-representation means ‘significantly more than would be expected by chance’ for example relative to an average human disordered protein sequence of the Uniprot database (www.uniprot.org/). The self-assembling protein may comprise FG and GF pairs spaced by 8-11 residues apart and RG and GR pairs to spaced 4 residues apart. The skilled person will be able to readily identify an appropriate self-assembling protein using the method of Nott et al., 2015, Molecular Cell 57, 936-947, which is herein incorporated by reference.
The self-assembling protein may comprise a protein derived from nuage, nuclear bodies, nuclear speckles, the spliceosome, the nucleolus, Cajal bodies, P-bodies or stress granules. In one embodiment, the self-assembling protein comprises Ddx4 protein. The self-assembling protein may comprise the disordered N-terminus of Ddx4 protein. The disordered N-terminus may comprise residue numbers 1-236 of the Ddx4 protein, or variants and/or truncations thereof. The DEAD-box helicase of Ddx4 may be replaced by a protein or peptide. For example, the DEAD-box helicase may be substituted for fluorescent marker protein such as GFP, RFP or YFP. In one embodiment the self-assembling protein does not comprise or consist of a fluorescence protein, such as GFP (green fluorescence protein). In another embodiment the self-assembling protein may comprise a fluorescence protein, such as GFP (green fluorescence protein), only in association (such as a fusion) with another self-assembling protein, such as Ddx4. For example, in examples where the DEAD-box helicase is substituted with a fluorescence protein.
In one embodiment the self-assembling protein does not comprise or consist of Ultrabithorax (Ubx).
In one embodiment the self-assembling protein does not directly bind to, or have affinity for, the target molecule. In one embodiment the self-assembling protein does not directly bind to, or have affinity for, the target molecule when the self-assembling protein is in its unassembled state. In another embodiment the self-assembling protein is not bound to the target molecule.
The self-assembling protein may comprise a human protein. In another embodiment, the self-assembling protein may comprise a non-human self-assembling protein. The non-human protein may be mammalian. In another embodiment, the non-human self-assembling protein may be bacterial. In another embodiment, the self-assembling protein may comprise a eukaryotic or prokaryotic protein.
The self-assembling protein may be selected from the group comprising Ddx4; Ddx3x; EWSR1; EIF4H; fragments/truncations and variants thereof; or combinations thereof.
The self-assembling protein may be selected from any one of the proteins listed in Table 1. The self-assembling protein may be a variant and/or truncation of a protein selected from any one of the proteins listed in Table 1. The proteins identified in Table 1 are human. However, equivalent non-human homologues may be provided for the self-assembling protein.
Variants of the self-assembling protein may include a protein having at least 70% sequence identity to any one protein identified in Table 1. In another embodiment, variants of the self-assembling protein may include a protein having at least 75% sequence identity to any one protein identified in Table 1. In another embodiment, variants of the self-assembling protein may include a protein having at least 80% sequence identity to any one protein identified in Table 1. In another embodiment, variants of the self-assembling protein may include a protein having at least 85% sequence identity to any one protein identified in Table 1. In another embodiment, variants of the self-assembling protein may include a protein having at least 90% sequence identity to any one protein identified in Table 1. In another embodiment, variants of the self-assembling protein may include a protein having at least 95% sequence identity to any one protein identified in Table 1. In another embodiment, variants of the self-assembling protein may include a protein having at least 98% sequence identity to any one protein identified in Table 1. In another embodiment, variants of the self-assembling protein may include a protein having at least 99% sequence identity to any one protein identified in Table 1. The sequence identity may be measured across the complete protein. In another embodiment, the sequence variation may be in regions of the protein outside of the repeating 8-10 residue blocks of alternating net charge, which optionally comprise an over-representation of FG, GF, RG, and GR motifs within the positively charged blocks.
Truncations of the self-assembling protein may include a protein having at least 20 amino acids of the proteins of Table 1, or variants or homologues thereof. Truncations of the self-assembling protein may include a protein having at least 30 amino acids of the proteins of Table 1, or variants or homologues thereof. Truncations of the self-assembling protein may include a protein having at least 40 amino acids of the proteins of Table 1, or variants or homologues thereof. Truncations of the self-assembling protein may include a protein having at least 50 amino acids of the proteins of Table 1, or variants or homologues thereof. Truncations of the self-assembling protein may include a protein having at least 60 amino acids of the proteins of Table 1, or variants or homologues thereof. Truncations of the self-assembling protein may include a protein having at least 80 amino acids of the proteins of Table 1, or variants or homologues thereof. Truncations of the self-assembling protein may include a protein having at least 100 amino acids of the proteins of Table 1, or variants or homologues thereof.
A variant of the self-assembling protein may alternatively comprise an elongated variant of the self-assembling protein.
In one embodiment, the self-assembling protein may comprise any one of EWSR1; LSM14A; NUCL; KHBRDS1; DDX4; DDX3X; RBM3; and EIF4H; or variants or homologues thereof; or combinations thereof. The above variants/truncations mentioned above for proteins of Table 1 may also equally apply to these self-assembling proteins.
The self-assembling protein may be modified in order to control phase transition properties and/or target molecule absorption properties of the proteinaceous phase. In one embodiment, the self-assembling protein may be modified through the methylation of one or more arginine residues. The self-assembling protein may be modified through the methylation of 2 to 10 arginine residues. The self-assembling protein may be modified through the methylation of 5 to 6 arginine residues. In another embodiment, the self-assembling protein may be modified through the methylation of at least 2 arginine residues. In another embodiment, the self-assembling protein may be modified through the methylation of at least 4 arginine residues.
Advantageously methylation significantly destabilises the proteinaceous phase globules, lowering the transition temperature, for example by as much as 25° C. The extent of the destabilization of the droplets can be the equivalent of adding 100 mM of additional salt to the solution.
In another embodiment, the self-assembling protein may be modified through mutation or deletion of one or more amino acid residues of the self-assembling protein. The skilled person will understand that an appropriate modification may be provided depending on the phase transition behaviour required. For example, Nott et al. (2015, Molecular Cell 57, 936-947) describes substitution of residues 132-166 of Ddx4, where a single aspartate residue significantly modifies the phase transition behaviour.
Similarly, mutation of phenylalanine residues to alanine or modification of phenylalanine residues through fluorination may further modify the phase behaviour. Rearrangement of charged residues may modify the phase transition behaviour. Such mutations and modifications may be made in combination. In one embodiment, one or more charged residues may be modified. Additionally, or alternatively, one or more aromatic residues may be modified. Equivalent mutations may be made to any one of the proteins described in Table 1.
A plurality of proteinaceous phase globules may be provided. The plurality of proteinaceous phase globules may be uniform in composition and/or size. In another embodiment, different proteinaceous phase species may be provided in the same solution. For example, a mixture of two or more different proteinaceous phase species may be provided. The different proteinaceous phase species may each target different target molecules for absorption and/or different non-target molecules for exclusion. The different proteinaceous phase globules may be provided by the use of a different self-assembling protein species for each proteinaceous phase species.
The proteinaceous phase may be at least 500 nm in size as determined by the largest dimension. The plurality of proteinaceous phase globules may be at least 500 nm in size as an average of the population as determined by the largest dimension.
The plurality of proteinaceous phase globules may comprise two or more globules per ml of solution. The plurality of proteinaceous phase globules may comprise three, four, five, six, seven, eight, nine, or ten or more globules per ml of solution. The plurality of proteinaceous phase globules may comprise 100 or more globules per ml of solution. The plurality of proteinaceous phase globules may comprise 1000 or more globules per ml of solution. The plurality of proteinaceous phase globules may comprise 10000 or more globules per ml of solution. In one embodiment the total volume of the proteinaceous phase (including single or multiple proteinaceous globules) in the solution may be between about 0.5 μl to about 10 ml. In one embodiment the total content of the proteinaceous phase (including single or multiple proteinaceous globules) in the solution may be between about 0.001% to about 99% (v/v) of the solution.
In one embodiment, the proteinaceous phase may selectively exclude one or more, or all non-target molecules. In one embodiment, all non-target molecules may be excluded from absorption by the proteinaceous phase. In another embodiment, several different target molecule species may be absorbed into the proteinaceous phase, with at least one non-target molecule species excluded from absorption into the proteinaceous phase.
The target molecule may comprise a biomolecule. The target biomolecule may comprise a protein, peptide or nucleic acid, or analogues thereof. In another embodiment the target molecule may comprise any one of small molecules, natural or synthetic polymers, sugar chains, such as dextran, fatty acid chains; or combinations thereof. Small molecule target molecules may comprise a molecule with a MW of <900 Da. The target molecule may comprise a small molecule with one or more aromatic moieties, for example a poly-aromatic molecule. A small molecule with one or more aromatic moieties may comprise a fluorescein dye. See
The target nucleic acid may comprise an oligonucleotide. The target molecule may be single stranded nucleic acid. The target nucleic acid may comprise ssDNA. Additionally or alternatively, the molecule may comprise ssRNA. The target nucleic acid may comprise duplexes less than 20 base pairs. In another embodiment, the target nucleic acid may comprise RNA in structural conformation.
The target RNA may comprise total RNA (i.e. all the RNA of a cell). The target RNA may comprise one or more, or all of the RNA molecules selected from mRNA, polyA RNA, polysomal RNA, tRNA, ribosomal RNA, lincRNA, miRNA, piRNA, siRNA, SRP RNA, tmRNA, snRNA, snoRNA, SmY RNA, scaRNA, gRNA, aRNA, crRNA, tasiRNA, rasiRNA, land SK RNA.
The target nucleic acid may comprise nucleic acid with a secondary structure, such as hairpin structure. The target nucleic acid may comprise DNA and/or RNA hairpin molecules. The DNA and/or RNA hairpin molecules may have a stem length of between about 6 and about 20 nucleotides. The DNA and/or RNA hairpin molecules may have a stem length of between about 6 and about 15 nucleotides, alternatively between about 6 and about 10 nucleotides. For example, a nucleic acid with a secondary structure may be enriched by absorption into the proteinaceous phase more than an unstructured nucleic acid molecule. Structured nucleic acid may be identified by the skilled person, and may be aided by the use of tools such as Unafold (Markham, N. R. & Zuker, M. (2008) UNAFold: software for nucleic acid folding and hybridization. In Keith, J. M., editor, Bioinformatics, Volume II. Structure, Function and Applications, number 453 in Methods in Molecular Biology, chapter 1, pages 3-31. Humana Press, Totowa, N.J. ISBN 978-1-60327-428-9). Unafold can be used to indicate if a specific sequence is more likely to be structured.
The target nucleic acid may be 3 or more nucleotides in length. The target nucleic acid may be 5 or more nucleotides in length. The target nucleic acid may be between about 3 and about 20,000 nucleotides in length. The target nucleic acid may be between about 5 and about 20,000 nucleotides in length. The target nucleic acid may be between about 8 and about 20,000 nucleotides in length. See
Advantageously, it is found that the proteinaceous phase strongly absorbs nuclear RNAs containing an overrepresentation of CAXT motifs. Therefore, in one embodiment, the target nucleic acid may comprise nucleic acid comprising a CAXT motif, wherein X is A, T, C or G (and in the case of RNA T is U). In another embodiment, the target nucleic acid may comprise nucleic acid comprising an overrepresentation of CAXT motifs, wherein X is A, T, C or G (and in the case of RNA T is U). An overrepresentation of motifs or residues may be understood by the skilled person as more than, such as substantially more than, the average found in a population of nucleic acid molecules which are not target nucleic acid molecules.
In one embodiment, a target molecule, such as a nucleic acid, may be tagged with a nucleic acid molecule comprising one or more CAXT motifs (wherein X is A, T, C or G (and in the case of RNA T is U)) to enhance selective absorption of the target molecule. In another embodiment, a target nucleic acid molecule may be engineered to increase the number of CAXT motifs (wherein X is A, T, C or G (and in the case of RNA T is U)) to enhance selective absorption of the target molecule. In another embodiment, the target nucleic acid molecule may be a modified form of the nucleic acid molecule relative to wild type, wherein the modified form has been engineered to increase the number of CAXT motifs (wherein X is A, T, C or G (and in the case of RNA T is U)) relative to wild type.
The target peptide/protein may be between about 2 and about 300 amino acids in length. The target peptide/protein may comprise a multiple protein assembly, for example a virus particle or a multi-subunit protein. The target molecule, such as a peptide/protein including assemblies thereof, may have a molecular weight of between about 100 Da and about 10 MDa.
The target molecule may comprise a protein. In particular, the proteinaceous phase is capable of selectively excluding particular species of proteins. A target protein may have a partitioning Gibbs free energy (ΔGpart) of greater than 0 kJ mol−1, resulting in net absorption. In another embodiment, the target protein may have a partitioning Gibbs free energy (ΔGpart) of greater than 1 kJ mol−1 (i.e. enrichment inside the proteinaceous phase of greater than 1.5× compared to the bulk solution). In another embodiment, the target protein may have a partitioning Gibbs free energy (ΔGpart) of greater than 2 kJ mol−1 (i.e. enrichment inside the proteinaceous phase of greater than 2.3× compared to the bulk solution). In another embodiment, the target protein may have a partitioning Gibbs free energy (ΔGpart) of greater than 3 kJ mol−1 (i.e. enrichment inside the proteinaceous phase of greater than 3.4× compared to the bulk solution). In another embodiment, the target protein may have a partitioning Gibbs free energy (ΔGpart) of greater than 5 kJ mol−1 (i.e. enrichment inside the proteinaceous phase of greater than 8× compared to the bulk solution). In another embodiment, the target protein may have a partitioning Gibbs free energy (ΔGpart) of greater than 8 kJ mol−1 (i.e. enrichment inside the proteinaceous phase of greater than 28× compared to the bulk solution). In another embodiment, the target protein may have a partitioning Gibbs free energy (ΔGpart) of greater than 15 kJ mol−1 (i.e. enrichment inside the proteinaceous phase of greater than 500× compared to the bulk solution). A concentration difference of the target molecular between inside and outside the proteinaceous phase may be converted into a partitioning Gibbs free energy (ΔGpart) using the equation:
ΔGpart=RT Ln(K)
where K is the ratio of target material inside to outside the proteinaceous phase.
In one embodiment, the target molecule is a molecule that is undesirable in the solution and it is targeted for removal, or temporary sequestration, from the solution. Additionally or alternatively, a desirable molecule in the solution may be selectively excluded from absorption into the proteinaceous phase as a non-target molecule.
In one embodiment, the target molecule may be a chaperone for separation/isolation of another molecule (herein termed the “molecule for separation”). For example fluorescein-tagged molecules may be chaperoned by the fluorescein, and absorbed into the proteinaceous phase (see the behaviour of unlabelled versus fluorescein-labelled ubiquitin in
The molecule for separation may be linked to the target molecule chaperone. The molecule for separation may be covalently bound to the target molecule chaperone. In another embodiment the bond may be ionic. Alternatively, the molecule for separation may be entangled with the target molecule chaperone. The molecule for separation may be tagged with the target molecule chaperone.
In one embodiment, the target molecule may be a chaperone for a biomolecule to be separated. For example, the target molecule may be a chaperone for a nucleic acid molecule to be separated. In another example, the target molecule may be a chaperone for a protein or peptide to be separated. In another example, the target molecule may be a chaperone for a small molecule to be separated. See
In an embodiment wherein the molecule for separation and the target molecule chaperone are proteins or peptides, they may be linked together by expression as fusion protein/peptide. In another embodiment, the molecule for separation and the target molecule chaperone may be covalently linked, for example using click-chemistry. In another embodiment, the molecule for separation and the target molecule chaperone may be linked by an affinity tag system such as avidin-biotin. In another embodiment, the molecule for separation and the target molecule chaperone may be linked by affinity, such as through the binding affinity of an antibody, antibody fragment, analogue or mimic thereof.
In an embodiment wherein the molecule for separation and the target molecule chaperone are nucleic acids, they may be linked together by base-pair complementary binding/hybridisation.
The molecule to be separated may be linked to the self-assembling protein. Therefore the self-assembling protein may itself act as a target molecule chaperone for the molecule to be separated. During phase transition the linked self-assembling protein may assemble into the proteinaceous phase, thereby absorbing the linked molecule for separation. In an embodiment wherein the molecule for separation and the self-assembling protein are linked together, they may be linked by expression as a fusion protein/peptide. For example, the molecule for separation may replace a DEAD-box helicase of the self-assembling protein. In another embodiment, the molecule for separation and the self-assembling protein may be covalently linked, for example using click-chemistry. In another embodiment, the molecule for separation and the self-assembling protein may be linked by an affinity tag system such as avidin-biotin. In another embodiment, the molecule for separation and the self-assembling protein may be linked by affinity, such as through the binding affinity of an antibody, antibody fragment, analogue or mimic thereof.
Advantageously, linking the self-assembling protein to the molecule to be separated can provide a tightly controlled pattern of absorption and dissolution by control of the phase transition. For example, the phase transition into a proteinaceous phase leads to substantially complete separation/absorption of the molecule to be separated into the proteinaceous phase, and this happens immediately upon the phase transition.
The heterogeneous mixture may comprise or consist of cell or tissue extract. The heterogeneous mixture may comprise or consist of biopsy material. The heterogeneous mixture may comprise or consist of whole cell extract. The heterogeneous mixture may comprise or consist of a mixture of nucleic acid species and/or sequences. The heterogeneous mixture may comprise a bodily fluid sample. In another embodiment, the heterogeneous mixture may comprise an environmental sample, such as water, air, or soil sample. The heterogeneous mixture may comprise a food or beverage sample.
The heterogeneous mixture may comprise a cell culture sample. The heterogeneous mixture may comprise pre-extracted nucleic acid. The heterogeneous mixture may consist of nucleic acid and a solute.
Where the heterogeneous mixture is a bodily fluid sample, it may be from a mammal. The mammal may be human. The heterogeneous mixture may comprise a blood or blood plasma sample. The heterogeneous mixture may be selected from any of the group comprising a sample of blood; blood plasma; mucous; urine; faeces; cerebrospinal fluid; tissue such as organ tissue, lung aspirate; or combinations thereof.
The non-target molecule may comprise any molecule in the heterogeneous mixture that is not of interest to be separated from the heterogeneous mixture. The non-target molecule may comprise a protein, peptide or nucleic acid, or analogues thereof. The non-target molecule may comprise any one of a small molecule (for example, molecules with a MW of <900 Da), natural or synthetic polymers, sugar chains, and fatty acid chains; or combinations thereof.
The non-target molecule may comprise double stranded nucleic acid molecules. For example, duplexes longer than 20 base pairs may be excluded by the proteinaceous phase. Therefore, the non-target molecule may comprise double stranded nucleic acid molecules of greater than 20 base pairs in length. In another embodiment, the non-target molecule may comprise double stranded nucleic acid molecules of greater than 19 base pairs in length. In another embodiment, the non-target molecule may comprise double stranded nucleic acid molecules of greater than 18 base pairs in length. In another embodiment, the non-target molecule may comprise double stranded nucleic acid molecules of greater than 17 base pairs in length. The double stranded nucleic acid molecule may comprise overhangs (i.e. may not be a perfect duplex). For example the double stranded nucleic acid molecule may comprise a restriction enzyme cut product, or a probe hybridised to a different length (e.g. longer) sequence.
The non-target molecule may be selected from the group comprising chromatin; HSP (heat shock protein), such oligomers of HSP16.5; αB-crystallin; wild-type GFP (green fluorescence protein); and fragments thereof; or combinations thereof. In one embodiment, the non-target molecule comprises chromatin. In one embodiment, the non-target molecule comprises HSP. In one embodiment, the non-target molecule comprises oligomers of HSP16.5. In one embodiment, the non-target molecule comprises αB-crystallin. In one embodiment, the non-target molecule comprises wild-type GFP.
The non-target molecule may comprise a protein. In particular, the proteinaceous phase is capable of selectively excluding particular species of proteins. A non-target protein may have a partition Gibbs free energy (ΔGpart) of less than 0 kJ mol−1, representing net exclusion from the proteinaceous phase. In another embodiment, the non-target protein may have a partition Gibbs free energy (ΔGpart) of less than −1 kJ mol−1 (i.e. exclusion from the proteinaceous phase by more than a factor of 1.5× compared to the bulk solution). In another embodiment, the non-target protein may have a partition Gibbs free energy (ΔGpart) of less than −2 kJ mol−1 (i.e. exclusion from the proteinaceous phase by more than a factor of 2.2× compared to the bulk solution). In another embodiment, the non-target protein may have a partition Gibbs free energy (ΔGpart) of less than −5 kJ mol−1 (i.e. exclusion from the proteinaceous phase by more than a factor of 8× compared to the bulk solution). In another embodiment, the non-target protein may have a partition Gibbs free energy (ΔGpart) of less than −7 kJ mol−1 (i.e. exclusion from the proteinaceous phase by more than a factor of 18× compared to the bulk solution). A concentration difference of the target molecular between inside and outside the proteinaceous phase may be converted into a partitioning Gibbs free energy (ΔGpart) using the equation:
ΔGpart=RT Ln(K)
Where K is the ratio of target material inside to outside the proteinaceous phase.
In one embodiment a molecule to be excluded may be linked to a non-target molecule. For example, the non-target molecule may act as a chaperone for a linked molecule, which selectively prevents absorption into the organelle by exclusion of the non-target molecule chaperone.
In an embodiment wherein the molecule for exclusion and the non-target molecule chaperone are proteins or peptides, they may be linked together by expression as fusion protein/peptide. In another embodiment, the molecule for exclusion and the non-target molecule chaperone may be covalently linked, for example using click-chemistry. In another embodiment, the molecule for exclusion and the non-target molecule chaperone may be linked by an affinity tag system such as avidin-biotin. In another embodiment, the molecule for exclusion and the non-target molecule chaperone may be linked by affinity, such as through the binding affinity of an antibody, antibody fragment, analogue or mimic thereof.
In an embodiment wherein the molecule for exclusion and the non-target molecule chaperone are nucleic acids, they may be linked together by base-pair complementary binding/hybridisation.
Target and/or non-target molecules may be labelled to aid their localisation and identification. The skilled person will be familiar with appropriate labels such as dyes, fluorescent labels, microparticles, radioactive labels, probes, and the like.
According to another aspect of the invention, there is provided a method of selective separation of target molecules from a heterogeneous mixture of target molecules and non-target molecules in a solution, the method comprising:
Tagging the target molecule with a self-assembling protein may comprise linking the target molecule with a self-assembling protein as described herein. For example, the linking may comprise a covalent linkage or affinity tagging.
Following absorption of the target molecule, the methods of the invention may further comprise the step of separating/isolating the proteinaceous phase from the solution. The separation may comprise sedimentation by centrifugation (for example between 10 g and 23,000 g) of the proteinaceous phase and decanting the remaining solution away from the sediment. The sedimentation may form a larger coalesced proteinaceous phase/dense liquid phase at the bottom of the container. The proteinaceous phase may be washed one or more times with a wash or re-suspension solution. The proteinaceous phase may be resuspended in a resuspension solution to form an isolated proteinaceous phase suspension.
Alternatively or additionally, the separation may comprise filtration or capture of the proteinaceous phase, for example in a matrix, mesh or column.
Additionally or alternatively, the proteinaceous phase may be tagged with an affinity tag or magnetic tag to aid separation. The tag may comprise a GST-tag or 6His tag, or the like.
Advantageously, the affinity purification tags do not need to be cleaved from the organelle-forming protein for phase separation to be easily controllable.
In another embodiment, the proteinaceous phase may be tagged with a molecule that causes it to float to the surface in a solution, thereby facilitating isolation of the proteinaceous phase by skimming or decanting it off the surface of the solution.
The phase transition of the proteinaceous phase may be reversed in order to dissolve the proteinaceous phase, thereby releasing the target molecule. The phase transition of the proteinaceous phase may be reversed for example, by a change in temperature, and/or pH, and/or ionic strength.
In one embodiment the separation of the target molecule from the heterogeneous mixture in the solution is a temporary separation. In particular, the phase transition of the proteinaceous phase may be reversed in order to dissolve the proteinaceous phase, thereby releasing the target molecule back into the solution. The reversal of the phase transition may be induced by a change in temperature, pH and/or salt concentration. The reversal of the phase transition may be after a desired reaction has occurred in the solution.
According to another aspect of the invention, there is provided a method of Ion Torrent Sequencing of RNA, wherein RNA extraction is provided by isolating the RNA from a solution according to the method of the invention herein.
The skilled person will be familiar with conventional methods of Ion Torrent Sequencing of RNA. It is understood that the RNA extraction steps of such methods can be readily substituted with the method of RNA isolation described herein.
According to another aspect of the invention, there is provided a composition comprising the self-assembling proteins as described herein, wherein the self-assembling proteins are capable of assembling into a matrix to form a proteinaceous phase.
The self-assembling proteins of the composition may be in a reversible amorphous solid (glassy) state. The composition may consist essentially of the self-assembling proteins as described herein. In another embodiment, the self-assembling proteins of the composition may be in solution. The solution of the composition may be a carrier comprising water. The solution of the composition may be a buffer solution. The solution of the composition may not comprise cell extract. The self-assembling proteins may be capable of assembling into a matrix in solution to form a proteinaceous phase.
The self-assembling protein(s) may be isolated, for example, isolated from other cell constituents. The self-assembling protein may be a recombinant protein.
The composition may comprise a carrier. The carrier may comprise a buffer. The composition may be in the form of a solution, lyophilised powder, or as a dried amorphous solid (glass). The dried, glassy form of the proteinaceous phase may be described as an amorphous solid. A demonstration of the physical properties of the glassy-state of the proteinaceous phase is shown in
According to another aspect of the invention, there is provided a kit for selective separation and/or modification of target molecules from a heterogeneous mixture of the target molecules and non-target molecules in a solution, wherein the kit comprises:
The kit may further comprise a target molecule chaperone for tagging/linking to a molecule of interest.
The self-assembling proteins of the kit may be in a reversible amorphous solid (glassy) state. The composition may consist essentially of the self-assembling proteins as described herein. The solution for reconstitution of the self-assembling proteins may be provided separately. In another embodiment, the self-assembling proteins of the kit may be in solution. The solution of the kit may be a carrier comprising water. The solution of the composition may be a buffer solution. The solution of the kit may not comprise cell extract. The self-assembling proteins of the kit may be capable of assembling into a matrix in solution to form a proteinaceous phase.
According to another aspect of the invention, there is provided use of self-assembling proteins described to selectively isolate target molecules from a heterogeneous mixture of the target molecules and non-target molecules in a solution.
The use may be for a diagnostic assay. The use may be for environmental sampling.
The self-assembling proteins may be capable of phase transition to form a proteinaceous phase.
Methods and compositions of the invention herein may be used for denaturing dsDNA, for example to open up a replication fork in DNA, or to open up regions of DNA to be hyper accessible to DNA modification enzymes (such as DNA repair enzymes).
In another embodiment, methods and compositions of the invention herein may be used for stabilising ssRNA, for example to purify regulatory RNA molecules from disordered RNA molecules, or for protecting RNA molecules from degrading enzymes.
The methods herein may be in vitro.
In one embodiment, the proteinaceous phase may be provided within a cell (or population of cells), for example to purify or isolate target molecules within a cell (or population of cells) prior to lysis. In one embodiment, the proteinaceous phase may be provided within a cell for sequestering one or more target molecules in the cell from other cell constituents. The sequestration may be temporary/reversible.
The proteinaceous phase may be provided within a cell (or population of cells) by expressing the self-assembling protein in the cell. The self-assembling protein may be endogenous and the over-expression of this protein is induced by genetic modification of the cell (or population of cells). For example additionally copies of the gene encoding the endogenous self-assembling protein may be transfected into the cell (or population of cells). Additionally or alternatively, the endogenous gene may be provided with a stronger and/or inducible promoter. In another embodiment, the cell (or population of cells) may be genetically modified with recombinant nucleic acid encoding an exogenous self-assembling protein. In one embodiment, the self-assembling protein may be expressed, or overexpressed, together with a target molecule. The self-assembling protein and target molecule may be expressed as a fusion protein. A demonstration of the sequestration of a target protein within proteinaceous globules formed by an organelle-forming protein is shown in
Where reference is made to a protein sequence, the skilled person will understand that one or more substitutions may be tolerated, optionally two substitutions may be tolerated in the sequence, such that it maintains its function. References to sequence identity may be determined by BLAST sequence alignment (www.ncbi.nlm.nih.gov/BLAST/) using standard/default parameters. For example, the sequence may have 99% identity and still function according to the invention. In other embodiments, the sequence may have 98% identity and still function according to the invention. In another embodiment, the sequence may have 95% identity and still function according to the invention. In another embodiment, the sequence may have 90%, 85%, or 80% identity and still function according to the invention.
The skilled person will understand that optional features of one embodiment or aspect of the invention may be applicable, where appropriate, to other embodiments or aspects of the invention.
There now follows by way of example only a detailed description of the present invention with reference to the accompanying drawings, in which;
Biochemical reactions inside cells are generally considered to occur in water. Cellular compartments termed ‘membraneless organelles’ challenge this view. These bodies are readily observable in the light microscope, for example nucleoli, Cajal bodies, P-bodies and nuage1,2 (
It is not obvious why forming droplets of proteinaceous solvent would be beneficial to an organism. Membraneless organelles can act as filters: nucleoli exclude bulk chromatin in the nucleus14 (
We employed a recently developed confocal microscopy method to ascertain whether fluorescently labelled DNA or RNA oligonucleotides were absorbed or excluded from organelles in vitro6. We began by investigating how nucleic acid length affected partitioning within Ddx4N1 organelles using a series of 5′-Cy5 labelled DNA and RNA oligonucleotides derived from concatenated ACTG (DNA) or ACUG (RNA) repeats. Annealing the labelled strand with an equal amount of an unlabelled sense or antisense strand of the same length, produced either single(ss)- or double(ds)-stranded nucleic acids ranging in length from a 12mer to a 40mer. The nucleic acids were mixed with Ddx4N1 organelles and their relative partitioning, or solubility, was monitored using confocal fluorescence microscopy, and quantified as a partition free energy (ΔGpart,
We next sought to investigate the partitioning of structured nucleic acid hairpins, containing a mixture of double-stranded (stem) and single-stranded (loop) regions. A series of 5′-Cy5 labelled DNAs and RNAs comprising 6 to 10 base pair (bp) GC stems and 20 base polyT/U loops displayed increased partitioning over the unstructured single stranded ACTG or ACUG sequences (
By contrast to the case of single-stranded oligonucleotides, highly rigid double-stranded DNA, RNA, and hybrid RNA/DNA duplexes of 20 base pairs or longer were predominantly excluded from Ddx4 organelles (
To move towards a mechanistic understanding of the partitioning, the partition free energy was found to vary strikingly as a function of the predicted stability (ΔGstab) of the nucleic acid structures studied (
To complement the characterisation of organelle partitioning of oligonucleotides, we mixed a range of fluorescently tagged proteins with organelles and measured their partition free energies (table 4). Fluorescence from isolated GFP and YFP was observed in approximately equal quantities both inside and outside of the organelle droplets indicating that they are folded in the organelle interior24. Proteins conjugated to GFP, CFP or YFP showed a greater range of partitioning than we observed in the nucleotides (
A series of GFP proteins were introduced that varied in surface charge25. Both an acidic GFP (−30 surface charge, GFP−30) and a basic GFP (+15 surface charge, GFP+15) were absorbed inside the organelles more than wild-type GFP (−7 surface charge, GFPWT,
In further experiments, complex mixtures of RNA (with likely more than 10,000 unique RNA molecules ranging in length of 100-100,000 nucleotides long, and potentially hundreds of copies of each) were fractionated into populations that were either absorbed or excluded by phase-separated droplets of Ddx4 protein (organelle phase). The experiment was performed by forming organelles in the presence of complex RNA mixtures (by rapidly reducing the ionic strength from 300 to 100 mM NaCl), incubating the samples at 4° C. for 1 hour, collecting the organelle phase (˜0.5% total sample volume) at the bottom of an eppendorf tube by bench-top centrifugation (<1 min), aspirating the supernatant phase (˜99.5% total sample volume) and resuspending the organelle phase in a high ionic strength buffer (300 mM NaCl). This yielded two fractions per experiment containing RNAs that were either absorbed of excluded from the organelle phase. RNA in each fraction (absorbed or excluded) were subsequently subjected to deep sequencing in order to identify their sequences and quantitate their relative absorption/exclusion. The experiment was repeated with either nuclear RNA or cytoplasmic RNA as the starting material (constituting the complex mixtures of RNAs).
An analysis was performed in which the absorption or exclusion of RNAs with alternative 5′ untranslated regions (5′UTRs), differing in length by 100-10,000 nucleotides, was compared. This analysis revealed, for a given RNA with alternative 5′UTRs, whether the long isoform or the short isoform was preferentially absorbed into the organelle phase. In the majority of instances, when there was significant absorption into the organelle and for both nuclear and cytoplasmic RNA, the RNA molecule containing the longer 5′UTR was absorbed more strongly then the equivalent RNA bearing the short isoform (
An analysis was then performed to identify short sequence motifs (7 nucleotides in length) contained in long 5′UTR isoforms and absent from the short 5′UTR isoform equivalent. Overall, for cytoplasmic RNAs, the preferentially absorbed 5′UTRs contained a high G and C content. Strongly absorbed nuclear RNAs contained an overrepresentation of CAxT motifs (where x is any of A, T, C or G nucleotides).
Further analysis of the alternative 5′UTR sequences that conferred preferential absorption to the organelle phase (sequences unique to the long isoforms) revealed that both length and predicted structure in the RNA was correlated with absorption. i.e. longer and more highly structured RNAs were more highly absorbed.
Overall, we have demonstrated that the interior of the organelle has very different properties to the bulk aqueous phase of the cell. Individual proteins and oligonucleotides exhibit a complex but predictable set of tendencies to enter the organelles. Moreover, double stranded DNA can be destabilised by up to 6 orders of magnitude without the need for ATP while single stranded structures can be simultaneously stabilised in a manner that suggests the interior of the organelle favours compact conformations.
HeLa cells were cultured as previously described1. Briefly, cells were grown on 25 mm glass coverslips in growth media (high glucose DMEM containing 20 mM HEPES pH 7.4, 10% FBS and antibiotics at 37° C. and 5% CO2). Ddx4 constructs (Ddx4YFP and Ddx4YFP-FtoA) were expressed in HeLa cells from pcDNA 3.1+(Invitrogen) plasmids by transient transfection utilizing the Effectene (Qiagen) or polyethylenimine (PEI) methods. Transfections were carried out according to the manufacturer's instructions and used 0.5-1 μg plasmid DNA per coverslip.
HeLa cells expressing Ddx4YFP and Ddx4YFP-FtoA were grown on 25 mm glass coverslips, and fixed with 4% paraformaldehyde (PFA) in phosphate buffered saline (PBS), for 5 minutes at 37° C. Cells were then washed three times with PBS to remove excess PFA. Next, cells were permeabilised with 0.5% TritonX-100 (in PBS) for 10 minutes, and again washed three times with PBS. Nuclei were visualized with Hoechst or DAPI stain. Cells were washed a further two times with PBS to remove excess Hoechst/DAPI stain and imaged using an Olympus IX81 inverted microscope with a 60× (NA 1.3) silicon immersion objective. Hoechst/DAPI dye was excited with a 405 nm laser and YFP was excited with a 515 nm laser. Hoechst/DAPI and YFP fluorescence were detected at 461 and 527 nm respectively. Differential interference contrast (DIC) images were collected using illumination from the 405 nm laser.
For immunofluorescence experiments, HeLa cells expressing Ddx4YFP were grown on 25 mm diameter #1.5 glass coverslips (Warner Instruments). The fixation and permeabilisation of samples was performed as above. Cells were then blocked with goat serum (5% in PBS) for one hour at room temperature before antibody staining. Primary antibodies were diluted to between 1:10 and 1:100 (in PBS containing 5% goat serum) before use. Nucleoli were labelled using mouse B23 (Santa Cruz sc-56622) antibodies. Following incubation at 4° C. overnight, excess primary antibodies were removed by washing the cells three times with PBS (five minutes per wash).
Cells were then incubated with Cy3 goat anti-mouse secondary antibodies (diluted 1:400) for 1 hour at room temperature. Excess secondary antibodies were removed by washing with PBS, as above. Nuclei were visualized with DAPI stain. Coverslips were then mounted on microscope slides using glycerol/n-propyl gallate mounting medium, sealed with nail varnish, and imaged using an Olympus IX81 inverted microscope equipped with a 60×(NA 1.3) silicon immersion objective. DAPI, YFP and Cy3 and dyes were excited with 405 nm, 515 nm and 559 nm lasers, respectively. 26 Z-slices (0.4 μm spacing, 12.5 μs pixel−1 scanning speed, 12 bit depth) were captured for each channel.
Recombinant proteins for both organelle formation and partitioning were expressed from IPTG-inducible plasmids in E. coli cells overnight at 20° C. Cell pellets were suspended in buffer (50 mM Tris pH 8.0, 500 mM NaCl, 5 mM DTT) and lysed by homogenisation. Proteins were first purified by affinity chromatography (GST-4b beads, GE Healthcare Life Sciences or Ni-NTA, Invitrogen). The GST tag was removed with TEV-protease, and the target protein further purified and buffer exchanged by size exclusion chromatography into storage buffer (20 mM Tris pH 8.0, 300 mM NaCl, 5 mM TCEP). Purified proteins were centrifugally concentrated and flash frozen in liquid nitrogen and stored at −80° C. The sequences of all proteins used in this study are summarised in table 2.
DNA and RNA oligonucleotides, including those combined with the fluorescent Cy3 and Cy5 dyes, were purchased from SIGMA and delivered as lyophilised samples. Stocks at 100 μM were made by resuspending the oligonucleotides in TE buffer (10 mM Tris pH 8.0 at RT, 1 mM EDTA) and stored at −20° C. Lower concentration working stocks were made by further dilution with TE buffer.
Prior to partitioning experiments, oligonucleotides mixtures (sense+sense or sense+antisense) were heated to >95° C. for 2 minutes, and allowed to cool to room temperature over the course of 90-120 minutes. Hairpin oligonucleotides were heated to >95° C. for 2 minutes and snap-cooled on ice before equilibrating at room temperature for 5 minutes.
Oligonucleotide stability (ΔGstab) was predicted using the calculator UNAFold2 with 1 oligo (selecting DNA or RNA as appropriate), 150 mM NaCl at 25° C. (table 3). The stabilities of the 12mer, 16mer and 20mer ACTG duplexes were independently verified by experimental measurement in-house (section S.7).
To compare with the theoretical estimates, the stability of the duplexes formed by the 12mer, 16mer and 20mer ACTG dsDNA samples were measured using a Chirascan circular dichroism spectrophotometer (Applied Photophysics), equipped with Series 800 Temperature Controller (AlphaOmega Instruments). Oligonucleotide samples were first heated to at least 85° C. for 5 minutes, followed by cooling at 1° C./min to 20° C. Absorbance at 278 nm was monitored (1 nm bandwidth, 1° C. step size, 0.5 sec per time point, 8 repeats) as the samples were cooled. The temperature at which the samples contained 50 duplex were derived from the decrease in hyperchromicity as the oligonucleotides annealed. Annealing profiles were measured at 4 concentrations per oligonucleotide sample, from which it was possible to derive ΔGstab.
The hyperchromicity was normalised to go from 1 at low temperatures (fully duplex) and 0 at high temperatures (no duplex), and taken to be mole fraction of duplex F=2[ds]/sstot where [ds] and sstot are the double stranded concentration and total concentration of all single chained nucleotides in the system. Noting that sstot=2[ds]+[ss], where [ss] is the total concentration of free single stranded DNA, the mole fraction F can be converted into a free energy:
which varies with temperature as ΔG=ΔH−TΔS. Rearranging and solving for F gives its functional variance with temperature: F=1+e(ΔH/T-ΔS)/R/sstot+√{square root over ((1+e(ΔH/T-Δ5)/R/sstot)2−1)} At the temperature midpoint, TM, F=0.5 and so:
A plot of 1/TM versus ln(4/sstot) enables determination of AH and AS. This method was applied to 12, 16 and 20mer duplex DNAs (
Ddx4 membraneless organelle samples were prepared as squashed drops as previously described1. Briefly, 1.35 μL of Ddx4N1 (325 μM in GF buffer; 20 mM Tris pH 8.0 at RT, 300 mM NaCl, 5 mM TCEP) was mixed 1:1 with an oligo or fluorescent protein sample on a round 22 mm siliconised coverslip (Hampton Research) and equilibrated as a hanging drop over a well solution of 20 mM Tris pH 8 at RT, 150 mM NaCl for 15 minutes (30 minutes for ternary mixtures also containing GFP). The well was sealed with Vaseline. The coverslip was removed after the equilibration period, excess Vaseline removed with a 200 uL pipette tip, and the droplet dispersed onto a microscope slide (Fisherbrand Superfrost catalogue #12-550-123).
In all cases the final concentration of Cy5- and/or Cy3-labelled oligonucleotides in the hanging droplet was 1 μM (0.5 μM for ternary mixtures also containing GFPs (
Ddx4 organelle portioning of fluorescent nucleic acids and proteins was imaged using a Leica TCS SP5II microscope equipped with a motorised stage and HCX PL APO CS 40× NA 1.3 oil immersion objective. Illumination was provided by Argon (458, 488 and 514 nm) and Helium-Neon (543 and 633 nm) lasers. Imaging scan speed was 400 Hz with a format of 512×512 pixels at 8 bit depth. Z-stacks were taken+/−10 μm from the brightest plane of the sample in 1 μm increments. Experimental parameters (laser intensity, detector sensitivity etc.) remained constant for each set of experiments and associated controls. Typical excitation and emission schemes for the fluorophores used in partitioning experiments are summarised in table 5. Image analysis was performed using Fiji3. Quantitation of partitioning was achieved by selecting circular regions of interest (ROIs) comprising 5 individual Ddx4 organelles (and 5 equally sized regions of adjacent dilute phase) for three fields of view (FOV) per sample with two independent samples per experiment. The z-stack for each ROI was cropped to 5 slices, centred on the middle of the sample (
ΔGpart=−RT ln([in]/[out])
where [in] and [out] are the total concentrations of oligonucleotides inside and outside the organelles. The reported values are the mean and standard deviation of the partitioning of 30 organelles (5 per FOV, 3 FOVs per sample, 2 samples) per fluorescent oligo or protein.
indicates data missing or illegible when filed
The partitioning data of single stranded nucleic acids was analysed according to the following thermodynamic square (
Which is defined by four equilibrium coefficients
denoting the stability outside and inside the organelle (KS,o and KS,i) and the partition coefficient for folded and unfolded single stranded oligonucleotides (KP,f and KP,u). As before, due to the square geometry of the scheme:
K
S,o
K
P,u
=K
S,i
K
P,f
The total concentration of oligonucleotide molecules is given by
ss
tot
=f
out
+f
in
+u
out
+u
in
If we substitute in equilibrium constants we solve for uout:
Our experimental observables are total concentration, stability of the DNA outside of the organelles, and the total partition coefficient are given by:
Where in the case of an unstructured single stranded oligonucleotide, ΔGpart=−RT ln KPu, and for a structured single stranded oligonucleotide ΔGpart=−RT ln KPobs. These can be manipulated to return to our original definitions:
Finally, we define a destabilisation energy, the difference in free energy of DNA association inside and outside the organelles (
Overall, from an experimental measurement of KPU, KP obs, ΔGstab and knowledge of the total concentration ssTot, we can calculate the destabilisation energy for secondary structure in single stranded oligonucleotides.
The analysis of double stranded oligonucleotides destabilisation is similar to that for the single strands (section S.10.1) but not identical as when a single chain denatures, we are left with one molecule, whereas when a duplex unfolds we have two. The partitioning data of double stranded nucleic acids was analysed according to the following thermodynamic square (
Which is defined by four equilibrium coefficients
denoting the stability outside and inside the organelle (KS,o and KS,i) and the partition coefficient for double and single stranded oligonucleotides (KP,d and KP,s) The equilibrium constants are related to Gibb's free energies via ΔG=−RTlnK. As the geometry of the scheme is a square, any two sides are equivalent and so:
K
S,o
K
P,s
2
=K
S,i
K
P,d
The total concentration of oligonucleotide molecules is given by sstot=2dsout+2dsin+ssout+ssin. if we substitute in equilibrium constants we can express the total concentration in terms of just ssout.
K
S,o
K
P,s
2
=K
S,i
K
P,d
The total concentration of oligonucleotide molecules is given by sstot=2dsout+2dsin+ssout+ssin. If we substitute in equilibrium constants we can express the total concentration in terms of just sstot.
Which can be solved:
Our experimental observables are total concentration, stability of the DNA outside of the organelles, and the total partition coefficient are given by:
Where for a single stranded nucleotide ΔGpart=−RTlnKPSS, and for a double stranded, ΔGpart=−RTln KPDS. These can be manipulated to return to our original definitions:
Finally, we define a destabilisation energy, the difference in free energy of DNA association inside and outside the organelles:
Overall, from experimental measures of KPSS, KPDS, ΔGstab and sstot values, ΔGdestab for double stranded DNA can be directly calculated (
The per-residue thermodynamic quantities ΔGpart/N and ΔGdestab/N were observed to decrease as a function of length, suggesting that the contribution per residue decreases as a function of sequence length. This is to be expected where there is topological frustration in the system4. When a single strand is dissolved inside an organelle, each residue will make many weak interactions with itself and solvent. Individual residues in shorter oligonucleotides will have the freedom to orientate themselves most favourably, whereas for longer oligonucleotides this will be lower if the mobility of the molecule is restricted. This scaling, where the free energy per residue decreases as a function of length in this way can be considered a form of topological frustration. In general terms, a reduced description of the free energy can be given in terms of factors that depend on its volume, and those that depend on its surface area. The former are proportional to the radius cubed, and so depends on N. The latter depends on the radius squared, and so depends on N2/3.
ΔG=volume+surface=aN+bN2/3
The free energy per residue is then expected to have a particular variation with length:
This is precisely the scaling seen in both DNA destabilisation (
A series of FRET samples based on the dsDNA or ssDNA 12mer ACTG oligonucleotides and 24mer dsDNA ACTG oligonucleotides were used to investigate whether Ddx4 organelles destabilised nucleic acid duplexes. Measurements were made using confocal microscopy (section S.12.1). The differently labelled oligonucleotides used in the study are listed in table 7, the correction factors measured in control measurements for a FRET calculation are shown in table 8 and the final results are summarised in table 9. To validate these results, FRET experiments were repeated in a fluorimeter (section S.12.2) in the presence of increasing amounts of the denaturant guanadinium hydrochloride (GdmHCl).
In each FRET experiment involving 12mer ssDNA and 12mer and 24mer dsDNA samples, 3 fluorescence images were recorded per z-slice. This produced three z-stacks corresponding to the following absorption/emission schemes:
In FRET experiments containing only Cy3 and Cy5 fluorophores, the donor fluorophore (Cy3) was excited using a 514 nm laser and the acceptor fluorophore (Cy5) was excited using a 633 nm laser. Donor emission was collected between 555 and 620 nm, whilst acceptor emission was collected between 650 and 750 nm.
In FRET experiments involving 24mer ACTG dsDNA in the presence of GFPWT or GFP+15: GFPs were excited using a 488 nm maser, Cy3 was excited using a 543 nm laser and Cy5 was excited using a 633 nm laser. Emission bands for Cy3 and Cy5 as above.
The overlap of acceptor and donor emission and excitation bands means that the observed emission values need to be corrected to obtain a FRET measurement. To account for this, equivalent emission intensities as above were recorded using samples that contained only donor (DDdonor, DAdonor) or acceptor fluorophores (DAaccept, AAaccept) with identical hardware parameters (i.e. laser intensity, detector sensitivity and emission bandwidth). Image processing and analysis was performed as described in section S.9. The volume-normalised and base-lined fluorescence intensities for equivalent ROIs in each of the DDobs, DAobs, AAobs, DDdonor, DAdonor, DAaccept and AAaccept image stacks were recorded and used for subsequent calculations. The corrected FRET, ECT, is then obtained from the following:
where the correction terms:
Measurements were performed on single stranded and double stranded 12mers, inside and outside organelles, and in dilute solution where no organelles were present. The correction factors and FRET measurements are summarised in table 8. FRET was calculated on a per-ROI (organelle or region of dilutes phase) basis. The reported FRET values are the mean and standard deviation of the corrected FRET signals of 30 organelles or equivalently sized regions of dilute phase (5 per field of view (FOV), 3 FOVs per sample, 2 samples).
For the FRET images in
Fluorescence measurements were made of the FRET-pair ssDNA and dsDNA 12mer ACTG samples in the absence of organelles but in the presence of increasing concentrations of the denaturant GdmHCl using a Perkin Elmer LS-50B fluorescence spectrometer. The 12mer ACTG dsDNA FRET-pair was chemically denatured using guanidinium chloride (GdmHCl 0-6 M, pH 8.0). At each titration point the total concentration of DNA was 2 tM (i.e. sense and sense/antisense strands each at 1 tM), and the NaCl concentration was fixed at 150 mM.
For samples not containing GdmHCl, Cy3 (donor) and Cy5 (acceptor) excitation was achieved at 514 and 633 nm, respectively, each with slit widths of 7.5 nm. Spectra (500 and 800 nm) were collected with a scan rate of 240 nm min−1. For samples containing GdmHCl, Cy3 (donor) and Cy5 (acceptor) excitation was achieved at 514 and 633 nm, respectively, each with slit widths of 5 nm. Spectra (500 to 800 nm) were collected with a scan rate of 120 nm min1.
The data were quantified using the integrative ratio A (RA) method5. The spectrum (intensity values are a function of emission frequency w) was recorded after donor excitation for samples containing donor only f(D,w)donor and a mixture of donor and acceptor f(D,w)mix, and after acceptor excitation for the mixture f(A,w)mix. The ratios were then calculated as follows:
Where max is the value of w where the donor emission is a maximum. The values of
RA were found to decrease as GdmHCl concentration was increased in the double stranded DNA samples, indicating destabilization of the double strand with denaturant (summarised in table 9). RA data was linearly scaled to the corrected FRET values obtained from equivalent samples using confocal microscopy, with ssDNA and dsDNA (no denaturant) as lower and upper bounds, respectively (
Number | Date | Country | Kind |
---|---|---|---|
1518477.3 | Oct 2015 | GB | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/GB2016/053239 | 10/18/2016 | WO | 00 |