The present invention relates to a method for identifying a sequence of monomer building blocks of a biological or synthetic heteropolymer. The invention also relates to the use of a nanopore for identifying a sequence of monomer building blocks of a biological or synthetic heteropolymer. The invention further relates to a computer-implemented method, computer program code, and data processing system for identifying a sequence of monomer building blocks of a biological or synthetic heteropolymer.
In recent decades, considerable progress has been made in technologies for extracting genetic information from cells and tissues, including next-generation single-molecule nucleic acid sequencing techniques. In contrast, similar development for direct identification, discrimination, and sequencing of proteins from cellular or acellular samples has yet to occur. While DNA and RNA sequences provide some prediction of the proteins expressed in a cell or tissue, direct determination of the proteome, e.g., from tumor cells, is more relevant for elucidating biological properties. Indeed, in situations where the presence of specific proteins or protein isoforms is desired or, as the case may be, undesired, such as in vitro protein synthesis for biologicals or biosimilars, per se protein detection and identification is required.
The identification of proteins in complex mixtures currently relies on mass spectrometry of ionized molecules in the gas phase, a powerful but costly technology that requires large equipment. The present invention consists in a novel approach combining highly controlled and automated, preferably enzymatic, fragmentation, using both sequence-specific endopeptidases and exopeptidases, with a newly developed principle of “peptide spectrometry through nanopores” for purposes of label-free characterization of protein mixtures, including identification, discrimination and ultimately protein sequencing.
Nanopore size spectroscopy was first demonstrated for synthetic polymers, but has recently been shown to be applicable to peptides, enabling their highly sensitive, label-free discrimination (Piguet et al. 2018; Ouldali et al. 2020). Importantly, this technique is able to detect differences in individual amino acid residues and, unlike mass spectrometry, distinguish between peptides of the same mass, e.g., peptides containing either the stereoisomers leucine or isoleucine (Ouldali et al. 2020), or characterized by sequence isomerism.
The current standard method for identifying proteins from mixtures involves a series of separation steps, such as liquid chromatography or (2D) gel electrophoresis, followed by tryptic digestion to peptide fragments, and mass spectrometry, e.g. electrospray ionization (ESI), or matrix-assisted laser desorption/ionization (MALDI), followed by separation according to time-of-flight (TOF), or in a quadru-(Q)/multipole field and subsequent correlation with known proteins in databases. Mass spectrometry, although a powerful technique, requires costly and bulky equipment and has significant shortcomings in terms of detection limits and dynamic sensitivity range. A more fundamental drawback is that peptides of the same mass but different composition (e.g., containing leucine or isoleucine) cannot be distinguished without derivatization. For these reasons, novel solutions are needed to identify, distinguish, and ultimately sequence proteins with single-molecule sensitivity.
In contrast to nanopore-mediated single-molecule DNA sequencing, where only 4 nucleobases of the same charge need to be distinguished, in the case of protein structure elucidation, the problem is incomparably more complex by comparison because of the 20 proteinogenic amino acids (aa). To date, this field is still in its infancy, but some progress has already been made, which is summarized below.
Single molecule detection through nanopores is based on analyzing the reduction in electrical conductivity that occurs when an analyte, e.g., a DNA strand or a peptide, diffuses or migrates into a molecularly sized water-filled channel located in an insulator, i.e., a nanopore. The principle of electrical detection of the transport of molecules through a nanopore, which may be a protein channel or an artificial channel, e.g., a nanoscale aperture in a solid membrane or a nanotube or DNA origami structure inserted into a lipid membrane or a nanoscale hole inserted into a solid membrane, is well known. The membrane is subjected to a potential difference that induces an ionic current through the nanopore in the presence of an electrolyte solution or other ionically conductive medium (e.g., an ionic liquid). The interaction of a molecule with the channel of a nanopore, in particular the entry of the molecule into the channel, the presence of the molecule in the channel, or the passage of the molecule through the channel, thereby induces a measurable decrease in the current, provided that the conductive medium in the channel has a higher electrical conductivity than the analyte and vice versa.
Biological (protein) nanopores forming such channels through insulating lipid bilayers were the first nanopores shown to be capable of detecting single molecules, and they enable current nanopore-based DNA sequencing techniques. Alternatively, nanoscopic pores can be fabricated by various drilling or etching techniques in solid-state materials such as thin SiN membranes. These solid-state nanopores are promising, although fabricating solid-state nanopores that are as identical as possible is a technical challenge. In contrast, pore-forming proteins are constructed with atomic precision and have evolved over millions of years to enable solute transport across membranes.
In both cases (biological and non-biological nanopores), the reduction in conductivity is measured as a change in ionic current caused by a constant voltage across the insulator in which the pore is the sole (or dominant) electrically conducting junction. These signals, called resistive pulses, correspond to individual analyte molecules entering the pore and interacting with the inner wall of the pore—and possibly, but not necessarily, translocating through the pore from one side of the insulator to the other.
If the analyte is a polymer (e.g., a peptide, polynucleotide, or synthetic polymer such as poly(ethylene glycol)), two regimes must be distinguished, as shown in
Talarimoghari, M., G. Baaken, R. Hanselmann, and J. C. Behrends. 2018. size-dependent interaction of a 3-arm star poly(ethylene glycol) with two biological nanopores. Eur. Phys. J. E. 41:6288-8. doi:10.1140/epje/i2018-11687-6).
While DNA sequencing by biological nanopores in the translocation/threading regime is well established and commercially available (see https://nanoporetech.com), peptide recognition and differentiation using nanopores is a nascent technique, with protein sequencing using nanopores a long-term goal that has yet to be achieved.
Peptides were threaded through biological protein nanopores such as the bacterial toxins aerolysin and alpha-hemolysin relatively early, but the interaction times were too short and the signal-to-noise ratio too low to distinguish between different peptides, let alone obtain sequence information. In the meantime, biological nanopores have been used to detect and differentiate peptides and proteins even in their native or folded state. Known is the ability of Frageatoxin (FraC) pores to distinguish between two forms of endothelin that differ only in two amino acid positions. (Huang, G., A. Voet, and G. Maglia. 2019. FraC nanopores with adjustable diameter identify the mass of opposite-charge peptides with 44 dalton resolution. Nat Comms. 10:347-10. doi:10.1038/s41467-019-08761-6.)
The well-documented superiority of the sensitivity of the aerolysin pore in the trapping/collapse regime, originally shown for poly(ethylene glycol) (Baaken et al. 2015), led to renewed interest in using this pore for peptide sizing. It was shown that the length of homoarginine peptides can be readily determined with this pore with an accuracy of one amino acid (Piguet et al. 2018). Furthermore, it was determined that the substitution of a single terminal residue in an octa-arginine peptide by one of the 20 proteinogenic amino acids can be detected and thereby differentiated between them, with sufficiently good discrimination of peptides even of the same mass (see
The references cited here are: Baaken et al, 2015 “High-Resolution Size-Discrimination of Single Nonionic Synthetic Polymers with a Highly Charged Biological Nanopore”, ACS nano, VOL. 9, NO. 6, 6443-6449. Piguet et al., 2018, “Identification of single amino acid differences in uniformly charged homopolymeric peptides with aerolysin nanopore,” Nature Communications; 9, 966. Ouldali et al, 2020, “Electrical recognition of the twenty proteinogenic amino acids using an aerolysin nanopore,” Nature Biotechnology, VOL 38, 176-181.
In US 2019/0317006 A1, it was proposed to distinguish different peptides of a mixture from each other by nanopore size spectroscopy and using an aerolysin nanopore.
It is the object of the present invention to provide a technical solution for the identification of a sequence of monomer building blocks of a biological or synthetic heteropolymer, in particular a peptide or protein.
This object is solved according to the invention by the method according to claim 1, the use of a nanopore according to claim 12, the computer-implemented method according to claim 13, the program code stored on a data carrier according to claim 14, and the data processing system according to claim 15. Preferred embodiments of the invention are objects of the subclaims.
In a preferred embodiment of the method according to the invention, the fragments of the fragment mixture are obtained by successive degradation of the heteropolymer. Preferably, the successive degradation of the heteropolymer provides that the heteropolymer is chain-shaped and has positions 1 (chain start) to n (chain end) of the chain, and that the chain, starting from one end, is shortened stepwise by one monomer building block to obtain length fragments, in particular essentially all length fragments n−(n−i) (here, i is a counter which is iteratively counted through according to i=i+1 according to i=1, 2, 3 . . . n−2, n−1, n, so that the length fragments have a total length of n−(n−1), n−(n−2) . . . to n−(n−n) monomer units), of a heteropolymer consisting of n monomer units, each length fragment having the sequence of monomer units identical to the heteropolymer starting from position 1 (chain start) to position n−(n−i). Such a fragment mixture is also referred to here as a “ladder” or a heteropolymer ladder, i.e. a “peptide ladder” if the heteropolymer is/features a peptide.
In this context, the monomer building blocks may belong to a set m of possible monomer building block species, e.g., in the case of eukaryotic proteins, a number n of amino acids (monomer building blocks) may form the protein (heteropolymer) (or a sequence thereof), which may be limited to the set m=21 of human proteinogenic amino acids (i.e., monomer building block species).
Instead of successive degradation, another degradation method can be used that yields the above-mentioned length fragments of the heteropolymer.
The sequence of monomer building blocks of the heteropolymer determined in step c) may be a part of the total sequence (partial sequence) of monomer building blocks of the heteropolymer, or, preferably, may be the total sequence of monomer building blocks of the heteropolymer.
Preferably, the heteropolymer is a peptide. Preferably, the fragmentation method is an Edman degradation or includes an Edman degradation. Further, the fragmentation method may be designed to provide for cleavage of the protein by endopeptidases to peptides, and in particular treatment of the peptides by exopeptidases to obtain the peptide ladder. Preferably, the method according to the invention comprises the following steps:
A characteristic residual current value denotes the measurement results of the current value measurement, which results from the interaction of a certain fragment, which is characterized by the characteristic residual current value, with the nanopore. In particular, the characteristic residual current value includes the residual current value amount attributable to the corresponding current signal. The characteristic residual current value may also be a vector-valued quantity which, in addition to the residual current value amount, includes other components whose number determines the dimension of the vector-valued quantity. Such components can be a time duration of the current signal or another quantity describing the time course of this current signal, or can be parameters describing an interpolation curve which is used to describe the current signal.
A characteristic residual current value describes in each case one fragment type, in particular fragment size, of the number n of fragment types of a fragment mixture formed from the heteropolymer. Example: a fragment mixture formed as a peptide ladder contains a total of n fragment types, starting from a peptide with n amino acids as monomer building blocks. The peptide solution containing the fragment mixture usually contains a large number of fragments of each fragment type (peptide type). Ideally, a fragment mixture obtained by 100% efficient fragmentation of one of a starting set having a total number M of the peptide to be sequenced also contains a number M of fragments of each of the n fragment types of the peptide. When “fragment” is referred to in this application, depending on the context, it may mean in particular the fragment type.
A “representative set of characteristic residual current values”, which can be derived in particular from the total number of measured residual current values, describes a plurality or multiplicity, preferably the totality, of the characteristic residual current values determined for the fragment mixture by means of the current value method mentioned in step b).
Preferably, the method according to the invention is defined as an extended method serving to determine a sequence of a protein, comprising the steps of.
The method according to the invention or the above-mentioned embodiment of the method according to the invention can advantageously be used to elucidate the, in particular complete, primary structure of a macromolecule, in particular a biological macromolecule, in particular a protein, wherein the biological macromolecule comprises various heteropolymers, in particular is formed from various heteropolymers bonded to one another:
Preferably, the method according to the invention is defined as an extended method used to determine the primary structure of a macromolecule, in particular a protein, comprising the steps of.
The method according to the invention can be designed to determine the complete sequence of the monomer building blocks from which the heteropolymer or the macromolecule is built, or one or more partial sequences thereof.
The method according to the invention can be configured to determine a part of the complete sequence of monomer building blocks of which the heteropolymer is composed. If only part of the complete sequence of monomer building blocks of a heteropolymer is determined, the method according to the invention can in particular be used to implement a determination method in which the partial sequence of monomer building blocks of a heteropolymer determined by the method according to the invention is used to determine which previously known heteropolymer has been determined from a set T (1 to T) of previously known different heteropolymers (namely different with respect to their sequence). “Pre-known” means here that the nearly complete, or complete sequence of monomer building blocks of each pre-known heteropolymer is known. The partial sequence determined by the method according to the invention represents a “fingerprint” of the heteropolymer to be determined from the previously known set of heteropolymers, i.e. a feature which makes the heteropolymer sought uniquely identifiable with respect to the other heteropolymers of sets 1 to T. The steps of such a determination method can be described as follows:
The said determination method allows the complete sequence of a sought heteropolymer to be determined without having to elucidate the complete sequence of the sought heteropolymer by means of the method according to the invention, if the sought heteropolymer originates from a set T of previously known heteropolymers each having a previously known sequence, a partial sequence—in the manner of a fingerprint—uniquely identifying the sought heteropolymer with respect to the remaining heteropolymers of this set. In this scenario, the determination method is the more efficient way to determine the complete sequence of the sought heteropolymer, compared to the alternative of elucidating the complete sequence of the sought heteropolymer by means of the method according to the invention instead of the partial sequence of the sought heteropolymer.
Preferably, the nanopore is a biological nanopore, i.e., a pore-forming toxin or a porin.
Preferably, the nanopore is a solid-state nanopore or a hybrid of solid-state and biological and/or chemical components. A solid, in particular a substrate, may include or be formed from at least one of the following materials: SiNx, SiO2, HfO2, MoS2, CNT, graphene, nanopipettes. Biological or chemical components may, each preferably, include or consist of at least one of the following: Pore-forming toxins, porins, βeta-barrel proteins, alpha-helical membrane proteins, DNA origami structures. Hybrids, combinations of all of the above components are possible.
Preferably, the fragmentation of the heteropolymer is carried out by enzymes. Preferably, these are endo/exo peptidases for proteins/peptides and common restriction enzymes (nucleases) for DNA. The person skilled in the art will choose an enzyme set up for this purpose depending on which sequence he wants to cut.
Possible peptidases are mentioned, for example, in: https://www.ebi.ac.uk/merops/Possible nucleases are mentioned, for example, in: https://wikivisually.com/wiki/List_of_restriction_enzyme_cutting_sites %3A_Bst % E2%80% 93Bv#Whole_list_navigation
Preferably, fragmentation of the heteropolymer is done chemically and non-enzymatically. For proteins/peptides, the Schlack-Kumpf and Edman degradation can be used. For DNA, enzymes are usually used.
Preferably, the fragmentation of the heteropolymer takes place by physical means, e.g. by exposure to heat, cold, sound waves, electromagnetic radiation, in particular infrared, ultraviolet or X-ray radiation, microwaves or visible light. Examples are documented in https://doi.org/10.1073/pnas.0901422106 or https://doi.org/10.1007/s13361-017-1794-9 and https://doi.org/10.1002/mas.20214.
Preferably, the nanopore is selected from the group of preferred nanopore proteins containing aerolysin, alpha-hemolysin, MspA, CsgG, VDAC or another protein from the family of beta-barrel proteins, as well as genetically optimized variants of these pore proteins.
The pore proteins and the other measurement conditions are thereby preferably optimized for an interaction of the analyte (the fragment) with the pore, which results in an interaction between analyte and pore that is optimally long-lasting for the respective analyte. A preferred embodiment of the nanopore is as follows: the nanopore is preferably an aerolysin pore, in particular a variant of the aerolysin pore. For this purpose, for example, the single molecule trap of the aerolysin pore can be adapted and optimized to the analyte by single point mutation in the dimension and depth of the potential well. In particular, this is done by the aerolysin variants R220S/A/C/K/H/E/D/Q/N, R288S/A/C/K/H/E/D/Q/N, R282S/A/C/K/H/E/D/Q/N, D222S/A/C/F/R/K/H/E/Q/N, D216S/A/C/F/R/K/H/E/Q/N, D209S/A/C/F/R/K/H/E/Q/N, K238S/A/C/F/R/D/H/E/Q/N, K242S/A/C/F/R/D/H/E/Q/N, K244S/A/C/F/R/D/H/E/Q/N, K246S/A/C/F/R/D/H/E/Q/N, E237S/A/C/F/R/D/H/K/Q/N E258S/A/C/F/R/D/H/K/Q/N E254S/A/C/F/R/D/H/K/Q/N, E252S/A/C/F/R/D/H/K/Q/N and any combinations thereof.
The aerolysin pore in its natural form (wild type) or as a variant thereof is particularly preferred for use as a nanopore in the context of the invention. The variant may be designed to differentiate and characterize fragments of heteropolymers that differ, for example, only by positional isomerism. Using the R220S variant of the aerolysin pore, for example, differentiation of positional isomerism derived from acetylation has been performed (“Resolving isomeric posttranslational modifications using a nanopore,” Tobias Ensslen, Kumar Sarthak, Aleksei Aksimentiev, Jan C. Behrends, bioRxiv 2021.11.28.470241; doi: https://doi.org/10.1101/2021.11.28.470241).
Translocation or passage of the analyte through the pore is not necessary, although it is permitted in principle. Rather, it is particularly advantageous if the same analyte visits its binding site in the pore for as long as possible, or revisits it several times and binds there after having left the molecular trap again in the direction of the entrance opening in the meantime. Preferably, therefore, “interaction” of the fragment (analyte, molecule) with the channel of the nanopore means that the fragment enters the channel but does not pass through the channel, which ultimately results in a non-destructive multiple determination of the same molecule.
By trapping the same analyte in the pore for as long as possible or repeatedly, a particularly precise determination of the characteristic residual current values by means of temporal signal averaging and a representative determination of the parameters of the time course of the current signal (variance, noise analysis) is made possible. It is understood that an interaction of analyte and pore should not last indefinitely, otherwise the accessibility of the pore for analyte molecules is reduced. This results in an optimal interaction duration adapted to the analyte, which can be achieved in particular by variant formation of the nanopore, preferably of the aerolysin.
From the investigations underlying the present invention, it was found that carrying out the current measurement method (step b) in claim 1) in the collapse regime (also: collapsed, binding or trapping regime) is particularly advantageous. The current measurement method carried out in step b) is preferably performed such that the fragment mixture is present in an electrolyte solution comprising, in particular, dissolved salts of the form AX, A2 X and AX2 etc., where substance A (e.g. selected from the alkali and alkaline earth metals Na, K, Cs, Rb, Li) provides the cation and substance X (e.g. selected from the halogens F, Cl, Br) provides the anion. The substance groups A and X may comprise further constituents in the sense of inorganic or organic derivatives of such salts (where, for example, substance A is a quaternary ammonium, imidazolium, phosphonium, pyridinium and pyrrolidinium ion such as e.g. tetramethylammonium, and substance X may be a nitrate, a sulfate, phosphate, an amino acid such as glutamate, a carboxylic acid such as gluconate, citrate, a (bi)carbonate, or a simple hydroxide). Preferably, the electrolyte solution may also comprise mixtures of different combinations of different salts.
The total salt concentration of the electrolyte solution in which the fragment mixture is present during the performance of the current measurement method is between 0.5 M and 20 M, preferably between 2 M and 10 M and particularly preferably between 3 M and 5 M. The fragment mixture can also be present in an ionic liquid as an alternative to an electrolyte solution. Such configurations of the electrolyte have the effect of optimally setting conditions such as charge shielding and solubility of the analyte in the electrolyte solution for the collapsed/bonded regime and the longest possible residence time of the analyte in the molecular trap of the pore, while at the same time achieving the highest possible signal-to-noise ratio of the current measurement.
The invention also relates to the use of a nanopore for carrying out the method of the invention for identifying a sequence of monomer building blocks of a biological or synthetic heteropolymer.
The invention also relates to a computer-implemented method for determining a sequence of monomer building blocks of a heteropolymer (heteropolymer sequence) from measurement data of a current measurement method containing information on current signals obtained upon interaction of different fragments formed from the heteropolymer with a nanopore, comprising the steps:
The invention also relates to a computer program code which is stored on a data carrier and which determines a sequence of monomer building blocks of a heteropolymer (heteropolymer sequence) from the measurement data of a current measurement method when executed by the central processor of a computer, the measurement data containing information about current signals which are determined upon the interaction of different fragments formed from the heteropolymer with a nanopore, comprising the respective steps implemented by the program code:
The invention also relates to a data processing system for determining a sequence of monomer building blocks of a heteropolymer (heteropolymer sequence) from the measurement data of a current measurement method containing information on current signals determined upon interaction of different fragments formed from the heteropolymer with a nanopore, comprising a computer with a central processor, and a program code, in particular the program code according to the invention, wherein the computer is programmed to perform the following computer-implemented steps:
The evaluation method, in which the sequence of the monomer building blocks of the heteropolymer is determined from the representative set of the characteristic current signals, preferably provides for the computer-implemented steps:
In steps A) to D), it is possible that the representative set of characteristic residual current values cannot unambiguously describe the heteropolymer because, for example, only part of the heteropolymer was fragmented or because not all characteristic residual current values could be unambiguously determined. In this case in particular, a prediction algorithm can be used to indicate from the incomplete data, in particular from an incomplete representative set of characteristic residual current values, a probability or an evaluation factor for evaluating the reliability of a primary structure of the heteropolymer determined by estimation. In this context, the prediction algorithm may have been determined by machine learning using, in particular, labeled training data. The labeled data may contain variations of incomplete representative sets of the characteristic residual current values of previously known heteropolymers. The prediction algorithm may include an artificial neural network, in particular a convolutional neural network (CNN), which may be trained by the labeled training data. The prediction algorithm may also implement unsupervised machine learning.
Further preferred embodiments of the objects according to the invention result from the following description of the embodiment examples in connection with the figures. Identical reference signs designate essentially identical components or method steps.
A: 1: Peptide design 2: Peptide-pore interaction. 3: Current trace in the presence of a mixture of 7−R+D,K,R,E,H.
B: plot of relative current vs. aa volumes. C: >95% discrimination between structural isomers 7R+L and 7R+I by high-resolution recording on MECA (according to Ouldali et al. 2020).
Based on the prior art in Ouldali et al. 2020, the question for the inventors was how to use the high sensitivity of the nanopore to peptide size or volume for actual sequence identification in heteropolymers or for protein identification and sequencing.
To solve this problem, the inventors explored an approach, also called “nanopore ladder sequencing,” in which peptides (or other heteropolymers), which can be initially generated preferably by enzymatic or chemical or physical cleavage of proteins, are separated, preferably by known chromatographic or electrophoretic methods, or in which peptides or other heteropolymers are already present in isolation, and, preferably in a second step, are subjected either to the action of exopeptidases that cleave individual N- or C-terminal amino acids from a peptide, or to chemical methods such as the Edman reaction, in order to obtain a mixture of peptides or heteropolymers, i.e., a mixture of fragments, in which several species or characteristic fragment types are present in a representative set, preferably representing all or most of the possible fragments formed by the removal of amino acids (or monomer building blocks) in sequence, such that for a peptide (or heteropolymer) of degree of polymerization (d. p.) n, all or most species of d.p. n−(n−1), n−(n−2) . . . bis n(n−n) are present. Each of these species, when interacting with the nanopore, will give a characteristic maximum in the histogram of relative residual currents (characteristic residual current value or amount).
The measurement evidence demonstrates the ability of the invention here, for example, to correlate short, known peptide sequences with nanopore data in this manner (see
A, B: Scatter plots with event histogram obtained from the interaction of aerolysin with two peptide ladders containing a triarginine handle. Removal of aa results in a species-specific shift in residual current characteristic of a monomer building block species (here aa).
C,D: Plot of the change in peptide volume and relative residual current for the two ladders shown above. A clear correlation between the two parameters as well as sequence dependence is evident.
In particular, the method 100 may be used in a method (200) for determining the primary structure of a protein, comprising the steps of (see
The evaluation method (103 or 300), in which the sequence of the monomer building blocks of the heteropolymer is determined from the representative set of the characteristic current signals, may in particular comprise the following steps (see
Experimental Data and Embodiment
An embodiment of the invention is described below in which the complete sequence of synthetic peptides is elucidated, including in a double-blind experiment:
In the present embodiment, the method according to the invention is described as a “method for peptide sequence recognition with respect to peptide sequencing in a derivatization-free single molecule experiment using the wt-aerolysin (wt-AeL) nanopore by a bottom-up peptide ladder strategy”. In this research experiment, six peptide ladder-like sample pools were designed. Each pool consisted of the same deca-peptide but with a scrambled sequence and the respective ladder down to the polycationic tri-arginine carrier. Single molecule resistive pulse experiments (nanopore size spectroscopy) demonstrated the detection of species-dependent characteristic differences in residual current strengths for each peptide with identification of the single amino acid (aa) corresponding to each step of ladder formation, laying the foundation for peptide sequencing according to the invention. In addition, the potential of this simple approach as a benchmark technique in everyday laboratory use is described by a double-blind study in another laboratory in which two blindly selected peptides from the sample pool were identified and distinguished based on their aa sequence.
Peptide Ladder Design and Measurement
The embodiment uses the wt-AeL nanopore. A Deka peptide was designed consisting of a polycationic C-terminal carrier, R3, preceded by a heterogeneous stretch of seven aa recruited from the five different aa SRAKY (e.g., SRASKYR). In a second step, the sequence of the aa portion was scrambled to obtain six different hetero-Deka peptides that have the exact same mass of 1335.65 Da (
Step b) of the method according to the invention, or steps A) and B), was carried out as follows: In a typical experiment, a single wt-AeL channel was inserted into a DPhPC lipid bilayer spanning a single 50 μm aperture of the microelectrode cavity array (MECA16) used. A trans-negative bias voltage of 40 mV was used to drive an ion current (Io) through the protein channel connecting two reservoirs otherwise electrically isolated from each other by the lipid bilayer and filled with electrolyte solution (4 M KCl). Individual peptides that enter the channel defined by the protein and thereby alter the ionic current (I) are detected via the resulting resistive pulses,
An evaluation method in which the sequence of monomer building blocks (here: aa) of the heteropolymer (here: peptide) is determined from the representative set of characteristic current signals results from using the differences ΔI/Io of residual current values of adjacent maxima in the representative set of characteristic residual current values. Step D, determining the above aa, is performed by assigning the residual current value differences ΔI/Io to aa of the peptide using pre-known correlation data containing information about which aa is represented by which current value difference amount ΔI/Io to make the determination of the sequence of aa (determining the sequence of As of the peptide).
To ensure correct assignment of maxima to peptides, the ladders were measured sequentially, starting with the smallest peptide. The expectation expressed above of a monotonic relationship between peptide length and depth of the block was confirmed. On this basis, following this experimental pathway, each of the 42 peptides could be identified within all six ladders (
All recorded resistive pulses in the data sets were analyzed in terms of event duration (dwell time) and amplitude (I/Io), as well as the number of modulations. The calculated differentials, i.e. changes in these values from one maximum to the next, were then plotted together with the differentials for the volume and hydrophobicity of the peptide against the respective position in the peptide,
Double-Blind Test
To investigate the reproducibility and reliability of the results described above, a double-blind experiment was performed. Six peptide ladder samples were prepared, each consisting of aa R13 to aa R73 in equimolar amounts. An independent third party acting as a notary randomly selected two of the six ladder samples, labeled them A & B, and sent them along with an R3-homo peptide sample to an outside comparison laboratory (Abdelghani Oukhaled working group, Université Cergy Pontoise, France). In addition to the ladders, only
Using
The embodiment shows the method of the invention for peptide identification by ladder fingerprinting, which can serve as a primary platform for further development towards peptide sequencing, in particular using the highly sensitive wt-AeL nanopore. Reliable detection of hetero-peptides consisting of a c-terminal polycationic R3-carrier and up to seven n-terminal alternating heterogeneous aa was achieved . . . . By using peptide ladder-like sample pools ranging from aa R13 to aa R73, the position-sensitive contribution of a specific aa species to the overall block depth of a peptide was investigated, and based on these findings, a sequencing as well as fingerprinting reading frame was postulated. Using these, the robustness and reliability of this strategy was demonstrated in a double-blind study by demonstrating sequencing of a randomly selected peptide and identification of a second peptide by fingerprinting.
In this embodiment example, peptides synthesized on demand were used. This is a model case that can be easily adapted for the case of unknown protein or peptide samples. More comprehensive analysis of larger heteropolymers is accomplished by an initial step of cleaving the heteropolymer by fragmentation methods into further fragmentable subcomponents, which are then used to form ladders For example, proteins can be made available in a standardized sample preparation process. Similar to standard bottom-up MS protein sequencing experiments, for example, an endo-peptidase can be used to fragment proteins into smaller peptides. Furthermore, an exo-peptidase can be used to dynamically generate ladders from these peptides. Individual peptides produced by the protease could be sequentially presented to the nanopore and analyzed in a dynamic exopeptidase-coupled experiment. There is great value in the method of the invention with respect to everyday laboratory applications.
Material and Methods
Reagents
All measurements were performed in AgCl (Carl Roth GmbH, Karlsruhe, Germany) saturated 4 M KCl (Carl Roth GmbH, Karlsruhe, Germany) buffered with 25 mM TRIS (Merck KGaA, Darmstadt, Germany) at pH 7.5. All solutions were prepared using 18.2 M Ω·cm−1 Milli-Q water. After equilibration, the electrolyte solutions were filtered (0.22 μm) and stored protected from light. Peptides were synthesized according to the desired requirements by Intavis Peptide Services GmbH & Co KG (Tubingen, Germany). Stock solutions (750 μM) of all peptides were prepared in 10 mM HEPES, pH 7.5 and stored at −20° C. until use. Reagents were used at a final concentration of 5 μM.
Protein and Lipid Preparation
Wild-type proaerolysin (pAeL) was prepared internally via standard protocols from E. coli BL21 (DE3)-pLysS-competent cells using the pET22b (+) vector. pAeL was purified from cell lysates via His-tag chromatography. Sticks of pAeL were prepared using 1 μg·μL−1, frozen with nitrogen, and stored at −80° C. Thawed pAeL was activated with trypsin (Promega GmbH, Walldorf, Germany) and used at a final pAeL concentration of 20 pmol L−1 (or 3 pmol L−1 AeL). The preprotein construct was chosen in such a way that the affinity tag used for purification is separated from the protein during trypsin activation and native protein is obtained.
All membranes were prepared from 1,2-diphytanoyl-sn-glycero-3-phosphocholine (DPhPC) from octane. DPhPC was dissolved in chloroform by Avanti Polar Lipids Inc (Alabaster, AL, USA). The lipids were aliquoted, dried under argon, and stored as a dry film at −20° C. until used at a concentration of 1 mg mL−1
Nanopore Measurements Inventor Laboratory
All recordings were made using an Axopatch 200B (Molecular Devices, San Jose, CA, USA) in capacitive feedback mode with its 4-pole Bessel filter corner frequency set to 100 kHz at a digitization rate of 1 MHz. An 8-pole Bessel filter with a corner frequency of 50 kHz was connected between the amplifier output and the input of the analog-to-digital converter (Model 9002, Frequency Devices, Ottawa, II, USA). Digitization was performed using a National Instruments AD converter (PCI-6251, National Instruments, Austin, TX, USA). GePulse software (Michael Pusch, University of Genoa, Italy) was used for holding potential control and data recording. Single-molecule resistive pulses were collected under 40 mV transnegative voltage. To eliminate as many parasitic capacitances as possible, MECA16 cavity arrays from lonera GmbH (Freiburg, Germany) with 50 μm diameter cavities were used. Further digital filtering (25 kHz Bessel) and event detection was performed with self-written LabView (National Instruments)-based software; subsequent analysis with Igor Pro 8 (Wavemetrics, Lake Oswego, OR, USA).
Nanopore Measurements Comparison Lab:
All recordings were performed with an Axopatch 200B (Molecular Devices, San Jose, CA, USA) in resistive feedback mode with its 4-pole Bessel filter cutoff frequency set to 5 kHz at a digitization rate of 100 kHz. A classic vertical chamber system from Warner Instruments (Hamden, CT, USA) with apertures of 150 μm diameter was used for the measurements. Digitization was performed using the DigiDatat 1440A AD converter and Clampex10 software (Molecular Devices). The analysis was performed with in-house routines implemented in IgorPro 8.
Number | Date | Country | Kind |
---|---|---|---|
102021200425.3 | Jan 2021 | DE | national |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2022/050990 | 1/18/2022 | WO |