NANOPORE ASSEMBLIES AND USES THEREOF

Information

  • Patent Application
  • 20200399693
  • Publication Number
    20200399693
  • Date Filed
    February 11, 2019
    7 years ago
  • Date Published
    December 24, 2020
    5 years ago
Abstract
The disclosure provides a nanopore system assembled with non-membrane proteins for detecting analytes. Also disclosed are the methods, kits, and detection devices employing the disclosed nanopore system. The nanopore system has a wide variety of applications, including single molecule detection, DNA/RNA/peptide sequencing, sensing of chemicals, biological reagents, and polymers, and disease diagnosis.
Description
FIELD OF THE INVENTION

The present disclosure relates generally to nanopores and more specifically to systems and methods using nanopores assembled with non-membrane proteins for detecting analytes.


BACKGROUND OF THE INVENTION

Biological nanopores are protein channels embedded in a substrate, typically lipid membranes. A wide variety of protein complexes in nature form elegant channel-like structures, some of which have been explored as nanopores, exemplified by α-hemolysin, MspA, aerolysin, FluA, Omp F/G, CsgG, ClyA, and PA63. For example, the biological protein nanopore Δ-hemolysin (ΔHL) from Staphylococcus aureus has been used for single molecule detection.


Recently, researchers have adopted biological, solid-state, DNA origami, and hybrid nanopores in single-molecule analyses. Biological nanopores have advantages compared to their synthetic counterparts, mostly because they can be reproducibly fabricated and modified with an atomic level precision that cannot yet be matched by artificial nanopores. The existing use of biological nanopores, however, also has drawbacks. For example, they are not versatile in providing different shapes, sizes, and hydrophilic/hydrophobic properties in order to detect different analytes with high sensitivity and specificity.


Accordingly, there remains a strong need for a robust nanopore system amendable for detecting analytes with distinct properties.


SUMMARY OF THE INVENTION

This disclosure addresses this need in the art by providing a nanopore assembly for detecting an analyte. The nanopore assembly comprises a channel formed of a plurality of subunits. Each of the subunits comprises a non-membrane protein capable of forming a protein channel In some embodiments, each of the subunits comprises a polypeptide having a polypeptide sequence at least 75% identical to a polypeptide sequence selected from the group consisting of SEQ ID NOs: 1-35. In some embodiments, the polypeptide comprises a polypeptide sequence at least 75% identical to SEQ ID NOs: 4-12.


In some embodiments, the polypeptide may include at least one residue substituted with cysteine. In some embodiments, the polypeptide is derived from phi29 portal or tail protein. In some embodiments, phi29 tail protein may include one or more of E595C, K321C, and K358C substitutions. In some embodiments, the phi29 tail protein may include one or more of K134I, D138L, D139L, D158L, E163V, E309V, D311V, K321V, K356A, K358A, D377A, D381V, N388L, R5241, R539A, and E595V substitutions.


In one aspect, the nanopore assembly further comprises a probe for detecting an analyte. The probe is operably linked to at least one of the subunits. The probe can be one of chemicals, carbohydrates, aptamers, nucleic acids, peptide, protein, antibodies, and receptors. In some embodiments, the probe comprises a sequence at least 75% identical to a sequence selected from the group consisting of SEQ ID NOs: 36-79. In some embodiments, the probe is an anti-PSA antibody. The probe may be operably linked via covalent bonding to at least one of the subunits. The covalent bonding includes a disulfide linkage, an ester linkage, or a sulfhydryl linkage. In some embodiments, the probe is operably linked to a location in proximity to an entrance of the channel or a location at an interior side of the channel


The analyte can be one of nucleic acids, amino acids, peptides, proteins, polymers, and chemical molecules. In some embodiments, the analyte is one of PSA, CEA, AFP, VCAM, MiR-155, MiR-22, MiR-7, MiR-92a, MiR-122, MiR-192, MiR-223, MiR-26a, MiR-27a, and MiR-802.


According to some embodiments of the nanopore assembly, the channel is embedded in a polymersome. In some embodiments, the channel is inserted in a membrane. The membrane may include a polymer membrane or a lipid membrane. The polymer membrane may include an alternating copolymer, a periodic copolymer, a block copolymer, a di-block copolymer, a tri-block copolymer, a terpolymer, or a combination thereof. In some embodiments, the polymer membrane comprises PMOXA-PDMS-PMOXA. In some embodiments, the nanopore assembly may further include cholesterol and/or porphyrin.


In another aspect, this disclosure also provides an apparatus for detecting an analyte. The apparatus includes the nanopore assembly described above and optionally a support for the nanopore assembly. The apparatus may further include an electrode, to which the nanopore assembly is tethered.


In another aspect, this disclosure further provides a kit that includes the nanopore assembly described above and optionally instructions for using the nanopore assembly.


In another aspect, the disclosure provides a method of detecting an analyte. The method includes: (1) contacting a sample containing an analyte with the nanopore assembly as described; (2) applying an electrical current across the channel of the nanopore assembly; (3) determining the electrical current passing through the channel at one or more time intervals; and (4) comparing the electrical current measured at one or more time intervals with a reference electrical current, wherein a change in electrical current relative to the reference electrical current indicates a presence of the analyte in the sample. The analyte may be any one of nucleic acids, amino acids, peptides, proteins, polymers, and chemical molecules. In some embodiments, the reference electrical current is measured with a sample that does not contain the analyte. In some embodiments, the nanopore assembly is placed on a support.


The foregoing summary is not intended to define every aspect of the disclosure, and additional aspects are described in other sections, such as the following detailed description. The entire document is intended to be related as a unified disclosure, and it should be understood that all combinations of features described herein are contemplated, even if the combination of features are not found together in the same sentence, or paragraph, or section of this document. Other features and advantages of the invention will become apparent from the following detailed description. It should be understood, however, that the detailed description and the specific examples, while indicating specific embodiments of the disclosure, are given by way of illustration only, because various changes and modifications within the spirit and scope of the disclosure will become apparent to those skilled in the art from this detailed description.





BRIEF DESCRIPTION OF THE DRAWINGS


FIGS. 1A and 1B are a schematic diagram showing different functional layers of a typical non-membrane protein (shown as a truncated cone structure) to be used as a polymer membrane-embedded nanopore. FIG. 1A shows the three distinct domains that are important for membrane anchoring. FIG. 1B shows the two areas that are particularly important for conjugation of functional modules for single molecule sensing.



FIG. 2 shows an example of mutagenesis in the membrane anchoring layer for direct membrane insertion, in which phi29 gp9ΔLoop tail protein is shown as an example. Structure of phi29 gp9ΔLoop tail protein channel before and after a series of hydrophobic mutations made on the middle layer (boxed) to increase its capability of direct membrane insertion. The exemplary mutation sites include: K134I, D138L, D139L, D158L, E163V, E309V, D311V, K321V, K356A, K358A, D377A, D381V, N388L, R524I, R539A, and E595V. After expression and purification, the mutant proteins were spontaneously inserted into polymer membranes. Representative positive charged residues (e.g., R and K) and representative negative charged residues (e.g., E, Q, D, and N) are indicated.



FIG. 3 shows an example of bacteriophage protein expression and purification, in which protein expression and purification of phi29 gp9 tail protein was demonstrated. Coomassie-blue stained SDS-PAGE gel shows the expression of phi29 gp-9 tail protein channel with a molecular weight of 70.33 kDa.



FIG. 4 is an example Coomassie-blue stained SDS-PAGE gel showing that majority of the gp-9 tail protein channel is present in 100 kDa column after purification and assembly, indicating that the channel is assembled from its monomer units, which are ˜70kDa.



FIG. 5 is an example of target sites selection in the non-membrane protein surface, as demonstrated using the phi29 gp9 tail protein. Phi29 gp9 tail protein structure shows three possible residues for mutagenesis into cysteine for conjugating hydrophobic membrane-anchoring modules. FIG. 6 shows an example of non-membrane protein channel inserted into polymer membranes, as demonstrated using the phi29 gp9ΔLoop tail protein. Single channel recording data shows direct insertion of the phi29 gp9ΔLoop channel into a polymer membrane of composition PMOXA6-PDMS65-PMOXA6. Conduction buffer: 1 M NaCl, 5 mM Tris, pH 7.6. Applied voltage: 75 mV.



FIGS. 7A and 7B show the structure of bacteriophage P22 gp1 portal protein full length (FIG. 7A) and barrel-deleted mutant (FIG. 7B).



FIG. 8 shows an example of target sites selection in the non-membrane proteins for membrane anchorage, as demonstrated using P22 gp1 portal protein. A representative conjugation site (e.g., an accessible cysteine residue, C283) for incorporating membrane anchoring domain is shown on the structure of bacteriophage P22 gp1 portal protein.



FIG. 9 shows an example of non-membrane proteins pores functionalized with fused peptide probes, as demonstrated using P22 gp1 portal protein fused with VCAM1. Coomassie-blue stained gel shows purified bacteriophage P22 gp1 portal protein full length and barrel deleted mutants with VCAM1 probe fused at the C-terminus.



FIG. 10 shows an example of non-membrane proteins pores functionalized with fused peptide probes, as demonstrated using P22 gp1 portal protein fused with PSA. Coomassie-blue stained gel shows purified bacteriophage P22 gp1 portal protein barrel deleted mutant with PSA probe fused at the C-terminus.



FIG. 11 shows an example of non-membrane proteins pores functionalized with fused peptide probes for single molecule sensing, as demonstrated using P22 gp1 portal protein fused with PSA probes for detecting PSA. Single channel recording data shows direct insertion of P22 gp1 protein channel harboring fused PSA probe into the polymer membrane of composition PMOXA6-PDMS35-PMOXA6. In the presence of PSA (10 ng/uL), the probe binds to the PSA and results in the characteristic current blockage events. Conduction buffer: 1 M KCl, 5 mM Tris, pH 7.6. Applied voltage: 75 mV.



FIG. 12 shows an example of target sites selection in the non-membrane proteins for membrane anchorage, as demonstrated using T4 gp20 portal protein. Representative conjugation sites (e.g., via accessible cysteine residues, C217, 245, 246) for incorporating membrane anchoring domain are indicated on the structure of bacteriophage T4 gp20 portal protein.



FIG. 13 shows an example of labeling non-membrane protein channels with cholesterol-PEG-maleimide for membrane anchorage, as demonstrated using T4 gp20 portal protein.



FIG. 14 shows an example of non-membrane protein channels inserted in polymer membranes, as demonstrated using T4 gp20 portal protein harboring cholesterol. Stepwise direct insertion of the proteins was observed under an applied potential. Conduction buffer: 1 M KCl, 5 mM Tris, pH 7.6. Applied voltage: 100 mV. Membrane: PMOXA6-PDMS35-PMOXA6.



FIG. 15 shows an example of conjugating probes to non-membrane protein channels via click chemistry, as demonstrated using phi29 gp10 portal protein. The thiol-miR-21 probe was labeled with TCO (trans-cyclooctene) followed by conjugation of TCO-miR-21 probe to Methyltetrazine-protein. The SDS-PAGE gel verifies target miR-21 binding to miR-21 probe conjugated to proteins.



FIG. 16 shows an example of analyte (e.g., miRNA) detection using probe-functionalized non-membrane protein pores, as demonstrated using T4 gp20 portal protein. In the presence of target miRNA, current blockage events were observed, indicating detection of miRNA at the single molecule level. Conduction buffer: 1 M KCl, 5 mM Tris, pH 7.6. Applied voltage: 100 mV. Membrane: PMOXA6-PDMS35-PMOXA6.



FIG. 17A shows an example of polyoxazoline based triblock copolymers for insertion of non-membrane proteins as nanopores, as demonstrated using PMOXA6-PDMS6s-PMOXA6.



FIG. 17B shows the stability of the planar membrane over the course of 2 days. The membrane shows no signs of membrane leakage. Membrane: PMOXA6-PDMS3s-PMOXA6.



FIGS. 17C and 17D show an example of inserting non-membrane protein pores using a fusion of polymersomes with planar polymer membranes, as demonstrated using phi29 gp9 tail protein (FIG. 17C) and P22 gp1 portal proteins (FIG. 17D). Planar membrane and polymersome composition in FIG. 17C: PMOXA11-PDMS65-PMOXA11; Planar membrane and polymersome composition in FIG. 17D: PMOXA5-PDMS13-PMOXA5;



FIG. 18 shows an example of a non-membrane protein with Spytag/Spycatcher system for incorporating analyte probes, as demonstrated using phi29 gp10 portal protein. Coomassie-blue stained SDS-PAGE gel shows the binding of Spytag incorporated in phi29 gp10 portal protein and PSA-Spycatcher protein to form a complex. The functionalized pore can then be used for sensing PSA as shown in FIG. 19.



FIG. 19 shows an example of non-membrane protein channels for analyte detection using spytag-spycatcher system, as demonstrated using phi29 gp10 portal protein. After direct insertion, the spytag/spycatcher (PSA probe) on the phi29 gp10 portal was able to detect PSA (10 ng/uL) in the solution, as shown as current blockage events. Conduction buffer: 1 M KCl, 5 mM Tris, pH 7.6. Applied voltage: 100 mV. Membrane: PMOXA6-PDMS35-PMOXA6.





DETAILED DESCRIPTION OF THE INVENTION

This disclosure provides a robust nanopore system and method adaptable for detecting different analytes. It allows for detection of a single molecule and of a target molecule present with other contaminants. The system offers a label-free, amplification-free, real-time detection. It requires very low sample amount and can be used for high-throughput analysis. The system can be adapted to detect a variety of analytes (e.g., small molecules, polymers, polypeptides, and nucleotides) with different shapes, sizes, and hydrophilic/hydrophobic properties, with high sensitivity and specificity.


This disclosure addresses the need in the art by providing nanopores formed of non-membrane proteins, such as proteins derived from bacteriophages. To generate the disclosed nanopores, bacteriophage proteins were expressed, purified, self-assembled, and inserted in lipid or polymer membranes. Non-membrane protein channels, however, are more difficult to be inserted into lipid bilayer or polymer membrane directly. Unlike membrane protein channels, non-membrane protein channels generally lack hydrophobic layer in the middle and hydrophilic layers in both ends. This invention overcomes this limitation by employing a series of methods for inserting various protein channels either directly or through highly efficient fusion mechanism in polymer membranes. In addition, in some scenarios, protein engineering, such as site-directed mutagenesis, insertion, and deletion of amino acids, and introduction of functional modules are carried out to tune nanopore properties to meet different detection needs.


The disclosed nanopore system has a wide variety of applications, including but not limited to single molecule detection, DNA/RNA/peptide sequencing, sensing of chemicals, biological reagents, and polymers, and disease diagnosis. As will be further described below, the methods, kits, and detection devices employing the disclosed nanopore system are also within the scope of this disclosure.


I. Nanopore Assemblies

One aspect of the present disclosure relates to nanopores formed of non-membrane proteins. The non-membrane proteins suitable for forming nanopores may be derived from cellular DNA translocases, helicases, terminase, ATPases, and fragments thereof. The non-membrane proteins suitable for forming nanopores may also include proteins involved in DNA repair, replication, recombination, chromosome segregation, DNA/RNA transportation, membrane sorting, cellular reorganization, cell division, bacterial binary fission, and other processes. The nanopore may include a plurality of subunits, each comprising a non-membrane protein. For example, the nanopore may include 10 to 15 subunits of phage portal proteins or 5 to 10 phage tail proteins. A non-membrane protein channel forming nanopores can be derived from bacteriophage portal proteins including, but not limited to T3, T4, T5, T7, SPP1, P22, P2, P3, Lambda, Mu, HK97 and C1. For example, a non-membrane protein channel forming nanopores can be derived from bacteriophage tail proteins including, but not limited to phi29, C1, Neisseria meningitidis serogroup B, T4, phiX174, lambda, SPP1, T5, Mu, F4-1, P2, Serratia phage KSP90, Enterobacteria phage T7M, Bacteriophage HK97.


Also within the scope of this disclosure are the variants and homologs with significant identity to bacteriophage proteins (e.g., bacteriophage portal or tail proteins) including, but not limited to T3, T4, T5, T7, SPP1, P22, P2, P3, Lambda, Mu, HK97, and C1. For example, such variants and homologs may have sequences with at least about 70%, about 71%, about 72%, about 73%, about 74%, about 75%, about 76%, about 77%, about 78%, about 79%, about 80%, about 81%, about 82%, about 83%, about 84%, about 85%, about 86%, about 87%, about 88%, about 89%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, or about 99% sequence identity over the sequences of the bacteriophage portal proteins described herein.


Biological pore mutagenesis can be used to optimize protein pores for assembly or use in a composition or method set forth herein. For example, protein engineering and mutagenesis techniques can be used to mutate biological pores and tailor their properties for specific applications. Accordingly, the nanopore assembly may include a channel formed of a plurality of subunits. Each of the subunits comprises a polypeptide having a polypeptide sequence having at least 75% identity to a polypeptide sequence selected from the group consisting of SEQ ID NOs: 1-35. In some embodiments, the polypeptide comprises a polypeptide sequence at least 75% identical to SEQ ID NOs: 4-12.


The subunits may include one or more substitutions, deletions or insertion to facilitate functionalization of nanopores or insertion in membranes. For example, the subunit may include at least one residue substituted with cysteine, such that a functional group can be linked to the subunit via a disulfide bond. For example, cysteine residues can be used for conjugating chemicals such as porphyrin to the nanopore. In some embodiments, the subunit may include phi29 gp9 protein having one or more of E595C, K321C, and K358C substitutions.


In addition, the subunit may include the phi29 gp9 protein having one or more of K134I, D138L, D139L, D158L, E163V, E309V, D311V, K321V, K356A, K358A, D377A, D381V, N388L, R524I, R539A, and E595V substitutions. Substitution of charged residues (e.g., K, R, D, E) with hydrophobic residues (e.g., A, V, L) increases the hydrophobicity of the protein locally and globally.


In some embodiments, the process of incorporating a protein channel in a membrane or a nanodisc can benefit from an affinity tag (e.g., polyhistidine affinity tags) on the protein and from the purification of the mixed nanopore population over an affinity column (e.g., a Ni column). Affinity tags and chemical conjugation techniques (e.g., modifications of cysteines set forth above) can be used to attach tethers to nanopores for use in a variety of methods, compositions or apparatus set forth herein. For example, the resulting tethers can be used to attract or attach a nanopore to a solid support or an electrode.









TABLE 1







Amino Acid Sequences of Channel Proteins









SEQ ID NO
Sequences
Other information





SEQ ID NO: 1
MADNENRLESILSRFDADWTASDEARREAKNDLF
P22 portal protein



FSRVSQWDDWLSQYTTLQYRGQFDVVRPVVRKL
channel/nanopore/



VSEMRQNPIDVLYRPKDGARPDAADVLMGMYRT
WT



DMRHNTAKIAVNIAVREQIEAGVGAWRLVTDYE




DQSPTSNNQVIRREPIHSACSHVIWDSNSKLMDKS




DARHCTVIHSMSQNGWEDFAEKYDLDADDIPSFQ




NPNDWVFPWLTQDTIQIAEFYEVVEKKETAFIYQ




DPVTGEPVSYFKRDIKDVIDDLADSGFIKIAERQIK




RRRVYKSIITCTAVLKDKQLIAGEHIPIVPVFGEWG




FVEDKEVYEGVVRLTKDGQRLRNMIMSFNADIV




ARTPKKKPFFWPEQIAGFEHMYDGNDDYPYYLL




NRTDENSGDLPTQPLAYYENPEVPQANAYMLEA




ATSAVKEVATLGVDTEAVNGGQVAFDTVNQLN




MRADLETYVFQDNLATAMRRDGEIYQSIVNDIYD




VPRNVTITLEDGSEKDVQLMAEVVDLATGEKQVL




NDIRGRYECYTDVGPSFQSMKQQNRAEILELLGK




TPQGTPEYQLLLLQYFTLLDGKGVEMMRDYANK




QLIQMGVKKPETPEEQQWLVEAQQAKQGQQDPA




MVQAQGVLLQGQAELAKAQNQTLSLQIDAAKVE




AQNQLNAARIAEIFNNMDLSKQSEFREFLKTVASF




QQDRSEDARANAELLLKGDEQTHKQRMDIANILQ




SQRQNQPSGSVAETPQ






SEQ ID NO: 2
MADNENRLESILSRFDADWTASDEARREAKNDLF
P22 portal protein



FSRVSQWDDWLSQYTTLQYRGQFDVVRPVVRKL
channel/nanopore/



VSEMRQNPIDVLYRPKDGARPDAADVLMGMYRT
Barrel deletion



DMRHNTAKIAVNIAVREQIEAGVGAWRLVTDYE




DQSPTSNNQVIRREPIHSACSHVIWDSNSKLMDKS




DARHCTVIHSMSQNGWEDFAEKYDLDADDIPSFQ




NPNDWVFPWLTQDTIQIAEFYEVVEKKETAFIYQ




DPVTGEPVSYFKRDIKDVIDDLADSGFIKIAERQIK




RRRVYKSIITCTAVLKDKQLIAGEHIPIVPVFGEWG




FVEDKEVYEGVVRLTKDGQRLRNMIMSFNADIV




ARTPKKKPFFWPEQIAGFEHMYDGNDDYPYYLL




NRTDENSGDLPTQPLAYYENPEVPQANAYMLEA




ATSAVKEVATLGVDTEAVNGGQVAFDTVNQLN




MRADLETYVFQDNLATAMRRDGEIYQSIVNDIYD




VPRNVTITLEDGSEKDVQLMAEVVDLATGEKQVL




NDIRGRYECYTDVGPSFQSMKQQNRAEILELLGK




TPQGTPEYQLLLLQYFTLLDGKGVEMMRDYANK




QLIQMGVKKPETPEEQQWLVEAQQAKQ






SEQ ID NO: 3
MAYVPLSGTNVRILADVPFSNDYKNTRWFTSSSN
Phi29 tail protein



QYNWFNSKSRVYEMSKVTFMGFRENKPYVSVSL
channel/nanopore/



PIDKLYSASYIMFQNADYGNKWFYAFVTELEFKN
WT



SAVTYVHFEIDVLQTWMFDIKFQESFIVREHVKL




WNDDGTPTINTIDEGLSYGSEYDIVSVENHKPYDD




MMFLVIISKSIMHGTPGEEESRLNDINASLNGMPQ




PLCYYIHPFYKDGKVPKTYIGDNNANLSPIVNMLT




NIFSQKSAVNDIVNMYVTDYIGLKLDYKNGDKEL




KLDKDMFEQAGIADDKHGNVDTIFVKKIPDYEAL




EIDTGDKWGGFTKDQESKLMMYPYCVTEITDFKG




NHMNLKTEYINNSKLKIQVRGSLGVSNKVAYSVQ




DYNADSALSGGNRLTASLDSSLINNNPNDIAILND




YLSAYLQGNKNSLENQKSSILFNGIMGMIGGGISA




GASAAGGSALGMASSVTGMTSTAGNAVLQMQA




MQAKQADIANIPPQLTKMGGNTAFDYGNGYRGV




YVIKKQLKAEYRRSLSSFFHKYGYKINRVKKPNL




RTRKAFNYVQTKDCFISGDINNNDLQEIRTIFDNGI




TLWHTDNIGNYSVENELR






SEQ ID NO: 4
MAYVPLSGTNVRILADVPFSNDYKNTRWFTSSSN
Phi29 tail protein



QYNWFNSKSRVYEMSKVTFMGFRENKPYVSVSL
channel/nanopore/



PIDKLYSASYIMFQNADYGNKWFYAFVTELEFKN
phi29-gp9 Δ417-



SAVTYVHFEIDVLQTWMFDIKFQESFIVREHVKL
491



WNDDGTPTINTIDEGLSYGSEYDIVSVENHKPYDD




MMFLVIISKSIMHGTPGEEESRLNDINASLNGMPQ




PLCYYIHPFYKDGKVPKTYIGDNNANLSPIVNMLT




NIFSQKSAVNDIVNMYVTDYIGLKLDYKNGDKEL




KLDKDMFEQAGIADDKHGNVDTIFVKKIPDYEAL




EIDTGDKWGGFTKDQESKLMMYPYCVTEITDFKG




NHMNLKTEYINNSKLKIQVRGSLGVSNKVAYSVQ




DYNADSALSGGNRLTASLDSSLINNNPNDIAILND




YLSAYQLTKMGGNTAFDYGNGYRGVYVIKKQLK




AEYRRSLSSFFHKYGYKINRVKKPNLRTRKAFNY




VQTKDCFISGDINNNDLQEIRTIFDNGITLWHTDNI




GNYSVENELR






SEQ ID NO: 5
MAYVPLSGTNVRILADVPFSNDYKNTRWFTSSSN
Phi29 tail protein



QYNWFNSKSRVYEMSKVTFMGFRENKPYVSVSL
channel/nanopore/



PIDKLYSASYIMFQNADYGNKWFYAFVTELEFKN
phi29-gp9Δ417-



SAVTYVHFEIDVLQTWMFDIKFQESFIVREHVKL
491-K358C



WNDDGTPTINTIDEGLSYGSEYDIVSVENHKPYDD




MMFLVIISKSIMHGTPGEEESRLNDINASLNGMPQ




PLCYYIHPFYKDGKVPKTYIGDNNANLSPIVNMLT




NIFSQKSAVNDIVNMYVTDYIGLKLDYKNGDKEL




KLDKDMFEQAGIADDKHGNVDTIFVKKIPDYEAL




EIDTGDKWGGFTKDQESKLMMYPYCVTEITDFKG




NHMNLKTEYINNSKLCIQVRGSLGVSNKVAYSVQ




DYNADSALSGGNRLTASLDSSLINNNPNDIAILND




YLSAYQLTKMGGNTAFDYGNGYRGVYVIKKQLK




AEYRRSLSSFFHKYGYKINRVKKPNLRTRKAFNY




VQTKDCFISGDINNNDLQEIRTIFDNGITLWHTDNI




GNYSVENELR






SEQ ID NO: 6
MAYVPLSGTNVRILADVPFSNDYKNTRWFTSSSN
Phi29 tail protein



QYNWFNSKSRVYEMSKVTFMGFRENKPYVSVSL
channel/nanopore/



PIDKLYSASYIMFQNADYGNKWFYAFVTELEFKN
phi29-gp9Δ417-



SAVTYVHFEIDVLQTWMFDIKFQESFIVREHVILW
491-



NLLGTPTINTIDEGLSYGSEYDIVSVENHKPYDDM
K134I, D138L,



MFLVIISKSIMHGTPGEEESRLNDINASLNGMPQPL
D139L



CYYIHPFYKDGKVPKTYIGDNNANLSPIVNMLTNI




FSQKSAVNDIVNMYVTDYIGLKLDYKNGDKELK




LDKDMFEQAGIADDKHGNVDTIFVKKIPDYEALEI




DTGDKWGGFTKDQESKLMMYPYCVTEITDFKGN




HMNLKTEYINNSKLKIQVRGSLGVSNKVAYSVQD




YNADSALSGGNRLTASLDSSLINNNPNDIAILNDY




LSAYQLTKMGGNTAFDYGNGYRGVYVIKKQLKA




EYRRSLSSFFHKYGYKINRVKKPNLRTRKAFNYV




QTKDCFISGDINNNDLQEIRTIFDNGITLWHTDNIG




NYSVENELR






SEQ ID NO: 7
MAYVPLSGTNVRILADVPFSNDYKNTRWFTSSSN
Phi29 tail protein



QYNWFNSKSRVYEMSKVTFMGFRENKPYVSVSL
channel/nanopore/



PIDKLYSASYIMFQNADYGNKWFYAFVTELEFKN
phi29-gp9Δ417-



SAVTYVHFEIDVLQTWMFDIKFQESFIVREHVILW
491-



NLLGTPTINTIDEGLSYGSEYLIVSVVNHKPYDDM
K134I, D138L,



MFLVIISKSIMHGTPGEEESRLNDINASLNGMPQPL
D139L, D158L,



CYYIHPFYKDGKVPKTYIGDNNANLSPIVNMLTNI
E163V



FSQKSAVNDIVNMYVTDYIGLKLDYKNGDKELK




LDKDMFEQAGIADDKHGNVDTIFVKKIPDYEALEI




DTGDKWGGFTKDQESKLMMYPYCVTEITDFKGN




HMNLKTEYINNSKLKIQVRGSLGVSNKVAYSVQD




YNADSALSGGNRLTASLDSSLINNNPNDIAILNDY




LSAYQLTKMGGNTAFDYGNGYRGVYVIKKQLKA




EYRRSLSSFFHKYGYKINRVKKPNLRTRKAFNYV




QTKDCFISGDINNNDLQEIRTIFDNGITLWHTDNIG




NYSVENELR






SEQ ID NO: 8
MAYVPLSGTNVRILADVPFSNDYKNTRWFTSSSN
Phi29 tail protein



QYNWFNSKSRVYEMSKVTFMGFRENKPYVSVSL
channel/nanopore/



PIDKLYSASYIMFQNADYGNKWFYAFVTELEFKN
phi29-gp9Δ417-



SAVTYVHFEIDVLQTWMFDIKFQESFIVREHVILW
491-



NLLGTPTINTIDEGLSYGSEYLIVSVVNHKPYDDM
K134I, D138L,



MFLVIISKSIMHGTPGEEESRLNDINASLNGMPQPL
D139L, D158L,



CYYIHPFYKDGKVPKTYIGDNNANLSPIVNMLTNI
E163V, E309V, D311V,



FSQKSAVNDIVNMYVTDYIGLKLDYKNGDKELK
K321V



LDKDMFEQAGIADDKHGNVDTIFVKKIPDYEALV




IVTGDKWGGFTKDQESKLMMYPYCVTEITDFKG




NHMNLKTEYINNSKLKIQVRGSLGVSNKVAYSVQ




DYNADSALSGGNRLTASLDSSLINNNPNDIAILND




YLSAYQLTKMGGNTAFDYGNGYRGVYVIKKQLK




AEYRRSLSSFFHKYGYKINRVKKPNLRTRKAFNY




VQTKDCFISGDINNNDLQEIRTIFDNGITLWHTDNI




GNYSVENELR






SEQ ID NO: 9
MAYVPLSGTNVRILADVPFSNDYKNTRWFTSSSN
Phi29 tail protein



QYNWFNSKSRVYEMSKVTFMGFRENKPYVSVSL
channel/nanopore/



PIDKLYSASYIMFQNADYGNKWFYAFVTELEFKN
phi29-gp9Δ417-



SAVTYVHFEIDVLQTWMFDIKFQESFIVREHVKL
491-K356A, 358A



WNDDGTPTINTIDEGLSYGSEYDIVSVENHKPYDD




MMFLVIISKSIMHGTPGEEESRLNDINASLNGMPQ




PLCYYIHPFYKDGKVPKTYIGDNNANLSPIVNMLT




NIFSQKSAVNDIVNMYVTDYIGLKLDYKNGDKEL




KLDKDMFEQAGIADDKHGNVDTIFVKKIPDYEAL




EIDTGDKWGGFTKDQESKLMMYPYCVTEITDFKG




NHMNLKTEYINNSALAIQVRGSLGVSNKVAYSVQ




DYNADSALSGGNRLTASLDSSLINNNPNDIAILND




YLSAYQLTKMGGNTAFDYGNGYRGVYVIKKQLK




AEYRRSLSSFFHKYGYKINRVKKPNLRTRKAFNY




VQTKDCFISGDINNNDLQEIRTIFDNGITLWHTDNI




GNYSVENELR






SEQ ID NO: 10
MAYVPLSGTNVRILADVPFSNDYKNTRWFTSSSN
Phi29 tail protein



QYNWFNSKSRVYEMSKVTFMGFRENKPYVSVSL
channel/nanopore/



PIDKLYSASYIMFQNADYGNKWFYAFVTELEFKN
phi29-gp9Δ417-



SAVTYVHFEIDVLQTWMFDIKFQESFIVREHVKL
491-



WNDDGTPTINTIDEGLSYGSEYDIVSVENHKPYDD
K356A, 358A,



MMFLVIISKSIMHGTPGEEESRLNDINASLNGMPQ
D377A, D381V,



PLCYYIHPFYKDGKVPKTYIGDNNANLSPIVNMLT
N388L



NIFSQKSAVNDIVNMYVTDYIGLKLDYKNGDKEL




KLDKDMFEQAGIADDKHGNVDTIFVKKIPDYEAL




EIDTGDKWGGFTKDQESKLMMYPYCVTEITDFKG




NHMNLKTEYINNSALAIQVRGSLGVSNKVAYSVQ




AYNAVSALSGGLRLTASLDSSLINNNPNDIAILND




YLSAYQLTKMGGNTAFDYGNGYRGVYVIKKQLK




AEYRRSLSSFFHKYGYKINRVKKPNLRTRKAFNY




VQTKDCFISGDINNNDLQEIRTIFDNGITLWHTDNI




GNYSVENELR






SEQ ID NO: 11
MAYVPLSGTNVRILADVPFSNDYKNTRWFTSSSN
Phi29 tail protein



QYNWFNSKSRVYEMSKVTFMGFRENKPYVSVSL
channel/nanopore/



PIDKLYSASYIMFQNADYGNKWFYAFVTELEFKN
phi29-gp9Δ417-



SAVTYVHFEIDVLQTWMFDIKFQESFIVREHVKL
491-



WNDDGTPTINTIDEGLSYGSEYDIVSVENHKPYDD
K356A, 358A,



MMFLVIISKSIMHGTPGEEESRLNDINASLNGMPQ
D377A, D381V,



PLCYYIHPFYKDGKVPKTYIGDNNANLSPIVNMLT
N388L, R524I



NIFSQKSAVNDIVNMYVTDYIGLKLDYKNGDKEL




KLDKDMFEQAGIADDKHGNVDTIFVKKIPDYEAL




EIDTGDKWGGFTKDQESKLMMYPYCVTEITDFKG




NHMNLKTEYINNSALAIQVRGSLGVSNKVAYSVQ




AYNAVSALSGGLRLTASLDSSLINNNPNDIAILND




YLSAYQLTKMGGNTAFDYGNGYRGVYVIKKQLK




AEYRRSLSSFFHKYGYKINRVKKPNLRTRKAFNY




VQTKDCFISGDINNNDLQEIRTIFDNGITLWHTDNI




GNYSVENELI






SEQ ID NO: 12
MNHKHHHHHHSSGENLYFQGHMGSMAYVPLSG
Phi29 tail protein



TNVRILADVPFSNDYKNTRWFTSSSNQYNWFNSK
channel/nanopore/



SRVYEMSKVTFMGFRENKPYVSVSLPIDKLYSAS
phi29-gp9Δ417-



YIMFQNADYGNKWFYAFVTELEFKNSAVTYVHF
491-N-his



EIDVLQTWMFDIKFQESFIVREHVKLWNDDGTPTI




NTIDEGLSYGSEYDIVSVENHKPYDDMMFLVIISK




SIMHGTPGEEESRLNDINASLNGMPQPLCYYIHPF




YKDGKVPKTYIGDNNANLSPIVNMLTNIFSQKSA




VNDIVNMYVTDYIGLKLDYKNGDKELKLDKDMF




EQAGIADDKHGNVDTIFVKKIPDYEALEIDTGDK




WGGFTKDQESKLMMYPYCVTEITDFKGNHMNLK




TEYINNSKLKIQVRGSLGVSNKVAYSVQDYNADS




ALSGGNRLTASLDSSLINNNPNDIAILNDYLSAYQ




LTKMGGNTAFDYGNGYRGVYVIKKQLKAEYRRS




LSSFFHKYGYKINRVKKPNLRTRKAFNYVQTKDC




FISGDINNNDLQEIRTIFDNGITLWHTDNIGNYSVE




NELR






SEQ ID NO: 13
MAYVPLSGTNVRILADVPFSNDYKNTRWFTSSSN
Phi29 tail protein



QYNWFNSKSRVYEMSKVTFMGFRENKPYVSVSL
channel/nanopore/



PIDKLYSASYIMFQNADYGNKWFYAFVTELEFKN
phi29-gp9



SAVTYVHFEIDVLQTWMFDIKFQESFIVREHVKL
K321C



WNDDGTPTINTIDEGLSYGSEYDIVSVENHKPYDD




MMFLVIISKSIMHGTPGEEESRLNDINASLNGMPQ




PLCYYIHPFYKDGKVPKTYIGDNNANLSPIVNMLT




NIFSQKSAVNDIVNMYVTDYIGLKLDYKNGDKEL




KLDKDMFEQAGIADDKHGNVDTIFVKKIPDYEAL




EIDTGDKWGGFTCDQESKLMMYPYCVTEITDFKG




NHMNLKTEYINNSKLKIQVRGSLGVSNKVAYSVQ




DYNADSALSGGNRLTASLDSSLINNNPNDIAILND




YLSAYLQGNKNSLENQKSSILFNGIMGMIGGGISA




GASAAGGSALGMASSVTGMTSTAGNAVLQMQA




MQAKQADIANIPPQLTKMGGNTAFDYGNGYRGV




YVIKKQLKAEYRRSLSSFFHKYGYKINRVKKPNL




RTRKAFNYVQTKDCFISGDINNNDLQEIRTIFDNGI




TLWHTDNIGNYSVENELR






SEQ ID NO: 14
MAYVPLSGTNVRILADVPFSNDYKNTRWFTSSSN
Phi29 tail protein



QYNWFNSKSRVYEMSKVTFMGFRENKPYVSVSL
channel/nanopore/



PIDKLYSASYIMFQNADYGNKWFYAFVTELEFKN
phi29-gp9



SAVTYVHFEIDVLQTWMFDIKFQESFIVREHVKL
K358C



WNDDGTPTINTIDEGLSYGSEYDIVSVENHKPYDD




MMFLVIISKSIMHGTPGEEESRLNDINASLNGMPQ




PLCYYIHPFYKDGKVPKTYIGDNNANLSPIVNMLT




NIFSQKSAVNDIVNMYVTDYIGLKLDYKNGDKEL




KLDKDMFEQAGIADDKHGNVDTIFVKKIPDYEAL




EIDTGDKWGGFTKDQESKLMMYPYCVTEITDFKG




NHMNLKTEYINNSKLCIQVRGSLGVSNKVAYSVQ




DYNADSALSGGNRLTASLDSSLINNNPNDIAILND




YLSAYLQGNKNSLENQKSSILFNGIMGMIGGGISA




GASAAGGSALGMASSVTGMTSTAGNAVLQMQA




MQAKQADIANIPPQLTKMGGNTAFDYGNGYRGV




YVIKKQLKAEYRRSLSSFFHKYGYKINRVKKPNL




RTRKAFNYVQTKDCFISGDINNNDLQEIRTIFDNGI




TLWHTDNIGNYSVENELR






SEQ ID NO: 15
MAYVPLSGTNVRILADVPFSNDYKNTRWFTSSSN
Phi29 tail protein



QYNWFNSKSRVYEMSKVTFMGFRENKPYVSVSL
channel/nanopore/



PIDKLYSASYIMFQNADYGNKWFYAFVTELEFKN
phi29-gp9



SAVTYVHFEIDVLQTWMFDIKFQESFIVREHVKL
E595C



WNDDGTPTINTIDEGLSYGSEYDIVSVENHKPYDD




MMFLVIISKSIMHGTPGEEESRLNDINASLNGMPQ




PLCYYIHPFYKDGKVPKTYIGDNNANLSPIVNMLT




NIFSQKSAVNDIVNMYVTDYIGLKLDYKNGDKEL




KLDKDMFEQAGIADDKHGNVDTIFVKKIPDYEAL




EIDTGDKWGGFTKDQESKLMMYPYCVTEITDFKG




NHMNLKTEYINNSKLKIQVRGSLGVSNKVAYSVQ




DYNADSALSGGNRLTASLDSSLINNNPNDIAILND




YLSAYLQGNKNSLENQKSSILFNGIMGMIGGGISA




GASAAGGSALGMASSVTGMTSTAGNAVLQMQA




MQAKQADIANIPPQLTKMGGNTAFDYGNGYRGV




YVIKKQLKAEYRRSLSSFFHKYGYKINRVKKPNL




RTRKAFNYVQTKDCFISGDINNNDLQEIRTIFDNGI




TLWHTDNIGNYSVCNELR






SEQ ID NO: 16
MTLSKIKLFYNTPFNNMQNTLHFNSNEERDAYFN
Bacteriophage C1



SKFDVHEFTSTFNYRNMKGVLRVTIDLVSDRSCF
tail protein



EQLMGVNYCQVQYIQSNRVEYLFVTDIQQLNDK
channel/nanopore/



VCELSLVPDVVMTYTQGNVLNTLNNVNVIRQHY
WT



TQTEYEQNLEQIRSNNDVLATSTMRVHAIKSELFT




QLEYILTIGANLRKSFGTAEKPKFPSSSGSTHDGIY




NPYDMYWFNDYESLKEVMDYLTGYPWIQQSIKN




VTIIPSGFIKQESLNDHEPVNGGDLSVRKLGKQGV




SNQKDFNAISLDYQSLMFTLGLNPINDKHLLRPNI




VTAELTDYAGNRLPIDLSLIETNLEFDSFVTMGAK




NEIKVYVKNYNARGNNVGQYIDNALTINNFDTIG




FSVDSGELGKANSAYSRELSNSRQMSSRINTVLD




NDASVKDRLFNAISLSGGLSIKSALSGFNNEYEHY




RDQKAQFKQMDALPNAITEGHVGYAPLFKQDKF




GVHLRLGRISQDELNNVKKYYNMFGYECNDYST




KLSDITSMSICNWVQFKGIWTLPNVDTGHMNMLR




ALFEAGVRLWHKESDMINNTVVNNVIIKSLEHHH




HHH






SEQ ID NO: 17
MQNNSYGYAVSVRVGGKEHRHWERYDIDSDFLI

Neisseria




PADSFDFVIGRLGPEAAIPDLSGESCEVVIDGQIVM

meningitidis




TGIIGSQRHGKSKGSRELSLSGRDLAGFLVDCSAP
serogroup B tail



QLNVKGMTVLDAAKKLAAPWPQIKAVVLKAEN
protein



NPALGKIDIEPGETVWQALTHIANSVGLHPWLEPD
channel/nanopore/



GTLVVGGADYSSPPVATLCWSRTDSRCNIERMDI
WT



EWDTDNRFSEVTFLAQSHGRSGDSAKHDLKWVY




KDPTMTLHRPKTVVVSDADNLAALQKQAKKQLA




DWRLEGFTLTITVGGHKTRDGVLWQPGLRVHVID




DEHGIDAVFFLMGRRFMLSRMDGTQTELRLKED




GIWTPDAYPKKAEAARKRKGKRKGVSHKGKKG




GKKQAETAVFE






SEQ ID NO:
MFVDDVTRAFESGDFARPNLFQVEISYLGQNFTF
Bacteriophage T4


18
QCKATALPAGIVEKIPVGFMNRKINVAGDRTFDD
tail protein



WTVTVMNDEAHDARQKFVDWQSIAAGQGNEITG
channel/nanopore/



GKPAEYKKSAIVRQYARDAKTVTKEIEIKGLWPT
WT



NVGELQLDWDSNNEIQTFEVTLALDYWE






SEQ ID NO:
MVDAGFENQKELTKMQLDNQKEIAEMQNETQKE
Bacteriophage


19
IAGIQSATSRQNTKDQVYAQNEMLAYQQKESTAR
phiX174 tail



VASIMENTNLSKQQQVSEIMRQMLTQAQTAGQY
protein



FTNDQIKEMTRKVSAEVDLVHQQTQNQRYGSSHI
channel/nanopore/



GATAKD
WT





SEQ ID NO:
MPVPNPTMPVKGAGTTLWVYKGSGDPYANPLSD

Escherichia phage



20
VDWSRLAKVKDLTPGELTAESYDDSYLDDEDAD
lambda



WTATGQGQKSAGDTSFTLAWMPGEQGQQALLA
(Bacteriophage



WFNEGDTRAYKIRFPNGTVDVFRGWVSSIGKAVT
lambda) tail



AKEVITRTVKVTNVGRPSMAEDRSTVTAATGMT
protein



VTPASTSVVKGQSTTLTVAFQPEGVTDKSFRAVS
channel/nanopore/



ADKTKATVSVSGMTITVNGVAAGKVNIPVVSGN
WT



GEFAAVAEITVTAS






SEQ ID NO:
NIYDILDKVFTMMYDGQDLTDYFLVQEVRGRSV
Bacteriophage


21
YSIEMGKRTIAGVDGGVITTESLPARELEVDAIVF
SPP1 tail protein



GDGTETDLRRRIEYLNFLLHRDTDVPITFSDEPSRT
channel/nanopore/



YYGRYEFATEGDEKGGFHKVTLNFYCQDPLKYG
WT



PEVTTDVTTASTPVKNTGLAVTNPTIRCVFSTSAT




EYEMQLLDGSTVVKFLKVVYGFNTGDTLVIDCHE




RSVTLNGQDIMPALLIQSDWIQLKPQVNTYLKAT




QPSTIVFTEKFL






SEQ ID NO:
MSLQLLRNTRIFVSTVKTGHNKTNTQEILVQDDIS

Escherichia phage



22
WGQDSNSTDITVNEAGPRPTRGSKRFNDSLNAAE
T5 tail protein



WSFSTYILPYKDKNTSKQIVPDYMLWHALSSGRA
channel/nanopore/



INLEGTTGAHNNATNFMVNFKDNSYHELAMLHIY
WT



ILTDKTWSYIDSCQINQAEVNVDIEDIGRVTWSGN




GNQLIPLDEQPFDPDQIGIDDETYMTIQGSYIKNKL




TILKIKDMDTNKSYDIPITGGTFTINNNITYLTPNV




MSRVTIPIGSFTGAFELTGSLTAYLNDKSLGSMEL




YKDLIKTLKVVNRFEIALVLGGEYDDERPAAILVA




KQAHVNIPTIETDDVLGTSVEFKAIPSDLDAGDEG




YLGFSSKYTRTTINNLIVNGDGATDAVTAITVKSA




GNVTTLNRSATLQMSVEVTPSSARNKEVTWAITA




GDAATINATGLLRADASKTGAVTVEATAKDGSG




VKGTKVITVTAGG






SEQ ID NO:
MAGNQRQGVAFIRVNGMELESMEGASFTPSGITR

Escherichia phage



23
EEVTGSRVYGWKGKPRAAKVECKIPGGGPIGLDE
Mu tail protein



IIDWENITVEFQADTGETWMLANAWQADEPKND
channel/nanopore/



GGEISLVLMAKQSKRIA
WT





SEQ ID NO:
MPVPNPTMPVKGAGTTLWVYKGSGDPYANPLSD

Escherichia phage



24
VDWSRLAKVKDLTPGELTAESYDDSYLDDEDAD
lambda tail



WTATGQGQKSAGDTSFTLAWMPGEQGQQALLA
protein



WFNEGDTRAYKIRFPNGTVDVFRGWVSSIGKAVT
channel/nanopore/



AKEVITRTVKVTNVGRPSMAEDRSTVTAATGMT
WT



VTPASTSVVKGQSTTLTVAFQPEGVTDKSFRAVS




ADKTKATVSVSGMTITVNGVAAGKVNIPVVSGN




GEFAAVAEITVTAS






SEQ ID NO:
MKLDYNSREIFFGNEALIVADMTKGSNGKPEFTN

Lactococcus



25
HKIVTGLVSVGSMEDQAETNSYPADDVPDHGVK
phage F4-1 tail



KGATLLQGEMVFIQTDQALKEDMLGQQRTENGL
protein



GWSPTGNWKTKCVQYLIKGRKRDKVTGEFVDGY
channel/nanopore/



RVVVYPHLTPTAEATKESETDSVDGVDPIQWTLA
WT



VQATESDIYSNGGKKVPAIEYEIWGEQAKDFAKK




MESGLFIMQPDTVLAGAITLVAPVIPNVTTATKGN




NDGTIVVPDTLKDSKGGTVKVTSVIKDAHGKVAT




NGQLAPGVYIVTFSADGYEDVTAGVSVTDHS






SEQ ID NO:
MAMPRKLKLMNVFLNGYSYQGVAKSVTLPKLTR

Escherichia phage



26
KLENYRGAGMNGSAPVDLGLDDDALSMEWSLG
P2 tail protein



GFPDSVIWELYAATGVDAVPIRFAGSYQRDDTGE
channel/nanopore/



TVAVEVVMRGRQKEIDTGEGKQGEDTESKISVVC
WT



TYFRLTMDGKELVEIDTINMIEKVNGVDRLEQHR




RNIGL






SEQ ID NO:
MATVNEFRGAMSRGGGVQRQHRWRVTISFPSFA

Serratia phage



27
ASADQTRDVCLLAVTTNTPTGQLGEILVPWGGRE
KSP90 tail protein



LPFPGDRRFEALPITFINVVNNGPYNSMEVWQQYI
channel/nanopore/



NGSESNRASANPDEYFRDVVLELLDANDNVTKT
WT



WTLQGAWPQNLGQLELDMSAMDSYTQFTCDLR




YFQAVSDRSR






SEQ ID NO:
MRSYEMNIETAEELSAVNDILASIGEPPVSTLEGD

Enterobacteria



28
ANADVANARRVLNKINRQIQSRGWTFNIEEGVTL
phage T7M tail



LPDAFSGMIPFSSDYLSVMATSGQTQYVNRGGYL
protein



YDRSAKTDRFPSGVQVNLIRLREFDEMPECFRNYI
channel/nanopore/



VTKASRQFNNRFFGAPEVDGVLQEEEQEAWSACF
WT



EYELDYGNYNMLDGDAFTSGLLNR






SEQ ID NO:
MAIDVLDVISLSLFKQQIEFEEDDRDELITLYAQA
Bacteriophage


29
AFDYCMRWCDEPAWKVAADIPAAVKGAVLLVF
HK97 tail protein



ADMFEHRTAQSEVQLYENAAAERMMFIHRNWR
channel/nanopore/



GKAESEEGS
WT





SEQ ID NO:
MARKRSNTYRSINEIQRQKRNRWFIHYLNYLQSL
Bacteriophage


30
AYQLFEWENLPPTINPSFLEKSIHQFGYVGFYKDP
Phi29 portal



VISYIACNGALSGQRDVYNQATVFRAASPVYQKE
protein



FKLYNYRDMKEEDMGVVIYNNDMAFPTTPTLEL
channel/nanopore/



FAAELAELKEIISVNQNAQKTPVLIRANDNNQLSL
WT



KQVYNQYEGNAPVIFAHEALDSDSIEVFKTDAPY




VVDKLNAQKNAVWNEMMTFLGIKNANLEKKER




MVTDEVSSNDEQIESSGTVFLKSREEACEKINELY




GLNVKVKFRYDIVEQMRRELQQIENVSRGTSDGE




TNE






SEQ ID NO:
TYRSINEIQRQKRNRWFIHYLNYLQSLAYQLFEW
Bacteriophage


31
ENLPPTINPSFLEKSIHQFGYVGFYKDPVISYIACN
Phi29 portal



GALSGQRDVYNQATVFRAASPVYQKEFKLYNYR
protein



DMKEEDMGVVIYNNDMAFPTTPTLELFAAELAEL
channel/nanopore/



KEIISVNQNAQKTPVLIRANDNNQLSLKQVYNQY
phi29 gp10 Δ1-7



EGNAPVIFAHEALDSDSIEVFKTDAPYVVDKLNA




QKNAVWNEMMTFLGIKNANLEKKERMVTDEVSS




NDEQIESSGTVFLKSREEACEKINELYGLNVKVKF




RYDIVEQMRRELQQIENVSRGTSDGETNEAHIVM




VDAYKPTK






SEQ ID NO: 32
TYLSINVIQLQKRNRWFIHYLNYLQSLAYQLFEW
Bacteriophage



ENLPPTINPSFLEKSIHQFGYVGFYKDPVISYIACN
Phi29 portal



GALSGQRDVYNQATVFRAASPVYQKEFKLYNYR
protein



DMKEEDMGVVIYNNDMAFPTTPTLELFAAELAEL
channel/nanopore/



KEIISVNQNAQKTPVLIRANDNNQLSLKQVYNQY
phi29 gp10 Δ1-7



EGNAPVIFAHEALDSDSIEVFKTDAPYVVDKLNA
R10L, E14V,



QKNAVWNEMMTFLGIKNANLEKKERMVTDEVSS
R17L



NDEQIESSGTVFLKSREEACEKINELYGLNVKVKF




RYDIVEQMRRELQQIENVSRGTSDGETNEAHIVM




VDAYKPTK






SEQ ID NO: 33
ILTYLSINVIQLQKRNRWFIHYLNYLQSLAYQLFE
Bacteriophage



WENLPPTINPSFLEKSIHQFGYVGFYKDPVISYIAC
Phi29 portal



NGALSGQRDVYNQATVFRAASPVYQKEFKLYNY
protein



RDMKEEDMGVVIYNNDMAFPTTPTLELFAAELAE
channel/nanopore/



LKEIISVNQNAQKTPVLIRANDNNQLSLKQVYNQ
phi29 gp10 Δ1-7



YEGNAPVIFAHEALDSDSIEVFKTDAPYVVDKLN
N-ter



AQKNAVWNEMMTFLGIKNANLEKKERMVTDEV
IL, R10L, E14V,



SSNDEQIESSGTVFLKSREEACEKINELYGLNVKV
R17L



KFRYDIVEQMRRELQQIENVSRGTSDGETNEAHIV




MVDAYKPTK






SEQ ID NO: 34
ILVAILTYLSINVIQLQKRNRWFIHYLNYLQSLAY
Bacteriophage



QLFEWENLPPTINPSFLEKSIHQFGYVGFYKDPVIS
Phi29 portal



YIACNGALSGQRDVYNQATVFRAASPVYQKEFKL
protein



YNYRDMKEEDMGVVIYNNDMAFPTTPTLELFAA
channel/nanopore/



ELAELKEIISVNQNAQKTPVLIRANDNNQLSLKQV
phi29 gp10 Δ1-7



YNQYEGNAPVIFAHEALDSDSIEVFKTDAPYVVD
N-ter



KLNAQKNAVWNEMMTFLGIKNANLEKKERMVT
ILVAIL, R10L,



DEVSSNDEQIESSGTVFLKSREEACEKINELYGLN
E14V, R17L



VKVKFRYDIVEQMRRELQQIENVSRGTSDGETNE




AHIVMVDAYKPTK






SEQ ID NO: 35
MARKRSNTYRSINEIQRQKRNRWFIHYLNYLQSL
Bacteriophage



AYQLFEWENLPPTINPSFLEKSIHQFGYVGFYKDP
Phi29 portal



VISYIACNGCLSGQRDVYNQATVFRAASPVYQKE
protein



FKLYNYRDMKEEDMGVVIYNNDMAFPTTPTLCL
channel/nanopore/



FAAELAELKEIISVNQNAQKTPVLIRANDNNCLSL
phi29 gp10



KQVYNQYEGNAPVIFAHEALDSDSIEVFKTDAPY
A79C, E135C,



VVDKLNAQKNAVWNEMMTFLGIKNANLEKKER
Q168C



MVTDEVSSNDEQIESSGTVFLKSREEACEKINELY




GLNVKVKFRYDIVEQMRRELQQIENVSRGTSDGE




TNE
















TABLE 2





Sequences of Linkers and Probes

















SEQ ID NO: 36
AHIVMVDAYKPTK
Spytag/




Conjugation Linker





SEQ ID NO: 37
DYDIPTTENLYFQGAMVDTLSGLSSEQGQSGDMT
SpyCatcher/



IEEDSATHIKFSKRDEDGKELAGATMELRDSSGKT
Conjugation Linker



ISTWISDGQVKDFYLYPGKYTFVETAAPDGYEVA




TAITFTVNEQGQVTVNGKATKGDAHI






SEQ ID NO: 38
MALTQPSSVSANPGETVKITCSGSSGSYGWYQQK
Anti-PSA single



SPDSAPVTVIYQSNQRPSDIPSRFSGSKSGSTGTLTI
chain



TGVQAEDEAVYYCGGWGSSVGMFGAGTTLTVL
antibody/probe



GQSSRSSGGGGSSGGGGSAVTLDESGGGLQTPGG




ALSLVCKASGFTFSSYAMGWVRQAPGKGLEWVA




GISDDGDSYISYATAVKGRATISRDNGQSTVRLQL




NNLRAEDTATYYCARSHCSGCRNAALIDAWGHG




TEVIVSSMSYY






SEQ ID NO: 39
VHSPNKK
Anti-Endothelial




vascular adhesion




molecule/probe





SEQ ID NO: 40
EVHLQQSLAELVRSGASVKLSCTASGFNIKHYYM
CEA probe



HWVKQRPEQGLEWIGWINPENVDTEYAPKFQGK




ATMTADTSSNTAYLQLSSLTSEDTAVYYCNHYRY




AVGGALDYWGQGTTVTVSSGGGGSGGGGSGGG




GSDIELTQSPAIMSASPGEKVTMTCSASSSVSYIH




WYQQKSGTSPKRWVYDTSKLASGVPARFSGSGS




GTSYSLTISTMEAEVAATYYCQQWNNNPYTFGG




GTKLEIK






SEQ ID NO: 41
VKLQESGPGLVAPSQSLSMSCTVSGFSLSSYGVH
CEA probe



WVRQPPGKGLEWLGVIWAGGTTNYNSALMSRLS




ISKDNSKSQVLLKMNSLQTDDTAMYYCATTTMIT




LMDYWGQGTTVTVSS






SEQ ID NO: 42
DVQLNQAKSSLSASLGDRVTISCRASQDISNYLN
CEA probe



WYQQKPDGTVKLLIYYTSRLHSGVPPRFSGSGSG




TDYSLTISNLEQEDIATYFCQQGNTVPWTFGGGTK




LEI






SEQ ID NO: 43
MARSGL
ErbB2 probe





SEQ ID NO: 44
MARAKE
ErbB2 probe





SEQ ID NO: 45
MSRTMS
ErbB2 probe





SEQ ID NO: 46
KCCYSL
ErbB2 probe





SEQ ID NO: 47
WIFPWIQL
GRP78 probe





SEQ ID NO: 48
WDLAWMFRLPVG
GRP78 probe





SEQ ID NO: 49
CTVALPGGYVRVC
GRP78 probe





SEQ ID NO: 50
IPLVVPLGGSCK
Hepsin probe





SEQ ID NO: 51
KTLLPTP
Plectin-1 probe





SEQ ID NO: 52
CVAYCIEHHCWTC
PSA probe





SEQ ID NO: 53
CVFAHNYDYLVC
PSA probe





SEQ ID NO: 54
CVFTSNYAFC
PSA probe





SEQ ID NO: 55
SGRSA
uPA probe





SEQ ID NO: 56
WGFP
uPA probe





SEQ ID NO: 57
XFXXYLW
uPA probe





SEQ ID NO: 58
AEPMPHSLNFSQYLWYT
uPA probe





SEQ ID NO: 59
FSRYLWS
uPA probe





SEQ ID NO: 60
IELLQAR
E-selectin probe





SEQ ID NO: 61
DITWDQLWDLMK
E-selectin probe





SEQ ID NO: 62
AYTKCSRQWRTCMTTH
Galectin-3 probe





SEQ ID NO: 63
PQNSKIPGPTFLDPH
Galectin-3 probe





SEQ ID NO: 64
SMEPALPDWWWKMFK
Galectin-3 probe





SEQ ID NO: 65
ANTPCGPYTHDCP
Galectin-3 probe





SEQ ID NO: 66
FQHPSFI
Alpha-fetoprotein




(AFP) probe





SEQ ID NO: 67
CVPELGHEC
HSP90 probe





SEQ ID NO: 68
AACCCCUAUCACGAUUAGCAUUAA
MiR-155 probe





SEQ ID NO: 69
AGUCAACAUCAGUCUGAUAAGCUA
MiR21 probe





SEQ ID NO: 70
ACAGUUCUUCAACUGGCAGCUU
MiR-22 probe





SEQ ID NO: 71
AACAACAAAAUCACUAGUCUUCCA
MiR-7 probe





SEQ ID NO: 72
ACAGGCCGGGACAAGUGCAAUA
MiR-92a probe





SEQ ID NO: 73
CAAACACCAUUGUCACACUCCA
MiR-122 probe





SEQ ID NO: 74
GGCUGUCAAUUCAUAGGUCAG
MiR-192 probe





SEQ ID NO: 75
UGGGGUAUUUGACAAACUGACA
MiR-223 probe





SEQ ID NO: 76
AGCCUAUCCUGGAUUACUUGAA
MiR-26a probe





SEQ ID NO: 77
GCGGAACUUAGCCACUGUGAA
MiR-27a probe





SEQ ID NO: 78
ACAAGGAUGAAUCUUUGUUACUG
MiR-802 probe





SEQ ID NO: 79
ATGGCGTTAACCCAACCTAGCAGCGTTAGCGCG
Anti-PSA single



AATCCTGGCGAAACCGTGAAAATTACCTGCAGC
chain antibody-



GGCAGCAGCGGTAGCTATGGCTGGTATCAGCA
SPYCATCHER



GAAAAGCCCGGATTCAGCGCCTGTGACCGTGAT




TTATCAGAGCAACCAGCGCCCGAGCGATATTCC




TAGCCGCTTTAGCGGCAGCAAAAGCGGTAGCA




CCGGCACCTTAACCATTACCGGTGTGCAGGCGG




AAGATGAAGCGGTGTATTATTGCGGCGGTTGGG




GTTCAAGCGTTGGCATGTTTGGTGCGGGTACCA




CCTTAACCGTGTTAGGTCAGAGCAGCCGTTCAA




GCGGTGGCGGTGGTAGCAGCGGTGGTGGTGGT




AGCGCAGTTACCCTGGATGAAAGCGGTGGCGG




CTTACAAACTCCTGGTGGTGCGCTGAGCTTAGT




TTGTAAAGCGAGCGGCTTTACCTTTAGCAGCTA




TGCGATGGGTTGGGTGCGTCAGGCGCCTGGTAA




AGGCTTAGAATGGGTGGCGGGCATTAGCGATG




ATGGCGATAGCTATATTAGCTATGCGACCGCGG




TTAAAGGTCGTGCGACCATTAGCCGTGATAACG




GCCAGAGCACCGTTCGTCTGCAGCTGAATAACC




TGCGCGCGGAAGATACCGCGACCTATTATTGCG




CGCGCAGCCATTGTAGCGGTTGTCGTAACGCGG




CGCTGATTGATGCATGGGGCCATGGCACCGAAG




TGATTGTGAGCAGCATGTCGTACTACCATCACC




ATCACCATCACGATTACGACATCCCAACGACCG




AAAACCTGTATTTTCAGGGCGCCATGGTTGATA




CCTTATCAGGTTTATCAAGTGAGCAAGGTCAGT




CCGGTGATATGACAATTGAAGAAGATAGTGCTA




CCCATATTAAATTCTCAAAACGTGATGAGGACG




GCAAAGAGTTAGCTGGTGCAACTATGGAGTTGC




GTGATTCATCTGGTAAAACTATTAGTACATGGA




TTTCAGATGGACAAGTGAAAGATTTCTACCTGT




ATCCAGGAAAATATACATTTGTCGAAACCGCAG




CACCAGACGGTTATGAGGTAGCAACTGCTATTA




CCTTTACAGTTAATGAGCAAGGTCAGGTTACTG




TAAATGGCAAAGCAACTAAAGGTGACGCTCAT




ATTTAA
















TABLE 3





Nucleic Acid Sequences

















SEQ ID NO: 80
ATGAATCATAAACATCATCATCATCATCACAGC
P22 portal protein



AGCGGCGAAAACCTGTATTTTCAGGGCCATATG
channel Barrel



GGATCCGCCGACAATGAAAACAGGCTGGAGAG
deletion-anti-PSA



CATCCTGTCGCGCTTTGATGCGGACTGGACAGC
single chain



CAGTGATGAAGCCAGACGAGAGGCAAAGAATG
antibody (channel



ATCTCTTCTTCTCCCGCGTATCTCAGTGGGATGA
with probe)



CTGGCTATCACAATACACAACCCTGCAGTATCG




CGGGCAGTTCGATGTTGTACGTCCAGTGGTGCG




CAAGCTCGTTTCTGAGATGCGTCAGAACCCTAT




TGATGTTCTGTATCGTCCAAAGGACGGAGCAAG




ACCTGATGCCGCTGATGTGCTTATGGGTATGTA




TCGCACAGACATGCGGCATAACACGGCTAAAA




TCGCGGTTAACATCGCTGTTCGTGAGCAGATTG




AAGCTGGAGTTGGTGCGTGGCGTCTGGTCACTG




ACTACGAAGACCAAAGTCCGACGAGCAACAAT




CAGGTTATCCGTCGAGAGCCTATCCATAGTGCC




TGCTCCCATGTTATCTGGGACAGCAACAGCAAA




CTGATGGATAAGTCTGACGCCCGTCACTGCACA




GTTATCCACTCAATGAGCCAGAATGGTTGGGAG




GATTTCGCAGAAAAATACGACCTCGATGCGGAT




GATATTCCATCATTCCAGAACCCCAACGATTGG




GTATTTCCATGGCTGACGCAGGACACAATTCAG




ATCGCTGAGTTTTACGAAGTGGTCGAGAAGAAA




GAGACGGCGTTTATCTACCAAGACCCGGTTACG




GGTGAGCCGGTAAGCTACTTTAAGCGCGATATT




AAAGACGTCATCGATGACCTGGCTGATAGTGGA




TTTATCAAAATTGCAGAGCGCCAGATTAAGCGT




CGCCGGGTATACAAATCGATTATCACCTGCACT




GCTGTACTCAAAGACAAGCAGCTCATTGCTGGC




GAGCATATCCCCATTGTTCCGGTGTTCGGAGAG




TGGGGCTTCGTTGAAGATAAAGAAGTGTATGAG




GGTGTCGTCCGCCTGACAAAAGACGGCCAGCGT




CTGCGCAACATGATTATGTCGTTCAACGCCGAC




ATCGTGGCCCGCACTCCGAAGAAGAAGCCGTTC




TTCTGGCCTGAGCAGATTGCAGGCTTTGAGCAT




ATGTACGACGGTAACGACGATTACCCATACTAC




CTGCTCAATCGCACTGACGAAAATAGTGGAGAC




CTTCCGACTCAGCCGCTGGCATATTATGAAAAC




CCGGAAGTGCCGCAAGCCAACGCCTACATGCTG




GAAGCAGCAACCAGCGCAGTAAAAGAGGTTGC




CACTCTCGGAGTTGATACAGAAGCGGTAAATGG




CGGACAGGTTGCGTTTGATACCGTCAATCAACT




GAATATGAGGGCTGACCTTGAGACATACGTGTT




TCAGGATAATCTGGCTACCGCCATGCGCCGTGA




CGGAGAGATTTACCAGTCGATAGTTAATGACAT




CTACGATGTTCCTCGCAACGTTACGATTACCCTT




GAGGATGGCAGCGAGAAAGATGTTCAGCTAAT




GGCTGAGGTTGTTGACCTTGCTACTGGAGAAAA




GCAGGTACTAAACGATATCAGGGGGCGCTATG




AGTGCTACACGGATGTTGGACCATCATTCCAGT




CCATGAAGCAGCAAAACCGCGCAGAAATTCTT




GAGTTGCTCGGCAAGACGCCACAGGGAACGCC




AGAATATCAACTGCTGTTGCTTCAGTACTTCAC




CCTGCTTGATGGTAAAGGTGTTGAGATGATGCG




TGACTATGCCAACAAGCAGCTTATTCAGATGGG




CGTTAAGAAGCCAGAAACGCCCGAAGAGCAGC




AATGGTTAGTAGAGGCGCAACAAGCCAAACAA




GGTGGAGGAGGAGGAGGAAAGCTTGCGTTAAC




CCAACCTAGCAGCGTTAGCGCGAATCCTGGCGA




AACCGTGAAAATTACCTGCAGCGGCAGCAGCG




GTAGCTATGGCTGGTATCAGCAGAAAAGCCCG




GATTCAGCGCCTGTGACCGTGATTTATCAGAGC




AACCAGCGCCCGAGCGATATTCCTAGCCGCTTT




AGCGGCAGCAAAAGCGGTAGCACCGGCACCTT




AACCATTACCGGTGTGCAGGCGGAAGATGAAG




CGGTGTATTATTGCGGCGGTTGGGGTTCAAGCG




TTGGCATGTTTGGTGCGGGTACCACCTTAACCG




TGTTAGGTCAGAGCAGCCGTTCAAGCGGTGGCG




GTGGTAGCAGCGGTGGTGGTGGTAGCGCAGTTA




CCCTGGATGAAAGCGGTGGCGGCTTACAAACTC




CTGGTGGTGCGCTGAGCTTAGTTTGTAAAGCGA




GCGGCTTTACCTTTAGCAGCTATGCGATGGGTT




GGGTGCGTCAGGCGCCTGGTAAAGGCTTAGAAT




GGGTGGCGGGCATTAGCGATGATGGCGATAGC




TATATTAGCTATGCGACCGCGGTTAAAGGTCGT




GCGACCATTAGCCGTGATAACGGCCAGAGCAC




CGTTCGTCTGCAGCTGAATAACCTGCGCGCGGA




AGATACCGCGACCTATTATTGCGCGCGCAGCCA




TTGTAGCGGTTGTCGTAACGCGGCGCTGATTGA




TGCATGGGGCCATGGCACCGAAGTGATTGTGAG




CAGCTAA






SEQ ID NO: 81
ATGAATCATAAACATCATCATCATCATCACAGC
P22 portal protein



AGCGGCGAAAACCTGTATTTTCAGGGCCATATG
channel Barrel



GGATCCGCCGACAATGAAAACAGGCTGGAGAG
deletion-anti-



CATCCTGTCGCGCTTTGATGCGGACTGGACAGC
VCAM-1



CAGTGATGAAGCCAGACGAGAGGCAAAGAATG




ATCTCTTCTTCTCCCGCGTATCTCAGTGGGATGA




CTGGCTATCACAATACACAACCCTGCAGTATCG




CGGGCAGTTCGATGTTGTACGTCCAGTGGTGCG




CAAGCTCGTTTCTGAGATGCGTCAGAACCCTAT




TGATGTTCTGTATCGTCCAAAGGACGGAGCAAG




ACCTGATGCCGCTGATGTGCTTATGGGTATGTA




TCGCACAGACATGCGGCATAACACGGCTAAAA




TCGCGGTTAACATCGCTGTTCGTGAGCAGATTG




AAGCTGGAGTTGGTGCGTGGCGTCTGGTCACTG




ACTACGAAGACCAAAGTCCGACGAGCAACAAT




CAGGTTATCCGTCGAGAGCCTATCCATAGTGCC




TGCTCCCATGTTATCTGGGACAGCAACAGCAAA




CTGATGGATAAGTCTGACGCCCGTCACTGCACA




GTTATCCACTCAATGAGCCAGAATGGTTGGGAG




GATTTCGCAGAAAAATACGACCTCGATGCGGAT




GATATTCCATCATTCCAGAACCCCAACGATTGG




GTATTTCCATGGCTGACGCAGGACACAATTCAG




ATCGCTGAGTTTTACGAAGTGGTCGAGAAGAAA




GAGACGGCGTTTATCTACCAAGACCCGGTTACG




GGTGAGCCGGTAAGCTACTTTAAGCGCGATATT




AAAGACGTCATCGATGACCTGGCTGATAGTGGA




TTTATCAAAATTGCAGAGCGCCAGATTAAGCGT




CGCCGGGTATACAAATCGATTATCACCTGCACT




GCTGTACTCAAAGACAAGCAGCTCATTGCTGGC




GAGCATATCCCCATTGTTCCGGTGTTCGGAGAG




TGGGGCTTCGTTGAAGATAAAGAAGTGTATGAG




GGTGTCGTCCGCCTGACAAAAGACGGCCAGCGT




CTGCGCAACATGATTATGTCGTTCAACGCCGAC




ATCGTGGCCCGCACTCCGAAGAAGAAGCCGTTC




TTCTGGCCTGAGCAGATTGCAGGCTTTGAGCAT




ATGTACGACGGTAACGACGATTACCCATACTAC




CTGCTCAATCGCACTGACGAAAATAGTGGAGAC




CTTCCGACTCAGCCGCTGGCATATTATGAAAAC




CCGGAAGTGCCGCAAGCCAACGCCTACATGCTG




GAAGCAGCAACCAGCGCAGTAAAAGAGGTTGC




CACTCTCGGAGTTGATACAGAAGCGGTAAATGG




CGGACAGGTTGCGTTTGATACCGTCAATCAACT




GAATATGAGGGCTGACCTTGAGACATACGTGTT




TCAGGATAATCTGGCTACCGCCATGCGCCGTGA




CGGAGAGATTTACCAGTCGATAGTTAATGACAT




CTACGATGTTCCTCGCAACGTTACGATTACCCTT




GAGGATGGCAGCGAGAAAGATGTTCAGCTAAT




GGCTGAGGTTGTTGACCTTGCTACTGGAGAAAA




GCAGGTACTAAACGATATCAGGGGGCGCTATG




AGTGCTACACGGATGTTGGACCATCATTCCAGT




CCATGAAGCAGCAAAACCGCGCAGAAATTCTT




GAGTTGCTCGGCAAGACGCCACAGGGAACGCC




AGAATATCAACTGCTGTTGCTTCAGTACTTCAC




CCTGCTTGATGGTAAAGGTGTTGAGATGATGCG




TGACTATGCCAACAAGCAGCTTATTCAGATGGG




CGTTAAGAAGCCAGAAACGCCCGAAGAGCAGC




AATGGTTAGTAGAGGCGCAACAAGCCAAACAA




GGTGGTGGCGTACATTCGCCTAACAAGAAGTAA






SEQ ID NO: 82
ATGAATCATAAACATCATCATCATCATCACAGC
P22 portal protein



AGCGGCGAAAACCTGTATTTTCAGGGCCATATG
channel Full



GGATCCGCCGACAATGAAAACAGGCTGGAGAG
length-anti-



CATCCTGTCGCGCTTTGATGCGGACTGGACAGC
VCAM-1



CAGTGATGAAGCCAGACGAGAGGCAAAGAATG




ATCTCTTCTTCTCCCGCGTATCTCAGTGGGATGA




CTGGCTATCACAATACACAACCCTGCAGTATCG




CGGGCAGTTCGATGTTGTACGTCCAGTGGTGCG




CAAGCTCGTTTCTGAGATGCGTCAGAACCCTAT




TGATGTTCTGTATCGTCCAAAGGACGGAGCAAG




ACCTGATGCCGCTGATGTGCTTATGGGTATGTA




TCGCACAGACATGCGGCATAACACGGCTAAAA




TCGCGGTTAACATCGCTGTTCGTGAGCAGATTG




AAGCTGGAGTTGGTGCGTGGCGTCTGGTCACTG




ACTACGAAGACCAAAGTCCGACGAGCAACAAT




CAGGTTATCCGTCGAGAGCCTATCCATAGTGCC




TGCTCCCATGTTATCTGGGACAGCAACAGCAAA




CTGATGGATAAGTCTGACGCCCGTCACTGCACA




GTTATCCACTCAATGAGCCAGAATGGTTGGGAG




GATTTCGCAGAAAAATACGACCTCGATGCGGAT




GATATTCCATCATTCCAGAACCCCAACGATTGG




GTATTTCCATGGCTGACGCAGGACACAATTCAG




ATCGCTGAGTTTTACGAAGTGGTCGAGAAGAAA




GAGACGGCGTTTATCTACCAAGACCCGGTTACG




GGTGAGCCGGTAAGCTACTTTAAGCGCGATATT




AAAGACGTCATCGATGACCTGGCTGATAGTGGA




TTTATCAAAATTGCAGAGCGCCAGATTAAGCGT




CGCCGGGTATACAAATCGATTATCACCTGCACT




GCTGTACTCAAAGACAAGCAGCTCATTGCTGGC




GAGCATATCCCCATTGTTCCGGTGTTCGGAGAG




TGGGGCTTCGTTGAAGATAAAGAAGTGTATGAG




GGTGTCGTCCGCCTGACAAAAGACGGCCAGCGT




CTGCGCAACATGATTATGTCGTTCAACGCCGAC




ATCGTGGCCCGCACTCCGAAGAAGAAGCCGTTC




TTCTGGCCTGAGCAGATTGCAGGCTTTGAGCAT




ATGTACGACGGTAACGACGATTACCCATACTAC




CTGCTCAATCGCACTGACGAAAATAGTGGAGAC




CTTCCGACTCAGCCGCTGGCATATTATGAAAAC




CCGGAAGTGCCGCAAGCCAACGCCTACATGCTG




GAAGCAGCAACCAGCGCAGTAAAAGAGGTTGC




CACTCTCGGAGTTGATACAGAAGCGGTAAATGG




CGGACAGGTTGCGTTTGATACCGTCAATCAACT




GAATATGAGGGCTGACCTTGAGACATACGTGTT




TCAGGATAATCTGGCTACCGCCATGCGCCGTGA




CGGAGAGATTTACCAGTCGATAGTTAATGACAT




CTACGATGTTCCTCGCAACGTTACGATTACCCTT




GAGGATGGCAGCGAGAAAGATGTTCAGCTAAT




GGCTGAGGTTGTTGACCTTGCTACTGGAGAAAA




GCAGGTACTAAACGATATCAGGGGGCGCTATG




AGTGCTACACGGATGTTGGACCATCATTCCAGT




CCATGAAGCAGCAAAACCGCGCAGAAATTCTT




GAGTTGCTCGGCAAGACGCCACAGGGAACGCC




AGAATATCAACTGCTGTTGCTTCAGTACTTCAC




CCTGCTTGATGGTAAAGGTGTTGAGATGATGCG




TGACTATGCCAACAAGCAGCTTATTCAGATGGG




CGTTAAGAAGCCAGAAACGCCCGAAGAGCAGC




AATGGTTAGTAGAGGCGCAACAAGCCAAACAA




GGTCAACAAGACCCGGCAATGGTTCAGGCTCA




GGGCGTACTCCTGCAGGGGCAGGCTGAACTGG




CTAAAGCTCAGAACCAGACGCTGTCCCTGCAAA




TCGATGCAGCTAAAGTCGAAGCGCAGAACCAG




CTTAACGCTGCCAGAATCGCAGAAATCTTCAAC




AACATGGACCTCAGTAAACAATCTGAGTTTAGA




GAGTTCCTTAAAACCGTTGCTTCATTCCAGCAG




GACCGCAGCGAAGACGCTCGCGCAAATGCTGA




GTTACTCCTTAAAGGCGATGAACAGACGCACAA




GCAGCGAATGGACATTGCCAACATCCTGCAATC




GCAGAGACAAAATCAACCTTCCGGCAGTGTAG




CCGAGACACCTCAAGGTGGCGTACATTCGCCTA




ACAAGAAGTAA






SEQ ID NO: 83
ATGAATCATAAACATCATCATCATCATCACAGC
phi29 portal



AGCGGCGAAAACCTGTATTTTCAGGGCCATATG
protein channel



ATTCTGACATACCTGTCTATCAATGTGATACAG
Δ1-7 N-ter IL,



CTTCAAAAACGGAATAGATGGTTTATTCACTAT
R10L, E14V,



CTGAACTACCTTCAATCTCTAGCCTATCAGCTAT
R17L, C-Spytag



TTGAGTGGGAGAACCTACCGCCTACGATAAACC




CTAGTTTCTTAGAAAAGTCTATTCATCAATTCG




GGTACGTGGGGTTCTATAAAGACCCTGTCATCA




GTTATATCGCTTGTAATGGCGCTCTATCGGGTC




AGAGAGACGTTTACAACCAAGCTACAGTTTTTA




GAGCCGCATCTCCTGTGTATCAAAAAGAATTCA




AGCTATACAACTATAGAGATATGAAGGAAGAA




GATATGGGTGTTGTTATCTACAACAATGACATG




GCTTTCCCTACCACGCCAACGCTAGAATTGTTT




GCGGCTGAATTGGCTGAATTAAAAGAAATCATA




TCGGTCAACCAAAACGCTCAAAAGACACCCGTC




TTAATTAGAGCAAATGACAATAACCAACTGAGC




TTAAAACAAGTGTATAACCAGTATGAAGGTAAT




GCCCCTGTTATCTTCGCTCACGAAGCTCTCGAC




AGTGACTCTATAGAAGTGTTTAAGACTGATGCT




CCCTATGTGGTGGACAAGTTAAACGCTCAGAAA




AATGCAGTATGGAATGAGATGATGACTTTCCTT




GGCATTAAGAACGCTAACCTAGAGAAGAAAGA




GCGCATGGTTACGGATGAAGTTTCCAGTAACGA




TGAACAGATCGAGTCTAGCGGCACTGTATTTTT




GAAGTCGAGGGAAGAAGCATGTGAGAAGATTA




ATGAGCTATATGGTCTCAATGTTAAAGTTAAAT




TCAGATATGACATCGTGGAACAAATGAGACGT




GAGCTACAGCAAATAGAAAATGTTTCACGTGG




AACATCGGACGGTGAAACAAATGAGGCGCATA




TTGTGATGGTGGATGCGTATAAACCGACCAAAT




AACTCGAG






SEQ ID NO: 84
ATGATTCTGACATACCTGTCTATCAATGTGATA
phi29 portal



CAGCTTCAAAAACGGAATAGATGGTTTATTCAC
protein channel



TATCTGAACTACCTTCAATCTCTAGCCTATCAGC
Δ1-7 N-ter IL,



TATTTGAGTGGGAGAACCTACCGCCTACGATAA
R10L ,E14V,



ACCCTAGTTTCTTAGAAAAGTCTATTCATCAATT
R17L, C-his



CGGGTACGTGGGGTTCTATAAAGACCCTGTCAT




CAGTTATATCGCTTGTAATGGCGCTCTATCGGG




TCAGAGAGACGTTTACAACCAAGCTACAGTTTT




TAGAGCCGCATCTCCTGTGTATCAAAAAGAATT




CAAGCTATACAACTATAGAGATATGAAGGAAG




AAGATATGGGTGTTGTTATCTACAACAATGACA




TGGCTTTCCCTACCACGCCAACGCTAGAATTGT




TTGCGGCTGAATTGGCTGAATTAAAAGAAATCA




TATCGGTCAACCAAAACGCTCAAAAGACACCC




GTCTTAATTAGAGCAAATGACAATAACCAACTG




AGCTTAAAACAAGTGTATAACCAGTATGAAGGT




AATGCCCCTGTTATCTTCGCTCACGAAGCTCTC




GACAGTGACTCTATAGAAGTGTTTAAGACTGAT




GCTCCCTATGTGGTGGACAAGTTAAACGCTCAG




AAAAATGCAGTATGGAATGAGATGATGACTTTC




CTTGGCATTAAGAACGCTAACCTAGAGAAGAA




AGAGCGCATGGTTACGGATGAAGTTTCCAGTAA




CGATGAACAGATCGAGTCTAGCGGCACTGTATT




TTTGAAGTCGAGGGAAGAAGCATGTGAGAAGA




TTAATGAGCTATATGGTCTCAATGTTAAAGTTA




AATTCAGATATGACATCGTGGAACAAATGAGA




CGTGAGCTACAGCAAATAGAAAATGTTTCACGT




GGAACATCGGACGGTGAAACAAATGAGCTCGA




GCACCACCACCACCACCACTGA






SEQ ID NO: 85
ATGGCACGTAAACGCAGTAACACATACCGATCT
phi29 portal



ATCAATGAGATACAGCGTCAAAAACGGAATAG
protein channel-



ATGGTTTATTCACTATCTGAACTACCTTCAATCT
Q168C



CTAGCCTATCAGCTATTTGAGTGGGAGAACCTA




CCGCCTACGATAAACCCTAGTTTCTTAGAAAAG




TCTATTCATCAATTCGGGTACGTGGGGTTCTAT




AAAGACCCTGTCATCAGTTATATCGCTTGTAAT




GGCGCTCTATCGGGTCAGAGAGACGTTTACAAC




CAAGCTACAGTTTTTAGAGCCGCATCTCCTGTG




TATCAAAAAGAATTCAAGCTATACAACTATAGA




GATATGAAGGAAGAAGATATGGGTGTTGTTATC




TACAACAATGACATGGCTTTCCCTACCACGCCA




ACGCTAGAATTGTTTGCGGCTGAATTGGCTGAA




TTAAAAGAAATCATATCGGTCAACCAAAACGCT




CAAAAGACACCCGTCTTAATTAGAGCCAATGAC




AATAACTGCCTGAGCTTAAAACAAGTGTATAAC




CAGTATGAAGGTAATGCCCCTGTTATCTTCGCT




CACGAAGCTCTCGACAGTGACTCTATAGAAGTG




TTTAAGACTGATGCTCCCTATGTGGTGGACAAG




TTAAACGCTCAGAAAAATGCAGTATGGAATGA




GATGATGACTTTCCTTGGCATTAAGAACGCTAA




CCTAGAGAAGAAAGAGCGCATGGTTACGGATG




AAGTTTCCAGTAACGATGAACAGATCGAGTCTA




GCGGCACTGTATTTTTGAAGTCGAGGGAAGAAG




CATGTGAGAAGATTAATGAGCTATATGGTCTCA




ATGTTAAAGTTAAATTCAGATATGACATCGTGG




AACAAATGAGACGTGAGCTACAGCAAATAGAA




AATGTTTCACGTGGAACATCGGACGGTGAAACA




AATGAG






SEQ ID NO: 86
ATGGCACGTAAACGCAGTAACACATACCGATCT
phi29 portal



ATCAATGAGATACAGCGTCAAAAACGGAATAG
protein channel-



ATGGTTTATTCACTATCTGAACTACCTTCAATCT
E135C



CTAGCCTATCAGCTATTTGAGTGGGAGAACCTA




CCGCCTACGATAAACCCTAGTTTCTTAGAAAAG




TCTATTCATCAATTCGGGTACGTGGGGTTCTAT




AAAGACCCTGTCATCAGTTATATCGCTTGTAAT




GGCGCTCTATCGGGTCAGAGAGACGTTTACAAC




CAAGCTACAGTTTTTAGAGCCGCATCTCCTGTG




TATCAAAAAGAATTCAAGCTATACAACTATAGA




GATATGAAGGAAGAAGATATGGGTGTTGTTATC




TACAACAATGACATGGCTTTCCCTACCACGCCA




ACGCTATGCTTGTTTGCGGCTGAATTGGCTGAA




TTAAAAGAAATCATATCGGTCAACCAAAACGCT




CAAAAGACACCCGTCTTAATTAGAGCCAATGAC




AATAACCAACTGAGCTTAAAACAAGTGTATAAC




CAGTATGAAGGTAATGCCCCTGTTATCTTCGCT




CACGAAGCTCTCGACAGTGACTCTATAGAAGTG




TTTAAGACTGATGCTCCCTATGTGGTGGACAAG




TTAAACGCTCAGAAAAATGCAGTATGGAATGA




GATGATGACTTTCCTTGGCATTAAGAACGCTAA




CCTAGAGAAGAAAGAGCGCATGGTTACGGATG




AAGTTTCCAGTAACGATGAACAGATCGAGTCTA




GCGGCACTGTATTTTTGAAGTCGAGGGAAGAAG




CATGTGAGAAGATTAATGAGCTATATGGTCTCA




ATGTTAAAGTTAAATTCAGATATGACATCGTGG




AACAAATGAGACGTGAGCTACAGCAAATAGAA




AATGTTTCACGTGGAACATCGGACGGTGAAACA




AATGAG






SEQ ID NO: 87
ATGGCACGTAAACGCAGTAACACATACCGATCT
phi29 portal



ATCAATGAGATACAGCGTCAAAAACGGAATAG
protein channel-



ATGGTTTATTCACTATCTGAACTACCTTCAATCT
A79C



CTAGCCTATCAGCTATTTGAGTGGGAGAACCTA




CCGCCTACGATAAACCCTAGTTTCTTAGAAAAG




TCTATTCATCAATTCGGGTACGTGGGGTTCTAT




AAAGACCCTGTCATCAGTTATATCGCTTGTAAT




GGCTGTCTATCGGGTCAGAGAGACGTTTACAAC




CAAGCTACAGTTTTTAGAGCCGCATCTCCTGTG




TATCAAAAAGAATTCAAGCTATACAACTATAGA




GATATGAAGGAAGAAGATATGGGTGTTGTTATC




TACAACAATGACATGGCTTTCCCTACCACGCCA




ACGCTAGAATTGTTTGCGGCTGAATTGGCTGAA




TTAAAAGAAATCATATCGGTCAACCAAAACGCT




CAAAAGACACCCGTCTTAATTAGAGCCAATGAC




AATAACCAACTGAGCTTAAAACAAGTGTATAAC




CAGTATGAAGGTAATGCCCCTGTTATCTTCGCT




CACGAAGCTCTCGACAGTGACTCTATAGAAGTG




TTTAAGACTGATGCTCCCTATGTGGTGGACAAG




TTAAACGCTCAGAAAAATGCAGTATGGAATGA




GATGATGACTTTCCTTGGCATTAAGAACGCTAA




CCTAGAGAAGAAAGAGCGCATGGTTACGGATG




AAGTTTCCAGTAACGATGAACAGATCGAGTCTA




GCGGCACTGTATTTTTGAAGTCGAGGGAAGAAG




CATGTGAGAAGATTAATGAGCTATATGGTCTCA




ATGTTAAAGTTAAATTCAGATATGACATCGTGG




AACAAATGAGACGTGAGCTACAGCAAATAGAA




AATGTTTCACGTGGAACATCGGACGGTGAAACA




AATGAG






SEQ ID NO: 88
ATGGCATATGTACCATTATCAGGAACGAACGTC
phi29-tail protein



AGGATTTTAGCTGACGTTCCTTTCTCTAATGATT
channel Δ417-



ATAAAAACACGAGATGGTTCACATCTTCAAGTA
491-K358C



ATCAGTATAACTGGTTTAACAGCAAATCACGTG




TGTATGAAATGAGTAAAGTAACATTCATGGGGT




TTAGAGAAAATAAACCATATGTTTCGGTTAGTC




TTCCCATAGATAAGCTTTACAGTGCGTCATATA




TTATGTTTCAAAATGCAGACTACGGTAACAAGT




GGTTTTATGCATTTGTAACCGAGTTAGAATTTA




AAAATAGTGCTGTTACCTACGTTCACTTTGAAA




TTGATGTTCTCCAAACATGGATGTTCGATATTA




AATTTCAAGAATCATTCATTGTGAGGGAGCACG




TTAAATTATGGAATGACGACGGGACACCGACTA




TCAACACAATTGATGAGGGTCTCAGCTACGGAA




GTGAATACGACATAGTTTCTGTAGAAAACCATA




AACCATACGACGACATGATGTTTCTCGTGATTA




TTTCCAAAAGCATTATGCATGGGACGCCGGGAG




AAGAGGAAAGCAGGCTAAATGACATAAACGCA




AGCCTGAACGGCATGCCGCAACCTCTCTGCTAC




TATATTCACCCATTCTACAAAGATGGTAAAGTT




CCTAAAACGTATATCGGAGATAACAACGCTAAC




TTGTCTCCTATTGTCAATATGCTCACCAATATCT




TTTCACAGAAGAGCGCTGTTAACGATATTGTCA




ATATGTATGTGACTGATTATATTGGTTTGAAGC




TTGACTATAAAAATGGTGATAAAGAATTGAAGC




TCGATAAAGACATGTTTGAACAGGCGGGTATAG




CTGACGATAAACACGGTAACGTTGACACCATCT




TTGTGAAGAAAATACCTGATTATGAAGCCCTAG




AAATAGACACAGGTGATAAATGGGGTGGCTTC




ACAAAAGACCAAGAAAGCAAACTGATGATGTA




CCCTTACTGCGTTACGGAAATAACTGACTTTAA




AGGCAACCATATGAATCTGAAAACCGAGTACA




TCAATAACAGTAAACTATGTATACAGGTTAGGG




GTTCACTAGGGGTCAGTAACAAGGTTGCCTACA




GTGTTCAGGATTATAACGCAGATAGCGCATTGA




GTGGCGGCAATAGATTGACTGCGTCTCTAGATT




CATCCTTAATCAACAACAACCCAAATGACATAG




CAATACTAAATGACTATCTATCTGCTTATCAGTT




AACGAAAATGGGCGGCAACACAGCGTTTGATT




ACGGGAATGGGTACAGAGGTGTGTACGTCATC




AAAAAGCAATTGAAGGCTGAATACAGACGAAG




TCTATCAAGTTTCTTCCATAAATACGGATACAA




GATTAACAGGGTAAAGAAACCAAATTTAAGAA




CACGAAAAGCATTTAACTATGTTCAGACAAAAG




ACTGTTTCATTTCAGGGGACATCAATAACAATG




ACTTACAGGAAATAAGAACAATTTTCGATAATG




GTATTACTCTTTGGCATACTGACAACATCGGAA




ATTACAGCGTCGAGAATGAATTGAGGTGA






SEQ ID NO: 89
ATGGCATATGTACCATTATCAGGAACGAACGTC
phi29-tail protein



AGGATTTTAGCTGACGTTCCTTTCTCTAATGATT
channelΔ4417-



ATAAAAACACGAGATGGTTCACATCTTCAAGTA
491-



ATCAGTATAACTGGTTTAACAGCAAATCACGTG
K134I, D138L,



TGTATGAAATGAGTAAAGTAACATTCATGGGGT
D139L



TTAGAGAAAATAAACCATATGTTTCGGTTAGTC




TTCCCATAGATAAGCTTTACAGTGCGTCATATA




TTATGTTTCAAAATGCAGACTACGGTAACAAGT




GGTTTTATGCATTTGTAACCGAGTTAGAATTTA




AAAATAGTGCTGTTACCTACGTTCACTTTGAAA




TTGATGTTCTCCAAACATGGATGTTCGATATTA




AATTTCAAGAATCATTCATTGTGAGGGAGCACG




TTATTTTATGGAATCTGCTGGGGACACCGACTA




TCAACACAATTGATGAGGGTCTCAGCTACGGAA




GTGAATACGACATAGTTTCTGTAGAAAACCATA




AACCATACGACGACATGATGTTTCTCGTGATTA




TTTCCAAAAGCATTATGCATGGGACGCCGGGAG




AAGAGGAAAGCAGGCTAAATGACATAAACGCA




AGCCTGAACGGCATGCCGCAACCTCTCTGCTAC




TATATTCACCCATTCTACAAAGATGGTAAAGTT




CCTAAAACGTATATCGGAGATAACAACGCTAAC




TTGTCTCCTATTGTCAATATGCTCACCAATATCT




TTTCACAGAAGAGCGCTGTTAACGATATTGTCA




ATATGTATGTGACTGATTATATTGGTTTGAAGC




TTGACTATAAAAATGGTGATAAAGAATTGAAGC




TCGATAAAGACATGTTTGAACAGGCGGGTATAG




CTGACGATAAACACGGTAACGTTGACACCATCT




TTGTGAAGAAAATACCTGATTATGAAGCCCTAG




AAATAGACACAGGTGATAAATGGGGTGGCTTC




ACAAAAGACCAAGAAAGCAAACTGATGATGTA




CCCTTACTGCGTTACGGAAATAACTGACTTTAA




AGGCAACCATATGAATCTGAAAACCGAGTACA




TCAATAACAGTAAACTAAAGATACAGGTTAGG




GGTTCACTAGGGGTCAGTAACAAGGTTGCCTAC




AGTGTTCAGGATTATAACGCAGATAGCGCATTG




AGTGGCGGCAATAGATTGACTGCGTCTCTAGAT




TCATCCTTAATCAACAACAACCCAAATGACATA




GCAATACTAAATGACTATCTATCTGCTTATCAG




TTAACGAAAATGGGCGGCAACACAGCGTTTGAT




TACGGGAATGGGTACAGAGGTGTGTACGTCATC




AAAAAGCAATTGAAGGCTGAATACAGACGAAG




TCTATCAAGTTTCTTCCATAAATACGGATACAA




GATTAACAGGGTAAAGAAACCAAATTTAAGAA




CACGAAAAGCATTTAACTATGTTCAGACAAAAG




ACTGTTTCATTTCAGGGGACATCAATAACAATG




ACTTACAGGAAATAAGAACAATTTTCGATAATG




GTATTACTCTTTGGCATACTGACAACATCGGAA




ATTACAGCGTCGAGAATGAATTGAGGTGA






SEQ ID NO: 90
ATGGCATATGTACCATTATCAGGAACGAACGTC
phi29-tail protein



AGGATTTTAGCTGACGTTCCTTTCTCTAATGATT
channel Δ417-



ATAAAAACACGAGATGGTTCACATCTTCAAGTA
491-



ATCAGTATAACTGGTTTAACAGCAAATCACGTG
K134I, D138L,



TGTATGAAATGAGTAAAGTAACATTCATGGGGT
D139L, D158L,



TTAGAGAAAATAAACCATATGTTTCGGTTAGTC
E163V



TTCCCATAGATAAGCTTTACAGTGCGTCATATA




TTATGTTTCAAAATGCAGACTACGGTAACAAGT




GGTTTTATGCATTTGTAACCGAGTTAGAATTTA




AAAATAGTGCTGTTACCTACGTTCACTTTGAAA




TTGATGTTCTCCAAACATGGATGTTCGATATTA




AATTTCAAGAATCATTCATTGTGAGGGAGCACG




TTATTTTATGGAATCTGCTGGGGACACCGACTA




TCAACACAATTGATGAGGGTCTCAGCTACGGAA




GTGAATACCTGATAGTTTCTGTAGTTAACCATA




AACCATACGACGACATGATGTTTCTCGTGATTA




TTTCCAAAAGCATTATGCATGGGACGCCGGGAG




AAGAGGAAAGCAGGCTAAATGACATAAACGCA




AGCCTGAACGGCATGCCGCAACCTCTCTGCTAC




TATATTCACCCATTCTACAAAGATGGTAAAGTT




CCTAAAACGTATATCGGAGATAACAACGCTAAC




TTGTCTCCTATTGTCAATATGCTCACCAATATCT




TTTCACAGAAGAGCGCTGTTAACGATATTGTCA




ATATGTATGTGACTGATTATATTGGTTTGAAGC




TTGACTATAAAAATGGTGATAAAGAATTGAAGC




TCGATAAAGACATGTTTGAACAGGCGGGTATAG




CTGACGATAAACACGGTAACGTTGACACCATCT




TTGTGAAGAAAATACCTGATTATGAAGCCCTAG




AAATAGACACAGGTGATAAATGGGGTGGCTTC




ACAAAAGACCAAGAAAGCAAACTGATGATGTA




CCCTTACTGCGTTACGGAAATAACTGACTTTAA




AGGCAACCATATGAATCTGAAAACCGAGTACA




TCAATAACAGTAAACTAAAGATACAGGTTAGG




GGTTCACTAGGGGTCAGTAACAAGGTTGCCTAC




AGTGTTCAGGATTATAACGCAGATAGCGCATTG




AGTGGCGGCAATAGATTGACTGCGTCTCTAGAT




TCATCCTTAATCAACAACAACCCAAATGACATA




GCAATACTAAATGACTATCTATCTGCTTATCAG




TTAACGAAAATGGGCGGCAACACAGCGTTTGAT




TACGGGAATGGGTACAGAGGTGTGTACGTCATC




AAAAAGCAATTGAAGGCTGAATACAGACGAAG




TCTATCAAGTTTCTTCCATAAATACGGATACAA




GATTAACAGGGTAAAGAAACCAAATTTAAGAA




CACGAAAAGCATTTAACTATGTTCAGACAAAAG




ACTGTTTCATTTCAGGGGACATCAATAACAATG




ACTTACAGGAAATAAGAACAATTTTCGATAATG




GTATTACTCTTTGGCATACTGACAACATCGGAA




ATTACAGCGTCGAGAATGAATTGAGGTGA






SEQ ID NO: 91
ATGGCATATGTACCATTATCAGGAACGAACGTC
phi29-tail protein



AGGATTTTAGCTGACGTTCCTTTCTCTAATGATT
channel Δ417-491



ATAAAAACACGAGATGGTTCACATCTTCAAGTA
K134I, D138L,



ATCAGTATAACTGGTTTAACAGCAAATCACGTG
D139L, D158L,



TGTATGAAATGAGTAAAGTAACATTCATGGGGT
E163V, E309V, D311V,



TTAGAGAAAATAAACCATATGTTTCGGTTAGTC
K321V



TTCCCATAGATAAGCTTTACAGTGCGTCATATA




TTATGTTTCAAAATGCAGACTACGGTAACAAGT




GGTTTTATGCATTTGTAACCGAGTTAGAATTTA




AAAATAGTGCTGTTACCTACGTTCACTTTGAAA




TTGATGTTCTCCAAACATGGATGTTCGATATTA




AATTTCAAGAATCATTCATTGTGAGGGAGCACG




TTATTTTATGGAATCTGCTGGGGACACCGACTA




TCAACACAATTGATGAGGGTCTCAGCTACGGAA




GTGAATACCTGATAGTTTCTGTAGTTAACCATA




AACCATACGACGACATGATGTTTCTCGTGATTA




TTTCCAAAAGCATTATGCATGGGACGCCGGGAG




AAGAGGAAAGCAGGCTAAATGACATAAACGCA




AGCCTGAACGGCATGCCGCAACCTCTCTGCTAC




TATATTCACCCATTCTACAAAGATGGTAAAGTT




CCTAAAACGTATATCGGAGATAACAACGCTAAC




TTGTCTCCTATTGTCAATATGCTCACCAATATCT




TTTCACAGAAGAGCGCTGTTAACGATATTGTCA




ATATGTATGTGACTGATTATATTGGTTTGAAGC




TTGACTATAAAAATGGTGATAAAGAATTGAAGC




TCGATAAAGACATGTTTGAACAGGCGGGTATAG




CTGACGATAAACACGGTAACGTTGACACCATCT




TTGTGAAGAAAATACCTGATTATGAAGCCCTAG




TTATAGTTACAGGTGATAAATGGGGTGGCTTCA




CAAAAGACCAAGAAAGCAAACTGATGATGTAC




CCTTACTGCGTTACGGAAATAACTGACTTTAAA




GGCAACCATATGAATCTGAAAACCGAGTACATC




AATAACAGTAAACTAAAGATACAGGTTAGGGG




TTCACTAGGGGTCAGTAACAAGGTTGCCTACAG




TGTTCAGGATTATAACGCAGATAGCGCATTGAG




TGGCGGCAATAGATTGACTGCGTCTCTAGATTC




ATCCTTAATCAACAACAACCCAAATGACATAGC




AATACTAAATGACTATCTATCTGCTTATCAGTT




AACGAAAATGGGCGGCAACACAGCGTTTGATT




ACGGGAATGGGTACAGAGGTGTGTACGTCATC




AAAAAGCAATTGAAGGCTGAATACAGACGAAG




TCTATCAAGTTTCTTCCATAAATACGGATACAA




GATTAACAGGGTAAAGAAACCAAATTTAAGAA




CACGAAAAGCATTTAACTATGTTCAGACAAAAG




ACTGTTTCATTTCAGGGGACATCAATAACAATG




ACTTACAGGAAATAAGAACAATTTTCGATAATG




GTATTACTCTTTGGCATACTGACAACATCGGAA




ATTACAGCGTCGAGAATGAATTGAGGTGA






SEQ ID NO: 92
ATGAATCATAAACATCATCATCATCATCACAGC
phi29-tail protein



AGCGGCGAAAACCTGTATTTTCAGGGCCATATG
channel Δ417-



GGATCCATGGCATATGTACCATTATCAGGAACG
491-N-his



AACGTCAGGATTTTAGCTGACGTTCCTTTCTCTA




ATGATTATAAAAACACGAGATGGTTCACATCTT




CAAGTAATCAGTATAACTGGTTTAACAGCAAAT




CACGTGTGTATGAAATGAGTAAAGTAACATTCA




TGGGGTTTAGAGAAAATAAACCATATGTTTCGG




TTAGTCTTCCCATAGATAAGCTTTACAGTGCGT




CATATATTATGTTTCAAAATGCAGACTACGGTA




ACAAGTGGTTTTATGCATTTGTAACCGAGTTAG




AATTTAAAAATAGTGCTGTTACCTACGTTCACT




TTGAAATTGATGTTCTCCAAACATGGATGTTCG




ATATTAAATTTCAAGAATCATTCATTGTGAGGG




AGCACGTTAAATTATGGAATGACGACGGGACA




CCGACTATCAACACAATTGATGAGGGTCTCAGC




TACGGAAGTGAATACGACATAGTTTCTGTAGAA




AACCATAAACCATACGACGACATGATGTTTCTC




GTGATTATTTCCAAAAGCATTATGCATGGGACG




CCGGGAGAAGAGGAAAGCAGGCTAAATGACAT




AAACGCAAGCCTGAACGGCATGCCGCAACCTCT




CTGCTACTATATTCACCCATTCTACAAAGATGG




TAAAGTTCCTAAAACGTATATCGGAGATAACAA




CGCTAACTTGTCTCCTATTGTCAATATGCTCACC




AATATCTTTTCACAGAAGAGCGCTGTTAACGAT




ATTGTCAATATGTATGTGACTGATTATATTGGTT




TGAAGCTTGACTATAAAAATGGTGATAAAGAAT




TGAAGCTCGATAAAGACATGTTTGAACAGGCG




GGTATAGCTGACGATAAACACGGTAACGTTGAC




ACCATCTTTGTGAAGAAAATACCTGATTATGAA




GCCCTAGAAATAGACACAGGTGATAAATGGGG




TGGCTTCACAAAAGACCAAGAAAGCAAACTGA




TGATGTACCCTTACTGCGTTACGGAAATAACTG




ACTTTAAAGGCAACCATATGAATCTGAAAACCG




AGTACATCAATAACAGTAAACTAAAGATACAG




GTTAGGGGTTCACTAGGGGTCAGTAACAAGGTT




GCCTACAGTGTTCAGGATTATAACGCAGATAGC




GCATTGAGTGGCGGCAATAGATTGACTGCGTCT




CTAGATTCATCCTTAATCAACAACAACCCAAAT




GACATAGCAATACTAAATGACTATCTATCTGCT




TATCAGTTAACGAAAATGGGCGGCAACACAGC




GTTTGATTACGGGAATGGGTACAGAGGTGTGTA




CGTCATCAAAAAGCAATTGAAGGCTGAATACA




GACGAAGTCTATCAAGTTTCTTCCATAAATACG




GATACAAGATTAACAGGGTAAAGAAACCAAAT




TTAAGAACACGAAAAGCATTTAACTATGTTCAG




ACAAAAGACTGTTTCATTTCAGGGGACATCAAT




AACAATGACTTACAGGAAATAAGAACAATTTTC




GATAATGGTATTACTCTTTGGCATACTGACAAC




ATCGGAAATTACAGCGTCGAGAATGAATTGAG




GTGA






SEQ ID NO: 93
ATGGCATATGTACCATTATCAGGAACGAACGTC
phi29 tail protein



AGGATTTTAGCTGACGTTCCTTTCTCTAATGATT
channel WT



ATAAAAACACGAGATGGTTCACATCTTCAAGTA




ATCAGTATAACTGGTTTAACAGCAAATCACGTG




TGTATGAAATGAGTAAAGTAACATTCATGGGGT




TTAGAGAAAATAAACCATATGTTTCGGTTAGTC




TTCCCATAGATAAGCTTTACAGTGCGTCATATA




TTATGTTTCAAAATGCAGACTACGGTAACAAGT




GGTTTTATGCATTTGTAACCGAGTTAGAATTTA




AAAATAGTGCTGTTACCTACGTTCACTTTGAAA




TTGATGTTCTCCAAACATGGATGTTCGATATTA




AATTTCAAGAATCATTCATTGTGAGGGAGCACG




TTAAATTATGGAATGACGACGGGACACCGACTA




TCAACACAATTGATGAGGGTCTCAGCTACGGAA




GTGAATACGACATAGTTTCTGTAGAAAACCATA




AACCATACGACGACATGATGTTTCTCGTGATTA




TTTCCAAAAGCATTATGCATGGGACGCCGGGAG




AAGAGGAAAGCAGGCTAAATGACATAAACGCA




AGCCTGAACGGCATGCCGCAACCTCTCTGCTAC




TATATTCACCCATTCTACAAAGATGGTAAAGTT




CCTAAAACGTATATCGGAGATAACAACGCTAAC




TTGTCTCCTATTGTCAATATGCTCACCAATATCT




TTTCACAGAAGAGCGCTGTTAACGATATTGTCA




ATATGTATGTGACTGATTATATTGGTTTGAAGC




TTGACTATAAAAATGGTGATAAAGAATTGAAGC




TCGATAAAGACATGTTTGAACAGGCGGGTATAG




CTGACGATAAACACGGTAACGTTGACACCATCT




TTGTGAAGAAAATACCTGATTATGAAGCCCTAG




AAATAGACACAGGTGATAAATGGGGTGGCTTC




ACAAAAGACCAAGAAAGCAAACTGATGATGTA




CCCTTACTGCGTTACGGAAATAACTGACTTTAA




AGGCAACCATATGAATCTGAAAACCGAGTACA




TCAATAACAGTAAACTAAAGATACAGGTTAGG




GGTTCACTAGGGGTCAGTAACAAGGTTGCCTAC




AGTGTTCAGGATTATAACGCAGATAGCGCATTG




AGTGGCGGCAATAGATTGACTGCGTCTCTAGAT




TCATCCTTAATCAACAACAACCCAAATGACATA




GCAATACTAAATGACTATCTATCTGCTTATTTAC




AGGGCAACAAAAATTCACTAGAGAACCAAAAA




TCGTCTATCCTTTTTAATGGCATTATGGGTATGA




TCGGCGGAGGTATATCAGCGGGAGCAAGTGCG




GCAGGAGGTTCAGCCCTAGGGATGGCTTCATCA




GTTACAGGGATGACAAGCACTGCGGGTAATGCT




GTTCTACAGATGCAAGCGATGCAAGCCAAGCA




AGCCGATATAGCAAACATTCCGCCGCAGTTAAC




GAAAATGGGCGGCAACACAGCGTTTGATTACG




GGAATGGGTACAGAGGTGTGTACGTCATCAAA




AAGCAATTGAAGGCTGAATACAGACGAAGTCT




ATCAAGTTTCTTCCATAAATACGGATACAAGAT




TAACAGGGTAAAGAAACCAAATTTAAGAACAC




GAAAAGCATTTAACTATGTTCAGACAAAAGACT




GTTTCATTTCAGGGGACATCAATAACAATGACT




TACAGGAAATAAGAACAATTTTCGATAATGGTA




TTACTCTTTGGCATACTGACAACATCGGAAATT




ACAGCGTCGAGAATGAATTGAGGTGA









A polypeptide “variant,” as the term is used herein, is a polypeptide that typically differs from a polypeptide specifically disclosed herein in one or more substitutions, deletions, additions and/or insertions. Such variants can be naturally occurring or can be synthetically generated, for example, by modifying one or more of the above polypeptide sequences of the disclosure and evaluating one or more biological activities of the polypeptide as described herein and/or using any of some techniques well known in the art.


For example, certain amino acids can be substituted for other amino acids in a protein structure without appreciable loss of its ability to bind other polypeptides or cells. Since it is the binding capacity and nature of a protein that defines that protein's biological functional activity, certain amino acid sequence substitutions can be made in a protein sequence, and, accordingly, its underlying DNA coding sequence, whereby a protein with like properties is obtained. It is thus contemplated that various changes can be made in the peptide sequences of the disclosed compositions, or corresponding DNA sequences that encode said peptides without appreciable loss of their biological utility or activity.


Variant sequences include those wherein conservative substitutions have been introduced by modification of polynucleotides encoding polypeptides of this disclosure Amino acids can be classified according to physical properties and contribution to secondary and tertiary protein structure. Such conservative modifications include amino acid substitutions, additions, and deletions. Conservative amino acid substitutions are ones in which the amino acid residue is replaced with an amino acid residue having a similar side chain Families of amino acid residues having similar side chains have been defined in the art. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine, tryptophan), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine).


“Sequence identity” or “homology” refers to the percentage of residues in the polynucleotide or polypeptide sequence variant that are identical to the non-variant sequence after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent homology. In particular embodiments, polynucleotide and polypeptide variants have at least about 70%, at least about 75%, at least about 80%, at least about 90%, at least about 95%, at least about 98%, or at least about 99% polynucleotide or polypeptide homology with a polynucleotide or polypeptide described herein.


Polypeptide variant sequences may share 70% or more (i.e. 80%, 85%, 90%, 95%, 97%, 98%, 99% or more) sequence identity with the sequences recited in this disclosure. Polypeptide variants may also include polypeptide fragments comprising various lengths of contiguous stretches of amino acid sequences disclosed herein. Polypeptide variant sequences include at least about 5, 10, 15, 20, 30, 40, 50, 75, 100, 150, or more contiguous peptides of one or more of the sequences disclosed herein as well as all intermediate lengths therebetween.


II. Nanopore Assemblies Functionalized With Probes

In one aspect, the nanopore assembly further comprises a probe for detecting an analyte. The probe is operably linked to at least one of the subunits. Such a probe is conjugated to the nanopore assembly to allow for selective binding of one or more analytes. A probe may be conjugated to the nanopore assembly at various stoichiometry (molar ratios between the probe and the nanopore assembly). In one example, the probe may be conjugated to each of the subunits. In another example, only one probe is conjugate the nanopore assembly. Also contemplated is a nanopore assembly including two or more different types of probes. Such functionalized nanopore assemblies can be used to detect analytes, such as nucleic acids, amino acids, peptides, proteins, polymers, and chemical molecules. In some embodiments, the analyte is one of PSA, CEA, AFP, VCAM, MiR-155, MiR-22, MiR-7, MiR-92a, MiR-122, MiR-192, MiR-223, MiR-26a, MiR-27a, MiR-802 or a fragment thereof.


The terms “specific binding,” “selective binding,” “selectively binds,” and “specifically binds,” refer to a probe on the nanopore assembly binds to an analyte in the sample more strongly than other contaminants present in the sample. For example, the probe may bind to an analyte with an equilibrium dissociation constant (Kd) of approximately less than 10−6 M, such as approximately less than 10−7 M, 10−8 M, 10−9 M or 10−10 M or even lower when determined by a binding assay, e.g., ELISA, equilibrium dialysis or surface plasmon resonance (SPR) technology in a BIACORE® 2000 surface plasmon resonance instrument.


The probe can be one of chemicals, carbohydrates, aptamers, nucleic acids, peptide, protein, antibodies, and receptors. In some embodiments, the probe may include a sequence at least 75% identical to a sequence of SEQ ID NOs: 36-79. In some embodiments, the probe is an anti-PSA antibody. The probe is operably linked via covalent bonding to at least one of the subunits. The probe can be linked to a subunit via one or more functional sites on the subunit. Such functional sites can be introduced by mutagenesis, for example, substitution with cysteine or other non-natural amino acids. The probe can also be linked to the channel via well-established chemical methods, including but not limited to ester linkage, click-chemistry, and sulfhydryl linkage. In some embodiments, the probe is operably linked to a location in proximity to an entrance of the channel or to a location at an interior side of the channel


III. Nanopore Assemblies With Membranes

In another aspect, this disclosure provides a nanopore assembly, in which the channel is inserted in the membrane. The polymer membrane can be of various compositions depending on the applications either in stand-alone form or in a microfluidic device. The polymer-membrane embedded channels display robust electrophysiological properties, a pre-requisite feature for single-molecule nanopore-based analysis. The membrane may include a polymer membrane (e.g., planar polymer membrane) or a lipid membrane. The polymer membrane can be either symmetric or asymmetric in nature. For example, the polymer membrane is an alternating copolymer (e.g., A-B-A-B- . . . ) or aperiodic copolymer (e.g., AA-BB-AA-BB . . . ). In another example, the polymer membrane can be a block copolymer comprising of two or more homopolymer subunits linked by covalent bonds. Alternatively, the polymer membrane can be a diblock or triblock copolymer (e.g., PMOXA-PDMS-PMOXA). The polymer membrane can also be a terpolymer consisting of three distinct monomers.


In some embodiments, the polymer membrane comprises an alternating copolymer, a periodic copolymer, a block copolymer, a di-block copolymer, a tri-block copolymer, a terpolymer, or a combination thereof. In some embodiments, the polymer membrane comprises PMOXA-PDMS-PMOXA.


In some embodiment, the channel is embedded in a polymersome of various sizes. In some embodiments, the polymersomes are fused with a planar polymer membrane to insert the channel Polymersomes are composed of hydrophilic-hydrophobic block copolymer, arranged in a bilayer vesicular system having a central aqueous core. They have a hydrophilic inner core and lipophilic bilayer. They differ from nanoparticles in that they contain a hydrophilic core rather than a lipophilic core, as in the case of nanoparticles. Although they have a bilayer structure, they offer more stability than liposomes due to the presence of a thick and rigid bilayer. They contain a hydrophilic core that provides a protein-affable environment.


In some embodiments, the channels are inserted into polymer membranes in the presence of any detergents of various composition (e.g., DDM, DOC, Tween, SDS, and Brij based detergents). In some embodiments, the protein channels are reconstituted into polymersomes in the presence of varying amounts of glycerol, CsCl, and/or sucrose for high-efficiency fusion.


In some embodiments, the protein channel is reengineered by mutagenesis (e.g., containing one to several substitutions) to facilitate insertion in (planar) polymer membranes. In addition, the protein channel is reengineered to introduce functional sites for site-specific labeling with chemicals or biopolymers for the purpose of inserting in polymer membranes. Such functional sites include cysteine residues for conjugating chemicals, such as porphyrin, or biological molecules, such as cholesterol or any hydrophobic lipid modules or nucleic acids of various lengths. The functional sites/groups may also include non-natural amino acids and linkage created by well-established chemical methods, including but not limited to ester, click chemistry, and sulfhydryl.


In some embodiments, the location of the conjugation is in the membrane anchoring layer of the channel, such as residues 70-80, 110-140, 155-245, 300-340, 410-435 of T4 gp 20 portal protein; residues 10-45, 250-300 of P22 gp1 portal protein; and residues 130-170, 300-325, 350-390, 530-595 of phi29 gp9 tail protein. Accordingly, to increase or decrease the hydrophobicity of the belt region for the purpose of membrane insertion, reengineering may include mutagenesis (e.g., substitution, insertion, deletion) of any one of the residues 70-80, 110-140, 155-245, 300-340, 410-435 of T4 gp 20 portal protein; residues 450-500, 350-380 of P22 gp1 portal protein; and residues 130-170, 300-325, 350-390, 530-595 of phi29 gp9 tail protein.


In some embodiments, the location of the conjugation is in the cis- and trans-hydrophilic layers of the channel, such as residues 465-515, 285-305 of T4 gp 20 portal protein; residues 450-500, 350-380 of P22 gp1 portal protein; and residues 20-50, and 250-300 of phi29 gp9 tail protein. Accordingly, to increase or decrease the hydrophobicity of the belt region for the purpose of membrane insertion, reengineering may include mutagenesis (e.g., substitution, insertion, deletion) may include any of the residues 465-515, 285-305 of T4 gp 20 portal protein; residues 450-500, 350-380 of P22 gp1 portal protein; and residues 20-50, and 250-300 of phi29 gp9 tail protein.


IV. Devices and Kits Comprising Nanopore Assemblies

In another aspect, this disclosure also provides a kit and a detection apparatus for detecting an analyte. The kit may include the nanopore assembly as described, optionally a buffer, and optionally instructions for using the nanopore assembly.


The apparatus may include the nanopore assembly as described and a support for the nanopore assembly. The detection apparatus can further include electrodes embedded in the solid support. The electrodes can be used to monitor assembly of protein nanopores into membranes. The electrodes can also be used for data collection during analyte detection steps. Electrodes used for monitoring and detection need not be embedded in the support and can be provided, for example, in a separate application-specific integrated circuit (ASIC) chip.


As used herein, the term “support” refers to a rigid substrate that is insoluble in an aqueous liquid and incapable of passing a liquid absent an aperture. Exemplary solid supports include, but are not limited to, glass and modified or functionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, polyurethanes, Teflon™, cyclic olefins, polyimides etc.), nylon, ceramics, resins, Zeonor, silica or silica-based materials including silicon and modified silicon, carbon, metals, inorganic glasses, optical fiber bundles, and polymers. Particularly useful supports comprise modified silicon such as SiN membranes on a Si substrate. For some embodiments, supports are located within a flow cell apparatus or other vessels.


In some embodiments, nanopores are fabricated on substrates such as chips, disks, blocks, plates and the like. Such substrates can be made from a variety of materials including but not limited to silicon, glass, ceramic, germanium, polymers (e.g., polystyrene), and/or gallium arsenide. The substrates may or may not be etched, e.g., chips can be semiconductor chips.


In particular embodiments, a detection apparatus can include a reservoir in contact with the array of nanopores. The reservoir can contain electrodes located to apply a current through the apertures formed by the protein nanopores.


In some embodiments, the detection apparatus of the present disclosure may include (a) an electrode; (b) a nanopore tethered to the electrode; and (c) a membrane surrounding the nanopore. Multiplex embodiments are also provided. For example, a detection apparatus can include (a) a plurality of electrodes; (b) a plurality of nanopores, each of the nanopores tethered to an electrode in the plurality of electrodes; and (c) a membrane surrounding each of the nanopores.


A nanopore can be tethered to an electrode (e.g., via a dielectric pad) using covalent moieties or non-covalent binding moieties. An example of a covalent attachment is when a nucleic acid tether is covalently attached to the nanopore and covalently attached to the dielectric pad. Other tethers can be similarly used for covalent attachment, including, for example, non-nucleic acid tethers such as polyethylene glycol or other synthetic polymers. An example, of a non-covalent attachment, is when a nanopore has an attached affinity moiety, such as a polyhistidine tag, Strep-tag or other amino acid encoded affinity moiety. Affinity moieties can bind non-covalently to ligands on a dielectric pad such as nickel or other divalent cations that bind polyhistidine, or biotin (or analogs thereof) that bind to Strep-tag. In some embodiments, such amino acid affinity moieties need not be used.


As described herein, the nanopores (whether hybrid nanopores or tethered nanopores) can be coupled with a detection circuit, including, for example, a patch clamp circuit, a tunneling electrode circuit, or a transverse conductance measurement circuit (such as a graphene nanoribbon, or a graphene nanogap), to record the electrical signals in methods of the present disclosure. In addition, the pore can also be coupled with an optical sensor that detects labels, for example, a fluorescent moiety or a Raman signal generating moiety, on the polynucleotides.


A detection apparatus of the present disclosure can be used to detect any of a variety of analytes including, but not limited to, ions, nucleic acids, nucleotides, polypeptides, biologically active small molecules, lipids, sugars or the like. Accordingly, one or more of these analytes can be present in or passed through the aperture of a protein nanopore in an apparatus set forth herein.


Other detection techniques that can be applied to an apparatus set forth herein include, but are not limited to, detecting events, such as the motion of a molecule or a portion of that molecule, particularly where the molecule is DNA or an enzyme that binds DNA, such as a polymerase. For example, Olsen et al, JACS 135: 7855-7860 (2013), which is incorporated herein by reference, discloses bioconjugating single molecules of the Klenow fragment (KF) of DNA polymerase I into electronic nanocircuits so as to allow electrical recordings of enzymatic function and dynamic variability with the resolution of individual nucleotide incorporation events. Or, for example, Hurt et al., JACS 131: 3772-3778 (2009), which is incorporated herein by reference, discloses measuring the dwell time for complexes of DNA with the KF atop a nanopore in an applied electric field. Or, for example, Kim et al., Sens. Actuators B Chem. 177: 1075-1082 (2012), which is incorporated herein by reference, discloses using a current-measuring sensor in experiments involving DNA captured in an α-hemolysin nanopore. Or, for example, Garalde et al., J. Biol. Chem. 286: 14480-14492 (2011), which is incorporated herein by reference, discloses distinguishing KF-DNA complexes based on the basis of their properties when captured in an electric field atop an α-hemolysin pore. Other references that disclose measurements involving α-hemolysin include the following, all to Howorka et al., which are incorporated herein by reference: PNAS 98: 12996-13301 (2001); Biophysical Journal 83: 3202-3210 (2002); and Nature Biotechnology 19: 636-639 (2001).


In some embodiments, the invention relates to channel proteins assembled into a lipid bilayer membrane. The presence of an analyte is monitored by the ionic current that passes through the pore at a fixed applied potential with an interruption of current indicating interactions of the analyte with the channel protein. In some embodiments, a stabilized sensor chip contains a single protein nanopore protein. The protein nanopore sensor chip can be applied to measurements at the single-molecule level, i.e., stochastic sensing. By monitoring the ionic current that passes through the pore at a fixed applied potential, various analytes can be distinguished on the basis of the amplitude and duration of individual current-blocking events.


V. Methods For Detecting Analytes Using Nanopore Assemblies

In another aspect, the disclosure provides a method of detecting/sensing an analyte. The method includes: (1) contacting a sample containing an analyte with the nanopore assembly as described; (2) applying an electrical current across the channel of the nanopore assembly; (3) determining the electrical current passing through the channel at one or more time intervals; and (4) comparing the electrical current measured at one or more time intervals with a reference electrical current, wherein a change in electrical current relative to the reference electrical current indicates a presence of the analyte in the sample. In some embodiments, the reference electrical current is measured with a sample that does not contain the analyte. In some embodiments, the nanopore assembly is placed on a support.


In some embodiments, the method may include measuring the ion current and/or a change in current signature, induced by translocation of each unit of the analyte or polymer through the channel In some embodiments, the method may include measuring the ion current and/or a change in current signature, induced by transient or permanent binding of each unit of the analyte or polymer to probes coupled on the channel which cause a change in current signature.


The analyte that can be detected by the disclosed methods of the present disclosure may be any one of nucleic acids, amino acids, peptides, proteins, polymers, and chemical molecules. A nucleic acid detected in the methods can be single stranded, double stranded, or contain both single stranded and double stranded sequence. The nucleic acid molecules can originate in a double-stranded form (e.g., dsDNA) and can optionally be converted to a single-stranded form. The nucleic acid molecules can also originate in a single stranded form (e.g., ssDNA, ssRNA), and the ssDNA can optionally be converted into a double-stranded form.


VI. Definitions

To aid in understanding the detailed description of the compositions and methods according to the disclosure, a few express definitions are provided to facilitate an unambiguous disclosure of the various aspects of the disclosure.


Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.


It is noted here that, as used in this specification and the appended claims, the singular forms “a,” “an,” and “the” include plural reference unless the context clearly dictates otherwise. The terms “including,” “comprising,” “containing,” or “having” and variations thereof are meant to encompass the items listed thereafter and equivalents thereof as well as additional subject matter unless otherwise noted.


As used herein, the term “each,” when used in reference to a collection of items, is intended to identify an individual item in the collection but does not necessarily refer to every item in the collection. Exceptions can occur if explicit disclosure or context clearly dictates otherwise.


The use of any and all examples, or exemplary language (e.g., “such as”) provided herein, is intended merely to better illuminate the invention and does not pose a limitation on the scope of the invention unless otherwise claimed. No language in the specification should be construed as indicating any non-claimed element as essential to the practice of the invention.


As used herein, the terms “nanopore” and “channel” are used to refer to structures having a nanoscale passageway through which ionic current can flow. The inner diameter of the nanopore may vary considerably depending on the intended use of the device. Typically, the channel or nanopore will have an inner diameter of at least about 0.5 nm, usually at least about 1 nm and more usually at least about 1.5 nm, where the diameter may be as great as 50 nm or longer, but in many embodiments will not exceed about 10 nm, and usually will not exceed about 2 nm.


As used herein, the term “analyte” refers to a substance or chemical constituent that is undergoing analysis or sought to be detected. It is not intended that the present invention be limited to a particular analyte. Representative analytes include ions, saccharides, proteins, nucleic acids, and nucleic acid sequences.


As used herein the term “membrane” refers to a sheet or other barrier that prevents passage of electrical current or fluids. The membrane is typically flexible or compressible in contrast to solid supports set forth herein. The membrane can be made from lipid material, for example, to form a lipid bilayer, or the membrane can be made from non-lipid material. The membrane can be in the form of a copolymer membrane, for example, formed by diblock polymers or triblock polymers, or in the form of a monolayer, for example, formed by a bolalipid. See, for example, Rakhmatullina et al., Langmuir: the ACS Journal of Surfaces and Colloids 24: 6254-6261 (2008), which is incorporated herein by reference.


As used herein, the term “lipid membrane” means a film made primarily of compounds comprising saturated or unsaturated, branched or unbranched, aromatic or non-aromatic, hydrocarbon groups. The film may be composed of multiple lipids. Examples of lipids include, but are not limited to, fatty acids, mono-, di-, and triglycerides, glycerophospholipids, sphingolipids, steroids, lipoproteins, and glycolipids.


“Nucleic acid” or “nucleic acid sequence” or “nucleic acid molecule” refers to deoxyribonucleotides or ribonucleotides and polymers thereof in either single- or double-stranded form. The term nucleic acid is used interchangeably with gene, complementary DNA (cDNA), messenger RNA (mRNA), oligonucleotide, and polynucleotide. The term encompasses nucleic acids containing known nucleotide analogs or modified backbone residues or linkages, which are synthetic, naturally occurring, and non-naturally occurring, which have similar binding properties as the reference nucleic acid, and which are metabolized in a manner similar to the reference nucleotides. Examples of such analogs include, without limitation, phosphorothioates, phosphoramidates, methyl phosphonates, chiral-methyl phosphonates, 2-O-methyl ribonucleotides, peptide-nucleic acids (PNAs). The terms encompass molecules formed from any of the known base analogs of DNA and RNA such as, but not limited to 4-acetylcytosine, 8-hydroxy-N6-methyladenine, aziridinyl-cytosine, pseudoisocytosine, 5-(carboxyhydroxylmethyl) uracil, 5-fluorouracil, 5-bromouracil, 5-carboxymethylaminomethyl-2-thiouracil, 5-carboxy-methylaminomethyluracil, dihydrouracil, inosine, N6-iso-pentenyladenine, 1-methyladenine, 1-methylpseudouracil, 1-methylguanine, 1-methylinosine, 2,2-dimethyl-guanine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-methyladenine, 7-methylguanine, 5-methylaminomethyluracil, 5-methoxyamino-methyl-2-thiouracil, beta-D-mannosylqueosine, 5′-methoxycarbonyl-methyluracil, 5-methoxyuracil, 2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid, oxybutoxosine, pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, N-uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid, pseudouracil, queosine, 2-thiocytosine, and 2,6-diaminopurine.


Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions) and complementary sequences, as well as the sequence explicitly indicated. Specifically, degenerate codon substitutions, in some aspects, are achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al., Nucleic Acid Res. 19: 5081, 1991; Ohtsuka et al., J. Biol. Chem. 260: 2605-8, 1985; Rossolini et al., Mol. Cell. Probes 8: 91-8, 1994). The term nucleic acid is used interchangeably with gene, cDNA, mRNA, oligonucleotide, and polynucleotide.


“Polypeptide” is used in its conventional meaning, i.e., as a sequence of amino acids. The polypeptides are not limited to a specific length of the product. Peptides, polypeptides, and proteins are included within the definition of polypeptide, and such terms can be used interchangeably herein unless specifically indicated otherwise. This term also includes post-expression modifications of the polypeptide, for example, glycosylations, acetylations, phosphorylations and the like, as well as other modifications known in the art, both naturally occurring and non-naturally occurring. A polypeptide can be an entire protein or a subsequence thereof.


The terms “identical” or percent “identity” as known in the art refers to a relationship between the sequences of two or more polypeptide molecules or two or more nucleic acid molecules, as determined by comparing the sequences. In the art, “identity” also means the degree of sequence relatedness between nucleic acid molecules or polypeptides, as the case may be, as determined by the match between strings of two or more nucleotide or two or more amino acid sequences. “Identity” measures the percent of identical matches between the smaller of two or more sequences with gap alignments (if any) addressed by a particular mathematical model or computer program (i.e., “algorithms”). “Substantial identity” refers to sequences with at least about 70%, about 71%, about 72%, about 73%, about 74%, about 75%, about 76%, about 77%, about 78%, about 79%, about 80%, about 81%, about 82%, about 83%, about 84%, about 85%, about 86%, about 87%, about 88%, about 89%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, or about 99% sequence identity over a specified sequence. In some aspects, the identity exists over a region that is at least about 50-100 amino acids or nucleotides in length. In other aspects, the identity exists over a region that is at least about 100-200 amino acids or nucleotides in length. In other aspects, the identity exists over a region that is at least about 200-500 amino acids or nucleotides in length. In certain aspects, percent sequence identity is determined using a computer program selected from the group consisting of GAP, BLASTP, BLASTN, FASTA, BLASTA, BLASTX, BestFit, and the Smith-Waterman algorithm.


The term “similarity” is a related concept but, in contrast to “identity,” refers to a measure of similarity which includes both identical matches and conservative substitution matches. If two polypeptide sequences have, for example, 10/20 identical amino acids, and the remainder are all non-conservative substitutions, then the percent identity and similarity would both be 50%. If, in the same example, there are five more positions where there are conservative substitutions, then the percent identity remains 50%, but the percent similarity would be 75% ( 15/20). Therefore, in cases where there are conservative substitutions, the degree of percent similarity between two polypeptides will be higher than the percent identity between those two polypeptides.


It also is specifically understood that any numerical value recited herein includes all values from the lower value to the upper value, i.e., all possible combinations of numerical values between the lowest value and the highest value enumerated are to be considered to be expressly stated in this application. For example, if a concentration range is stated as about 1% to 50%, it is intended that values such as 2% to 40%, 10% to 30%, or 1% to 3%, etc., are expressly enumerated in this specification. The values listed above are only examples of what is specifically intended.


Ranges, in various aspects, are expressed herein as from “about” or “approximately” one particular value and/or to “about” or “approximately” another particular value. When values are expressed as approximations, by use of the antecedent “about,” it will be understood that some amount of variation is included in the range.


The term “operably linked” refers to a functional linkage between a nucleic acid expression control sequence (such as a promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, wherein the expression control sequence directs transcription of the nucleic acid corresponding to the second sequence.


As used herein, the term “purified” or “substantially purified,” as used herein, refers to the desired protein is enriched by at least 20%, more preferably by at least 50%, even more preferably by at least 75%, and most preferably by at least 90%, or even 95%.


Each publication, patent application, patent, and other reference cited herein is incorporated by reference in its entirety to the extent that it is not inconsistent with the present disclosure.


Publications disclosed herein are provided solely for their disclosure prior to the filing date of the present invention. Nothing herein is to be construed as an admission that the present invention is not entitled to antedate such publication by virtue of prior invention. Further, the dates of publication provided may be different from the actual publication dates which may need to be independently confirmed.


Recitation of ranges of values herein are merely intended to serve as a shorthand method for referring individually to each separate value falling within the range and each endpoint unless otherwise indicated herein, and each separate value and endpoint is incorporated into the specification as if it were individually recited herein.


All methods described herein are performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. In regard to any of the methods provided, the steps of the method may occur simultaneously or sequentially. When the steps of the method occur sequentially, the steps may occur in any order, unless noted otherwise.


In cases in which a method comprises a combination of steps, each and every combination or sub-combination of the steps is encompassed within the scope of the disclosure, unless otherwise noted herein.


The section headings as used herein are for organizational purposes only and are not to be construed as limiting the subject matter described.


It is understood that the examples and embodiments described herein are for illustrative purposes only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims.


VII. EXAMPLES
Example 1

This example describes the materials and methods to be used in the subsequent examples.


Material and Methods
Non-Membrane Protein Channel Reconstruction

For a non-membrane protein channel, it is difficult to insert into lipid bilayer or polymer membrane directly because unlike membrane protein channels, they usually lack hydrophobic layer in the middle and hydrophilic layers in both ends and hence require extensive engineering. In order to be used for biosensing, or sequencing, non-membrane protein channel also need to be engineered and functionalized with different probes. Although non-membrane protein channel may originate from various source with different sequences, shapes, structures, or properties, the strategies and methods employed to reengineer and enable them to be used as nanopore share some common feature. As shown in FIGS. 1A and 1B, three distinct domains are important for membrane anchoring, and two areas are particularly important for conjugation of functional modules for single molecule sensing. To reengineer these distinct domains and change the hydrophobic or hydrophilic property, or functionalized conjugation site or probes, a series of molecular cloning works are involved. Typically, a common restriction enzyme cloning method is employed for various engineering purposes. However, other clone methods, such as Gateway® Recombination Cloning, TOPO® Cloning, Isothermal Assembly Reaction, or Type IIS Assembly could also be employed.


Expression and Purification of Non-Membrane Proteins

Although non-membrane protein channel may originate from various source with different sequences, shapes, structures, or properties, the strategies and methods employed to express and purify them share some common features.


First, the reengineered non-membrane protein channel gene was cloned into an expression vector with or without a tag on the terminal. Then the vector was transformed into suitable protein expression host, e.g., E. coli system. After the protein channel was expressed in the host, the host was lysed and series of steps were taken to remove the host debris Finally, non-membrane proteins could be purified by one or combination of following methods, such as affinity chromatography, exchange chromatography, size exclusion chromatography, or other commonly used purification methods.


As an example, the non-membrane protein channel gene and variants gene was cloned into a PET23a vector with His-tag at the N-terminus. After transforming into BL21 (DE3) cells, one BL21(DE3) colony was inoculated in 5 mL fresh LB medium in the presence of antibiotics and cultured at 37° C. in a shaker for several hours until the OD reached 0.8. The flask was then kept at 4° C. for cooling down. Then 0.5 mM IPTG was added into the flask for induction. The bacteria were cultured at 16° C. in a shaker overnight. The bacteria were harvested in 8000 rpm for 10 mins, supernatant discarded, and then the pellet suspended with lysis buffer. The bacteria solution was sonicated until the solution became transparent and not sticky. The solution was centrifuged after sonicating with 12000 rpm for 30 mins, the supernatant discarded and the pellet collected. 10 ml urea (8 M) was added to the precipitation and oscillated at low speed on the oscillator until the precipitation was completely dissolved in urea. The solution was centrifuged at 12000 rpm for 10 mins The supernatant was added to 100 ml protein renaturation buffer and stirred overnight. The solution was centrifuged after refolding at 12000 rpm for 30 mins, the pellet discarded and the supernatant collected. The supernatant was passed through 0.45 um syringe filter to discard the denatured protein. The supernatant was added to a dialysis bag, and the dialysis fluid replaced three times. The fluid in the dialysis bag is collected and centrifuges for 12000 rpm for 10 minutes, the pellet discarded and the supernatant collected. Nickel beads were equilibrated with lysis buffer, and the supernatant added to the beads. The beads were then washed with washing buffer. The protein was eluted using elution buffer for 7˜10 column volumes. The eluent was collected and concentrated to 5 mL. The eluent was centrifuged at 12000 rpm for 10 mins, and then the supernatant was absorbed and injected with a syringe into AKTA FPLC. Before injection, the sample loop was washed with 10 mL lysis buffer. The protein was collected after passing through a size exclusion column. An SDS-PAGE gel was run to check the protein sample after SEC, and stored at −80° C.


Conjugation

To functionalized with different probes or enhance the hydrophobicity or hydrophilicity of non-membrane protein channels, various conjugation methods could be employed. Although non-membrane protein channel may originate from various source with different sequences, shapes, structures, or properties, the strategies and methods employed for conjugation share some common feature. To conjugate a hydrophobic moiety to non-membrane protein channel, the reactions could be done on cysteine group on the protein channel The protein channel contains one or multiple cysteines per subunit, which are located in the middle layer of the channel and accessible to the environment. The protein solution was prepared in a buffer containing 0.5 M NaCl, 50 mM Tris, 15% glycerol with pH 6.8. The solution was degassed, and 100-fold excess TCEP was added to the solution. After 20 min incubation at room temperature, 4 μL of 4 mM cholesterol-PEG-maleimide was added drop-wise to the protein solution, and the reaction mixture was incubated in the dark at room temperature for 2 hours. Excess cholesterol-PEG-maleimide was removed by a NanoSep 100K spin column. The labeling of the protein was checked with 12% SDS-PAGE.


To conjugate probes to non-membrane protein channel, click chemistry, such as Tetrazine-Alkene ligation, or Azide-Alkyne click chemistry is employed. For Tetrazine-Alkene ligation, the protein solution was prepared in a buffer containing 0.5 M NaCl, 50 mM Tris, 15% glycerol with pH 6.8. The solution was degassed, and 100-fold excess TCEP was added to the solution. After 20 min incubation, 1.6 μL of 10 mM Methyltetrazine-PEG4-maleimide was added drop-wise to the protein solution, and the mixture was incubated in the dark at room temperature for 1 hour. Excess Methyltetrazine-PEG4-maleimide was removed by desalting using a desalt spin column. A 50 μL of probe solution was prepared in a buffer containing 0.5 M NaCl, 50 mM Tris with pH 6.8. The solution was degassed, and 100-fold excess TCEP was added to the solution. After 20 min incubation, 6 μL of 25 mM TCO-PEG3-maleimide was added drop-wise to the oligo solution, and the mixture was incubated at room temperature for 2 hours. Excess TCO-PEG3-maleimide was removed by desalting column. The labeling of the probe was verified by 20% urea-PAGE gel.


For Azide-Alkyne ligation, TCO-modified probe was mixed with Methyltetrazine-labeled protein at different molar ratios to achieve optimal protein labeling efficiency. The mixture was incubated at room temperature for 1 hour. The conjugation was verified with 12% SDS-PAGE. To A 40 μL of 20 μM protein solution was prepared in a buffer containing 0.5 M NaCl, 50mM Tris, 15% glycerol with pH 6.8. The solution was degassed, and 100-fold excess TCEP was added to the solution. After 20 min incubation, 1.6 μL of 10 mM Sulfo DBCO-PEG4-maleimide was added drop-wise to the protein solution, and the mixture was incubated in the dark at room temperature for 1 hour. Excess maleimide reagent was removed by desalting using a desalt spin column. The DBCO-modified protein was mixed with Azide-modified probe at equal molar concentration, and the reaction mixture was incubated at room temperature for 2 hours. The conjugation was verified with 12% SDS-PAGE gel.


Nanopore Experiments Setup and Data Recording

Although non-membrane protein channel may originate from various source with different sequences, shapes, structures, or properties, all non-membrane protein channels or variants could be applied on the similar nanopore setup or device. Typically, the setup comprises a sensor chip or array which have one or multiple apertures. The sensor chip is capable of supporting lipid or polymer membrane formation which could separate the compartment into cis-(top) and trans-(bottom) compartments. Both compartments were filled with conducting buffer. Input electrode was embedded into one of the compartment, and a ground electrode was embedded into another compartment. Furthermore, the setup maybe also combined with a fluidic system to enable sample flow from one container to the sensor device. To insert non-membrane protein channel into lipid or polymer membrane, the protein channel was suspended in their respective storage buffer is diluted 50-100 fold in the conducting buffer (typically 1 M KCl or 1 M NaCl, 5 mM HEPES or Tris, pH 7.6) and added to the top compartments. Under an applied potential (constant holding voltage or ramping voltage) with or without detergent, direct insertion of the protein channels in planar membranes can be observed. When no analyte presents, the current is stable and clean. When the analyte presents, the interaction with probe results in a current change which was recorded.


Data Analysis

Although non-membrane protein channel may originate from various source with different sequences, shapes, structures, or properties, the strategies and methods employed for data analysis share common feature. Typically, ˜40,000+ current blockage events (either translocation or single molecule binding events) are analyzed to ensure the result is within statistical significance. MATLAB or PYTHON-based custom algorithm was developed for quantitative fast processing of events. Generally, two parameters are used: (1) Current blockage fraction, represented as [(Currentunblocked−Currentafter analyte block)/Currentunblocked]; and (2) Dwell time: τoff (duration of an event) and τon (time between consecutive events). From the τon and τoff , the κon (association rate constant) and κoff (dissociation rate constant) can be obtained and finally Kd (equilibrium dissociation constant). A calibration curve was constructed showing the capture rate as a function of analyte concentration. Upon introducing an unknown concentration of the analyte to the nanopore, the mean capture rate can be calculated, and the analyte concentration can be determined from the calibration curve. Analysis of clinical samples requires further tuning of the analysis algorithm to clearly discriminate between ‘contaminative’ or non-specific signals and true analyte induced events. To standardize the analyte detection system, an ‘endogenous’ normalizer with a ‘spiked-in’ control was generally employed. The platform data was then cross-validated using standard assays commonly used in the diagnostic field, such as immunoassay and qRT-PCR. Finally, statistical sample size/power analyses were based on two-sample t-tests for two-group comparisons and two-way analysis of variance (ANOVA) for a combination of two factors.


Example 2
Expression and Purification of the phi29 gp-9 Tail Protein

The purified phi29 gp-9 tail protein and its variants in SDS PAGE are shown in FIGS. 3-4. The expression and purification steps are as follows. The phi29 gp-9 tail protein gene and variants gene was cloned into vector pBDHT with His-tag on the N-terminal After transforming into BL21 (DE3), one BL21 colony was inoculated in 5 mL fresh LB medium with antibiotic inside, cultured at 37° C. in a shaker (220 rpm) for several hours until the OD reached 0.8. The flask was then kept at 4° C. for cooling down. Then 0.5 mM IPTG was added into the flask for induction. The bacteria were cultured at 16° C. in a shaker (180 rpm) overnight. The bacteria were harvested in 8000 rpm for 10 mins, supernatant discarded, and then the pellet resuspended with lysis buffer (50 mM Tris pH8.0, 500 mM NaCl). The bacteria solution was sonicated until the solution becomes transparent and not sticky. The solution was centrifuged after sonicating with 12000 rpm for 30 mins, the supernatant discarded and the pellet collected. 10 ml urea (8 M) was added to the precipitation and oscillated at low speed on the oscillator until the precipitation was completely dissolved in urea. The solution was centrifuged at 12000 rpm for 10 mins The supernatant was added to 100 ml protein renaturation buffer (15% glycerol, 500 mM NaCl, 50 mM Tris, 2 M L-Arginine, pH8.0) and stirred overnight. The solution was centrifuged after refolding at 12000 rpm for 30 mins, the pellet discarded and the supernatant collected. The supernatant was passed through 0.45 um syringe filter to discard the denatured protein. The supernatant was added to a dialysis bag, and the dialysis fluid (50 mM NaCl, 5 mM Tris) replaced three times. The fluid in the dialysis bag is collected and centrifuges for 12000 rpm for 10 minutes, the pellet discarded and the supernatant collected. Nickel beads were equilibrated with lysis buffer, and the supernatant added to the beads. The beads were washed with washing buffer (50 mM Tris pH8.0, 500 mM NaCl, 25 mM imidazole) for 50 column volumes. The protein was eluted using elution buffer (50 mM Tris pH8.0, 500 mM NaCl, 500 mM imidazole) for 7˜10 column volumes. The elution was collected and concentrated to 5 mL. The elution was centrifuged at 12000 rpm for 10 mins, and then the supernatant was absorbed and injected with a syringe into AKTA FPLC. Before injection, the sample loop was washed with 10 mL lysis buffer. The protein was collected after passing through a size exclusion column. An SDS-PAGE gel was run to check the protein sample after SEC, and stored at −80° C.


Example 3
Phage Tail Proteins as Nanopores

Many bacteriophages contain a long contractile or non-contractile, or short non-contractile tail. The tail plays a critical role in the process of host cell recognition, membrane penetration, and viral genome ejection. The tail proteins are derived from (including but not limited to) phi29, T4, T3, T5, T7, SPP1, P22, P2, P3, Lambda, Mu, HK97 and C1.


These protein channels of the invention are ideal for biosensing and sequencing of a biological molecule, such as disease-related biomarker, polynucleotide and polypeptide sequences. With improved membrane capability, the modified protein channels could insert into lipid or polymer membrane efficiently and serve as nanopore for biosensing and sequencing. By conjugating with various probes, the modified protein channels have the capacity to detect specific disease-related biomarker with high sensitivity and specificity. The pore of the invention may be present in a homologous or heterologous pore.


Representative Example: Gp-9 Tail Protein From phi29

The crystal structure of phi29 bacteriophage tail (gp9) shows that six gp9 subunits form a hexameric or cylindrical-like tube structure. Inside the structure, the distal end is blocked by six flexible hydrophobic loops before DNA ejection is triggered. In order to deliver the genomic dsDNA into the host cell cytoplasm, the phi29 tail needs to penetrate the cell membrane. The length of the tube is about 12.5 nm. The tube has an inner diameter of approximately 4 nm and an outer diameter of approximately 9 nm and. The wall of the tube is comprised of largely β-sheets with about 2.5 nm thickness.


A clone of full-length gp9 and series of mutants were constructed (FIGS. 2-6), such as gp9Δloop[417-491], in which a disordered region (residues 417-491) was deleted. According to the crystal structure, the gp9Δ 417-491 structure is also a cylindrical tube-like homo-hexamer. By studying the structure of the tail protein, the amino acids 130-170,300-325,350-390,530-595 is on the surface of the middle channel, which can interact with the hydrophobic layer of the membrane. Thus, attachment of a hydrophobic group to these sites or mutation of these amino acids to hydrophobic amino acids, including glycine (Gly), alanine (Ala), valine (Val), leucine (Leu), isoleucine (Ile), proline (Pro), phenylalanine (Phe), methionine (Met), and tryptophan (Trp), could change the channel's capability of membrane insertion. Particularly, mutations of one or any combination of the following sites are critical: K134I, D138L, D139L, D158L, E163V, E309V, D311V, K321V, K356A, K358A, D377A, D381V, N388L, R524I, R539A, E595V. To conjugate a hydrophobic group to the middle part of the channel, mutation of the following sites to cysteine are important: E595C, K321C, and K358C. In addition, the amino acids 250-300, and 20-50 are laying on the surface of the upper and lower part of the channel, which can interact with the hydrophilic environment.


Viral Portal Proteins as Nanopores

Portal proteins exist not only in bacteriophages phi29, T3, T4, T5, T7, SPP1, and P22, but also in other viral systems such as Adeno and Herpes viruses. The portal channel, termed the connector, is a pore-like protein structure with a central channel that acts as a pathway for genomic DNA to enter the viral capsid during packaging and exit during infections. Although structural studies indicate significant differences in sequence homology and sizes among different viral connector proteins, they all are topologically similar with a truncated cone shape. The stoichiometry of the connectors derived from overexpressed proteins often varies depending on the expression conditions. These protein channels of the invention can be ideal for biosensing and sequencing applications if they can be directly inserted into robust polymer membranes. Since the structures of the connector from different viral portal proteins show similar characteristics, the principle outlined above also applies to all of them by extension.


Representative Example: P22 Portal Channel

P22 is a tailed bacteriophage which assembles empty precursor capsids that are subsequently packaged with viral DNA by a powerful packaging motor. P22 portal protein forms a channel-like structure for bidirectional passage of viral DNA. Podoviridae family of short-tailed dsDNA bacteriophages includes members of the P22-like subgroup, such as Sf6, CUS-3, epsilon34, and APSE-1. The portal barrel is highly dynamic and susceptible to proteolysis in solution. The P22 portal protein is composed of 12 identical subunits, arranged symmetrically around the central channel. The overall height is ˜30 nm with a funnel-shaped core of diameter ˜17 mm, which is connected to an ˜20 nm long α-helical tube. The average internal diameter of the channel varies from 3.5 nm to 7.5 nm throughout the structure.


Series of mutants were constructed (FIGS. 7-11), which include the following defining features: (1) The portal core (res. 1-602) is topologically similar to other viral protein channels, but the presence of the helical barrel tube is unique to P22. The connection between the portal core and helical barrel can be easily cleaved by chymotrypsin in solution, which indicates that the two domains are intrinsically flexible. The invention includes removal of the barrel residues 603-725 and replacement with any peptide or nucleic acid sequences as a separate recognition domain. (2) The invention includes alteration (deletion, truncation, mutation) of the internal flexible loop residues 464-492 to change the electrophysiological properties and/or detection capabilities. (3) The invention includes the use of EDTA (60 mM or higher) to assemble the dodecamer complex: Chelating divalent cations nonspecifically trapped at the monomer-monomer interface is necessary for correct assembly of dodecameric rings. (4) The invention includes changing the overall electronegative property of the channel interiors by altering the five rings of residues Glu70 (which is clustered together with Glu423, Glu414, Glu406, Glu393, and Glu396. (5) The invention includes altering any of the hydrophobic amino acids forming a belt underneath the wing domain (Phe 24, Ile25, Leu28, Phe60, Phe128, Pro129, and Pro132) to change the hydrophobicity of the surface; (6) The invention includes adding several amino acids (any natural or unnatural amino acids) at the terminal ends with the goal of using these amino acids as anchoring point (such as cysteine or lysine or arginine) for added functionalities or for altering the electrophysiological properties (hydrophilic or hydrophobic tag) for membrane insertion and channel stability; (7) The invention includes mutagenesis for hydrophobic and hydrophilic layer; Middle part: Amino acid from 250 to 300, 10-45; Upper part: 450-500; Lower part : 350-380; Arg 476 or C-terminal to Cys mutation for conjugation; Middle part THR240, VAL244, Arg 273 for Cys mutagenesis to conjugate cholesterol.


Representative Example: T4 Portal Channel

The T4 portal mostly exists as a 12-mer (some 11-mer or 13-mer depending on protein expression conditions) ring that is 14 nm long with 7 nm wide, and an interior channel of 3 nm in diameter. Since the channel assembles from 12 subunits, altering residue(s) in one monomer will trigger the effect in the entire channel with the mutation present in the same plane of the molecule. The invention includes (FIGS. 12-14, and 16): (1) changing the overall electronegative property of the channel interiors by altering the rings of charged residues; (2) altering the two basic residues, R338 and K342, at the inner channel entrance (with any amino acids to change the hydrophilicity; (3) Adding several amino acids (any natural or unnatural amino acids) at the terminal ends with the goal of using these amino acids as anchoring point (such as cysteine or lysine or arginine) for added functionalities or for altering the electrophysiological properties (hydrophilic or hydrophobic tag) for membrane insertion and channel stability; and (5) alteration (deletion, truncation, mutation) of the internal flexible loop residues 374-398 to change the electrophysiological properties and/or detection capabilities.


Example 4
Compositions of Membranes for Nanopore Housing

Planar bilayer lipid membranes (BLMs) or polymer membrane were generated in (a) BCH-1A horizontal BLM cell (Eastern Scientific) or (b) home-made custom chamber. A Teflon partition with a 100 or 200 μm aperture was placed in the apparatus to separate the BLM cell into cis-(top) and trans-(bottom) compartments. The home-made chamber has pre-drilled 100 or 200 um apertures separating the cis- and trans-compartments.


Lipid membrane: A planar lipid bilayer of varying composition was formed by pre-painting the aperture with lipids in hexane (concentration: 0.5 mg/ml) followed by painting with lipids in n-decane (concentration: 20-30 mg/ml). Examples of lipid composition include: (i) zwitterionic lipids such as 100% DPhPC or DOPC or POPC; (ii) 0-50% anionic lipids such as DPhPG/DOPG/POPG or DPhPS/DOPS/POPS mixed proportionately (final ratio adds up to 100%) with composition (i); (iii) 0-25% cholesterol mixed with composition (i) and (ii) proportionately. The exact lipid composition depends on the properties of the protein. Typical lipid membrane composition used include: 100% DPhPC; 100% DPhPC; 30% DPhPS; 70% DPhPC: 28% DPhPG: 2% cholesterol.


Polymer membrane: A planar polymer membrane of varying composition was formed by manual painting using membranes suspended in organic solvents such as Decane or Silicone oil (Polyphenyl-methylsiloxane based or Polydimethylsiloxane based with a viscosity of 20 mPa·s). The membrane composition is Polyoxazoline based triblock copolymers (FIG. 17) such as:


(i) PEOXA-PEO-PEOXA [Poly(2-ethyl oxazoline)-b-poly(ethylene oxide)-b-poly(2-ethyl oxazoline)];


(ii) PMOXA-PDMS-PMOXA [Poly(2-methyl oxazoline)-b-poly(dimethylsiloxane)-b-poly(2-methyl oxazoline)]; (with ethyl-benzyl or propyl or propyl-ethoxy link between blocks)


(iii) PMOXA-PB-PMOXA [Poly(2-methyl oxazoline)-b-poly(1,4-butadiene)-b-poly(2-methyl oxazoline)];


(iv) PMOXA-PE-PMOXA [Poly(2-methyl oxazoline)-b-poly(ethylene)-b-poly(2-methyl oxazoline)];


(v) PMOXA-PEO-PMOXA [Poly(2-methyl oxazoline)-b-poly(ethylene oxide)-b-poly(2-methyl oxazoline)]


Typical membranes used include PMOXA6-PDMS35-PMOXA6; PMOXA6-PDMS65-PMOXA6; PMOXA11-PDMS65-PMOXA11; PMOXA5-PDMS13-PMOXA5; PMOXA3-PDMS38-PMOXA3 (FIG. 17). The block lengths (denoted by subscripts X and Y in PMOXAX-PDMSY-PMOXAX) (which determines the length of hydrophilic or hydrophobic blocks) is tunable depending on the following factors: (1) certain membrane thickness is needed for stable insertion of the protein pore; (2) membranes need to retain very low permeability; (3) membrane has to be mechanically and chemically stable for an extended period of time under different solution conditions, including extreme pH (1-12) and high/low salt environments.


Example 5
Insertion of Protein Channels in the Membrane

Typically, the protein channels (tail proteins or portal channels) suspended in their respective storage buffer is diluted 50-100 fold in the conducting buffer (typically 1 M KCl or 1 M NaCl, 5 mM HEPES or Tris, pH 7.6) and added to the top compartments of the BLM cell. Under an applied potential (constant holding voltage or ramping voltage), direct insertion of the protein channels in planar membranes can be observed (FIGS. 6, 11, 14, 16, and 19).


If necessary, the membrane compositions as described can also be used to generate vesicular polymersome structures with varying polydispersity to reconstitute the protein channels. For high insertion efficiency, varying amounts of glycerol, CsCl, and/or sucrose can be encapsulated within the polymersomes. The resulting proteo-polymersomes can then fuse with a planar bilayer of the same composition in a salt and voltage-dependent manner


Example 6
Conjugation of Hydrophobic Moieties to Non-Membrane Protein Channels Via Reactions With Cysteine Residues

Here T4 gp20 portal protein is used as an example to demonstrate how to conjugate hydrophobic moiety to the non-membrane channel via reactions with cysteine residues. The protein channel contains three cysteines per subunit, which are located in the middle layer of the channel and accessible to the environment (FIG. 13). A 40 μL of 20 μM protein solution was prepared in a buffer containing 0.5 M NaCl, 50 mM Tris, 15% glycerol with pH 6.8. The solution was degassed, and 100-fold excess TCEP was added to the solution. After 20 min incubation at room temperature, 4 μL of 4 mM cholesterol-PEG-maleimide was added drop-wise to the protein solution, and the reaction mixture was incubated in the dark at room temperature for 2 hours. Excess cholesterol-PEG-maleimide was removed by a NanoSep 100K spin column. The labeling of the protein was checked with 12% SDS-PAGE (FIG. 13).


Example 7
Insertion of Non-Membrane Protein Channel Carrying Hydrophobic Moiety Into Lipid or Polymer Membrane

Planar polymer membranes were generated. Under an applied voltage (constant holding voltage or ramping voltage), direct insertion of non-membrane protein channel carrying hydrophobic moiety was observed after adding protein channel to a cis chamber (FIG. 14). Mutation to hydrophobic amino acids or conjugation of hydrophobic group to the middle layer phi29 gp-9 tail protein can significantly enhance the insertion process and its stability.


Example 8
Conjugate Probes to Non-Membrane Protein Channels Via Click Reaction

phi29 gp10 portal protein was used as a representative example.


Strategy 1: via Tetrazine-Alkene ligation: A 40 μL of 20 μM protein solution was prepared in a buffer containing 0.5 M NaCl, 50 mM Tris, 15% glycerol with pH 6.8. The solution was degassed, and 100-fold excess TCEP was added to the solution. After 20 min incubation, 1.6 μL of 10 mM Methyltetrazine-PEG4-maleimide was added drop-wise to the protein solution, and the mixture was incubated in the dark at room temperature for 1 hour. Excess Methyltetrazine-PEG4-maleimide was removed by desalting using a desalt spin column. A 50 μL of 30 μM thiol modified miRNA probe solution was prepared in a buffer containing 0.5 M NaCl, 50 mM Tris with pH 6.8. The solution was degassed, and 100-fold excess TCEP was added to the solution. After 20 min incubation, 6 μL of 25 mM TCO-PEG3-maleimide was added drop-wise to the oligo solution, and the mixture was incubated at room temperature for 2 hours. Excess TCO-PEG3-maleimide was removed by desalting column. The labeling of the miRNA probe was verified by 20% urea-PAGE gel. To conjugate the miRNA probe to the protein, the TCO-modified miRNA probe was mixed with Methyltetrazine-labeled protein at different molar ratios to achieve optimal protein labeling efficiency. The mixture was incubated at room temperature for 1 hour. The conjugation was verified with 12% SDS-PAGE.


Strategy 2: via Azide-Alkyne click chemistry: A 40 μL of 20 μM protein solution was prepared in a buffer containing 0.5 M NaCl, 50 mM Tris, 15% glycerol with pH 6.8. The solution was degassed, and 100-fold excess TCEP was added to the solution. After 20 min incubation, 1.6 μL of 10 mM Sulfo DBCO-PEG4-maleimide was added drop-wise to the protein solution, and the mixture was incubated in the dark at room temperature for 1 hour. Excess maleimide reagent was removed by desalting using a desalt spin column. The DBCO-modified protein was mixed with Azide-modified miRNA probe at equal molar concentration, and the reaction mixture was incubated at room temperature for 2 hours. The conjugation was verified with 12% SDS-PAGE.


Example 9
Verification of the DNA Probe Conjugated to Proteins

The protein-miRNA probe conjugate was incubated with the target miRNA oligo at an equal molar concentration at room temperature for 30 min. The binding of the protein-miRNA probe with the target miRNA was verified with 12% SDS-PAGE (FIG. 15).


Example 10
PSA Detection Using Engineered Non-Membrane Protein Channels

To conjugate a protein probe to the nanopore channel, various methods have been tried and tested, including reaction of cysteine residue with Methyltetrazine-PEG4-maleimide and subsequent reaction with TCO-labeled probes, and SpyCatcher/Spytag protein conjugation system. How a single chain antibody against PSA conjugate to phi29 gp10 portal protein channel via SpyCatcher/Spytag was demonstrated. Phi29 gp10 protein channel with a C-terminal SpyTag peptide was constructed and purified. Single chain antibody against PSA with a C-terminal SpyCatcher was constructed and purified. The assembled protein channel-PSA single chain antibody was verified by gel (FIG. 18). Then the purified assembled protein channel-PSA single chain antibody was inserted into polymer membrane to test its capacity of binding PSA antigen. The procedure to setup electrophysiological experiments was described previously. After insertion, a series of different concentration of PSA antigen was added to the chamber, and a unique binding event was observed (FIG. 19).


Example 11
MicroRNA Detection Using Engineered Non-Membrane Protein Channels

To conjugate a nucleic probe to the nanopore channel, a variety of different methods have been tried and tested as outlined in Example 6. It was demonstrated that miR-21 probes can be conjugated to T4 gp20 portal protein channel The purified conjugated complex was inserted into a polymer membrane to test its capacity of binding corresponding microRNA. The procedure to setup electrophysiological experiments was described previously. After insertion, a series of different concentration of microRNA was added to the chamber, and a unique binding event was observed (FIG. 16).


Other objects, features, and advantages of the present invention will become apparent from the following detailed description. It should be understood, however, that the detailed description and the examples, while indicating specific embodiments of the invention, are given by way of illustration only. Additionally, it is contemplated that changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.

Claims
  • 1. A nanopore assembly for detecting an analyte, the nanopore assembly comprising a channel formed of a plurality of subunits, wherein each of the subunits comprises a non-membrane protein capable of forming a protein channel.
  • 2. The nanopore assembly of claim 1, wherein each of the subunits comprises a polypeptide having a polypeptide sequence at least 75% identical to a polypeptide sequence selected from the group consisting of SEQ ID NOs: 1-35, preferably SEQ ID NOs: 4-12.
  • 3. The nanopore assembly of claim 2, wherein the polypeptide comprises at least one residue substituted with cysteine.
  • 4. (canceled)
  • 5. The nanopore assembly of claim 1, further comprising a probe for detecting an analyte, the probe being operably linked to at least one of the subunits.
  • 6. The nanopore assembly of claim 5, wherein the probe is selected from the group consisting of chemicals, carbohydrates, aptamers, nucleic acids, peptide, protein, antibodies, and receptors.
  • 7. The nanopore assembly of claim 5, wherein the probe comprises a sequence at least 75% identical to a sequence selected from the group consisting of SEQ ID NOs: 36-79.
  • 8. The nanopore assembly of claim 5, wherein the probe is an anti-PSA antibody.
  • 9. The nanopore assembly of claim 5, wherein the analyte is selected from the group consisting of nucleic acids, amino acids, peptides, proteins, polymers, and chemical molecules.
  • 10. The nanopore assembly of claim 5, wherein the analyte is selected from the group consisting of PSA, CEA, AFP, VCAM, MiR-155, MiR-22, MiR-7, MiR-92a, MiR-122, MiR-192, MiR-223, MiR-26a, MiR-27a, MiR-802, and PSA or a fragment thereof.
  • 11. (canceled)
  • 12. The nanopore assembly of claim 5, wherein the probe is operably linked via covalent bonding to at least one of the subunits, and wherein the covalent bonding optionally comprises a disulfide linkage, an ester linkage, or a sulfhydryl linkage.
  • 13. (canceled)
  • 14. The nanopore assembly of claim 5, wherein the probe is operably linked to a location in proximity to an entrance of the channel or a location at an interior side of the channel.
  • 15. (canceled)
  • 16. The nanopore assembly of claim 1, wherein the channel is embedded in a polymersome.
  • 17. The nanopore assembly of claim 1, further comprising a membrane, wherein the channel is inserted in the membrane, and optionally at least one of cholesterol and porphyrin.
  • 18. The nanopore assembly of claim 17, wherein the membrane comprises a polymer membrane or a lipid membrane.
  • 19. The nanopore assembly of claim 18, wherein the polymer membrane comprises an alternating copolymer, a periodic copolymer, a block copolymer, a di-block copolymer, a tri-block copolymer, a terpolymer, or a combination thereof.
  • 20. The nanopore assembly of claim 19, wherein the polymer membrane comprises PMOXA-PDMS-PMOXA.
  • 21. (canceled)
  • 22. An apparatus for detecting an analyte, comprising the nanopore assembly of claim 1 and optionally a support for the nanopore assembly.
  • 23. A kit comprising the nanopore assembly of claim 1 and optionally instructions for using the nanopore assembly.
  • 24. A method of detecting an analyte, comprising: (a) contacting a sample containing an analyte with the nanopore assembly of claim 1 optionally placed on a support;(b) applying an electrical current across the channel of the nanopore assembly;(c) determining the electrical current passing through the channel at one or more time intervals; and(d) comparing the electrical current measured at one or more time intervals with a reference electrical current, wherein a change in electrical current relative to the reference electrical current indicates a presence of the analyte in the sample.
  • 25. The method of claim 24, wherein the reference electrical current is measured with a sample that does not contain the analyte.
  • 26. (canceled)
CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority under 35 U.S.C. § 119(e) to U.S. Provisional Patent Application No. 62/629,604, filed Feb. 12, 2018, the disclosures of which are incorporated herein by reference in their entireties.

PCT Information
Filing Document Filing Date Country Kind
PCT/US19/17432 2/11/2019 WO 00
Provisional Applications (1)
Number Date Country
62629604 Feb 2018 US