A computer readable form of the Sequence Listing is filed with this application by electronic submission and is incorporated into this application by reference in its entirety. The Sequence Listing is contained in the file created on Aug. 22, 2021 having the file name “20-776-WO-SeqList_ST25.txt” and is 318 kb in size.
Genetically programmable materials that spontaneously assemble into ordered structures following mixture of two or more components are far more controllable than materials constitutively forming from one component; they offer temporal control over the assembly process, thereby enabling rigorous characterization and opening up a wide variety of applications. Most previously known ordered protein 2D materials primarily involve single protein components. A de-novo interface design between rigid domains that is stabilized by extensive noncovalent interactions would provide more control over atomic structure and a robust starting point for further structural and functional modulation.
In a first aspect, the disclosure provides two-dimensional protein structures, comprising a first polypeptide and a second polypeptide, wherein
In one embodiment, the disclosure provides two-dimensional protein structures, comprising a first polypeptide and a second polypeptide, wherein
In one embodiment, the first interface region and the second interface regions comprise alpha-helical domains. In another embodiment, the interface comprises an interface between an alpha-helical domain of the first polypeptide and an alpha-helical domain of the second polypeptide. In a further embodiment, each of the first polypeptide and the second polypeptide comprise a plurality (2, 3, 4, 5, 6, 7, or more) alpha helical domains separate by loop domains. In one embodiment, the interface comprises (a) a region of the first polypeptide within 25 amino acids from the first polypeptide C-terminus, and (b) a region of the second polypeptide within 25 amino acids from the second polypeptide N-terminus. In one embodiment, the first polypeptide comprises a secondary structure as shown below, wherein positions in parentheses are optional and may be present or absent:
and
In other embodiment, the first polypeptide and the second polypeptides comprise polypeptides of other aspects of the disclosure.
In a second aspect, the disclosure provides polypeptides comprising an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the amino acid sequence of SEQ ID NO:1, wherein the polypeptide includes a mutation at 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 or all 15 positions selected from the group consisting of T210, A213, Q215, Q216, Q217. Q219, K220, K222, A223. E224, F225, A226, Q227, Q229, and K230 relative to SEQ ID NO:1, wherein residues in parentheses are optional and may be present or absent. In another embodiment, the polypeptides comprise an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to the amino acid sequence selected from the group consisting of SEQ ID NOS:2-31, wherein residues in parentheses may be present or absent. In a further embodiment, the polypeptides may comprise one or more additional functional peptide domains. In another embodiment, the disclosure provides homo-oligomers of the polypeptide of this aspect, including but not limited to cyclic homo-oligomer.
In a third aspect, the disclosure provides polypeptides comprising an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the amino acid sequence of SEQ ID NO:100, wherein the polypeptide includes a mutation at 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or all 14 positions selected from the group consisting of M1, N5, E8, K9, Q12, E13, H14, K16, I17, V18, Q19, A20, E22, and I23 relative to SEQ ID NO:100. In one embodiment, the polypeptides comprises an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to the amino acid sequence selected from the group consisting of SEQ ID NOS: 50-82, wherein residues in parentheses are optional and may be present or absent. In another embodiment, the polypeptides may further comprise one or more additional functional peptide domains. In another embodiment, the disclosure provides homo-oligomers of the polypeptide of this aspect, including but not limited to cyclic homo-oligomer.
In other aspects, the disclosure provides nucleic acids encoding the polypeptides of any embodiment or combination of embodiments of the disclosure, expression vectors comprising the nucleic acid of the disclosure operatively linked to a suitable promoter or other control sequence, and host cells comprising the polypeptide, nucleic acid, expression vector, and/or 2D protein material of any embodiment or combination of embodiments of the disclosure.
In one aspect, the disclosure provides 2D protein materials comprising:
In another aspect, the discourse provides uses of or methods for using the polypeptides, fusion proteins, homo-polymers, 2D protein materials, nucleic acids, recombinant expression vectors, and host cells of any of the preceding claims for any suitable purpose, including but not limited to those described herein.
In a further aspect, the disclosure provides computational methods to generate de-novo binary 2D non-covalent co-assemblies by designing rigid asymmetric interfaces between two distinct protein dihedral building-blocks, optionally as further defined herein.
Figure J (a-e). Design strategy and characterization of in vivo assembly. (a) Design strategy. Left: The two possible orientations of a D3 building block and 3 of the 6 possible orientations of a D2 building block compatible with p6m symmetry; symmetry axes are shown. Middle: symmetry operator arrangement in p6m; the lattice spacing degree of freedom is indicated by the dashed line (d); the corresponding building block axis is shown as dashed line on the left. Right: Example of one of the 12 possible p6m array configurations resulting from placement of the dihedral building blocks in p6m lattice, with lattice spacing parameter d indicated. (b) Generating a hethro-interface through sequence design at the contact between the two homooligomers. Left panel view direction is in-plane along the sliding axes and the right panel is rotated 90 perpendicularly to the plane. (c) Model of genetically fused GFP fused to A (AGFP). (d) Negative stain TEM images of 2D arrays formed in E. coli coexpressing A+B (top left panel) and AGFP+B (bottom left panel). The corresponding averaged images are shown in the right panel superimposed with the design model (GFP omitted). (e) Confocal microscopy images of cells coexpressing AGFP+B (left panel) or expressing only AGFP (right panel), the difference in GFP signal homogeneity suggests the arrays form within the cells, scales bar: (d) 100 nm, (e) 5 μm.
All references cited are herein incorporated by reference in their entirety. Within this application, unless otherwise stated, the techniques utilized may be found in any of several well-known references such as: Molecular Cloning; A Laboratory Manual (Sambrook, et al., 1989, Cold Spring Harbor Laboratory Press), Gene Expression Technology (Methods in Enzymology, Vol. 185, edited by D. Goeddel, 1991. Academic Press, San Diego, CA), “Guide to Protein Purification” in Methods in Enzymology (M. P. Deutshcer, ed., (1990) Academic Press, Inc.); PCR Protocols: A Guide to Methods and Applications (Innis, et al. 1990. Academic Press, San Diego, Calif.), Culture of Animal Cells: A Manual of Basic Technique, 2nd Ed. (R. I. Freshney. 1987. Liss, Inc. New York, NY), Gene Transfer and Expression Protocols, pp. 109-128, ed. E. J. Murray, The Humana Press Inc., Clifton, N.J.), and the Ambion 1998 Catalog (Ambion. Austin, TX).
As used herein, the singular forms “a”, “an” and “the” include plural referents unless the context clearly dictates otherwise.
As used herein, the amino acid residues are abbreviated as follows: alanine (Ala; A), asparagine (Asn; N), aspartic acid (Asp; D), arginine (Arg; R), cysteine (Cys; C), glutamic acid (Glu; E), glutamine (Gin; Q), glycine (Gly; G), histidine (His; H), isoleucine (Ile; I), leucine (Leu; L), lysine (Lys; K), methionine (Met; M), phenylalanine (Phe; F), proline (Pro: P), serine (Ser; S), threonine (Thr; T), tryptophan (Trp; W), tyrosine (Tyr; Y), and valine (Val; V).
In all embodiments of polypeptides disclosed herein, any N-terminal methionine residues are optional (i.e.: the N-terminal methionine residue may be present or may be absent).
All embodiments of any aspect of the disclosure can be used in combination, unless the context clearly dictates otherwise.
Unless the context clearly requires otherwise, throughout the description and the claims, the words ‘comprise’, ‘comprising’, and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is to say, in the sense of “including, but not limited to”. Words using the singular or plural number also include the plural and singular number, respectively. Additionally, the words “herein,” “above,” and “below” and words of similar import, when used in this application, shall refer to this application as a whole and not to any particular portions of the application.
In a first aspect, the disclosure provides polypeptides comprising an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the amino acid sequence of SEQ ID NO:1, wherein the polypeptide includes a mutation at 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 or all 15 positions selected from the group consisting of T210, A213, Q215, Q216, Q217, Q219, K220, K222, A223, E224, F225, A226, Q227, Q229, and K230 relative to SEQ ID NO:1, wherein residues in parentheses are optional and may be present or absent.
As described in the examples below, the polypeptides of this aspect are “first polypeptides” (also referred to as “A” components herein), capable of homo-oligomerization and interaction via a rigid interface with “second polypeptides” (or “B components) as defined below to produce the two-dimensional materials disclosed herein.
The polypeptide of SEQ ID NO:1 is not capable of such co-assembly; mutations at one or more of positions T210, A213, Q215, Q216, Q217, Q219, K220, K222, A223, E224, F225, A226, Q227, Q229, K230 result in such co-assembly properties.
In some embodiments described herein, the optional residues are present in the polypeptides and considered in determining percent identity relative to SEQ ID NO:1; in other embodiments, the optional residues are not present and are not considered in determining percent identity relative to SEQ ID NO:1.
In one embodiment, mutations in the polypeptide relative to SEQ ID NO:1 comprise:
Each of these embodiments is present in a specific polypeptide disclosed herein capable of acting as a first polypeptide in the 2D materials disclosed herein.
In one embodiment, mutations in the polypeptide comprise mutations at 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or all 17 residues selected from residues 10, 65, 72, 73, 74, 77, 81, 85, 89, 90, 96, 100, 119, 152, 157, 167, and 197 relative to SEQ ID NO:1.
As disclosed in the examples, mutations at one or more of these positions lead to increased stability of the polypeptides. In a further embodiment, mutations in the polypeptide comprise 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, or all 17 mutations selected from the group consisting of 10A, 65Q, 72P, 73E, 74Q or 74H, 77K, 81C, 85F, 89P, 90E, 96Y, 100R, 119Q, 152A, 157M or 157F, 167D, and 197G relative to SEQ ID NO:1.
In another embodiment, the polypeptide comprises an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 9814N or 99% sequence identity to the amino acid sequence selected from the group consisting of SEQ ID NOS:2-31, wherein residues in parentheses may be present or absent.
QEDANLSPEQVAKAFSCAFGFPITPEDAPALYKLLRNMIEDAGDLATRSAKDHYQRIRPFAFYGVS
GRVVGSAVVATLHTNPEFQAQLIKAKIEFKQHQK
In all polypeptide embodiments for all aspects of the disclosure, in one embodiment the optional residues are present in the polypeptides and considered in determining percent identity relative to the reference sequence; in other embodiments, the optional residues are not present and are not considered in determining percent identity relative to the reference sequence.
In one embodiment, underlined residues of the polypeptide are conserved relative to the reference amino acid sequence. In these embodiments, the underlined residues comprise the region of the polypeptide involved in forming a rigid interface of 2D protein materials when homo-oligomers of these “first” polypeptides co-assemble with homo-oligomers of the “second” polypeptides, embodiments of which are described below.
The polypeptides of this first aspect may comprise one or more additional functional peptide domains. Any functional domain may be added as deemed appropriate for an intended use. Exemplary embodiments of such fusion proteins include, but are not limited to, polypeptides having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the amino acid sequence of a sequence below, wherein residues in parentheses are optional and may be present or absent.
In one embodiment the optional residues are present in the polypeptides and considered in determining percent identity relative to the reference sequence; in other embodiments, the optional residues are not present and are not considered in determining percent identity relative to the reference sequence.
As discussed in detail herein, the polypeptides of this first aspect self-assemble into homo-oligomers comprising a first interface region that can interact with a second interface region of homo-oligomers of the polypeptides of the second aspect of the disclosure. Thus, in another embodiment, the disclosure provides homo-oligomers of the polypeptide of any embodiment or combination of embodiments of the first aspect of the disclosure.
In one embodiment, the homo-oligomer is a cyclic homo-oligomer. In a further embodiment, the cyclic homo-oligomer comprises a homo-oligomer of a polypeptide comprising an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the amino acid sequence of SEQ ID NO:31, wherein residues in parentheses are optional. In one embodiment the optional residues are present in the polypeptides and considered in determining percent identity relative to the reference sequence; in other embodiments, the optional residues are not present and are not considered in determining percent identity relative to the reference sequence.
In a second aspect, the disclosure provides polypeptides comprising an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the amino acid sequence of SEQ ID NO:100, wherein the polypeptide includes a mutation at 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or all 14 positions selected from the group consisting of M1, N5, E8, K9, Q12, E13, H14, K16, I17, V18, Q19, A20, E22, and I23 relative to SEQ ID NO:100.
As described herein, the polypeptides of this aspect are “second polypeptides” (also referred to as “B” components herein), capable of homo-oligomerization and interaction via a rigid interface with “first polypeptides” (or “A” components), which are defined above. The polypeptide of SEQ ID NO:100 is not capable of such co-assembly; mutations at one or more of positions M1, N5, ER, K9, Q12, E13, H14, K16, I17, V18, Q19, A20, E22, and I23 result in such co-assembly properties.
In one embodiment, mutations in the polypeptide relative to SEQ ID NO:100 comprise:
Each of these embodiments is present in a specific polypeptide disclosed herein capable of acting as a second polypeptide in the 2D materials disclosed herein.
In a further embodiment, mutations in the polypeptide comprise mutations at 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 or all 14 residues selected from residues 37, 38, 41, 98, 101, 111, 134, 137, 141, 150, 153, 158, 187, 189, and 190 relative to SEQ ID NO:100.
As disclosed in the attached appendices, mutations at one or more of these positions lead to increased stability of the polypeptides. In one embodiment, mutations in the polypeptide comprise 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, or all 14 residues selected from residues 37R, 38A, 41N, 98Y, 101A, 111G, 134R, 137G, 141I, 150K, 153D, 158C, 187A, 189E, and 190L relative to SEQ ID NO:100.
In a further embodiment, the polypeptide comprises an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the amino acid sequence selected from the group consisting of SEQ ID NOS:50-82, wherein residues in parentheses may be present or absent. In one embodiment the optional residues are present in the polypeptides and considered in determining percent identity relative to the reference sequence; in other embodiments, the optional residues are not present and are not considered in determining percent identity relative to the reference sequence.
(MG)SLITLVELEWLEHQLIVQLSERLKGQTAKVGELLCECLKKGGKILICGNGGSAADAQHFAAELS
(MG)SLITLVELEWLEHQLIVQLSERLKGQIAKVGELLCECLKNGGKILICGNGGSAADAQHFAAELS
(MG)SLITLVELEWLEHQLIVQLSERLKGQIAKVGELLCECLKNGGKILICGNGGSAADAQHFAAELS
RELNMLCIGLSGKGGGKMNDLCDHNLVVPSDDTARIQEMHILIIHTLCQIIDEAF(LEHHHHHH)
(MG)SLITLVELEWLEHQLIVQLSERLKGQIAKVGELLCRALKNGGKILICGNGGSAADAQHFAAELS
RELGMLCIGLSGKGGGKMNDLCDHCLVVPSDDTARIQEMHILIIHTLCQIIDEAF(LEHHHHHH)
(MG)SLITLVELEWLEHQLIVQLSERLKGQIAKVGELLCRALKNGGKILICGNGGSAADAQHFAAELS
RELGMLCIGLSGKGGGKMNDLCDMCLVVPSDDTARIQEMHILIIHTLCQIIDEAF(ELHHHHHH)
In one embodiment, underlined residues of the polypeptide are conserved relative to the reference amino acid sequence. In these embodiments, the underlined residues comprise the region of the polypeptide involved in forming a rigid interface of 2D protein materials when homo-oligomers of these “second” polypeptides co-assemble with homo-oligomers of the “first” polypeptides, embodiments of which are described above.
The polypeptides of this second aspect may comprise one or more additional functional peptide domains. Any functional domain may be added as deemed appropriate for an intended use. Exemplary embodiments of such fusion proteins include, but are not limited to, polypeptides having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the amino acid sequence selected from the group consisting of SEQ ID NO:83-99 and 106, wherein residues in parentheses are optional and may be present or absent.
B(c) Fused to mCherry
In one embodiment the optional residues are present in the polypeptides and considered in determining percent identity relative to the reference sequence; in other embodiments, the optional residues are not present and are not considered in determining percent identity relative to the reference sequence.
As discussed in detail herein, the polypeptides of this second aspect self-assemble into homo-oligomers comprising a second interface region that can interact with a first interface region of homo-oligomers of the polypeptides of the first aspect of the disclosure. Thus, in another embodiment, the disclosure provides homo-oligomers of the polypeptide of any embodiment or combination of embodiments of the second aspect of the disclosure.
In one embodiment, the homo-oligomer is a cyclic homo-oligomer. In a further embodiment, the cyclic homo-oligomer comprises a homo-oligomer of a polypeptide comprising an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75% 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the amino acid sequence of SEQ ID NO:101 (Di13B2L4B4 cyclic B compo.), wherein residues in parentheses are optional. In one embodiment the optional residues are present in the polypeptides and considered in determining percent identity relative to the reference sequence; in other embodiments, the optional residues are not present and are not considered in determining percent identity relative to the reference sequence.
In a third aspect, the present disclosure provides nucleic acids, including isolated nucleic acids, encoding the polypeptides of any embodiment or combination of embodiments of the present disclosure that can be genetically encoded. The isolated nucleic acid sequence may comprise RNA or DNA. Such isolated nucleic acid sequences may comprise additional sequences useful for promoting expression and/or purification of the encoded protein, including but not limited to polyA sequences, modified Kozak sequences, and sequences encoding epitope tags, export signals, secretory signals, nuclear localization signals, and plasma membrane localization signals. It will be apparent to those of skill in the art, based on the teachings herein, what nucleic acid sequences will encode the polypeptides of the invention.
In a fourth aspect, the present disclosure provides expression vectors comprising the nucleic acid of any aspect of the invention operatively linked to a suitable control sequence. “Expression vector” includes vectors that operatively link a nucleic acid coding region or gene to any control sequences capable of effecting expression of the gene product. “Control sequences” operably linked to the nucleic acid sequences of the invention are nucleic acid sequences capable of effecting the expression of the nucleic acid molecules. The control sequences need not be contiguous with the nucleic acid sequences, so long as they function to direct the expression thereof. Thus, for example, intervening untranslated yet transcribed sequences can be present between a promoter sequence and the nucleic acid sequences and the promoter sequence can still be considered “operably linked” to the coding sequence. Other such control sequences include, but are not limited to, polyadenylation signals, termination signals, and ribosome binding sites. Such expression vectors include but are not limited to, plasmid and viral-based expression vectors. The control sequence used to drive expression of the disclosed nucleic acid sequences in a mammalian system may be constitutive (driven by any of a variety of promoters, including but not limited to, CMV, SV40, RSV, actin, EF) or inducible (driven by any of a number of inducible promoters including, but not limited to, tetracycline, ecdysone, steroid-responsive). The expression vector must be replicable in the host organisms either as an episome or by integration into host chromosomal DNA. In various embodiments, the expression vector may comprise a plasmid, viral-based vector (including but not limited to a retroviral vector or oncolytic virus), or any other suitable expression vector. In some embodiments, the expression vector can be administered in the methods of the disclosure to express the polypeptides in vivo for therapeutic benefit. In non-limiting embodiments, the expression vectors can be used to transfer or transduce cell therapeutic targets (including but not limited to CAR-T cells or tumor cells) to effect the therapeutic methods disclosed herein.
In a fifth aspect, the present disclosure provides host cells that comprise the expression vectors and/or nucleic acids disclosed herein, wherein the host cells can be either prokaryotic or eukaryotic. The cells can be transiently or stably engineered to incorporate the expression vector of the invention, using techniques including but not limited to bacterial transformations, calcium phosphate co-precipitation, electroporation, or liposome mediated-, DEAE dextran mediated-, polycationic mediated-, or viral mediated transfection. (See, for example, Molecular Cloning: A Laboratory Manual (Sambrook, et al., 1989, Cold Spring Harbor Laboratory Press), Culture of Animal Cells: A Manual of Basic Technique, 2nd Ed. (R. I. Freshney, 1987. Liss, Inc. New York, NY)). A method of producing a polypeptide according to the invention is an additional part of the invention. The method comprises the steps of (a) culturing a host according to this aspect of the invention under conditions conducive to the expression of the polypeptide, and (b) optionally, recovering the expressed polypeptide. The expressed polypeptide can be recovered from the cell free extract, but preferably they are recovered from the culture medium.
In a sixth aspect, the disclosure provides two-dimensional protein structures, comprising a first polypeptide and a second polypeptide, wherein
As described herein, the inventors disclose a computational method to generate de-novo binary 2D non-covalent co-assemblies by designing rigid asymmetric interfaces between two distinct protein dihedral building-blocks. The designed array components are soluble at mM concentrations, but when combined at nM concentrations, rapidly assemble into nearly-crystalline micrometer-scale p6m arrays nearly identical to the computational design model in vitro and in cells without the need of a two-dimensional support. Because the material is designed from the ground up, the components can be readily functionalized, and their symmetry reconfigured, enabling formation of ligand arrays with distinguishable surfaces to drive extensive receptor clustering, downstream protein recruitment, and signaling. The 2D protein materials can impose order onto fundamentally disordered substrates like cell membranes. In sharp contrast to previously characterized cell surface receptor binding assemblies such as antibodies and nanocages, which are rapidly endocytosed, large arrays of the present 2D protein materials assembled at the cell surface suppress endocytosis in a tunable manner, providing potential therapeutic benefits for extending receptor engagement and immune evasion.
Specific exemplary embodiments of the polypeptides and homo-oligomers are provided herein in the first and second aspects of the disclosure. The examples provide detailed rules for generating other such 2D protein arrays starting from a variety of different initial polypeptides.
The first and second homo-oligomers do not independently interact to form larger structures and are stable in solution. Co-assembly into a two dimensional protein structure only occurs when the first homo-oligomer and the second homo-oligomer interact via the rigid interface. As used herein, “rigid” means that the peptide region that takes part in the interface is a structurally well-defined secondary structure (i.e., known down to a certain defined x-Angstrom resolution). This is very different than interfaces based on peptide fusions where a flexible linker connects the building block component and the peptide and so its position is not well defined and only estimated.
The homo-oligomers in this embodiment have “pseudo-dihedral symmetry” in that the homo-oligomer array forming interface regions have a dihedral symmetry, but the entire homo-oligomer is not required to be dihedral.
In one embodiment of this sixth aspect, the first interface region and the second interface regions comprise alpha-helical domains. In this embodiment, each monomer (i.e.: first polypeptide and second polypeptide) may provide a single alpha helix to the rigid interface between the two homo-oligomers, but each first homo-oligomer provides two alpha helices and each second homo-oligomer provides two alpha helices to the rigid interface.
In a seventh aspect, the disclosure also provides two-dimensional protein structures, comprising a first polypeptide and a second polypeptide, wherein
This aspect is particularly preferred to form arrays on soft substrates, including but not limited to cells. As used herein, “cyclic pseudo-dihedral symmetry” means a cyclic homo-oligomer in which a subset of the polypeptide residues display a dihedral point symmetry. The polypeptides may be any that can be part of a pair of distinct or identical proteins, which independently form dihedral or pseudo-dihedral homo-oligomers, and contact each other, while one of their in-plane symmetry/pseudo symmetry axis coincide and each interact with 3 residues or more which belong to rigid secondary structure, either a helix or a beta sheet.
In one embodiment of this seventh aspect, the interface comprises an interface between an alpha-helical domain of the first polypeptide and an alpha-helical domain of the second polypeptide. In one cyclic pseudo-dihedral embodiment, each monomer (i.e.: first polypeptide and second polypeptide) may provide two alpha helices connected by either a loop domain (case of component B) or by numerous secondary structures (case of the A component) to the rigid interface.
In one embodiment of the sixth or seventh aspect of the disclosure, each of the first polypeptide and the second polypeptide may comprise a plurality (2, 3, 4, 5, 6, 7, or more) alpha helical domains separate by loop domains.
In another embodiment of the sixth or seventh aspect of the disclosure, the interface comprises (a) a region of the first polypeptide within 25 amino acids from the first polypeptide C-terminus, and (b) a region of the second polypeptide within 25 amino acids from the second polypeptide N-terminus.
In another embodiment of the sixth or seventh aspect of the disclosure
In this embodiment, the polypeptide length is variable, since amino acid insertions may be incorporated into the loop regions. Such insertions may be of any length and amino acid composition as deemed appropriate for an intended purpose. In this embodiment, the first polypeptide is at least 216 amino acids in length and has at least 9 helical domains and loop domains arranged as shown above, and the second polypeptide is at least 183 amino acids in length (i.e.: up to 5 terminal N- and/or C-terminal residues may be removed) and has at least 8 helical domains and at least 9 loop domains with 5 of the loop domains including beta sheet structures as shown above.
In various other embodiments of the sixth or seventh aspects of the disclosure, the first polypeptide comprises a polypeptide of any embodiment or combination of embodiments of the first aspect of the disclosure, and/or the second polypeptide comprises a polypeptide of any embodiment or combination of embodiments of the second aspect of the disclosure.
In a further embodiment, the first polypeptide comprises an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the amino acid sequence of SEQ ID NO:31, wherein residues in parentheses are optional and may be present or absent. In another embodiment, the second polypeptide comprises an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to the amino acid sequence of SEQ ID NO:101, wherein residues in parentheses are optional and may be present or absent.
The disclosure also provides two-dimensional protein materials comprising
In some embodiments, the first homo-oligomer and the second homo-oligomers according to this and other aspects and embodiments disclosed herein may be first and second homo-oligomers according to any embodiment or combination of embodiments disclosed herein. In various embodiments, the first and second homo-oligomers comprise a pair of homo-oligomers comprising an amino acid sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to the amino acid sequence selected from the group consisting of the following, wherein optional residues (including any N-terminal methionine residues) may be present or absent:
In another aspect, the disclosure provides uses of and methods for using the polypeptides, fusion proteins, homo-polymers, 2D protein materials, nucleic acids, recombinant expression vectors, and host cells of any of the preceding claims for any suitable purpose, including but not limited to those described herein. As described herein, the polypeptides, fusion proteins, and homo-polymers may be used, for example, to generate the binary 2D non-covalent co-assemblies that interact at rigid asymmetric interfaces between two distinct protein dihedral building-blocks. The designed array components are soluble at mM concentrations, but when combined at nM concentrations, rapidly assemble into nearly-crystalline micrometer-scale p6m arrays nearly identical to the computational design model in vitro and in cells without the need of a two-dimensional support. The components can be readily functionalized, and their symmetry reconfigured, enabling formation of ligand arrays with distinguishable surfaces to drive extensive receptor clustering, downstream protein recruitment, and signaling. The 2D protein materials can impose order onto fundamentally disordered substrates like cell membranes. In sharp contrast to previously characterized cell surface receptor binding assemblies such as antibodies and nanocages, which are rapidly endocytosed, large arrays of the present 2D protein materials assembled at the cell surface suppress endocytosis in a tunable manner, providing potential therapeutic benefits for extending receptor engagement and immune evasion.
Proteins that assemble into ordered two-dimensional arrays are generally constituted from just one protein component. For modulating assembly dynamics and incorporating more complex functionality, materials composed of two components would have considerable advantages. Here we describe a computational method to generate de-novo binary 2D non-covalent co-assemblies by designing rigid asymmetric interfaces between two distinct protein dihedral building-blocks. The designed array components are soluble at mM concentrations, but when combined at nM concentrations, rapidly assemble into nearly-crystalline micrometer-scale p6m arrays nearly identical to the computational design model in vitro, by TEM and SAXS, and in cells without the need for a two-dimensional support. Because the material is designed from the ground up, the components can be readily functionalized, and their symmetry reconfigured, enabling formation of ligand arrays with distinguishable surfaces to drive extensive receptor clustering, downstream protein recruitment, and signaling. Using AFM on supported bilayers and quantitative microscopy on living cells, we show that arrays assembled on membranes have component stoichiometry and structure similar to arrays formed in vitro, and thus that our material can impose order onto fundamentally disordered substrates like cell membranes. We find further that in sharp contrast to previously characterized cell surface receptor binding assemblies such as antibodies and nanocages, which are rapidly endocytosed, large arrays assembled at the cell surface suppress endocytosis in a tunable manner, with potential therapeutic relevance for extending receptor engagement and immune evasion. Our work paves the way towards synthetic cell biology, where a new generation of multi-protein macroscale materials is designed to modulate cell responses and reshape synthetic and living systems.
Most previously known ordered protein 2D materials primarily involve single protein components. A de-novo interface design between rigid domains that is stabilized by extensive noncovalent interactions would provide more control over atomic structure and a robust starting point for further structural and functional modulation.
We set out to design two component 2D arrays by engineering de-novo heterotypic (asymmetric) interfaces between dihedral protein homooligomeric building-blocks (BBs). There are 17 distinct plane symmetry groups that define 2D repetitive patterns, but a broader set of unique geometries are available using 3D objects; 33 distinct planar geometries can be generated by combining two objects. The BBs can be either cyclic or dihedral homooligomers oriented in space such that their highest order rotation symmetry (Cx: xÎ{2,3,4,6}) is perpendicular to the plane. We chose a subset of the 17 plane symmetry groups (p3 m1, p4m, p6m) that can be generated by introducing a single additional interface between BBs with dihedral symmetry. We chose to use objects with dihedral rather than cyclic symmetry for their additional in-plane 2-fold rotation axes (
We selected forty-five of the lowest energy designs (2—p3 mL, 10—p4m, and 33—p6m) with high shape complementarity and few buried polar groups not making hydrogen bonds (see
Protein sequence of A and B components and of the native protein models (1d2t→A, 1tk9→B). To simplify purification we added a 6×His tags to each component and NcoI/XhoI are appended as part of the cloning process. Design mutations are indicated in bold.
To determine whether co-assembly occurs in vivo or after lysis, we genetically fused superfolder green fluorescent protein (GFP) to the N-terminus of component A (AGFP) (
A notable advantage of two-component materials is that if the components are soluble in isolation, co-assembly can in principle be imitated by mixing. This is important for unbounded (i.e. not finite in size) crystalline materials which typically undergo phase separation as they crystallize, complicating the ability to work with them in solution. A measure of binary system quality is the ratio between the maximum value in which either component remains individually soluble to the minimal concentration at which they co-assemble when mixed; the higher this ratio, the easier to prepare, functionalize, and store the components in ambient conditions. To evaluate this ratio, which we refer to as SACA (Self-Assembly to Co-Assembly) we separately expressed and purified the A and B components. We found the A component to be quite soluble with the expected molecular weight by SEC-MALS, but component B precipitated overnight. To improve solubility of A and B components we stabilized both using evolution guided design. We found that both components could then be stored at concentrations exceeding 2 mM at room temperature for an extended duration (see methods and Table 2,
KMNKLCDHNLVVPSDDTARIQEMHILIIHTLCQIIDEAFLEHHHHHH
KMNDLCDHNLVVPSDDTARIQEMHILIIHTLCQIIDEAFLEHHHHHH
KMNDLCDHCLVVPSDDTARIQEMHILIIHTLCQIIDEAFLEHHHHHH
KMNDLCDHCLVVPSDDTARIQEMHILIIHTLCQIIDEAFELHHHHHH
To improve the protein stability and potentially at the same time expression levels we used the PROSS server. Because at that time the protocol did not include symmetry design we optimized only the monomeric interactions by restricting from design all the residues in proximity to both the intra- and inter-homooligomer interfaces (the first are the interfaces forming the homooligomer, and the second are the arrays forming interfaces). Sequences of the design component B and 4 stabilized versions are shown. Mutations that were introduced by the stabilization protocol are indicated in bold. The protocol allows different degrees of sequence manipulations, i.e., number of introduced stabilizing mutations. The higher the number of mutations the better is the expected result, however, also the higher is the risk to damage the overall protein. While the original B component design was aggregating within a day in room temp versions B2 to B4 were all highly stable in room temp, and could be stored at over 2 mM for periods of months. Following the stabilization process we predominantly use the B2 version.
Upon mixing the two purified proteins in vitro at equimolar concentrations, even larger and more regular hexagonal arrays were formed compared to in vivo assembly in bacteria (
We investigated the kinetics and mechanism of in vitro assembly by mixing the two components and then monitoring growth in solution by light scattering, and on a substrate by AFM (
We next investigated if preformed arrays could cluster transmembrane receptors on living cells (
Taking advantage of the two-component nature of the material, we sought to speed up assembly kinetics and homogeneity of clustering by first saturating membrane receptors with one component, then triggering assembly on cells with the second (
Array formation on cells using this method was fast (steady state reached in =20s) and colocalizing mScarlet™ patches appeared synchronously with GFP-positive patches, indicating that receptor clustering was fast as well (
To evaluate how many molecules were clustered per array, we adapted our previously described microscope calibration nanocages to two colors (see methods and
We explored tuning the final size of the array by tuning the density of receptors at the cell surface. We used a doxycycline-inducible promoter to control the expression of the synthetic membrane protein and thus its density at the cell surface (
We then investigated whether the lattice order (
Following ligand-induced oligomerization, numerous receptors, such as the Epidermal Growth Factor Receptor (EGFR), are internalized by endocytosis and degraded in lysosomes as a means to downregulate signalling. It is therefore not a surprise that EGFR oligomerisation agents, such as combinations of antibodies recognizing different epitopes or bivalent heterotypic nanobodies induce rapid EGFR endocytosis and degradation in lysosomes. This rapid endocytosis is not specific to small oligomers, as large 3D oligomers, such as our 60-mer nanocages30 functionalized with EGFR binders, are also rapidly internalised and routed to lysosomes (
Our studies of the interactions of the designed protein material with mammalian cells provides new insights into cell biology of membrane dynamics and trafficking. We observe a strong dependence of endocytosis on array size and on the geometry of receptor binding domain presentation: arrays roughly the size of clathrin coated pits almost completely shut down endocytosis, while smaller arrays, and nanoparticles displaying large numbers of receptor binding domains are readily endocytosed (
The long range almost-crystalline order, tight control over the timing of assembly, and the ability to generate complexity by modulating the array components differentiate the designed two dimensional protein material described here from naturally occurring and previously designed protein 2D lattices. Applied to biology, this new material provides an unprecedented way to rapidly and quantitatively cluster transmembrane proteins, effectively enabling modulating signalling pathways from the outside. In particular, the stepwise assembly approach described here offers a fine level of control to cluster receptors compared to pre-assembled materials or aggregates: not only is receptor density in the clusters fixed at the structural level, but also the fluorescence intensity of the array component can be directly converted into the absolute size of receptor clusters and the number of receptors being clustered, which is useful if the receptors are endogenous cell proteins not fluorescently tagged. We anticipate that these properties, combined with the synchrony of receptor clustering should greatly facilitate the detailed investigation of the molecular sequence of events downstream of receptor clustering. Applied to structural biology, the ability to impose a predetermined order onto transmembrane proteins may help structure determination of those challenging targets using averaging techniques. We furthermore envision multiple ways for these two component bio-polymers to integrate into designed and living materials. For example, as two-component bioinks, adhesive bio-printed scaffolds could remove the need for harmful temperature/UV-curing techniques; conversely, embedding cells secreting designed scaffolds building-blocks could continuously regenerate their extracellular structure or induce its remodelling in response to programmable cues. We expect the methodology developed here, combined with the rapid developments in de novo design of protein building-blocks and quantitative microscopy techniques, will open the door to a future of programmable biomaterials for synthetic and living systems.
Crystal structures of 628 D2, 261 D3, 63 D4, and 13 D6 dihedral homooligomers with resolution better than 2.5 Å were selected from the Protein Data Bank (PDB) to be used as building blocks (BBs). Combinatorial pairs of BBs were selected such that they afford the two rotation centers required in a selected subset of plane symmetries (P3 m1 [C3-C3], p4m [C4-C4, C4-C2], p6m [C6-C2, C6-C3, C3-C2]). The highest-order rotation symmetry axis of each BB was aligned perpendicular to the plane and an additional 2 fold symmetry axis was aligned with the plane symmetry reflection axis. Preserving these constraints allows positioning the D2, D3. D4, and D6 BBs in 6, 2, 2, and 2 unique conformations, respectively, and results in a total of ˜2.6M unique docking trajectories. In a first iteration Symmetric Rosetta™ Design25 was applied to construct the BBs dihedral homooligomers, position them in the correct configuration in space and slide them into contact, along the plane symmetry group reflection axes. Docking trajectories are discarded if clashing between BBs are detected, if a fraction greater than 20% of contact positions (residues belonging to one BB within 10 Å of their partner BB residues) do not belong to a rigid secondary structure (helix/beta sheet), or if the surface area buried by the formation of the contact is lower than 400 Å2. These initial filtering parameters narrow the number of potential design trajectories to approximately 1% of the original trajectories number. In a second iteration, the selected docks (BBs pairs contact orientation) are regenerated by Symmetric Rosetta™ Design, slide into contact and retract in steps of 0.05 Å to a maximum distance of 1.5 Å. For each position, layer sequence design calculations, implemented by a Rosetta™ script,26 are made to generate low-energy interfaces with buried hydrophobic contacts that are surrounded by hydrophilic contacts. Designed substitutions not substantially contributing to the interface were reverted to their original identities. Resulting designs were filtered based on shape complementarity (SC), interface surface area (SASA), buried unsatisfied hydrogen bonds (UHB), binding energy (ddG), and number of hydrophobic residues at the interface core. A negative design approach that includes an asymmetric docking is used to identify potential alternative interacting surfaces. Designs that exhibit a non-ideal energy funnel are discarded as well. Forty five best scoring designs belonging to p3 m1:2, p4m: 10, and p6m: 33, were selected for experiments. Protein monomeric stabilization was done to the D2 and D3 homooligomers of design #13 using the PROSS server (see
Pyrosetta™35 and RosettaRemodel™36 were used to model and generate linkers to render the D2 and D3 working homooligomers into C2 and C3 (cyclic pseudo-dihedral) homooligomers (see
Genes encoding for the 45 pairs were initially codon optimized using DNAWorks™ v3.2.437 followed by RNA ddG minimization of the 50 first nucleotides of each gene using mRNAOptimiser™38 and Nupack3.2.2 programs (
The transmembrane nanobody construct (
For the GBP-mScarlet and GBP-EGFR-Darpin fusions, we modified a pGEX vector to express a protein of interest fused to GBP downstream of the Gluthatione S transferase (GST) purification tag followed by TEV and 3C cleavage sequences. We then cloned mScarlet and a published Darpin against EGFR49 (clone E01) into this vector, which thus express GST-3C-TEV-GBP-mScarlet and GST-3C-TEV-GBP-EGFR-Darpin fusions, respectively.
Unless stated otherwise, all steps were performed at 4° C. Protein concentration was determined either by absorbance at 280 nm (NanoDrop™ 8000 Spectrophotometer, Fisher Scientific), or by densitometry on coomassie-stained SDS page gel against a BSA ladder.
For initial screening of the 45 designs for A and B, bicistronic plasmids were transformed into BL21 Star (DE3) E. coli, cells (Invitrogen) and cultures grown in LB media. Protein expression was induced with 1 mM isopropyl β-d-1-thiogalactopyranoside (IPTG) for 3 hours at 37° C. or 15 hours at 22° C., followed by cell lysis in Tris-buffer (TBS; 25 mM Tris, 300 mM NaCl, 1 mM dithiothreitol (DTT), 1 mM phenylmethylsulfonyl fluoride (PMSF), and lysozyme (0.1 mg/ml) using sonication (Fisher Scientific) at 20 W for 5 min total ‘on’ time, using cycles of 10s on, 10s off. Soluble and insoluble fractions were separated by centrifugation at 20,000×g for 30 minutes and protein expression was screened by running both fractions on SDS-PAGE (Bio-Rad) (see
SpyTag-spyCatcher™ conjugation was done by mixing a tagged protein and the complementary tagged array component at a 1.3:1 molar ratio, overnight incubation (˜10 hours) at 4° C. followed by Superose™ 6 10/300 GL SEC column purification to obtain only fully conjugated homooligomers. Sub-loaded conjugation was done at tag:array protein 0.17:1 molar ratio and used as is. Biotinylation of AV-tagged components was performed with BirA as described in [42] and followed by Superose™ 6 10/300 GL SEC column purification. In-vitro array assembly was induced by mixing both array components at equimolar concentration.
GFP-tagged 60-mer nanocages were expressed and purified as previously.30 GBP-mScarlet was expressed in E. coli BL21 Rosetta™ 2 (Stratagene) by induction with 1 mM IPTG in 2×YT medium at 20° C. overnight. Bacteria were lysed with a microfluidizer at 20 kPsi in lysis buffer (20 mM Hepes, 150 mM KCl, 1% TritonX100, 5% Glycerol, 5 mM MgCl2, pH 7.6) enriched with protease inhibitors (Roche Mini) and 1 mg/ml lysozyme (Sigma) and 10 μg/ml DNAse I (Roche). After clarification (20000 rpm, Beckman J A 25.5, 30 min 4° C.), lysate was incubated with Glutathione S-sepharose 4B resin (GE Healthcare) for 2 h at 4C and washed extensively with (20 mM Hepes, 150 mM KCl, 5% glycerol, pH7.6), and eluted in (20 mM Hepes, 150 mM KCl, 5% glycerol, 10 mM reduced glutathione, pH7.6). Eluted protein was then cleaved by adding 1:50 (vol:vol) of 2 mg/mL (His)G-TEV protease and 1 mM/0.5 mM final DTT/EDTA overnight at 4° C. The buffer of the cleaved protein was then exchanged for (20 mM Hepes, 150 mM KCl, 5% Glycerol, pH 7.6) using a ZebaSpin™ column (Pierce), and free GST was removed by incubation with Glutathione S-sepharose 4B resin. Tag-free GBP-mScarlet was then ultracentrifuged at 100,000×g for 5 min at 4C to remove aggregates. GBP-mScarlet was then incubated with GFP-60mer nanocages,30 followed by size exclusion chromatography (see Microscope calibration), which further removed the TEV protease from the final mScarlet-GBP/GFP-60mer.
GBP-EGFR-Darpin was expressed similarly as GBP-mScarlet, except that lysis was performed using sonication, lysate clarification was performed at 16,000 rpm in a Beckman JA 25.5 rotor for 30 min at 4° C.). After TEV cleavage buffer was exchanged for (20 mM Hepes, 150 mM KCl, 5% Glycerol, pH 7.6) by dialysis, free GST and TEV proteases were removed by sequential incubation with Glutathione S-Sepharose™ 4B resin and Ni-NTA resin. Tag-free GBP-EGFR-Darpin was then flash frozen in liquid Ni and kept at −80° C.
Delta-like ligand 4 (DLL4) was prepared from a fragment of the human Delta ectodomain (1-405) with a C-terminal GS-SpyTag-6×His sequence. The protein was purified by immobilized metal affinity chromatography from culture medium from transiently transfected Expi293F cells (Thermo Fisher), then further purified to homogeneity by size exclusion chromatography on a Superdex™ 200 column in 50 mM Tris, pH 8.0, 150 mM NaCl, and 5% glycerol, and flash frozen before storage at −80° C. DLL4 was conjugated to the SpyCatcher tagged A homooligomers (ASC) at 1.5:1 molar ratio of DLL4 to ASC. The ASC-ST-DLL4 conjugate was purified by size exclusion chromatography on a Superose™ 6 column. The ASC-ST-DLL4-JF646 conjugate was produced by coupling of 1.5 μM ASC-ST-DLL4 to excess Janelia Fluor 646 SE (Tocris) overnight at 4° C. in 25 mM HEPES, pH 7.5, 150 mM NaCl. The labeled ASC-ST-DLL4 was then purified by desalting on a P-30 column (Bio-Rad). The final molar ratio of JF646 to ASC-ST-DLL4 was 5:1.
For initial screening of coexpressed designs insoluble fractions were centrifuged at 12,000 g for 15 min and resuspended in Tris-buffer (TBS; 25 mM Tris, 300 mM NaCl) twice prior to grid preparation. Samples were applied to glow-discharged EM grids with continuous carbon, after which grids were washed with distilled, deionized water, and stained with 2% uranyl formate. EM grids were screened using an FEI Morgagni 100 kV transmission electron microscope possessed of a Gatan Orius™ CCD camera. For the working design EM grids were initially screened using the Morgagni. Micrographs of well-stained EM grids were then obtained with an FEI Tecnair™ G2 Spirit transmission electron microscope (equipped with a LaB6 filament and Gatan UltraScan™ 4 k×4 k CCD camera) operating at 120 kV and magnified pixel size of 1.6 Å. Data collection was performed via the Leginon™ software package.50 Single-particle style image processing (including CTF estimation, particle picking, particle extraction, and two-dimensional alignment and averaging) was accomplishing using the Relion™ software package.51
Arrays formation kinetics was determined by turbidity due to light scattering, monitored by absorption at 330 nm wavelength, using an Agilent Technologies (Santa Clara. Calif.) Cary 8454 UV-Vis spectrophotometer. Control sample containing a single component at 20 μM was measured for 3 hours. Kinetic measurements were initiated immediately after mixing both components in equimolar concentrations between 1 μM to 2 μM.
Far-ultraviolet Circular Dichroism (CD) measurements were carried out with an AVIV spectrometer, model 420. Wavelength scans were measured from 260 to 195 nm at temperatures between 25 and 95° C. Temperature melts monitored absorption signal at 220 nm in steps of 2° C./min and 30 s of equilibration time. For wavelength scans and temperature melts a protein solution in PBS buffer (pH 7.4) of concentration 0.2-0.4 mg/ml was used in a 1 mm path-length cuvette.
Small angle X-ray scattering data were collected at the SIBYLS beamline at the Advanced Light Source in Berkeley California.52 Components A and B were measured independently and as a mixture in 25 Tris, 150 NaCl and 5% glycerol. Imidazole was added to the mixture in a stepwise fashion after A and B were mixed 1:1. These solutions were prepared 24 hours prior to collection. Before collection samples were placed in a 96 well plate. Each sample was presented to the X-ray beam using an automated robotics platform. The 10.2 keV monochromatic X-rays at a flux of 1012 photons per second struck the sample with a 1×0.3 mm rectangular profile that converged at the detector to a 100 μm×100 μm spot. The detector to sample distance was 2 m and nearly centered on the detector. Each sample was exposed for a total of 10 seconds. The Pilatus 2M detector framed the 10 second exposure in 300 ms frames for a total of 33 frames. No radiation damage was observed during exposures.
Components A and B were independently collected at 4 concentrations (40, 80, 120, 160 μM). No concentration dependence was observed so the 160 μM, highest signal, SAXS measurement was fully analyzed using the Scatter program developed by Rambo et al at SIBYLS and the Diamond Light Source. SAXS profiles were calculated using the FOXS53 and compared to the measured data with excellent agreement χ2<1 for hexameric A and tetrameric B(
The mixture of components A and B were measured at 4 concentrations as well (0.5, 2, 5, and 10 μM). The scattering profiles all had peaks (
The measured SAXS profile was also matched by calculations of the SAXS from atomic models (
We utilized the trend in the ratio of the diffraction to scattering from the models to estimate the size of the sheets observed in solution. All calculations and the experimental SAXS profiles were scaled by the underlying scattering. The higher the angle, the smaller the contribution of the diffraction, so the highest angle experimental signal with sufficient signal to noise was used (0.1<q<0.15 Å) for this scaling all profiles relative to one another. Once scaled, the ASU was divided through all scattering curves where the ASU is as defined above. By dividing through, the exponential decay of the scattering profile was removed and yielded a set of peaks that oscillate about a constant background which was further normalized so as to oscillate about a value of one (
Time resolved SAXS measurements were obtained for mixtures at 10 μM at several time points ranging from 30 see to 15 min. Each measurement was collected from a separate well to avoid accumulated damage to the samples. SAXS profiles were scaled (including the ovemight SAXS profile to which a fit was obtained) and the ASU was divided. The min to max peaks distance was calculated and scaled for all profiles to agree with the values obtained for the common sample (the overnight sample the fit was obtained for in
Flp-In NIH/3T3 cells (Invitrogen, R76107) were cultured in DMEM (Gibco, 31966021) supplemented with 10% Donor Bovine Serum (Gibco, 16030074) and Pen/Strep 100 units/ml at 37° C. with 5% CO2. Cells were transfected with Lipofectamine 2000 (Invitrogen, 11668). Stable transfectants obtained according to the manufacturer's instructions by homologous recombination at the FRT were selected using 100 Wpg/mL Hygromycin B Gold™ (Invivogen, 31282-04-9). HeLa cells were cultured in DMEM supplemented with 10% Fetal Bovine Serum and Penicillin-streptomycin 100 units/ml at 37° C. with 5% CO2.
Human Umbilical Vein Endothelial Cells (HUVECs) (Lonza, Germany) were grown on 0.1% gelatin-coated 35 mm cell culture dish in EGM2 media (20% Fetal Bovine Serum, 1% penicillin-streptomycin, 1% Glutamax (Gibco, catalog #35050061), 1% ECGS (endothelial cell growth factors), 1 mM sodium pyruvate, 7.5 mM HEPES, 0.08 mg/ML heparin, 0.01% amphotericin B, a mixture of 1×RPMI 1640 with and without glucose to reach 5.6 mM glucose in final volume). HUVECs were expanded till passage 4 and cryopreserved.
ECGS was extracted from 25 mature whole bovine pituitary glands from Pei-Freeze biologicals (catalog #57133-2). Pituitary glands were homogenized with 187.5 mL ice cold 150 mM NaCl and the pH adjusted to pH4.5 with HCl. The solution was stirred in a cold room for 2 hours and centrifuged at 4000 RPM at 4C for 1 hour. The supernatant was collected and adjusted to pH7.6, 0.5 g/100 mL streptomycin sulfate (Sigma #S9137) was added, stirred in the cold room overnight and centrifuged 4000 RPM at 4C for 1 hour. The supernatant was filtered using a 0.45 to 0.2-micrometer filter.
The HUVEC cells were expanded till P8, followed by 16 hrs starvation with DMEM low glucose media prior to protein scaffold treatment. The cells were then treated with desired concentrations of protein scaffolds in DMEM low glucose media for 30 min or 60 min. Cells were cultured at 37C, 5% CO2, 20% O2.
Glycerol stocks of E. coli strain BL21(DE3) having the single cistronic AGFP and the bicistronic AGFP+B were used to grow overnight cultures in LB medium+KAN at 37° C. To avoid GFP signal saturation, leaky expression only was used by allowing culture to remain at 37° C. another 24 hours before spotted onto a 1% agarose-LB-KAN pad. Agarose pads were imaged using the Leica SP8X confocal system to obtain bright and dark field images.
All live imaging of NIH-3T3 cells (
Array growth and dynamics at molecular resolution were characterized by mixing both components at equimolar concentration (7 μM) and immediately injecting the solution into the fluid cell on freshly cleaved mica. All in-situ AFM images were collected using silicon probes (HYDRA6V-100NG, k=0.292 N m−1, AppNano™) in ScanAsyst Mode with a Nanoscope™ 8 (Bruker). To minimize damage to the structural integrity of the arrays during AFM imaging, the applied force was minimized by limiting the Peak Force™ Setpoint to 120 pN or less.34 The loading force can be roughly calculated from the cantilever spring constant, deflection sensitivity and Peak Force Setpoint.
Arrays were assembled on supported bilayers (
Correlative AFM/SIM imaging was performed by combining a Bioscope Resolve™ system (Bruker, Santa Barbara, Calif., LISA) with a home-made SIM system.59 The fields of view of the two microscopes were aligned so that the AFM probe was positioned in the middle of the field of view of the SIM microscope. A brightfield image of the “shadow” of the AFM cantilever was used to precisely align the AFM probe with the SIM lens. To acquire structured illumination microscopy images, a ×60/1.2 NA water immersion lens (UPLSAPO 60×W, Olympus) focused the structured illumination pattern onto the sample, and the same lens was also used to capture the fluorescence emission light before imaging onto an sCMOS camera (C11440, Hamamatsu). The wavelengths used for excitation were 488 nm (iBEAM-SMART™-488, Toptica) for the protein arrays and 561 nm (OBIS 561, Coherent) for the lipid bilayers. Images were acquired using custom SIM software described previously.59
AFM images were acquired in Fast Tapping imaging mode using Fastscan™-D probes (Bruker), with a nominal spring constant of 0.25 N/m and a resonant frequency of 110 kHz. Images were recorded at scan speeds ranging between 2 and 10 Hz and tip-sample interaction forces between 100 and 200 pN. Large scale images (20×20 μm) were used to register the AFM with the SIM fields of view and small (500×500 nm) scans were performed in order to resolve the structure of the arrays. Raw AFM images were first order fitted with reference to the lipid bilayer. Amplitude images were inverted and a lowpass filter was applied to remove excess noise. For the high magnification scans, amplitude images are presented as movement of the arrays on the lipid bilayer does not affect the resolution of these images to the same extent as that of topography images. Amplitude data is helpful in visualising features and the shape of the sample, however note that the z-scale in amplitude images indicates the amplitude error and thus is not representative of the height of the sample.
Cells were lysed directly on the plate with lysis buffer containing 20 mM Tris-HCl pH 7.5, 150 mM NaCl, 15% Glycerol, 1% Triton x-100, 1 M β-Glycerolphosphate, 0.5 M NaF, 0.1 M Sodium Pyrophosphate, Orthovanadate, PMSF and 2% SDS. 25 U of Benzonase® Nuclease (EMD Chemicals, Gibbstown, N.J.), and 100× phosphatase inhibitor cocktail 2, 4× Laemli sample buffer (900 μl of sample buffer and 100 μl β-Mercaptoethanol) is added to the lysate then heated (95° C., 5 mins). 30 μl of protein sample was run on SDS-PAGE (protean TGX pre-casted gradient gel, 4%-20%, Bio-rad) and transferred to the Nitro-Cellulose membrane (Bio-Rad) by semi-dry transfer (Bio-Rad). Membranes are blocked for 3 h with 5% BSA (P-AKT) or 1 h with 5% milk (β-Actin) corresponding to the primary antibodies and incubated in the primary antibodies overnight at 4° C. The antibodies used for western blot were P-AKT(S473)(Cell Signaling 9271, 1:2000), β-Actin (Cell Signaling 13E5, 1:1000). The membrane incubated with P-AKT was then blocked with 5% milk prior to secondary antibody incubation. The membranes were then incubated with secondary antibodies anti-rabbit IgG HRP conjugate (Bio-Rad) for 2 hrs and detected using the Immobilon-luminol reagent assay (EMP Millipore).
For
Alternatively, for
Alternatively, to label cell membranes of fixed NIH/3T3 cells expressing GBP-TM-mScarlet (
To evaluate the endocytic block affecting clustered EGF receptors (
Alternatively, for
To quantitatively measure the internalization of GFP-positive arrays as a function of their size (
To measure the density of active GBP-TM-mScarlet at the surface of cells as a function of the expression level of this construct (
TIRF imaging of array assembled onto cells (
For fast imaging of array formation (
Imaging of immunofluorescence experiments depicted in
To calibrate the TIRE and Spinning disk setup described above in terms of estimated number of GFP and mScarlet molecules, we mixed our previously published GFP-60mer nanocages30 with an excess of a purified GBP-mscarlet fusion (see
We then acquired z-stacks of diluted nanocages in the same buffer as the cells' imaging medium, which revealed discrete particles fluorescing on both the GFP and mScarlet channels (see
Mathematically, conversions into number of molecules, and their associated error, were performed by building on the elegant work of Picco and colleagues as follows:61 IGFP is the integrated intensity of the arrays in the GFP channel (n measurements) and I60GFP is the integrated intensity of the reference 60mer in the same channel (n′ measurements). As distribution of dim signals are skewed, estimated average values for IGFP, noted , is computed as median of the distribution. The estimate for the reference 60 mer, , is similarly computed from I60GFP. The respective error associated with these measurements, and , respectively, are estimated with the Median Absolute Deviation (MAD) corrected for asymptotically normal consistency on the natural logarithm transform of the raw fluorescence values IGFP and I60GFP.
The estimate of number of GFP molecule per army was computed a
The uncertainty over this number of molecules, δn GFP, was computed by error propagation as
Similarly, the number of molecules in the mScarlet channel, nmScarlet was estimated from ImScarlet, the integrated intensity of the arrays in the mScarlet channel (n measurements) and the intensity of the reference 60mer in the same channel, I60mScarlet (n′ measurements).
The estimate of number of mScarlet molecules per array was computed as
The uncertainty over this number of molecules, δn GFP, was computed by error propagation as
We then estimated the GFP/mscarlet ratio on cells in terms of molecules,
Its associated error,
is computed as:
To compare the lattice order between arrays made on cells and preformed arrays (
Its associated error, is computed as:
We verified that the mScarlet/GFP fluorescence ratio varies as expected from the structure, and is thus a good proxy of bulk order (
(GFP)with GFP-mScarlet=3/2×(/GFP)without GBP-mScarlet
To estimate the A/B ratio on cells (
with R0=56.75. To the GFP intensity lGFP is corrected by a factor
to account for FRET in order to evaluate as above.
As dihedral components have twice more fluorophore than cyclic ones per unit cell, the mean A/B ratio, noted computed as follows:
Its associated error, is computed as:
Unless stated otherwise, measurements are given in mean±SEM. No randomization methods were used in this study. No blind experiments were conducted in this study. Statistical analyses were performed using GraphPad Prism 8 or SigmaStat 3.5 with an alpha of 0.05. Normality of variables was verified with Kolmogorov-Smirnov tests. Homoscedasticity of variables was always verified when conducting parametric tests. Post-hoc tests are indicated in their respective figure legends.
Unless stated otherwise, images were processed using Fiji62/ImageJ 1.52 d, Imaris, OMERO63 and MATLAB 2017b (Mathworks) using custom codes available on request. Figures were assembled in Adobe Illustrator 2019 and movies were edited using Adobe Premiere pro CS6.
Spatial drift during acquisition was corrected using a custom GPU-accelerated registration code based on cross correlation between successive frames. Drift was measured on one channel and applied to all the channels in multichannel acquisitions.
For live quantification of mScarlet recruitment by preformed AGFP+B arrays (
For 3D reconstruction (
For analysis of FRAP data of GBP-TM-mScarlet clustered by preformed AGFP+B arrays (
with I(t), the mean intensity at time point t; Iprebleach the intensity before bleaching (averaged over six time points). As a control that binding of AGFP alone (that is, not in an array) does not affect fluorescence recovery of GBP-TM-mScarlet (meaning that the array does not recover because all the GBP-TM-mScarlet is trapped by the AGFP+B array), we performed FRAP experiments of GBP-TM-mScarlet in cells incubated with AGFP alone. As expected, we found that it recovers (
For live quantification of array assembly and growth on cells (
For Mean Square Displacement (MSD) analysis (
For automated quantification of the colocalization between GFP-positive arrays and LAMP1 staining (
This measurement was then averaged for all z-planes of a given cell, and this average percentage of colocalization per cell was averaged between different cells and compared between conditions. Quantitatively similar values of the percentage of colocalization were obtained if the analysis was performed in 3D (using our previously described method)66 rather than in 2D then averaged across the cell, or conversely, if the percentage of colocalization per z-plane was summed rather than averaged, indicating that data are not biassed due to some z-plane having less GFP-positive spots than others (data not shown).
For automated quantification of the colocalization between GFP-positive nanocages and LAMP1 staining (
Indeed, as 60-mer are internalized, they accumulate in lysosomes, which thus display more signal than isolated 60-mer. Using a particle based calculation would thus not be accurate.
For automated quantification of the fraction of GFP-positive arrays associated with WGA-positive plasma membranes (
This measurement was then averaged for all z-planes of a given cell, and this average percentage of colocalization per cell was averaged between different cells and compared between conditions.
This application claims priority to U.S. Provisional Patent Application Ser. No. 63/069,901 filed Aug. 25, 2020, incorporated by reference herein in its entirety.
This invention was made with government support under Grant Nos. P01 GM081619 and R01 GM083867 and R01 GM097372 and U01 HL099993 and U01 HL099997, awarded by the National Institutes of Health. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2021/047136 | 8/23/2021 | WO |
Number | Date | Country | |
---|---|---|---|
63069901 | Aug 2020 | US |