The present invention relates to a method for manufacturing a modified polypeptide from a first polypeptide, said modified polypeptide exhibiting altered binding properties to a target molecule and/or having a different amino acid sequence compared to a first polypeptide.
Most biological processes involve permanent and non-permanent interactions between different proteins. The development of modulators of protein-protein interactions possesses significant potential for the discovery of novel protein-affinity reagents, which can be used for a variety of diagnostic and therapeutic purposes as well as for purification, imaging and reagent purposes. Reagents that disrupt protein-protein interactions must contend with a protein surface that is comparatively large, poorly defined and solvent exposed, and the scope of their targets shall not be limited by the immune response.
The challenges faced in developing molecules interfering with protein-protein interactions are the lack of small molecule starting points, the apparent nondescript nature of the target area on the surface of the molecule, the difficulty of distinguishing artifactual binding from real associations and the insufficiency of chemical libraries to obtain high-affinity binders by screening.
It is an object of the present invention to provide a method in which binding partners of a target molecule are identified and in which binding properties of the binding partners are modified in a process which is performed in an environment close to nature.
Thus, the present invention relates to a method for manufacturing a modified polypeptide from a first polypeptide, said modified polypeptide exhibiting altered binding properties to a target molecule and/or having a different amino acid sequence compared to a first polypeptide comprising the steps of:
a) providing a first cell comprising a nucleic acid molecule encoding for a first fusion polypeptide, said first fusion polypeptide comprising at least one first polypeptide and a transcriptional activation domain, and comprising optionally a nucleic acid molecule encoding for a second fusion polypeptide, said second fusion polypeptide comprising the target molecule or a polypeptide domain binding the target molecule and a DNA binding domain, whereby the cell further comprises a reporter gene encoding a reporter polypeptide operably linked to an upstream transcriptional regulatory sequence comprising a DNA binding site as target for the at least one first polypeptide or optionally a DNA binding site for the DNA binding domain of the second fusion polypeptide,
b) cultivating the cells of step a),
c) identifying at least one cell expressing the reporter polypeptide,
d) isolating at least one nucleic acid molecule encoding for at least one first polypeptide of the at least one cell identified in step c),
e) modifying the at least one nucleic acid molecule of step d) by introducing at least one mutation thus obtaining at least one modified nucleic acid molecule encoding for at least one modified polypeptide,
f) introducing the at least one modified nucleic acid molecule of step e) into at least one second cell comprising optionally a nucleic acid molecule encoding for a second fusion polypeptide, said second fusion polypeptide comprising the target molecule or a polypeptide domain binding the target molecule and a DNA binding domain, and
g) repeating steps a) to f) at least two additional times (so that steps a) to f) are performed at least three times on the initial polypeptide) until a nucleic acid molecule encoding for a modified polypeptide is obtained and isolated in step d) exhibiting predetermined altered binding properties to the target molecule compared to the at least one first polypeptide and/or having a different amino acid sequence compared to the first polypeptide wherein in the repeating steps a) to d) the first polypeptide is exchanged with the modified polypeptide of step e).
The present method adapts the method of yeast 2 hybrid (Y2H) to the field of the in vitro evolution. It was rather surprising that the Y2H system is usable at all in such a system due to the low transformation rate of eukaryotic cells, such as yeast, which is less than a tenth or a hundredth of usual in vitro evolution systems using e.g. bacteria or phages. Moreover, Y2H was not regarded as a sophisticated system which is normally applied in in vitro evolutions. Quite contrary to usual in vitro evolution methods, Y2H is a rather basic method.
The method of the present invention allows to identify and to manufacture modified polypeptides derived from a first polypeptide, which show altered binding properties to a target molecule (e.g. protein, polypeptide, peptide, nucleic acid, carbohydrate) compared to the first polypeptide or which show comparable or substantially identical binding properties but have a different amino acid sequence. The latter features of a modified polypeptide are of particular interest when, e.g. in the course of an immune therapy, a polypeptide is required which shows a binding affinity and/or specificity which is similar to a naturally occurring polypeptide and should be recognized as “foreign” by the immune system of a mammal.
A main feature of the method of the present invention is the use of iterative steps leading from a first polypeptide to a modified polypeptide exhibiting different properties compared to the first polypeptide. In the course of said method steps a) to f) is repeated at least two additional times (so that steps a) to f) are performed at least three times), preferably at least three, four, five, ten, 20, 30 etc. times. In the course of the method (after step f) the at least one first polypeptide of step a) is the modified polypeptide obtained in step e). Consequently, due to the use of repeating steps a polypeptide is subjected to an in vivo/in vitro evolution, wherein the binding properties of the first/modified polypeptide are determined in vivo and the mutagenesis (“evolution”) is performed in vitro. Of course, the terms “in vivo” or “in vivo selection” in the present method refer to the steps performed in living cells, such as single cell organisms, like yeast or bacteria, or in cells or cell cultures or tissue of complex organisms, such as human or animal cells or cell lines. The “synthetic in vivo mutagenesis” according to the present invention is therefore also possible and covered in cell culture of e.g. human cells. According to the present invention, these repeating steps wherein the polypeptide is subjected to mutagenesis and selection procedures, wherein the binding properties of the modified polypeptide are determined, are performed in a living cell (in an intracellular environment).
The method of the present invention allows to select iteratively and subsequently to produce affinity binders in an intracellular environment. It is particularly advantageous that the manufacturing of a modified polypeptide showing an altered binding property to a target molecule occurs intracellularly. Every protein-protein interaction requires extracellularly certain interaction conditions like pH, salt concentration, temperature, working and binding conditions, detergents etc. which have to be maintained in the course of the determination of the protein-protein interaction and which do not reflect the naturally occurring conditions. The maintenance of these defined conditions in vitro is laborious and requires high accuracy. With the method of the present invention and with the provision of an in vivo method to manufacture modified polypeptides having altered binding properties to a target molecule compared to an unmodified polypeptide these drawbacks can be overcome. With the method of the present invention the binding conditions are reproducibly predetermined by the cell itself. Another main advantage of the present invention is that the polypeptide-target interaction occurs in a reducing intracellular environment. In contrast thereto, conventional in vitro methods like phage, ribosome, bacterial and yeast display technologies occur in an oxidising environment (in vitro). Since many therapeutic substances act intracellularly when administered to an individual or animal, it is advantageous that these substances which may be the modified polypeptides of the present invention are identified and obtained by a method in an intracellular and thus reducing environment. The binding properties of polypeptides to a target molecule may vary in an oxidising and reducing environment. The end result of the selection is a bank of binding peptides, rather than single molecular entities. Each target and binder combination is trapped within a single cell.
The method of the present invention overcomes various drawbacks of biochemical methods known in the art which are regularly used to manufacture modified polypeptides exhibiting an altered binding affinity to a target molecule compared to the unmodified versions of said polypeptides. Usually the target molecule as well as the polypeptides to be tested and manufactured have to be recombinantly produced and purified. The recombinant production and in particular the isolation of polypeptides is usually very laborious (comprising the steps of cloning of the polypeptides, expression and isolation of the polypeptides and testing the binding behaviour of the polypeptides to the target molecule) and expensive and therefore not suited for a fast and routinely used identification method of modified polypeptides showing altered binding properties to a target molecule compared to unmodified polypeptides. By using the method of the present invention it is possible to determine in vivo whether a modified polypeptide exhibits altered binding properties to a target molecule compared to an unmodified polypeptide. Therefore, there is no need to isolate the modified polypeptides from the cells in order to determine the binding properties of the modified polypeptides to a target molecule. This allows also screening a high number of modified polypeptides.
The method of the present invention allows to detect in vivo whether a modified polypeptide binds to the target molecule or not and—if the expression rate of the reporter polypeptide is quantified—to determine the binding strength.
In contrast thereto in vitro systems require the isolation of a modified polypeptide prior to determining its binding properties to a target molecule. Thus, the detection of the expression of the reporter polypeptide in vivo according to the present invention (“directed in vivo mutagenesis” or “synthetic in vivo mutagenesis”) measures the result of the selection process during the mutagenesis and selection procedure, and can be used by a skilled artisan to determine whether further mutagenesis and selection steps need to be performed.
A further advantage of the present method is the fact that the modified polypeptide is expressed and bound to the target molecule in a natural defined and constant environment so that the binding conditions are highly reproducible, so that there is no need to optimize additionally the binding conditions in an in vitro test assay.
The modified polypeptides manufactured with the present method preferably exhibit a stronger binding (higher affinity) and/or higher specificity to a target molecule compared to the unmodified polypeptides, whereby these properties may preferably be determined by determining the reporter activity (e.g. LacZ, HIS3, ADE2, URA3, GusA etc.; such methods are well known in the art e.g. Serebriiskii I G et al. Biotechniques. 2000 29:278-9, 282-4, 286-8 and Estojak J. et al. (1995) Mol. Cell. Biol. 15:5820-5829). Of course, it is also possible to determine the reporter activity outside the cells (in vitro) by appropriate biochemical and genetic methods. This may be achieved by lysing the cells and by determining the activity of the reporter polypeptide in e.g. microtiter plates or by an e.g. ELISA assay. Alternatively, the reporter activity may preferably be determined by quantitative methods involving e.g. β-galactosidase (LacZ) or qualitative methods involving antibiotic resistance (e.g. HIS3) or antibiotic sensitivity (e.g. URA3). The latter method may preferably be used to manufacture polypeptides which show a reduced binding to the target molecule.
The binding properties may be optimized by performing several cycles comprising method steps a) to f) until at least one modified polypeptide is identified showing predetermined properties (e.g. altered binding properties to the target molecule compared to the at least one first polypeptide and/or having a different amino acid sequence compared to the first polypeptide). In practice the iterative steps of the method of the present invention will be preferably applied until the activity of the reporter polypeptide reaches a certain level. For instance, the steps may be repeated until the reporter activity recorded on a chart in each step reaches a plateau.
According to the present invention the iterative steps of the method will be repeated until the modified polypeptide(s) obtained exhibit “predetermined” properties (e.g. binding affinity). This means that the iterative steps of the method will be repeated until the modified polypeptide(s) obtained exhibit “preferred (or: desired)” properties (e.g. an optimised binding affinity or an alteration of solubility with constant binding affinity). This can be done by observing the evaluation of the polypeptide during the repeats and repeating until no variation is visible anymore (this, of course, depends on the detection limits of the measurement means) or until an initially predetermined value has been achieved. On the other hand, the predetermined property can also be a qualitative property relative to the initial polypeptide (e.g. at least 20% higher (or lower) binding affinity or at least 20% lower (or higher) molecular weight (or pl-value, hydrophilicity, etc.) under preservation of binding affinity).
These properties may vary, among others, from polypeptide to polypeptide and depend on the use of the modified polypeptides obtained by the method of the present invention. The desired properties may be predetermined by the skilled artisan prior the application of the present method.
In the prior art, some reports disclose that Y2H has been used to change polypeptides, however, Y2H has never been suggested as a tool for performing in vitro evolution: Williams et al. (NAR 33 (2005), 4475-4484) performed a classic Y2H screen (according to Fields and Song) with HOXA13 and identified Smad5 as interacting protein from a limb bud cDNA library. A fragment encoding amino acids 175-465 from Smad5 was isolated. This Smad5 coding sequence was reintroduced in a Y2H bait vector and retested for confirmation of the interaction with a fragment of HOXA13 (amino acids 150-360), HOXD13 (amino acids 1-312), HOXA11 (amino acids 1-281) and HOXA9 (amino acids 1-245). Then, to identify the Smad5 domains that interact with HOXA13, deletion constructs of Smad5 were tested for interaction. Smad5 clones (amino acids 1-198, 146-198, 1-265) failed to interact with HOXA13, whereas Smad5 clones (amino acids 146-465, 202-465, 265-465) could interact.
In summary, Williams et al., a) performed a classic Y2H screen with the well known lexA-Y2H system in a cDNA library, b) identified Smad5 as interaction partner, c) re-introduced the interactor into the Y2H system to confirm the interaction, d) used specific deletion mutants of Smad5 to map the HOXA13 interacting domain.
In contrast thereto, the present invention provides a new interaction and clarified the part of the molecule that is important for interaction. In the prior art, Y2H was used for analysis but not for engineering new molecules with altered molecular properties. For example, there was no intention at all of Williams et al. to develop new molecules that have higher affinity to a HOXA13.
According to the present invention repeated use of mutagenesis/selection/monitoring of interaction strength, which is called “in vitro evolution” (de novo in vitro engineering of strong binders), enables engineering of new molecules by using the classic Y2H system. After the combination of several rounds of mutagenesis and selection and monitoring, the polypeptides thus generated are changed in their amino acid composition and have increased affinity to a target. The generated molecules are not naturally occurring ones. Williams et al., did neither use repeated mutagenesis & selection nor did they investigate and analyse nor produce novel molecules with increased affinities and amino-acid composition. Performing the “in vitro evolution” according to the present invention with the Y2H system may encompass the screening with Y2H and specific bait, mutagenesis, retesting with the Y2H system, measurement of binding affinity (=monitoring of protein strength indicators/reporters), iteration of these steps until no further change in the molecular property is detectable. The end result of this method is an increased affinity of the molecule compared to the original molecule and identification and production of a new peptidic molecule, not naturally occurring, and a strong binder of the target molecule.
Several authors observed a Y2H interaction between Bem1 and Cdc42 in former studies. For example Yamaguchi et al. (JBC 282 (2007): 29-38) examined various portions of Bem1 to identify the part that is responsible for Cdc42 binding. To identify the interacting domain of Bem1, the authors cloned a series of Bem1 fragments into a Y2H prey vector. The Bem1 fragments (amino acid 1-551, 1-140, 1-256, 140-551, 283-551, 140-256) were tested for interaction in the Y2H system. The Bem1 fragment 140-256, which contains a SH3 domain could specifically interact with Cdc42. Since this portion of Bem1 also interacts with Ste20, the authors decided to pinpoint the region to identify amino acid residues responsible for Cdc42 binding. By using a dual-bait Y2H system (a special type of the Y2H system that enables simultaneous screening of 2 different baits, here Ste20 and Cdc42 for Bem1 fragment 140-246) they searched for mutants, which fail to interact with Cdc42 but not with Ste20. Yamaguchi et al. performed an error-prone PCR and subsequently a dual-bait Y2H screening. Using this strategy they isolated 3 different mutants that interact with Ste20 but fail to interact with Cdc20.
Yamaguchi et al. therefore used the information of a published classic Y2H analysis, the protein interaction between Bem1 and Cdc42 for a detailed study; revealed a small portion of Bem1 to interact with Cdc42 by testing 6 deletion mutants of Bem1 in a classic Y2H system and mutagenized this fragment by error-prone PCR and screened for loss of function mutants (unable to bind to Cdc42) in a dual-bait Y2H system and identified three mutants that failed to interact to Cdc42.
Accordingly, Yamaguchi et al. used an already known Y2H interaction and clearly identified the part of the molecule that is important for interaction. Then they mutagenized and identified amino acid residues essential for binding to Cdc42. There was no intention at all to produce new molecules with higher affinity to Cdc42. I.e. they analysed and did not engineer new molecules with enhanced molecular properties with respect to binding, they did not try to enhance the performance or affinity of molecules.
In the method according to the present invention, after several rounds of mutagenesis and selection and monitoring (at least three rounds), the polypeptides thus generated are changed in their amino acid composition and increased in affinity to a target. The repetition of mutagenesis and selection generates new molecules, and a “gain of function” (here: the property to bind stronger to the target) enables the cells to survive (here: the cell harbouring the strong binder can grow under strong selection pressure) and therefore can enter a new round of mutagenesis and selection. “Loss of function” mutants are eliminated because they are outperformed during “in vitro evolution”. The molecules generated with the procedure according to the present invention are not naturally-occurring ones, their amino acid composition is changed and binding strength is increased relative to their “ancestors” existing in nature.
Yamaguchi et al. did neither continuously repeat mutagenesis and selection and monitoring nor produced novel molecules with increased affinities; of course, as evidenced above, they did not perform “in vitro evolution” with the Y2H system.
In Williams et al., Yamaguchi et al. and others (e.g. Drees et al. JBC 154 (2001): 549-571; Allen et al. TIBS 20 (1995): 511-516; Pajunen et al. NAR 35 (2007): 3-5; WO 00/66722; EP 1 405 911 A1 or WO 02/31165 A1), an “extended classic Y2H analysis” is used which comprises: Taking a bait protein, look for interactors in a gene library and identify an interacting protein; then the Y2H analysis is extended by mutagenesis of either the bait or the prey molecule (one of the interacting proteins) to map the domains responsible for binding. These results in identification of interacting portions (domains) of the interactor, a search for mutants unable to interact, together aiming at a molecular analysis of the interacting process.
In contrast thereto, the present invention adds an in vitro evolution step to the classic Y2H analysis and the extended Y2H analysis which includes the monitoring of interaction strength after mutagenesis and selection, repetition of the steps as long as an increase in binding is detectable to obtain molecules with increased affinities or desired properties.
Thus, whereas classic Y2H analysis delivers a pair of candidate interacting proteins and extended Y2H analysis delivers a protein-fragment or domain responsible for binding to a desired target, the in vitro evolution according to the present invention yields novel molecules with increased affinity, an engineered property, to a desired target molecule, as a result of repeated rounds of mutagenesis+selection+monitoring.
Further advantages of the method of the present invention are:
The nucleic acid molecule encoding for a first fusion polypeptide and optionally the nucleic acid molecule encoding for a second fusion polypeptide are introduced in the first cell provided in step a) by using preferably a vector (e.g. bacterial, yeast or viral vector) through transfection or transformation or by directly transforming the cell with the linear nucleic acid molecules. In both cases and in particular in the latter case the nucleic acid molecule may be provided with a region which allows a (preferably stable) homologous recombination of said nucleic acid molecule into the genome of the cell.
If the modified polypeptide of the present invention is or should be capable to bind to a specific binding site on a nucleic acid molecule it is sufficient that said first cell comprises only a reporter gene encoding a reporter polypeptide operably linked to an upstream transcriptional regulatory sequence to which the first and modified polypeptide are capable to or should bind. If such a binding event occurs the transcriptional activation domain of the first fusion polypeptide induces the transcription and consequently the biosynthesis of the reporter polypeptide. Of course the upstream transcriptional regulatory sequence should comprise the nucleotide sequence to which the first and/or modified polypeptide is capable to bind to. Nevertheless it is of course also possible to provide a cell comprising next to a nucleic acid molecule encoding for a first fusion polypeptide a nucleic acid molecule encoding for a second fusion polypeptide as defined above. In this case, however, both the first and the second fusion polypeptide may bind to the same nucleic acid molecule on which the binding site for the first fusion polypeptide can be found.
The reporter gene encoding a reporter polypeptide operably linked to an upstream transcriptional regulatory sequence may be present in said first cell on an extrachromosomal vector or integrated on the chromosome.
The first cell may, however, also comprise a nucleic acid molecule encoding for a second fusion polypeptide, which comprises the target molecule (e.g. enzyme, receptor) or a polypeptide domain binding to the target molecule (e.g. carbohydrate structure, nucleic acid molecule, polypeptide, organic compound) and a DNA binding domain, which is capable to bind to the upstream transcriptional regulatory sequence of the reporter gene. The presence of said second fusion polypeptide within the first cell is particularly desired when the first or the modified polypeptide is or should be capable to bind directly to the target molecule or when other molecules (e.g. chemical compounds, carbohydrates, polypeptides, nucleic acid molecules) are the target molecule and the second fusion polypeptide comprises a domain capable to bind also to the target molecule directly or via further molecules. Similar concepts are known from the yeast hybrid system (e.g. one-, two and three hybrid system, see e.g. Hollingsworth R et al. (2004) DDT:Targets 3:97-103, in particular
Using the terminology of the yeast two hybrid system (see e.g. Hollingsworth R et al. (2004) DDT:Targets 3:97-103) the first fusion polypeptide may be denominated as prey fusion protein/polypeptide and the second fusion polypeptide as bait fusion protein/polypeptide. Consequently, first fusion polypeptide may be used interchangeable with prey fusion protein/polypeptide and second fusion polypeptide with bait fusion protein/polypeptide.
The bait fusion protein includes a fusion between a polypeptide moiety of interest (e.g., a protein of interest or a polypeptide from a polypeptide library), and a DNA-binding domain which specifically binds a DNA binding site which occurs upstream of an appropriate reporter gene. The nucleotide sequence which encodes the polypeptide moiety of interest is cloned in-frame to a nucleotide sequence encoding the DNA-binding domain.
Any polypeptide that binds a defined DNA sequence can be used as a DNA-binding domain. The DNA-binding domain can be derived from a naturally occurring DNA-binding protein, e.g., a prokaryotic or eukaryotic DNA-binding protein. Alternatively, the DNA-binding domain can be a polypeptide derived from a protein artificially engineered to interact with specific DNA sequences. Examples of DNA-binding domains from naturally occurring eukaryotic DNA-binding proteins include p53, Jun, Fos, GCN4 or GAL4. The DNA-binding domain of the bait fusion protein can also be generated from viral proteins, such as the pappillomavirus E2 protein. In another example, the DNA-binding domain is derived from a prokaryote, e.g. the E. coli LexA repressor can be used, or the DNA-binding domain can be from a bacteriophage, e.g., a lambda cl protein. Exemplary prokaryotic DNA-binding domains include DNA-binding portions of the P22 Arc repressor, MetJ, CENP-B, Rapt, Xy1S/Ada/AraC, Bir5 and DtxR.
The DNA-binding protein also can be a non-naturally occurring DNA-binding domain and can be generated by combinatorial mutagenic techniques. Methods for generating novel DNA-binding proteins which can selectively bind to a specific DNA sequence are known in the art (e.g. U.S. Pat. No. 5,198,346).
The basic requirements of the bait fusion protein include the ability to specifically bind a defined nucleotide sequence (i.e. a DNA binding site) upstream of the appropriate reporter gene. The bait fusion protein should cause little or no transcriptional activation of the reporter gene in the absence of an interacting prey fusion protein. It is also desirable that the bait not interfere with the ability of the DNA-binding domain to bind to its DNA binding site.
As appropriate, the DNA-binding domain used in the bait fusion protein can include oligomerization motifs. It is known in the art that certain transcriptional regulators dimerize. Dimerization promotes cooperative binding of the transcriptional regulators to their cognate DNA binding sites. For example, where the bait protein includes a LexA DNA-binding domain, it can further include a LexA dimerization domain; this optional domain facilitates efficient LexA dimer formation. Because LexA binds its DNA binding site as a dimer, inclusion of this domain in the bait protein also optimizes the efficiency of binding. Other exemplary motifs include the tetramerization domain of p53 and the tetramerization domain of BCR-ABL.
The nucleotide sequences encoding for the bait and prey fusion proteins are inserted into a vector such that the desired bait fusion protein can be produced in a host cell. Suitable recombinant expression vectors are known in the art. Preferably the recombinant expression vectors may include one or more regulatory sequences operably linked to the fusion nucleic acid sequence to be expressed. The term “regulatory sequence” includes promoters, enhancers and other expression control elements (e.g., polyadenylation signals) etc. Optionally, the vector can also include a selectable marker, the expression of which in the host cell permits selection of cells containing the marker gene from cells that do not contain the marker gene. Selectable markers are known in the art, e.g. neomycin, zeocin or blasticidin.
The vectors encoding for the bait and prey fusion proteins are preferably integrated into a chromosome of a cell.
It may also be preferred to introduce an unstructured polypeptide linker region between the DNA-binding domain of the fusion protein and the bait polypeptide sequence. The linker can facilitate, e.g., enhanced flexibility of the fusion protein allowing the DNA-binding domain to freely interact with the DNA binding site.
The prey fusion protein includes a transcriptional activation domain and a candidate interactor polypeptide sequence which is to be tested for its ability to form an intermolecular association with the bait polypeptide. As discussed above, protein-protein contact between the bait and prey fusion proteins (via the interaction of the bait and prey polypeptide portions of these proteins) links the DNA-binding domain of the bait fusion protein with the activation domain of the prey fusion protein, generating a protein complex capable of directly activating expression of the reporter gene.
Any of a number of activation domains can be used in the prey fusion protein. The activation domain can be a naturally occurring activation domain, e.g., an activation domain that is derived from a eukaryotic or prokaryotic source. Exemplary activation domains include GAL4, VP16, CR2, B112, or B117. The activation domain can also be derived from a virus, e.g., VP16 activation domain is derived from herpesvirus.
DNA sequences which encode the prey and the transcriptional activation domain, e.g., a VP16 activation domain, can also include other sequences such as a nuclear localization sequence (e.g., those derived from GAL4 or MATα2 genes). The nuclear localization sequence optimizes the efficiency with which prey proteins reach the nuclear-localized reporter gene construct.
The prey polypeptide can be any polypeptide, e.g., the prey polypeptide can be derived from all or a portion of a known protein or a mutant thereof, all or a portion of an unknown protein (e.g., encoded by a gene cloned from a cDNA library or an ORFeome), or a random polypeptide sequence.
To isolate DNA sequences encoding novel interacting proteins, members of a DNA expression library (e.g., a cDNA or synthetic DNA library) can be fused in-frame to the transcriptional activation domain to generate a variegated library of prey fusion proteins.
In an exemplary embodiment, a cDNA library may be constructed from an mRNA population and inserted into an expression vector. It is also noted that prey polypeptides need not be naturally occurring full-length proteins. In certain embodiments, prey proteins can be encoded by synthetic DNA sequences.
DNA sequences which encode for the prey protein and the activation domain, e.g., the nucleic acid sequence which encodes for the VP-16 activation domain, are inserted into a vector such that the desired prey fusion protein is produced in a host mammalian cell. The vector can be any expression vector as described above. In the instance where it is preferable to recover the prey sequence using a bacterial host cell, as described above, the prey DNA sequences are inserted into a vector which contains an appropriate origin of replication. By an appropriate origin of replication an origin of replication is meant which allows the vector to be maintained episomally and indefinitely without damaging the mammalian host cell or integrating the DNA sequence into the genomic DNA of the mammalian host cell. Since the vector is maintained episomally, the vector can be easily introduced and recovered from a bacterial host cell. An example of such a suitable origin of replication is the oriP Epstein Barr virus replication origin sequence (oriP). In a preferred embodiment, a vector containing an oriP is transformed into a mammalian cell which contains an Epstein Barr virus nuclear antigen-1 (EBNA-1). A vector containing an oriP can replicate stably in a mammalian cell that expresses EBNA-1 (Aiyar et al., EMBO Journal, 17:12:6394-6403).
Expression of the reporter gene indicates an interaction between the prey and bait polypeptides, and permits the identification of mammalian cells in which an interaction has occurred. The reporter gene sequence will include a reporter gene operably linked to a DNA binding site to which the DNA-binding domain of the bait fusion protein binds.
In a preferred embodiment of the invention, the reporter gene encodes a fluorescent molecule, e.g., a green fluorescent protein (GFP) or a blue fluorescent protein (BFP). The advantage of using a reporter gene that encodes a fluorescent protein is that a single individual fluorescent positive cell can be identified quickly. For example, using GFP as the reporter gene product, green fluorescence can be detected as early as 16 hours after transfection. Positive (fluorescent) cells can be identified using a fluorescence microscope, e.g., using an inverted phase-contrast microscope equipped with an epifluorescence light source and a fluorescein isothiocyanate filter set. Using this method, positive cells can be identified without damage to the cells, e.g., positive, green fluorescent cells can be easily isolated by conventional cell cloning methods, such as using small plastic cylinders to isolate cells, or collecting positive cells directly using a conventional micropipette. Alternatively, fluorescence-activated cell sorter (FACS) can be used to isolate positive cells. However, isolating positive cells by FACS is less preferable, since this approach will mix up the positive clones, and hence, may cause cloning bias. The total DNA from a positive clone can be prepared by standard procedures and the sequence which encodes the prey protein amplified using PCR and sequenced by standard procedures. A preferred fluorescent polypeptide is derived from a GFP.
Of course, in the method of the present invention any suitable reporter gene can be used. Examples include chloramphenicol acetyl transferase (CAT; Alton and Vapnek (1979), Nature 282:864-869), and other enzyme detection systems, such as beta-galactosidase; firefly luciferase (deWet et al. (1987), Mol. Cell. Biol. 7:725-737); bacterial luciferase (Engebrecht and Silverman (1984), PNAS 1:4154-4158; Baldwin et al. (1984), Biochemistry 23:3663-3667); phycobiliproteins (especially phycoerythrin); alkaline phosphates (Toh et al. (1989) Eur. J. Biochem. 182:231-238, Hall et al. (1983) J. Mol. Appl. Gen. 2:101), secreted alkaline phosphate (Cullen and Malim (1992) Methods in Enzymol. 216:362-368) or fluorescent proteins (e.g., GFP). Other examples of suitable reporter genes include those which encode proteins conferring drug/antibiotic resistance to the host cell.
The amount of transcription from the reporter gene may be measured using any suitable method. Various suitable methods are known in the art. For example, specific RNA expression may be detected using Northern blots, or specific protein product may be identified by a characteristic stain or an intrinsic activity.
In preferred embodiments, the protein encoded by the reporter is detected by an intrinsic activity associated with that protein. For instance, the reporter gene may encode a gene product that, by enzymatic activity, gives rise to a detection signal based on, fluorescence, colour, or luminescence.
The term “polypeptide”, as used herein, refers to a proteinaceous molecule comprising at least 5 (preferably at least 7, more preferably at least 10 etc.) amino acid residues. The link between one amino acid residue and the next is an amide bond. The term “polypeptide” according to the present invention includes also peptides and proteins, terms used commonly in the art.
According to a preferred embodiment of the present invention the nucleic acid molecule encoding for the first fusion polypeptide and/or the nucleic acid molecule encoding for the second fusion polypeptide are comprised on at least one vector.
The nucleic acid molecules encoding for the first and the second fusion polypeptides may be present on a single vector, on more than one vector (preferably 2) or one or both nucleic acid molecules are integrated into the chromosomal/genomic DNA. Suitable vectors are known to the person skilled in the art and may be chosen depending on the host cell used and whether an integration into the genome of the host cell is desired.
In order to determine the properties of the first and the modified polypeptides the expression rate of the reporter polypeptide of the at least one cell identified in step c) is quantified. This quantification may occur directly (e.g. fluorescence) or indirectly (e.g. cell growth (cell density)). The quantification method depends on the reporter polypeptide used.
According to a preferred embodiment of the present invention the target polypeptide is selected from the group of receptors, structural proteins, transport proteins and enzymes, preferably cell surface receptors, transcription factors, and enzymes such as kinases, proteinases, phosphatases, other hydrolases, or translocases.
The bait portion of the bait fusion protein may be chosen from any protein of interest and includes proteins and/or polypeptides of unknown, known, or suspected diagnostic, therapeutic, or pharmacological importance. For example, the protein of interest can be a protein suspected of being an inhibitor or an activator of a cellular process (e.g. receptor signalling, apoptosis, cell proliferation, cell differentiation or import or export of toxins and nutrients). Examples of bait proteins include receptors, such as hormone receptors, neurotransmitter receptors, metabotropic receptors, ionotropic receptors and hormone receptors, oncoproteins such as myc, Ras, Src and Fos, tumor-suppressor proteins such as p53, p21, p16 and Rb (Knudsen et. al., Oncogene, 1999, 18:5239-45), proteins involved in cell-cycle regulation such as kinases and phosphates, or proteins involved in signal transduction, like T-cell signalling, e.g. Zap-70 or SAM-68. Usually the full length of the protein of interest is used as the bait protein. In cases where the protein of interest is of a large size, e.g. has a molecular weight of over 20 kDa, it may be more convenient to use a portion of the protein.
According to a preferred embodiment of the present invention the nucleotide sequences encoding the at least one first polypeptide are obtained from peptide and/or polypeptide libraries and/or from cDNA, genomic DNA or Expressed Sequence Tags (ESTs) of at least one organism and/or derived from the ORFeome of at least one organism and/or from artificial genomic or artificial nucleic acid libraries. In this context “organisms” include also viruses.
The prey moiety (i.e. at least one first polypeptide) of the prey fusion protein may be any polypeptide comprising at least 5 amino acid residues. The polypeptide can be derived from several sources whereby it is especially preferred to use polypeptide/peptide libraries or polypeptides/peptides from an ORFeome.
The polypeptide/peptide library-based method of the present invention starts with consensus-sequence peptides derived from binding partners in a defined protein-protein interaction starting preferably from naturally selected binders occurring in a complete ORFeome in the format of an ORFomer library (in which an ORFomer library is a library containing peptides and polypeptides derived from an ORFeome). The polypeptides are transformed through mutation into optimized binders of the target molecules in a selection and production procedure in an intracellular environment.
“Orfeome” or “Orfome” is the totality of open reading frames (ORF) in an organism or virus. ORFs code for polypeptides and proteins and can be determined by sequence analysis of the nucleotide sequences of the organism or virus. This analysis may be facilitated by computer programs such as GenScan. However, a proteome that includes ORFeome may also be characterized empirically from experimental data generated by techniques such as mass spectroscopy. An advantage of doing selection in a two hybrid system starting from an ORFeome and using ORFomers is, all and any binary combinations of binders and targets are obtained in single cells. So far, peptide- or protein-affinity reagents were obtained without considering the complete range of binding proteins from a complete ORFeome, whether homologous (from the same source as the target protein) or heterologous (from a different source than the target protein), that is without taking into consideration naturally selected sources of affinity binders
A cDNA library according to the present invention refers to a complete, or nearly complete, set of all the mRNAs contained within a cell or organism. cDNA is usually obtained by employing reverse transcriptase which will produce a DNA copy of each mRNA strand. Referred to as cDNA these reverse transcribed mRNAs are collectively known as the library.
cDNA libraries may be prepared from total or enriched Poly(A)+single stranded mRNA that is converted into a double-stranded DNA copy of the message using reverse transcriptase. The cDNA fragments can be inserted into an appropriate plasmid, phage or cosmid vector for maintenance and cloning. The population of recombinant vectors will represent the entire set of expressed genes in the cell from which the RNA was isolated. One of the main advantages of using a cDNA library is that the introns are spliced out and the mRNA sequence can be used as a template to create cDNA to collect the preferred genes. Differently than standard cDNA libraries, in full-length cDNA libraries most of the cDNA represent complete original mRNA molecules. These are produced by implementation of normal methods for cDNA libraries preparation. The main technologies are known as “cap-trapper” and “oligo-capping”.
The organism from which the cDNA and the ORFeome are derived from is preferably a microrganism (including viruses), plant or animal, preferably a mammal.
According to a preferred embodiment of the present invention the at least one mutation of the nucleotide sequences of step e) is a point mutation, a deletion, an insertion, a DNA translocation, DNA shuffeling, DNA rearrangements and DNA multimerisation.
Due to the introduction of mutations into the nucleotide sequences encoding for the polypeptides or peptides binding to a target polypeptide (binders) and identified with the method according to the present invention, it is possible to change the properties of these binders in a way to create molecules exhibiting e.g. stronger binding affinity to the target polypeptide and thus having an increased inhibitory effect compared to the wild-type molecules. The introduction of mutations into the nucleotide sequences may be done with molecular biological methods known in the art.
It is in particular preferred to mutate the identified polypeptide by truncating (deleting amino acid residues from the N- and/or C-terminus) or fragmenting its amino acid sequence. This allows to identify e.g. small or even the smallest polypeptides still binding to the target polypeptides. Short polypeptides are often used as inhibitors of target polypeptides because they are able to bind to the targets without showing other biological effects or as being recognized as foreign by the immune system when administered to a mammal.
According to another preferred embodiment of the present invention the vector is a prokaryotic hybrid vector, preferably a bacterial hybrid vector, or a eukaryotic hybrid vector, preferably a yeast, insect or mammalian hybrid vector and/or the cell is a prokaryotic cell, preferably a bacterial cell, or a eukaryotic cell, preferably a yeast, insect or mammalian cell.
Depending on the two hybrid systems used, the vectors and the host cells are chosen appropriately.
The present invention is further illustrated by the following figures and examples, however, without being restricted thereto.
a) Bovine serum albumine (BSA) (500 ng/μl). This protein served as negative control 1; b) Affinity purified, recombinantly in E. coli produced GST-tagged Frag2-CDS. The amino acid sequence of Frag2-CDS is N-YSLGSSFGSGAGSSSFSRTSSSRAVVVKKIETRDGKLVSESSDVLPK-C (SEQ ID NO: 1); c) Human aortic elastin protein. This protein served as positive control 1 because elastin protein is known to bind to S. aureus; d) Human fibrinogen protein. This protein served as positive control 2 because fibrinogen protein is known to bind to S. aureus; e) Affinity purified, recombinantly in E. coli produced GST-tagged S. aureus protein, methionine sulfoxide reductase msrB, NCBI protein GI:54041496. This protein served as negative control 2; f) Affinity purified, recombinantly in E. coli produced GST-tagged S. aureus protein, NADPH-dependent 7-cyano-7-deazaguanine reductase, NCBI protein GI:81781951. This protein served as negative control 3; g) Synthetic IPEP-21SA. The sequence of IPEP-21SA is N-SYSLGSSFGSGAGSSSFSRTS (SEQ ID NO: 2); h) Synthetic control peptide, the sequence of the control peptide is N-EQRGELAIKDANAKLSELEAAL (SEQ ID NO: 3).
a) Comparison of growth strength on 50 mM 3-AT containing highly selective yeast media.
b) Comparison of beta-galactosidase expression by X-Gal beta galactosidase overlay assays. Both measurements are an indicative for the relative protein-protein interaction strength in the Y2H system. Negative control shows no growth on 3-AT containing medium and no blue color development (no beta galactosidase expression). The stronger binders produce at least 50% and 100% more beta galactosidase in comparison to the original binder, respectively. Equal numbers of cells from each clone were incubated all plates. The growth of Y2H yeast cells (stronger binders harbouring clones) on 3-AT included selective medium increased significantly compared to the original binder.
The increasing incidence of antibiotic-resistant strains of S. aureus defines the need for alternatives to the current arsenal of antibiotics. To infect and colonize a host, and then cause disease, S. aureus uses several proteins, both own and host-derived. Protein-protein interactions (PPI) are the basis of this human—pathogen “conversation” resulting in infection and disease. Revealing the complete protein interface (‘interactome’) between S. aureus and its host (man) would offer targets for new therapeutic and diagnostic approaches. Hence a research objective has been the search for the human proteins interacting with S. aureus surface adhesins, which are bacterial proteins involved in early steps of the infection.
S. aureus can adhere to components of the extra cellular matrix (ECM) of the host. This is accomplished through surface expressed and cell wall anchored protein adhesins, so called ‘microbial surface components recognising adhesive matrix molecules’ (MSCRAMMs). A structural organization of MSCRAMMs is shown in
Cna is able to bind to collagen substrates and collagenous tissues. Cna is not expressed by the majority of strains, and S. aureus strains which lack Cna expression are less virulent. A 55 kDa domain (AA 30-531) contains the collagen binding site in a 19 kDa sub domain (AA 151-318). Although the 19 kDa sub domain can bind to different types of collagen, (via a GPP containing triple helix in collagen), the 55 kDa domain shows a higher affinity to the substrate than the 19 kDa domain. A synthetic peptide mimicking this sub-domain inhibits collagen binding to bacteria.
EbpS is encoded by a 1461 bp ORF in the S. aureus strain MU50 (source: NCBI—database; gi: 47208328). It differs from the above MSCRAMMs by its cell wall anchor (no LPXTG motif in the cell wall anchor). EbpS can bind elastin, which is one of the major protein components of the ECM. Hot spots of elastin expression are the lung, the skin and the blood vessels. EbpS binds to the N—terminal ⅓ of elastin through its N—Terminal region (encoded by the first 609 bp). Recombinant EbpS, as well as a Fab fragment of an Ab which was raised against recombinant EbpS, inhibit binding of S. aureus to elastin.
FnbA and FnbB are expressed through two closely linked genes. FnbA and FnbB can bind immobilized fibronectin in vitro, and they contribute to the adherence of S. aureus to plasma clots and to surgical material that has been in contact with the host. The ligand binding domains (D domains) are located very close to the cell wall-spanning anchor in the C—terminal region of the surface protein. They consist of 3-5 repeats of a 37-38 AA motif. Individually, the domains can bind fibronectin with a low micromolar dissociation constant (Kd), in tandem they compose a high affinity domain with a dissociation constant of Kd=1.5 nM. In addition to the D-domains, two other fibronectin-interacting domains have been identified: DuA and DuB. Peptides which mimic domains of the D—region inhibit fibronectin binding by the pathogen. The binding domain interacts simultaneously with multiple sites of the N—terminus of fibronectin.
The bacterial binding site of fibronectin is located in the N—terminus and consists of 5 sequential fibronectin Type 1 modules (F1: a beta—sandwich of 2 anti parallel beta—sheets). The main binding site for the binding D—domains of S. aureus is a F1F1 module pair. This domain is bound by ligand induced formation of additional anti parallel beta—strands in the binding domain of the FnbA/FnbB, which results in a tandem beta—zipper interaction. Electrostatic interactions are important and an anti parallel orientation for the beta—sheets of fibronectin
The aim of the following examples is to show that a protein interacting with another protein can be reduced to a “minimal interacting peptide” (i.e. the smallest peptide still binding specifically to the target protein), which can compete with the full length protein in binding to the target protein and therefore act as an inhibitor, as follows:
Experimental Approach:
Exemplified by
Exemplified by
Tested by In Vitro Pull-Down Analysis:
With the above approach, peptide binders of defined target proteins, exemplified by the bacterial adhesins, can be identified and can be subjected to modification and subsequent selection using the Y2H system. Therefore, the approach can be extended to other components of the host-pathogen interaction and even beyond that to interacting proteins within the bacterium or within the host cell, which are implicated in establishing a successful infection and colonization of the host. As the approach is of a general nature, it can be used to any PPI in any system as a method to identify peptides binding to defined proteins.
1.1. Primary Y2H-PPI Hits
A Y2H-screen with the virulence factors of S. aureus (clumping factor b (ClfB)), fibronectin binding protein (Fnbp), Elastin binding protein (EbpS), and Collagen binding protein (CNA) was initiated.
Two different Y2H-screening libraries were used for this purpose, a genomic library of S. aureus and a cDNA library of human keratinocytes. Both libraries were house-made by general library cloning strategies.
ClfB and FnbB were screened against the human cDNA library, whereas EbpS and CNA were screened against the S. aureus genomic library.
Each screen was performed separately and resulted in “positive Y2H colonies”. Positive Y2H colonies are yeast colonies, which contain a pair of plasmids encoding for proteins or protein fragments that can interact with each other. Interacting pairs of proteins or protein fragments can turn on a reporter gene. The reporter gene is a gene, which is translated into a protein (e.g. HIS3 protein product), which enables a yeast cell deficient in a distinct essential gene (e.g. HIS3, ADE2) to grow on yeast media lacking a substance e.g. the amino acid histidine, or adenine.
The Y2H-positive colonies were lysed, and the plasmids responsible for interaction were isolated, amplified and re-introduced in a Y2H reporter strain. The re-introduction is done for re-production of the results in order to exclude artifactual activation of the reporter genes (e.g. genomic mutations that enable cell growth without PPI).
Plasmids from re-produced Y2H-positive colonies were sequenced for the identification of the gene translated in yeast to a protein capable in the interaction with the virulence factors. Protein- and gene-identification is done by using blast-alignment/search tool (http://www.ncbi.nlm.nih.gov/BLAST/).
The screenings delivered PPIs, which were not publicly known to interact with the virulence factors ClfB, FnbB, EbpS or CNA. A summary of the genes encoding for PPI-partners can be seen in Table 1.
Homo sapiens chaperonin containing
1.2. Evaluation and Confirmation of the Quality of Identified PPIs
The candidate PPI-partners were analysed within the Y2H-system to determine whether a) the correct reading frame of the gene is maintained, b) the orientation of the gene insert is correct, c) the portion of the gene insert corresponds to gene-coding regions (e.g. 5-primed or 3-primed regions so-called untranslated regions are not relevant for proteins), d) genomic regions, which do not correspond to annotated genes so far.
For elimination of one distinct class of Y2H false-positives (spurious activation of the reporter system within a Y2H-system that leads to the arising of Y2H-positive colonies without an interaction, e.g. false conclusion due to proteins interacting with the transcriptional apparatus of the yeast cells) a confirming experiment was performed. The best method for the elimination of a Y2H false-positive is to use a technique that depends on a completely different mechanism (e.g. confirming a molecular genetic result by a biochemical approach, e.g. confirming Y2H-results by a so-called pull-down approach or immunoprecipitation experiment). This procedure enables the elimination of one class of Y2H-false positives, the class of “technical” false positives. Recombinant proteins were produced from the isolated plasmids from the PPI-partners in a rabbit reticulocyte-based translation system and additionally in E. coli cells. The produced recombinant proteins were used for pull-down and immunoprecipitation experiments. The PPIs confirmed by these techniques are listed in Table 2.
1.3 Characterization of the Interacting Proteins and Amino Acid Sequences Sufficient For an Interaction in the Y2H-System.
A Y2H-system is based on the interaction of two polypeptides, which can be either complete proteins (encoded by a full-length gene), domains of a protein or even fragments of a protein corresponding to peptides. Standard cDNA-libraries and genomic libraries used for Y2H-screening contain a mixture of nucleic acid sequences, ranging from complete full-length genes to gene-fragments encoding for a small portion of a protein. This diversity is caused by the construction procedure itself, which is a technical limitation of library constructions in general. Thus, a Y2H positive yeast colony can arise from interactions between proteins or fragments thereof (e.g. Protein-Protein, Protein-Protein domain, Protein-Peptide, Protein domain-Peptide etc.). The amino acid constituents of the interacting molecules were identified by comparing the sequence results of the isolated plasmids encoding for the identified PPI-partners with the sequence of the complete open reading frames of the full-length genes annotated in Genbank (http://www.ncbi.nlm.nih.gov/). Table 3 shows the full-length amino acid sequences of proteins used and identified in this Y2H-experiment. The amino acid sequences sufficient for an interaction within the Y2H-system are extracted and listed in Table 3.
MVGGGGVGGGLLENANPLIYQRSGERPVTAGEEDEQVPDSIDAREIFDLIRSINDP
EHPLTLEELNVVEQVRVQVSDPESTVAVAFTPTIPHCSMATLIGLSIKVKLLRSLPQ
RFKMDVHITPGTHASEHAVNKQLADKERVAAALENTHLLEVVNQCLSARS
aureus]
MDNCLAAAALNGVDRRSLQRSARLALEVLERAKRRAVDWHALERPKGCMGVLAR
EAPHLEKQPAAGPQRVLPGEKYYSSVPEEGGATHVYRYHRGESKLHMCLDIGNG
QRKDRKKTSLGPGGSYQISEHAPEASQPAENISKDLYIEVYPGTYSVTVGSNDLTK
KTHVVAVDSGQSVDLVFPV
aureus]
MDPNCSCATGGSCTCAGSCKCKECKCTSCKKSCCSCCPVGCAKCAQGCVCKGA
SEKCSCCA
aureus]
PRINSQLVAQQVAQQYATPPPPKKEKKEKVEKQDKEKPEKDKEISPSVTKKNTNKK
TKPKSDILKDPPSEANSIQSANATTKTSETNHTSRPRLKNVDRSTAQQLAVTVGNV
TVIITDFKEKTRSSSTSSSTVTSSAGSEQQNQSSSGSESTDKGSSRSSTPKGDMS
AVNDESF
aureus]
HGTLERDGVFCLLSDDHGASWRYGSGVSGIPYGQPKQENDFNPDECQPYELPD
GSVVINARNQNNYHCHCRIVLRSYDACDTLRPRDVTFDPELVDPVVAAGAVVTSS
aureus]
GIVFFSNPAHPEFRVNLTLRWSFSNGTSWRKETVQLWPGPSGYSSLATLEGSMD
GEEQAPQLYVLYEKGRNHYTESISVAKISVYGTL
aureus]
NNQQKTRILQGALEQGSNSQLMAVQYTETTSSISRNSGSELQVYYASPRSYQDFF
EAIRRRGDTFYVVSFRRDHLLLPATTHNKTTRPKMSIVLPAININENVINGQDYEVM
MQIDCQVMDTRILHIKSSSVPPYLRDQQRNQTNTFFGSPPAATEATHVVSTIPESL
Q
MAELMLLSEIADPTRFFTDNLLSPEDWGLQNSTLYSGLDEVAEEQTQLFRCPEQD
VPFDGSSLDVGMDVSPSEPPWELLPIFPDLQVKSEPSSPCSSSSLSSESSRLSTEP
SSEALGVGEVLHVKTESLAPPLCLLGDDPTSSFETVQINVIPTSDDSSDVQTKIEPV
SPCSSVNSEASLLSADSSSQAFIGEEVLEVKTESLSPSGCLLWDVPAPSLGAVQIS
MGPSLDGSSGKALPTRKPPLQPKPVVLTTVPMPSRAVPPSTTVLLQSLVQPPPVS
PVVLIQGAIRVQPEGPAPSLPRPERKSIVPAPMPGNSCPPEVDAKLLKRQQRMIKN
aureus]
RESACQSRRKKKEYLQGLEARLQAVLADNQQLRRENAALRRRLEALLAENSELKL
GSGNRKVVCIMVFLLFIAFNFGPVSISEPPSAPISPRMNKGEPQPRRHLLGFSEQE
PVQGVEPLQGSSQGPKEPQPSPTDQPSFSNLTAFPGGAKELLLRDLDQLFLSSDC
RHFNRTESLRLADELSGWVQRHQRGRRKIPQRAQERQKSQPRKKSPPVKAVPIQ
PPGPPERDSVGQLQLYRHPDRSQPAFLDAIDRREDTFYVVSFRRDHLLLPAISHNK
TSRPKMSLVMPAMAPNETLSGRGAPGDYEEMMQIECEVMDTRVIHIKISTVPPSLR
KQPSPTPGNATGGPLPVSAASQAHQASHQPLYLNHP
aureus]
SSSEKKLKT
MPEPAKSAPAPKKGSKKAVTKAQKKDGKKRKRSRKESYSVYVYKVLKQVHPDTGI
SSKAMGIMNSFVNDIFERIAGEASRLAHYNKRSTITSREIQTAVRLLLPGELAKHAV
SEGTKAVTKYTSAK
aureus]
YGKVSRRGGHQNSYKPY
aureus]
EAEEAPGARPQLQDAWRGPREPGPAGRGDGDSGRSQREGQGEGETQEAAAAA
RRQEQTLRDATMEVQRGQFQGRPVSVWDVLFSSYLSEAHRDELLAQHAAGALGL
PDLVAVLTRVIEETEERLSKVSFRGLRRQVSASELHTSGILGPETLRDLAQGTKTLQ
EVTEMDSVKRYLEGTSCIAGVLVPAKDQPGRQEKMSIYQAMWKGVLRPGTALVLL
EAQAATGFVIDPVRNLRLSVEEAVAAGVVGGEIQEKLLSAERAVTGYTDPYTGQQI
SLFQAMQKDLIVREHGIRLLEAQIATGGVIDPVHSHRVPVDVAYRRGYFDEEMNRV
aureus]
LADPSDDTKGFFDPNTHENLTYLQLLQRATLDPETGLLFLSLSLQ
MSSTLHSVFFTLKVSILLGSLLGLCLGLEFMGLPNQWARYLRWDASTRSDLSFQFK
TNVSTGLLLYLDDGGVCDFLCLSLVDGRVQLRFSMDCAETAVLSNKQVNDSSWH
FLMVSRDRLRTVLMLDGEGQSGELQPQRPYMDVVSDLFLGGVPTDIRPSALTLDG
VQAMPGFKGLILDLKYGNSEPRLLGSRGVQMDAEGPCGERPCENGGICFLLDGH
PTCDCSTTGYGGKLCSEDVSQDPGLSHLMMSEQ
aureus]
MAPSRNGMVLKPHFHKDWQRRVATWFNQPARKIRRRKARQAKARRIAPRPASG
PIRPIVRCPTVRYHTKVRAGRGFSLEELRVAGIHKKVARTIGISVDPRRRNKSTESL
QANVQRLKEYRSKLILFPRKPSAPKKGDSSAEELKLATQLTGPVMPVRNVYKKEKA
RVITEEEKNFKAFASLRMARANARLFGIRAKRAKEAAEQDVEKKK
aureus]
aureus]
DQRGNLSGNSHKHKGEAKEQURKKERSRSIDKDRKKKDKEREREQDKRKEKQK
REEKDFKFSSQDDRLKRKRESERTFSRSGSISVKIIRHDSRQDSKKSTTKDSKKHS
GSDSSGRSSSESPGSSKEKKAKKPKHSRSRSAEKSQRSGKKASRKHKSKSRSR
QRLIFAGKQLEDGRTLSDYNIQKESTLHLVLRLRGG
MQIFVKTLTGKTITLEVEPSD
aureus]
aureus]
FFFLKEFQVCADKVLGIESHHDFLVKVKVGKFMAKLAEHMFPKSQE
MQIFVKTLTGKTITLEVEPSDTIENVKAKIQDKEGIPPDQQRLIFAGKQLEDGRTLSD
YNIQKESTLHLVLRLRGG
MQIFVKTLTGKTITLEVEPSDTIENVKAKIQDKEGIPPDQ
aureus]
EQEMATAASSSSLEKSYELPDGQVITIGNERFRCPEALFQPSFLGMESCGIHETTF
NSIMKCDVDIRKDLYANTVLSGGTTMYPGIADRMQKEITALAPSTMKIKIIAPPERKY
aureus]
SVWIGGSILASLSTFQQMWISKQEYDESGPSIVHRKCF
GNVAGDSKNDPPMEAAGFTAQVIILNHPGQISAGYAPVLDCHTAHIACKFAELKEKI
DRRSGKKLEDGPKFLKSGDAAIVDMVPGKPMCVESFSDYPPLGRFAVRDMRQTV
aureus]
AVGVIKAVDKKAAGAGKVTKSAQKAQKAK
GPFVMRSTCRRCGGRGSIIISPCVVCRGAGQAKQKKRVMIPVPAGVEDGQTVRM
PVGKREIFITFRVQKSPVFRRDGADIHSDLFISIAQALLGGTARAQGLYETINVTIPP
GTQTDQKIRMGGKGIPRINSYGYGDHYIHIKIRVPKRLTSRQQSLILSYAEDETDVE
GTVNGVTLTSSGKRSTGN
aureus]
TQIAVPQPVAPSYSYATPTPQASFQSTSAPYPVIKELVVSAGESVQITLPKNEVQLN
aureus]
AYVLQEPPKGETYTYDWQLITHPRDYSGEMEGKHSQILKLSKLTPGLYEFKVIVEG
QNAHGEGYVNVTVKPEPRKNRPPIAIVSPQFQEISLPTTSTVIDGSQSTDDDKIVQY
HWEELKGPLREEKISEDTAILKLSKLVPGNYTFSLTVVDSDGATNSTTANLTVNKAV
DYPPVANAGPNQVITLPQNSITLFGNQSTDDHGITSYEWSLSPSSKGKVVEMQGV
RTPTLQLSAMQEGDYTYQLTVTDTIGQQATAQVTVIVQPENNKPPQADAGPDKEL
TLPVDSTTLDGSKSSDDQKIISYLWEKTQGPDGVQLENANSSVATVTGLQVGTYV
FTLTVKDERNLQSQSSVNVIVKEEINKPPIAKITGNVVITLPTSTAELDGSKSSDDKGI
VSYLWTRDEGSPAAGEVLNHSDHHPILFLSNLVEGTYTFHLKVTDAKGESDTDRT
TVEVKPDPRKNNLVEIILDINVSQLTERLKGMFIRQIGVLLGVLDSDIIVQKIQPYTEQ
STKMVFFVQNEPPHQIFKGHEVAAMLKSELRKQKADFLIFRALEVNTVTCQLNCSD
HGHCDSFTKRCICDPFWMENFIKVQLRDGDSNCEWSVLYVIIATFVIVVALGILSWT
VICCCKRQKGKPKRKSKYKILDATDQESLELKPTSRAGIKQKGLLLSSSLMHSESEL
DSDDAIFTWPDREKGKLLHGQNGSVPNGQTPLKARSPREEIL
MDLVRSAPGGILDLNKVATKLGVRKRRVYDITNVLDGIDLVEKKSKNHIRWIGSDLS
NFGAVPQQKKLQEELSDLSAMEDALDELIKDCAQQLFELTDDKENERLAYVTYQDI
HSIQAFHEQIVIAVKAPAETRLDVPAPREDSITVHIRSTNGPIDVYLCEVEQGQTSNK
RSEGVGTSSSESTHPEGPEEEENPQQSEELLEVSN
aureus]
aureus]
GHGGDPGLVSAYGAGLEGGVTGNPAEFVVNTSNAGAGALSVTIDGPSKVKMDCQ
ECPEGYRVTYTPMAPGSYLISIKYGGPYHIGGSPFKAKVTGPRLVSNHSLHETSSV
FVDSLTKATCAPQHGAPGPGPADASKVVAKGLGLSKAYVGQKSSFTVDCSKAGN
NMLLVGVHGPRTPCEEILVKHVGSRLYSVSYLLKDKGEYTLVVKWGDEHIPGSPY
RVVVP
SRQQVQLKAECNKGYVKVKQVGVNPTSIDSVVIGKDQEVKLQPGQVLHMVNELYP
YIVEFEEEAKNPGLETHRKRKRSGNSDSIERDAAQEAEAGTGLEPGSNSGQCSVP
LKKGKDAPIKKESLGHWSQGLKISMQDPKMQVYKDEQVVVIKDKYPKARYHWLVL
PWTSISSLKAVAREHLELLKHMHTVGEKVIVDFAGSSKLRFRLGYHAIPSMSHVHL
HVISQDFDSPCLKNKKHWNSFNTEYFLESQAVIEMVQEAGRVTVRDGMPELLKLP
aureus]
LRCHECQQLLPSIPQLKEHLRKHWTQ
SRQQVQLKAECNKGYVKVKQVGVNPTSIDSVVIGKDQEVKLQPGQVLHMVNELYP
YIVEFEEEAKNPGLETHRKRKRSGNSDSIERDAAQEAEAGTGLEPGSNSGQCSVP
LKKGKDAPIKKESLGHWSQGLKISMQDPKMQVYKDEQVVVIKDKYPKARYHWLVL
PWTSISSLKAVAREHLELLKHMHTVGEKVIVDFAGSSKLRFRLGYHAIPSMSHVHL
HVISQDFDSPCLKNKKHWNSFNTEYFLESQE
aureus]
VMEGKHSQILKLSKLTPGLYEFKVIVEGQNAHGEGYVNVTVKPEPRKNRPPIAIVSP
QFQEISLPTTSTVIDGSQSTDDDKIVQYHWEELKGPLREEKISEDTAILKLSKLVPG
NYTFSLTVVDSDGATNSTTANLTVNKAVDYPPVANAGPNQVITLPQNSITLFGNQS
TDDHGITSYEWSLSPSSKGKVVEMQGVRTPTLQLSAMQEGDYTYQLTVTDTIGQQ
ATAQVTVIVQPENNKPPQADAGPDKELTLPVDSTTLDGSKSSDDQKIISYLWEKTQ
GPDGVQLENANSSVATVTGLQVGTYVFTLTVKDERNLQSQSSVNVIVKEEINKPPI
aureus]
AKITGNVVITLPTSTAELDGSKSSDDKGIVSYLWTRDEGSPAAGEVLNHSDHHPILF
LSNLVEGTYTFHLKVTDAKGESDTDRTTVEVKPDPRKNNLVEIILDINVSQLTERLK
GMFIRQIGVLLGVLDSDIIVQKIQPYTEQSTKMVFFVQNEPPHQIFKGHEVAAMLKS
ELRKQKADFLIFRALEVNTVTCQLNCSDHGHCDSFTKRCICDPFWMENFIKVQLRD
GDSNCEWSVLYVIIATFVIVVALGILSWTVICCCKRQKGKPKRKSKYKILDATDQES
LELKPTSRAGRGPGCQSF
MNARGLGSELKDSIPVTELSASGPFESHDLLRKGFSCVKNELLPSHPLELSEKNFQ
LNQDKMNFSTLRNIQGLFAPLKLQMEFKAVQQVQRLPFLSSSNLSLDVLRGNDETI
GFEDILNDPSQSEVMGEPHLMVEYKLGYCNSVLFMETEGCILFIVIFVL
aureus]
aureus]
MQTAGALFISPALIRCCTRGLIRPVSASFLNSPVNSSKQPSYSNFPLQVARREFQT
SVVSRDIDTAAKFIGAGAATVGVAGSGAGIGTVFGSLIIGYARNPSLKQQLFSYAILG
FALSEAMGLFCLMVAFLILFAM
aureus]
MYSRKAMYKRKYSAAKSKVEKKKKEKVLATVTKPVGGDKNGGTRVVKLRKMPRY
YPTEDVPRKLLSHGKKPFSQHVRKLRASITPGTILIILTGRHRGKRVVFLKQLASGLL
LVTGPLVSIEFLYEEHTRNLSLPLQPKSISAIVKIPKHLTDAYFKKKKLRKPRHQEGEI
FDTEKEKYEITEQRKIDQKLWTHKFYQKSKLFLSSS
aureus]
MRALGQNPTNAEVLKVLGNPKSDEMNVKVLDFEHFLPMLQTVAKNKDQGTYEDY
VEGLRVFDKEGNGTVMGAEIRHVLVTLGEKMTEEEVEMLVAGHEDSNGCINYEEL
VRMVLNG
aureus]
aureus]
KEAGGSDEEQEKGSSSEKEGSEDEHSGSESEREEGDRDEASDKSGSGEDESSE
DEARAARDKEEIFGSDADSEDDADSDDEDRGQAQGGSDNDSDSGSNGGGQRS
RSHSRSASPFPSGSEHSAQEDGSEAAASDSSEADSDSD
MKGSAITGPVAKECADLWPRIASNAGSIA
aureus]
INYIKQSKVVLLEDLASQVGLRTQDTINRIQDLLAEGTITGVIDDRGKFIYITPEELAA
VANFIRQRGRVSIAELAQASNSLIAWGRESPAQAPA
aureus]
GV
aureus]
ADA
aureus]
aureus]
FMKLDTPATSDPLSEEKGGKKRKKQKQKLLFSTSVVHTK
RSRSRDRRKIDDQRGNLSGNSHKHKGEAKEQERKKERSRSIDKDRKKKDKERER
EQDKRKEKQKREEKDFKFSSQDDRLKRKRESERTFSRSGSISVKIIRHDSRQDSK
KSTTKDSKKHSGSDSSGRSSSESPGSSKEKKAKKPKHSRSRSAEKSQRSGKKAS
RKHKSKSRSR
aureus]
aureus subsp.
aureus Mu50].
RKLLEGEESRLESGMQNMSIHTKTTSGYAGGLSSAYGGLTSPGLSYSLGSSFGSG
AGSSSFSRTSSSRAVVVKKIETRDGKLVSESSDVLPK
aureus subsp.
aureus Mu50].
aureus subsp.
aureus Mu50].
RDPNAPQKGDDLQYTMTLTFEEAVFGTTKEISIRKDVTCETCHGDGAKPGTSKKT
CSYCNGAGHVAVEQNTILGRVRTEQVCPKCNGSGQEFEEACPTCHGKGTENKTV
KLEVKVPEGVDNEQQIRLAGEGSPGVNGGPAGDLYVVFRVKPSETFKRDGDDIYY
aureus]
NTKNEARDNHLKSGDFFGTDEFDKITFVTKSITESKVVGDLTIKGITNEETFDVEFN
GVSKNPMDGSQVTGIIVTGIINREKYGINFNQTLETGGVMLGKDVKFEASAEFSISE
aureus]
aureus]
MQYTMMKYAREHGATTYDFGGTDNDPDKDSEHYGLWAFKKVWGTYLSEKIGEF
DYVLNQPLYQLIEQVKPRLTKAKIKISRKLKRK
The sum of all open reading frames (ORFs) of all annotated genes from a given organism is called an ORFeome. By systematic cloning of an ORFeome one can construct a complete collection of genes for organism-wide research. The ORFeome is a resource. The construction of an ORFeome is a challenging undertaking, because each gene has to be amplified from a cDNA or genomic-DNA source with specific primers (e.g. for S. aureus about 6000 primers are needed) and has to be cloned one by one into plasmids. However, it enables a more comprehensive and systematic research because each gene's role (more precisely also each protein's role produced from a certain gene) is investigated in one experiment and no gene is skipped. Furthermore, there is no bias due to unequal representation of expressed genes as it is in cDNA-libraries (low abundant or high abundant expressed genes). If libraries are made from the ORFeome, one can construct libraries which are perfectly normalized for gene distribution and gene level. The ORFeome-resource is a collection of individual genes cloned into a plasmid and arranged in a micro-well plate format. From each ORF different samples are maintained for short- and long-term storage:
The ORFeome can be used either individually gene by gene, as a complete collection, sub-divided into parts (partition by professional knowledge into sets of genes with similar molecular or biochemical functions according to gene-ontology criteria e.g. transcription factors, DNA-repair, glycolysis etc.) in a matrix format, as well as pooled into mixtures of ORF-containing plasmid-libraries. The libraries resulting from such pooling approaches are normalized for gene content and distribution and can be used for downstream applications e.g. Y2H-screening, gene-expression analysis, microarray analysis etc.
2.1. ORFeome Design
2.2. ORFeome Cloning
The aim was to construct a comprehensive and normalized Y2H-library, which contains all genes from an organism and is unique in composition
3.1. Unspecific and Unselected ORFomer-Library
ORFomers are peptides derived from proteins encoded by the ORFeome. ORFomer libraries will be used to find binders for a certain protein target or group of target proteins. Thus, the functionality of the library will be determined by the feature to bind to (to have an affinity for) a protein or peptide. The complexity of an ORFomer library is comparable to random peptide libraries or antibody Fab-fragment libraries. However, ORFomers represent the diversity within all proteins in a given organism, not restricted to a single class of proteins (such as antibodies).
The S. aureus ORFeome was used for the construction of an unspecific and unselected ORFomer-library. The term unspecific means, that the library contains a broad spectrum of peptides from protein binders, but is not yet specific for a certain target. The term unselected means that the library has not been subjected to any selection or maturation steps (no iterative rounds of mutagenesis and selection). This type of ORFomer library is the basis for deriving all other ORFomer-library types, and is produced as follows:
Exemplified by systematic mutagenesis: Shortening of the gene encoding the interacting protein, resulting in N- and C-terminal truncations, until the interaction is lost to identify the smallest peptide derived from the binding proteins still interacting with the target protein
This mini-library exemplifies a specific (because specialized for a distinct target, ClfB) ORFomer-library. The ORFomer library is mutagenized (5-primed deletions). However, the ORFomer library has not undergone iterative rounds of repeated mutagenesis and selections. All constituents of this mini-library are tested directly in a binary manner against ClfB.
ATG
TCCATCAGGGTGACCCAGAAGTCCTACAAGGTGTCCACCTCTGGCCCCCG
GGCCTTCAGCAGCCGCTCCTACACGAGTGGGCCCGGTTCCCGCATCAGCTCC
TCGAGCTTCTCCCGAGTGGGCAGCAGCAACTTTCGCGGTGGCCTGGGCGGCG
GCTATGGTGGGGCCAGCGGCATGGGAGGCATCACCGCAGTTACGGTCAACCA
GAGCCTGCTGAGCCCCCTTGTCCTGGAGGTGGACCCCAACATCCAGGCCGTG
CGCACCCAGGAGAAGGAGCAGATCAAGACCCTCAACAACAAGTTTGCCTCCTT
CATAGACAAGGTACGGTTCCTGGAGCAGCAGAACAAGATGCTGGAGACCAAGT
GGAGCCTCCTGCAGCAGCAGAAGACGGCTCGAAGCAACATGGACAACATGTTC
GAGAGCTACATCAACAACCTTAGGCGGCAGCTGGAGACTCTGGGCCAGGAGA
AGCTGAAGCTGGAGGCGGAGCTTGGCAACATGCAGGGGCTGGTGGAGGACTT
CAAGAACAAGTATGAGGATGAGATCAATAAGCGTACAGAGATGGAGAACGAAT
TTGTCCTCATCAAGAAGGATGTGGATGAAGCTTACATGAACAAGGTAGAGCTG
GAGTCTCGCCTGGAAGGGCTGACCGACGAGATCAACTTCCTCAGGCAGCTATA
TGAAGAGGAGATCCGGGAGCTGCAGTCCCAGATCTCGGACACATCTGTGGTG
CTGTCCATGGACAACAGCCGCTCCCTGGACATGGACAGCATCATTGCTGAGGT
CAAGGCACAGTACGAGGATATTGCCAACCGCAGCCGGGCTGAGGCTGAGAGC
ATGTACCAGATCAAGTATGAGGAGCTGCAGAGCCTGGCTGGGAAGCACGGGG
ATGACCTGCGGCGCACAAAGACTGAGATCTCTGAGATGAACCGGAACATCAGC
CGGCTCCAGGCTGAGATTGAGGGCCTCAAAGGCCAGAGGGCTTCCCTGGAGG
CCGCCATTGCAGATGCCGAGCAGCGTGGAGAGCTGGCCATTAAGGATGCCAA
CGCCAAGTTGTCCGAGCTGGAGGCCGCCCTGCAGCGGGCCAAGCAGGACATG
GCGCGGCAGCTGCGTGAGTACCAGGAGCTGATGAACGTCAAGCTGGCCCTGG
ACATCGAGATCGCCACCTACAGGAAGCTGCTGGAGGGCGAGGAGAGCCGGCT
GGAGTCTGGGATGCAGAACATGAGTATTCATACGAAGACCACCAGCGGCTATG
CAGGTGGTCTGAGCTCGGCCTATGGGGGCCTCACAAGCCCCGGCCTCAGCTA
CAGCCTGGGCTCCAGCTTTGGCTCTGGCGCGGGCTCCAGCTCCTTCAGCCGC
ACCAGCTCCTCCAGGGCCGTGGTTGTGAAGAAGATCGAGACACGTGATGGGA
AGCTGGTGTCTGAGTCCTCTGACGTCCTGCCCAAG
TGA
ACAGCTGCGGCAGC
CCCTCCCAGCCTACCCCTCCTGCGCTGCCCCAGAGCCTGGGAAGGAGGCCG
CTATGCAGGGTAGCACTGGGAACAGGAGACCCACCTGAGGCTCAGCCCTAGC
CCTCAGCCCACCTGGGGAGTTTACTACCTGGGGACCCCCCTTGCCCATGCCT
CCAGCTACAAAACAATTCAATTGCTTTTTTTTTTTGGTCCAAAATAAAACCTCAG
CTAGCTCTGCCAAACCC
CAGGAGCTGATGAACGTCAAGCTGGCCCTGGACATCGAGATCGCCACCTACA
GGAAGCTGCTGGAGGGCGAGGAGAGCCGGCTGGAGTCTGGGATGCAGAACA
TGAGTATTCATACGAAGACCACCAGCGGCTATGCAGGTGGTCTGAGCTCGGCC
TATGGGGGCCTCACAAGCCCCGGCCTCAGCTACAGCCTGGGCTCCAGCTTTG
GCTCTGGCGCGGGCTCCAGCTCCTTCAGCCGCACCAGCTCCTCCAGGGCCGT
GGTTGTGAAGAAGATCGAGACACGTGATGGGAAGCTGGTGTCTGAGTCCTCTG
ACGTCCTGCCCAAG
TGA
ACAGCTGCGGCAGCCCCTCCCAGCCTACCCCTCCT
GCGCTGCCCCAGAGCCTGGGAAGGAGGCCGCTATGCAGGGTAGCACTGGGA
ACAGGAGACCCACCTGAGGCTCAGCCCTAGCCCTCAGCCCACCTGGGGAGTT
TACTACCTGGGGACCCCCCTTGCCCATGCCTCCAGCTACAAAACAATTCAATT
GCTTTTTTTTTTTGGTCCAAAATAAAACCTCAGCTAGCTCTGCCAAACCC
ACCTACAGGAAGCTGCTGGAGGGCGAGGAGAGCCGGCTGGAGTCTGGGATG
CAGAACATGAGTATTCATACGAAGACCACCAGCGGCTATGCAGGTGGTCTGAG
CTCGGCCTATGGGGGCCTCACAAGCCCCGGCCTCAGCTACAGCCTGGGCTCC
AGCTTTGGCTCTGGCGCGGGCTCCAGCTCCTTCAGCCGCACCAGCTCCTCCA
GGGCCGTGGTTGTGAAGAAGATCGAGACACGTGATGGGAAGCTGGTGTCTGA
GTCCTCTGACGTCCTGCCCAAG
TGAACAGCTGCGGCAGCCCCTCCCAGCCTA
TACAGCCTGGGCTCCAGCTTTGGCTCTGGCGCGGGCTCCAGCTCCTTCAGCC
GCACCAGCTCCTCCAGGGCCGTGGTTGTGAAGAAGATCGAGACACGTGATGG
GAAGCTGGTGTCTGAGTCCTCTGACGTCCTGCCCAAG
TGA
ACAGCTGCGGCA
GCCCCTCCCAGCCTACCCCTCCTGCGCTGCCCCAGAGCCTGGGAAGGAGGC
CGCTATGCAGGGTAGCACTGGGAACAGGAGACCCACCTGAGGCTCAGCCCTA
GCCCTCAGCCCACCTGGGGAGTTTACTACCTGGGGACCCCCCTTGCCCATGC
CTCCAGCTACAAAACAATTCAATTGCTTTTTTTTTTTGGTCCAAAATAAAACCTC
AGCTAGCTCTGCCAAACCC
TACAGCCTGGGCTCCAGCTTTGGCTCTGGCGCGGGCTCCAGCTCCTTCAGCC
GCACCAGCTCCTCCAGGGCCGTGGTTGTGAAGAAGATCGAGACACGTGATGG
GAAGCTGGTGTCTGAGTCCTCTGACGTCCTGCCCAAG
TGA
TACAGCCTGGGCTCCAGCTTTGGCTCTGGCGCGGGCTCCAGCTCCTTCAGCC
GCACCAGCTCCTCCAGGGCCGTGGTTGTGAAGA
Mutagenesis followed by selection and further rounds of mutagenesis & selection of minimal interacting peptides in an ORFomer library results in specific and selected ORFomer, selected for binding to a specific target.
5.1. Exemplified by Systematic Mutagenesis:
Shortening of the gene encoding the interacting protein, resulting in N- and C-terminal truncations, until the interaction is lost to identify the smallest peptide derived from the binding proteins still interacting with the target protein
The mini-library constructed in example 4 was mutagenized, however has not undergone iterative rounds of selection & mutagenesis. To demonstrate, that repeated rounds of selection can eliminate non-binders from binders 2 rounds of selection within the Y2H-system were performed.
The plasmid harbouring the sequence encoding the shortest KRT fragment still capable of interaction was Frag2-binding domain encoding a peptide of 28 amino acids. This clone was identified 18 times within the 100 different clones and Frag2-CDS was identified 28 times within the 100 plasmids. No single clone of the 26 amino-acid peptide encoding plasmid (Frag3) was identified. This experiment demonstrates that two rounds of selection within the Y2H-system are sufficient to eliminate non-binders from a binder population within a functional (here for ClfB) ORFomer library (see Table 8).
5.2. Exemplified by Random Mutagenesis
A Y2H-compatible ‘restriction-fragment gene-library’ was produced from cDNA encoding the interacting human proteins, with the aim to identify (using the Y2H system) the smallest peptide derived from the binding proteins still interacting with the target protein. Chemical synthesis is used to confirm the nature of the binding peptide. The peptide can be used for a variety of applications.
Here, both iterative rounds of selection and mutagenesis from a specific but unselected (still containing non-binders and weak binders) ORFomer library were performed. The resulting peptide is a selected mature ORFomer.
5.2.1. Mutagenesis
E. coli colonies
E. coli transformants on Petri-dishes
5.2.2. Iterative Rounds of Selection
6.1. Specificity of the Peptides in the Y2H-System
Following Yeast Transformations were Performed:
6.2. Specificity of the Peptides in the In Vitro-Pull Down Analysis
6.2.1. Recombinant Protein Production
6.2.2. Pull-Down Experiment Demonstrating Specificity of the Peptide
The results show that the GST-tagged peptide of 48-amino acids is only pulled down (co-purified) with the His-tagged ClfB protein. Neither BSA coated- nor GST coated magnetic beads could pull down the GST-tagged 48-aminoacid peptide. The results clearly demonstrate that the recombinant peptide of 48-amino acids binds specifically to its target the ClfB protein recombinantly produced in E. coli (see
7.1. Specific Binding of the PPI-Inhibitory Peptide to the Target ClfB Demonstrated by Dot-Blot Far-Western Approach.
Peptides were spotted onto a protein binding membrane and incubated with recombinant His-tagged ClfB protein. The irreversibly bound peptides can be targeted by additionally added ClfB. Bound ClfB protein is detected by a His-tag specific antibody (see
7.2. PPI-Inhibition by the Inhibitory Peptide
The results show, that recombinant GST-tagged keratin fragment (recombinant ORFomer) binds strongly to living pathogen cells with affinities comparable to published virulence factor substrates as shown by an adhesion assay. The results also show, that IPEP-21 SA (the synthetic ORFomer) binds strongly to living pathogen cells compared to synthetic control peptide.
The method according to the present invention was performed on the peptide YSLGSSFGSGAGSSSFSRTSSSRAVVVK (SEQ ID NO: 59), the original binding domain to ClfB, in order to provide molecules with increased relative protein interaction strength.
1. Construction of a Mutagenesis-Peptide Library
1.1 The nucleic acid sequence which encodes for the original binder peptide (amino acid sequence: YSLGSSFGSGAGSSSFSRTSSSRAVVVK (SEQ ID NO: 59)) was amplified from the Y2H prey plasmid pGADT7 by PCR.
1.2 The PCR product was mutagenized by error-prone PCR with the commercially available product Diversify® PCR Random Mutagenesis Kit from Clontech.
1.3 The mutagenized PCR product was cloned into the Y2H prey vector pGADT7 by restriction enzyme based cloning. At least 8×106 mutagenized clones were generated.
1.4. The mutagenesis procedure was repeated to gain a highly diverse peptide library. Random E. coli clones which harbour a mutagenized peptide encoding nucleic acid sequence were analysed by plasmid isolation and subsequent sequencing. Equal distribution of mutations can be seen over the complete length of the peptide (Table 11).
2. Selection in the Y2H System
2.1. The mutagenesis library was introduced into the Y2H strain AH109 by lithium acetate based transformation and a total number of 8.6×106 individual yeast transformants were generated. Yeast transformations were plated out onto protein-interaction selective media (yeast selective media lacking tryptophan, leucine, Adenine, histidine and supplemented with 50 mM 3AT). Colonies that appeared on the plates were harvested from the plates by scraping and pooled to yield a mixture of pre-selected differently mutagenized original binders expressed in AH109 yeast cells (7×106 cells/μl).
2.2 Two aliquots of the pooled colonies were taken (corresponding to 7×107 cells) and inoculated in 2 different liquid Y2H selection media (25 ml of yeast selective media lacking tryptophan, leucine, histidine, adenine supplemented with 250 mM 3AT and additionally 25 ml of yeast selective media lacking tryptophan, leucine, histidine, adenine and 3AT).
2.3 The yeast liquid cultures were incubated 1 week at 28° C. and reached a cell density of (˜1011 cells in 25 ml in 0 mM 3AT and 5×109 in 250 mM 3AT).
2.4. Steps 2.2 and 2.3 were repeated (again an aliquot of the selected cells was inoculated into the same selection media and grown for 1 week at 28° C.).
2.5 Plasmids were isolated from the yeast cell pool and randomly taken clones were sequenced (Tables 12, 13). The observed mutations are not randomly distributed over the complete sequence.
2.6 Additionally, a third aliquot was taken from step 2.1, plated out onto Y2H selective media (yeast selective media lacking tryptophan, leucine, adenine, histidine and containing additionally 50 mM 3AT) and incubated at the very stringent temperature of 37° C. (this is a unusual high selection temperature compared to the standard Y2H selection of 28° C.).
2.7 After 1 week incubation the plasmids from grown yeast colonies were isolated and sequenced (Table 14).
3. Monitoring of the Increased Relative Protein Interaction Strength in the Y2H System
3.1. The plasmid harbouring the nucleic acid sequence that encodes for the peptide (YSLGSSFGSGAGSSSLGRTSSSRAVVVK (SEQ ID NO: 97)) was re-introduced into AH109 cells and tested for the relative interaction strength to ClfB by X-Gal overlay assays and growth strength on 3AT media (
4. Construction of a Mutagenesis-Peptide Library from Selected Clones in 2.5
4.1. The same strategy was used to construct a further mutagenesis-peptide library, however, instead of using a single nucleic acid sequence a mixture of nucleic acids was used as template for the error-prone PCR.
4.2. Plasmid isolation from step 2.5 contains a pool of pre-selected strong binders to ClfB, which were used for the error-prone PCR and yielded PCR product was again cloned into the Y2H prey plasmid pGADT7. A total number of 107 mutation clones were obtained.
4.3. This mutagenesis procedure was repeated.
5. Selection in the Y2H System
5.1. Steps from 2.1 to 2.5 were repeated twice. Sequenced clones can be seen in table 5.
6. Monitoring of the Increased Relative Protein Interaction Strength in the Y2H System
6.1. The plasmid harbouring the nucleic acid sequence that encodes for the peptide (YSLGSSFGSGAGSSSSSRTSPSRAVVVK (SEQ ID NO: 73)) was re-introduced into AH109 cells and tested for the relative interaction strength to ClfB by X-Gal overlay assays and growth strength on 3AT media (
FSLGSSFGSGAG
FVK
PSSFSRTSSSRA
PSRAVVVK
Number | Date | Country | Kind |
---|---|---|---|
07450150.3 | Aug 2007 | EP | regional |
This application is a national phase application under 35 U.S.C. §371 of International Application No. PCT/EP2008/061362 filed 29 Aug. 2008, which claims priority to European Application No. 07450150.3 filed 30 Aug. 2007. The entire text of each of the above-referenced disclosures is specifically incorporated by reference herein without disclaimer.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP2008/061362 | 8/29/2008 | WO | 00 | 2/26/2010 |