The present application claims priority on EP 01 11 6756.6 filed on Jul. 19th, 2001, which hereby is incorporated by reference in its entirety.
Because of their high degree of specificity and broad target range, antibodies have found numerous applications in a variety of settings in basic research, clinical and industrial use, where they serve as tools to selectively recognize virtually any kind of substrate. However, despite their versatility there are intrinsic limitations in the use of antibody molecules for some important applications. For example, therapeutic or in vivo diagnostic antibody fragments require a long serum half-life in human patients to accumulate at the desired target, and they must, therefore, be resistant to precipitation and degradation by proteases (Willuda et al., 1999). Industrial applications often demand antibodies, that can function in organic solvents, surfactants or at high temperatures—all of which pose severe challenges to the stability of these molecules (Dooley et al., 1998; Harris et al., 1994). There is also a size consideration, especially in clinical applications. Enhanced tumor penetration favors smaller molecules, thus making the large size of whole antibodies a potential liability in some treatment regimens. Furthermore, the high demand for, and the increasing number of, applications of antibodies require more efficient methods for their high-level production.
Single-chain Fv (scFv) fragments are one antibody format designed to circumvent some of these limitations (Bird et al., 1988; Huston et al., 1988). The size of these molecules is reduced to the antigen binding part of an antibody, and they contain the variable domains of the heavy and light chain connected via a flexible linker. Most scFv fragments can be easily obtained from recombinant expression in E. coli in sufficient amounts (Glockshuber et al., 1992; Plückthun et al., 1996). As production yields of these fragments are influenced by their stability, as well as solubility and folding efficiency, considerable efforts have been made to identify positions in scFv fragments critical for influencing their expression behavior (Knappik & Plückthun, 1995; Forsberg et al., 1997; Kipriyanov et al., 1997; Nieba et al., 1997).
The factors influencing the stability of antibody molecules have been studied mostly with scFv fragments (Wörn & Plückthun, 2001). The overall stability of scFv fragments depends on the intrinsic structural stability of VL and VH as well as on the extrinsic stabilization provided by their interaction (Wörn & Plückthun, 1999). For some scFvs, the stabilities of isolated VH and VL domains, as well as of the whole scFv fragment, have been measured and compared recently (Jäger et al., 2001; Jäger & Plückthun, 1999a; Wörn & Plückthun, 1999). The VH domain of the anti-HER2 scFv hu4D5-8, which was generated by loop grafting on a human VH3 consensus framework (Carter et al., 1992; Rodrigues et al., 1992), shows a free energy of unfolding of 14.4 kJ/mol−1 l (Jäger et al., 2001). This low thermodynamic stability is surprising at first glance, but there are several differences in framework residues of the VH3 consensus sequence introduced after the loop grafting to increase affinity to HER2 (Carter et al., 1992). The VH domain IcaH-01 of a catalytic antibody (Ohage et al., 1999) was engineered for stability by converting it to the consensus sequence (Steipe et al., 1994). Because of the frequent usage of VH3 domains, this overall consensus is heavily biased towards the VH3 consensus. Seven positions were identified and separately exchanged (Wirtz & Steipe, 1999).
ScFv fragments, as well as complete human antibodies against a broad variety of tailored antigens, can now be obtained from several antibody libraries (Griffiths et al., 1994; Vaughan et al., 1996; Knappik et al., 2000). The libraries are enriched by panning for antibody fragments that bind the desired target molecule, but the selection procedure is biased for additional factors such as expression behavior, toxicity of the expressed antibody construct to the bacterial host, protease sensitivity, folding efficiency, and stability. There are two conceivable solutions to make a diverse library of stable frameworks. The first is to use a single stable framework (Holt et al., 2000; Pini et al., 1998; Söderlind et al., 2000). These libraries use the germ line gene DP47 (Tomlinson et al., 1992) as the master framework for the VH domain, since this gene is well expressed in bacterial systems (Griffiths et al., 1994) and most frequently expressed in vivo in human individuals (de Wildt et al., 1999). The Griffiths library is built from a germline VH bank using in vitro generated CDR3 and FR4 sequences (Griffiths et al., 1994). The diversity has been reached by introducing various point mutations in the CDRs (Holt et al., 2000; Pini et al., 1998) or sampled CDRs from in vivo-processed gene sequences (Söderlind et al., 2000).
The second possibility to achieve a structurally diverse library of stable frameworks is to optimize the human consensus antibody frameworks further. Different frameworks with conformational changes for framework 1 conformations (Honegger & Plückthun, 2001 a; Jung et al., 2001; Saul & Poljak, 1993) may access a different range of CDR2 conformations (Saul & Poljak, 1993), while different framework 4 sequences affect CDR3 conformation. The Human Combinatorial Antibody Library (HuCAL, Knappik et al., 2000) consists of combinations of seven VH and seven VL synthetic consensus frameworks connected via a linker region forming 49 master genes (Knappik et al., 2000).
The basis for this library is a set of consensus sequences of the framework regions of the major VH- and VL-subfamilies (VH1, VH2, VH3, VH4, VH5, and VH6, Vκ1, Vκ2, Vκ3, Vκ4, Vλ1, Vλ2 and Vλ3). These subfamilies were identified from known germline sequences (VBASE, Cook & Tomlinson, 1995) with the VH1 subfamily further divided into VH1a and VH1b because of different CDR-H2 conformations. For each of the subfamilies, a consensus sequence for the framework regions was calculated from a database of all known rearranged antibody sequences belonging to that subfamily.
These 14 consensus sequences ideally represent the structural repertoire of human variable domain frameworks.
These consensus sequences containing germline CDR1 and CDR2 sequences of the corresponding germline variable domain and identical CDR3s were used for expression studies (Knappik et al., 2000). Thus, it could be shown that the individual VH and VL domains are well expressed and stable in E. coli. However, these studies, and studies on their individual perfomance in recombinant libraries (Hanes et al., 2000) showed that nevertheless there are striking differences between the individual variable domains when compared to each other.
Enhanced overall expression and stability of antibodies or fragments thereof is highly desirable for most applications of antibody libraries.
Thus, the technical problem of the present invention is to improve the relative stability, overall expression and solubility of antibodies or fragments thereof. The solution to the above mentioned technical problem is achieved by providing the embodiments characterized in the claims and disclosed hereinafter.
The technical approach of the present invention i.e. modifying one or more framework residues in a human variable heavy or light chain antibody domain of a particular subclass with reference to a VH or a VL domain, respectively, of another subclass, is neither provided nor suggested by the prior art.
The present invention provides antibodies having, inter alia, a modified framework region, using methods described and contemplated herein. Methods for mutating nucleic acid sequences are well known to the practitioner skilled in the art, including but not limited to cassette mutagenesis, site-directed mutagenesis, mutagenesis by PCR (see for example Sambrook et al., 1989; Ausubel et al., 1999).
In one aspect, the present invention provides isolated polypeptides (and isolated nucleic acid sequences encoding the same) that contain a VH domain selected from the group consisting of (i) a VH domain belonging to the VH1a subclass, wherein the VH domain contains an amino acid residue F at position 29 and/or L at position 89; (ii) a VH domain belonging to the VH1b subclass, wherein the VH domain contains the amino acid residue L at position 89; (iii) a VH domain belonging to the VH2 subclass, wherein the VH domain contains at least one amino acid residue selected from the group consisting of G at position 16, V at position 44, A at position 47, G at position 76, F at position 78, Y at position 90, R at position 97, E at position 99, wherein if R is at position 97, then E is at position 99; (iv) a VH domain belonging to the VH4 subclass, wherein the VH domain contains at least one amino acid residue selected from the group consisting of G at position 16, A at position 47, F at position 78, Y at position 90, R at position 97, and E at position 99, wherein if R is at position 97, then E is at position 99; (v) a VH domain belonging to the VH5 subclass, wherein the VH domain contains at least one amino acid residue selected from the group consisting of L at position 89, R at position 97, and E at position 99, wherein if R is at position 97, then E is at position 99; and (vi) a VH domain belonging to the VH6 subclass, wherein the VH domain contains at least one amino acid residue selected from the group consisting of V at position 5, G at position 16, I at position 58, F at position 78, Y at position 90 and R at position 97, and E at position 99, wherein if R is at position 97, then E is at position 99.
The present invention also provides isolated polypeptides (and isolated nucleic acid sequences encoding the same) that contain a VL domain selected from the group consisting of (i) a VL domain belonging to the VLκ2 subclass, wherein the VL domain contains the amino acid residue R at position 18, and wherein if R is at position 18, then T is at position 92; and (ii) a VL domain belonging to the VLλ1 subclass, wherein the VL domain contains the amino acid residue K at position 47.
The nucleic acid sequences encoding the polypeptides of the invention can be used, e.g., for the construction of libraries of antibodies or fragments thereof. Libraries of antibodies or fragments thereof have been described in various publications (see, e.g., Vaughan et al., 1996; Knappik et al., 2000; U.S. Pat. No. 6,300,064, which are incorporated by reference in their entirety), and are well-known to one of ordinary skill in the art.
In the context of the present invention, the term “VH domain” refers to the variable part of the heavy chain of an immunoglobulin molecule. The term “VH . . . subclass” includes the subclass defined by the corresponding “VH . . . ” consensus sequence taken from the HuCAL (VH1a, VH1b, VH2, VH3, VH4, VH5, and VH6 (Knappik et al., 2000) generated as described above. In this context, the term “subclass” refers to a group of variable domains sharing a high degree of identity and similarity represented by a consensus sequence of the major VH-subfamilies, wherein the term “subfamily” is used as a synonym for “subclass.” In the context of the present invention, the term “consensus sequence” refers to the HuCAL consensus genes. The determination whether a given VH domain is “belonging to a VH subclass” is made by alignment of the VH domain with all known human VH germline segments (VBASE, Cook & Tomlinson, 1995) and determination of the highest degree of homology using a homology search matrix such as BLOSUM (Henikoff & Henikoff, 1992). Methods for determining homologies and grouping of sequences according to homologies are well known to one of ordinary skill in the art. The grouping of the individual germline sequences into subclasses is done according to Knappik et al., (2000).
In the context of the present invention the term “VL domain” refers to the variable part of the light chain of an immunoglobulin molecule. The term “VL . . . subclass” refers to the subclass defined by the corresponding VL . . . consensus sequence taken from the HuCAL (Vκ1, Vκ2, Vκ3 and Vκ4 as well as Vλ1, Vλ2 and Vλ3; Knappik et al., 2000) generated as described above.
In this library, a consensus sequence for each of the major VL-subfamilies was generated from known antibody sequences (VBASE, Cook & Tomlinson, 1995). In the context of the present invention, the numbering of the amino acid residues is according to the structurally adjusted scheme of Honegger & Plückthun (2001b).
In the context or the present invention, the term “antibody” is used as a synonym for “immunoglobulin”. Antibodies or fragments thereof according to the present invention may be Fv (Skerra & Plückthun, 1988), scFv (Bird et al., 1988; Huston et al., 1988), disulfide-linked Fv (Glockshuber et al., 1992; Brinkmann et al., 1993), Fab, (Fab′)2 fragments, single VH domains or other fragments well-known to the practitioner skilled in the art, which comprise at least one variable domain of an immunoglobulin or immunoglobulin fragment and have the ability to bind to a target.
The invention provides novel immunoglobulin sequences and methods for making the same. The present inventors surprisingly discovered a scheme for optimizing certain framework regions of an immunoglobulin of any variable heavy or light chain subclass, using the sequences of another subclass (i.e., subfamily) as a reference point. The present invention, also relates to a method for the further modification of such optimized human variable domains comprising the steps of: (i) identifying for said domain the corresponding amino acid consensus sequence selected from the group of VH consensus sequences consisting of VH1a, VH1b, VH2, VH4, VH5, and VH6, and (ii) substituting one or more codons corresponding to amino acid residues of said consensus sequence into a corresponding position(s) in said nucleic acid sequence of said domain.
The following procedure describes a generally applicable method for improving the properties of any given human immunoglobulin heavy chain variable domain while keeping binding activity. (This method can be readily modified, using the guidance provided herein, to improve the properties of any given human immunoglobulin light chain variable domain). The first task is to compare each residue of the given domain to different subsets of immunoglobulin sequences. As the binding activity preferably is retained, residues of CDR1 (25-40), CDR2 (57-77), CDR3 (109-137) and the outer loop (84-87) are generally not considered (numbering scheme according to Honegger and Plückthun (2001b)). After determination of the framework 1 class, the subtype-determining (6, 7, 9, 10) and subtype-corresponding (19, 74, 78, 93) residues are compared to the consensus of sequences falling into the same class (Honegger and Plückthun, 2001a). The other residues are then compared to the consensus sequences of the VH domains with favorable properties (families 1, 3 and 5) (see Example 1, Knappik et al., 2000). Next, the differences in residues are analyzed using structure models (see Example 2). Mutations that increase the expression yield of soluble protein and/or thermodynamic stability, as seen in this study, include: (i) mutations which replace a non-glycine residue in a loop with a positive phi-angle to glycine, (ii) mutations of residues in a β-strand with low β-sheet propensity to a residue with high β-sheet propensity, (iii) mutations of solvent exposed hydrophobic residues to hydrophilic ones, and (iv) replacement of residues with unsatisfied H-bonds.
In a preferred embodiment, the present invention relates to a method for the modification of certain human VH domains belonging to a VH subclass which is not VH3, comprising the steps of: (a) identifying certain amino acid residues of said VH domain being different compared to the corresponding amino acid residues of the HuCAL VH3 domain, (b) replacing at least one of the differing amino acid residues by the corresponding amino acid residues of the HuCAL VH3 domain, provided that the replacing amino acid residue is not the consensus amino acid residue of said subclass.
This basic method is, in principle, also applicable to VL domains. For example, Vκ domains can be compared to the consensus sequence of Vκ3, as this domain displays the highest thermodynamic stability and expression yield of Vκ domains. The physical principles for rational design Vλ domains are the same as with VH domains described above.
In a preferred embodiment, the present invention relates to an isolated polypeptide comprising a VH domain belonging to the VH1a subclass, wherein said VH domain comprises an amino acid residue F at position 29 and L at position 89.
In yet a further embodiment, the invention relates to an isolated polypeptide comprising a VH domain belonging to the VH1b subclass, wherein said VH domain comprises the amino acid residue L at position 89.
In a further preferred embodiment, the invention relates to an isolated polypeptide comprising a VH domain belonging to the VH2 subclass, wherein said VH domain comprises at least one amino acid residue selected from the group consisting of G at position 16, V at position 44, A at position 47, G at position 76, F at position 78, Y at position 90, R at position 97, E at position 99, wherein if R is at position 97, then E is at position 99.
In yet a further preferred embodiment, the invention relates to an isolated polypeptide comprising a VH domain belonging to the VH4 subclass, wherein said VH domain comprises at least one amino acid residue selected from the group consisting of G at position 16, A at position 47, F at position 78, Y at position 90, R at position 97, and E at position 99, wherein if R is at position 97, then E is at position 99.
In yet a further preferred embodiment, the invention relates to an isolated polypeptide comprising a VH domain belonging to the VH5 subclass, wherein said VH domain comprises at least one amino acid residue selected from the group consisting of L at position 89, R at position 97, and E at position 99, wherein if R is at position 97, then E is at position 99.
In a further preferred embodiment, the present invention relates to an isolated polypeptide comprising a VH domain belonging to the VH6 subclass, wherein said VH domain comprises at least one amino acid residue selected from the group consisting of V at position 5, G at position 16, I at position 58, F at position 78, Y at position 90 and R at position 97, and at position 99, wherein if R is at position 97, then E is at position 99.
In yet a further preferred embodiment, the invention relates to an antibody or functional fragment thereof comprising any VH domain according to the present invention. Further preferred is a library of antibodies or functional fragments thereof comprising one or more antibodies or functional fragments thereof according to the present invention.
A library according to the present invention could be generated, starting from the HuCAL library (Knappik et al., 2000) by optimizing one or more of the VH and/or VL consensus sequences in accordance with the teaching of the present invention, and by introducing diversity into at least one CDR region in said optimized sequence, e.g. by using oligonucleotide cassettes synthesized using trinucleotide-directed mutagenesis as described in Knappik et al., 2000.
In yet a further preferred embodiment, the present invention relates to an isolated polypeptide comprising a VL domain belonging to the VLκ2 subclass, wherein said VL domain comprises the amino acid residue R at position 18, and wherein R is at position 18, then T is at position 92.
In a further preferred embodiment, the present invention relates to an isolated polypeptide comprising a VL domain belonging to the VLλ1 subclass, wherein said VL domain comprises the amino acid residue K at position 47.
In yet a further preferred embodiment, the present invention relates to an antibody or a functional fragment thereof comprising a VL domain according to the present invention.
In a most preferred embodiment, the present invention relates to libraries of antibodies or functional fragments thereof comprising one or more antibodies or functional fragments thereof according to the present invention.
In a further preferred embodiment, the present invention relates to a method for the modification of a human VH domain belonging to the VH1a subclass by generating a modified VH domain comprising at least one amino acid residue exchange taken from the list of: (a) 29 to F and (b) 89 to L.
In yet a further embodiment, the invention provides for a method for the modification of a human VH domain belonging to the VH1b subclass by generating a modified VH domain comprising the amino acid residue exchange: 89 to L.
In a further embodiment, the invention relates to a method for the modification of a human VH domain belonging to the VH2 subclass by generating a modified VH domain comprising at least one amino acid residue exchange taken from the list of: (a) 16 to G; (b) 44 to V; (c) 47 to A; (d) 76 to G; (e) 78 to F; (f) 97 to R, provided that the amino acid residue 99 is, or is exchanged to E; and (g) 99 to E. Further preferred is a method for the modification of a VH domain belonging to the VH2 subclass, by generating a modified VH domain comprising the amino acid residue exchange 90 to Y.
In a further preferred embodiment, the invention relates to a method for the modification of a human VH domain belonging to the VH4 subclass by generating a modified VH domain comprising at least one amino acid residue exchange taken from the list of: (a) 16 to G; (b) 44 to V; (c) 47 to A; (d) 76 to G; (e) 78 to F; (f) 97 to R, provided that the amino acid residue 99 is, or is exchanged to E; and (g) 99 to E. Further preferred is a method for the modification of a human VH domain belonging to the VH4 subclass, by generating a modified VH domain comprising the amino acid residue exchange 90 to Y.
In a further preferred embodiment, the invention provides for a method for the modification of a human VH domain belonging to the VH5 subclass by generating a modified VH domain comprising at least one amino acid residue exchange taken from the list of: (a) 77 to R; (b) 89 to L; (c) 97 to R, provided that the amino acid residue 99 is, or is exchanged to E; and (d) 99 to E.
In yet a further embodiment, the invention provides for a method for the modification of a human VH domain belonging to the VH6 subclass by generating a modified VH domain comprising at least one amino acid residue exchange taken from the list of: (a) 5 to V; (b) 16 to G; (c) 44 to V; (d) 58 to I; (e) 72 to D; (f) 76 to G; (g) 78 to F and (h) 97 to R, provided that the amino acid residue 99 is, or is exchanged to E. Further preferred is a method for the modification of a VH domain belonging to the VH6 subclass, by generating a modified VH domain comprising the amino acid residue exchange 90 to Y.
In another embodiment, the present invention relates to a method for the modification of a VH domain, wherein 2 or more amino acid residues are exchanged.
In a further embodiment, the present invention provides for a method for the modification of a VH domain comprising the steps of (i) providing a nucleic acid molecule encoding said VH domain; (ii) mutating said nucleic acid molecule resulting in a modified nucleic acid molecule encoding said modified VH domain.
In a preferred embodiment, the present invention relates to a method for obtaining a polypeptide according to the present invention, substituting in a VH1a subclass domain at least one amino acid residue selected from the group consisting of F at position 29 and L at position 89.
In yet a further preferred embodiment, the present invention relates to a method for obtaining a polypeptide according to the present invention, comprising the step of substituting in a VH1b subclass domain the amino acid residue L at position 89.
In a further preferred embodiment, the present invention relates to a method for obtaining a polypeptide according to the present invention, comprising the step of substituting in a VH2 subclass domain at least one amino acid residue selected from the group consisting of G at position 16, V at position 44, A at position 47, G at position 76, F at position 78, R at position 97, and E at position 99, wherein if R is at position 97, then E is at position 99. Further preferred is a method for obtaining the polypeptide according to the present invention, comprising the step of substituting in a VH2 subclass domain the amino acid residue Y at position 90.
In a further preferred embodiment, the present invention relates to a method for obtaining the polypeptide according to the present invention, comprising the step of substituting in a VH4 subclass domain at least one amino acid residue selected from the group consisting of G at position 16, V at position 44, A at position 47, G at position 76, F at position 78, R at position 97, and E at position 99, wherein if R is at position 97, then E is at position 99. Further preferred is a method for obtaining the polypeptide according to the present invention, comprising the step of substituting in a VH4 subclass domain the amino acid residue Y at position 90.
In yet a further preferred embodiment, the present invention relates to a method for obtaining the polypeptide according to the present invention, comprising the step of substituting in a VH5 subclass domain at least one amino acid residue selected from the group consisting of R at position 77, L at position 89, R at position 97, and E at position 99, wherein if R is at position 97, then E is at position 99.
In a further preferred embodiment, the present invention relates to a method for obtaining a polypeptide according to the present invention, comprising the step of substituting in a VH6 subclass domain at least one amino acid residue selected from the group consisting of V at position 5, G at position 16, V at position 44, I at position 58, D at position 72, G at position 76, F at position 78,R at position 97, and E is at position 99, wherein if R is at position 97, then E is at position 99. Further preferred is a method for obtaining a polypeptide according to the present invention, comprising the step of substituting in a VH6 subclass domain the amino acid residue Y at position 90.
In a further preferred embodiment, the present invention relates to a method for obtaining a polypeptide according to the present invention, wherein 2 or more amino acid residues are substituted.
In yet a further preferred embodiment, the present invention relates to a method for obtaining the polypeptide according to the present invention, comprising the step of substituting in a of a VLκ2 subclass domain at least one amino acid residue selected from the group consisting of S at position 12, Q at position 45, and R at position 18, and wherein R is at position 18, then T is at position 92.
In yet a further preferred embodiment, the present invention relates to a method for obtaining the polypeptide according to the present invention, comprising the step of substituting in a VLλ1 subclass domain at least one amino acid residue selected from the group consisting of K at position 47.
In a further preferred embodiment, the present invention relates to a method for obtaining a polypeptide according to the present invention, comprising the step of substituting in a VLλ1, VLλ2 and VLλ3 domain the amino acid residue P at position 8. Further preferred is a method for obtaining a polypeptide according to the present invention, wherein P is at position 8, and further comprising the substitutions S at positions 7 and 9.
In a further preferred embodiment, the present invention relates to a method according to the present invention, wherein 2 or more amino acid residues are substituted.
In a further preferred embodiment, the present invention relates to a method for obtaining a polypeptide according to the present invention further comprising the step of expressing a modified nucleic acid molecule.
In a further preferred embodiment, the present invention relates to an isolated nucleic acid molecule encoding an inventive VH domain, an antibody or a functional fragment thereof, as disclosed or contemplated herein.
In a further preferred embodiment, the present invention relates to an isolated nucleic acid molecule encoding an inventive VL domain, an antibody or a functional fragment thereof, as disclosed or contemplated herein.
In a further preferred embodiment, the present invention relates to a method for producing a VL domain, antibody or a functional fragment thereof, as described or contemplated herein, comprising the step of expressing an isolated nucleic acid molecule of the present invention.
The invention also provides for conservative amino acid variants of the molecules of the invention. Variants according to the invention also may be made that conserve the overall molecular structure of the encoded proteins. Given the properties of the individual amino acids comprising the disclosed protein products, some rational substitutions will be recognized by the skilled worker. Amino acid substitutions, i.e. “conservative substitutions,” may be made, for instance, on the basis of similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues involved.
For example: (a) nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; (b) polar neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and glutamine; (c) positively charged (basic) amino acids include arginine, lysine, and histidine; and (d) negatively charged (acidic) amino acids include aspartic acid and glutamic acid. Substitutions typically may be made within groups (a)-(d). In addition, glycine and proline may be substituted for one another based on their ability to disrupt α-helices. Similarly, certain amino acids, such as alanine, cysteine, leucine, methionine, glutamic acid, glutamine, histidine and lysine are more commonly found in αhelices, while valine, isoleucine, phenylalanine, tyrosine, tryptophan and threonine are more commonly found in β-pleated sheets. Glycine, serine, aspartic acid, asparagine, and proline are commonly found in turns. Some preferred substitutions may be made among the following groups: (i) S and T; (ii) P and G; and (iii) A, V, L and 1. Given the known genetic code, and recombinant and synthetic DNA techniques, the skilled scientist readily can construct DNAs encoding the conservative amino acid variants.
As used herein, “sequence identity” between two polypeptide sequences indicates the percentage of amino acids that are identical between the sequences. “Sequence similarity” indicates the percentage of amino acids that either are identical or that represent conservative amino acid substitutions.
The invention also provides nucleic acids that hybridize under high stringency conditions to the VH and/or VL domains, antibodies or functional fragments thereof, according to the present invention. As used herein, highly stringent conditions are those, which are tolerant of up to about 5-20% sequence divergence, preferably about 5-10%. Without limitation, examples of highly stringent (−10° C. below the calculated Tm of the hybrid) conditions use a wash solution of 0.1×SSC (standard saline citrate) and 0.5% SDS at the appropriate Ti below the calculated Tm of the hybrid. The ultimate stringency of the conditions is primarily due to the washing conditions, particularly if the hybridization conditions used are those, which allow less stable hybrids to form along with stable hybrids. The wash conditions at higher stringency then remove the less stable hybrids. A common hybridization condition that can be used with the highly stringent to moderately stringent wash conditions described above is hybridization in a solution of 6×SSC (or 6×SSPE), 5×Denhardt's reagent, 0.5% SDS, 100 μg/ml denatured, fragmented salmon sperm DNA at an appropriate incubation temperature Ti. See generally Sambrook et al., Molecular Cloning: A Laboratory Manual, 2d edition, Cold Spring Harbor Press (1989)) for suitable high stringency conditions.
Stringency conditions are a function of the temperature used in the hybridization experiment and washes, the molarity of the monovalent cations in the hybridization solution and in the wash solution(s) and the percentage of formamide in the hybridization solution. In general, sensitivity by hybridization with a probe is affected by the amount and specific activity of the probe, the amount of the target nucleic acid, the detectability of the label, the rate of hybridization, and the duration of the hybridization. The hybridization rate is maximized at a Ti (incubation temperature) of 20-25° C. below Tm for DNA:DNA hybrids and 10-15° C. below Tm for DNA:RNA hybrids. It is also maximized by an ionic strength of about 1.5M Na+. The rate is directly proportional to duplex length and inversely proportional to the degree of mismatching.
Specificity in hybridization, however, is a function of the difference in stability between the desired hybrid and “background” hybrids. Hybrid stability is a function of duplex length, base composition, ionic strength, mismatching, and destabilizing agents (if any).
The Tm of a perfect hybrid may be estimated for DNA:DNA hybrids using the equation of Meinkoth et al (1984), as
Tm=81.5° C.+16.6(log M)+0.41(% GC)−0.61(% form)−500/L
and for DNA:RNA hybrids, as
Tm=79.8° C.+18.5(log M)+0.58(% GC)−11.8(% GC)2−0.56(% form)−820/L
where M, molarity of monovalent cations, 0.01-0.4 M NaCl,
Tm is reduced by 0.5-1.5° C. (an average of 1° C. can be used for ease of calculation) for each 1% mismatching.
The Tm may also be determined experimentally. As increasing length of the hybrid (L) in the above equations increases the Tm and enhances stability, the full-length rat gene sequence can be used as the probe.
Filter hybridization is typically carried out at 68° C., and at high ionic strength (e.g., 5-6×SSC), which is non-stringent, and followed by one or more washes of increasing stringency, the last one being of the ultimately desired high stringency. The equations for Tm can be used to estimate the appropriate Ti for the final wash, or the Tm of the perfect duplex can be determined experimentally and Ti then adjusted accordingly.
In a further preferred embodiment, the present invention relates to a method for producing a VH domain, antibody or a functional fragment thereof, as described or contemplated herein, comprising the step of expressing an isolated nucleic acid molecule of the present invention.
In particular, such method comprises the steps of: (i) providing a nucleic acid molecule encoding a VH domain; (ii) mutating said nucleic acid molecule resulting in a modified nucleic acid molecule encoding a modified VH domain comprising at least one amino acid residue exchange. Methods for mutating nucleic acid sequences are well known to the practitioner skilled in the art, encluding but not limited to cassette mutagenesis, site-directed mutagenesis, mutagenesis by PCR (see for example Sambrook et al., 1989; Ausubel et al., 1999).
Further preferred is a vector comprising an isolated nucleic acid molecule according to the present invention.
In yet a further preferred embodiment, the invention relates to a host cell harboring an isolated nucleic acid molecule according to the present invention or a vector according to the present invention.
In a further preferred embodiment, the VH domains according to the present invention can be used for all applications of antibodies including but not limited to the construction, generation, expression and screening of antibody libraries.
In a further preferred embodiment, the VL domains according to the present invention can be used for all applications of antibodies including but not limited to the construction, generation, expression and screening of antibody libraries
In yet a further preferred embodiment, the present invention relates to an antibody or a functional fragment thereof (and methods of making the same), that contains any combination of a VH and VL domain described herein. For example, an antibody may comprise (i) a VH domain belonging to the VH1a subclass, wherein said VH domain comprises an amino acid residue F at position 29 and/or L at position 89; and (ii) a VL domain belonging to the VLκ2 subclass, wherein said VL domain comprises one or more of the following substitutions: S at position 12, Q at position 45, or R at position 18, provided that if R is at position 18, then T is at position. 92.
In still a further preferred embodiment, the present invention relates to a library of antibodies or functional fragments thereof comprising one or more antibodies or functional fragments thereof, according to the present invention.
In a further preferred embodiment, the present invention relates to an isolated nucleic acid molecule encoding an antibody or functional fragment thereof according to the present invention.
In the following examples, all molecular biology experiments are performed according to standard protocols (Ausubel et al., 1999).
Construction of Expression Vectors
Starting point for all expression vectors were the scFv master genes of the HuCAL library in the orientation VH-(Gly4Ser)4-VL in the expression vector pBS13 (Knappik et al., 2000), which all carried H-CDR3 and L-CDR3 of the antibody hu4D5-8 (Carter et al., 1992).
The seven isolated human consensus VH domains were PCR amplified from the master genes and the CDR3 region between the BssHII and StyI restriction sites was then exchanged to code for a CDR-H3 found by metabolic selection (J. Burmester et al., unpublished results): YNHEADMLIRNWLYSDV. The final expression plasmids were derivatives of the vector pAK400 (Krebber et al., 1997), in which the expression cassette of the seven different VH domains had been introduced between the XbaI and HindIII restriction sites, and where the skp cassette (Bothmann & Plückthun, 1998) had been introduced at the NotI restriction site. The expression cassette consists of a phoA signal sequence, the short FLAG-tag (DYKD), one of the seven VH domains and a hexahistidine-tag.
The seven isolated human consensus VL domains were cut out from the master genes with the restriction enzymes EcoRV and EcoRI and ligated into a pAK400 derivative with these restriction sites. The L-CDR3 of the Vλ domains between the BbsI and MscI restriction sites was exchanged to QSYDSSLSGVV (107-138). This λ-like L-CDR3 is a consensus L-CDR3 from sequences found in the Kabat database (Kabat et al., 1991) for Vλ domains, in contrast to the κ-like L-CDR3 of hu-4D5-8 with the conserved cis-proline in position 136. The chosen length of the consensus λ-like L-CDR3 is found in 20% of the sequences, representing the highest percentage. The tryptophan at position 109, which is the most frequent residue with 54%, was exchanged to tyrosine, which is present in 20% of the sequences, to avoid interference with the native state fluorescence signal of the conserved unique tryptophan. The final expression cassette consists of a pelB signal sequence, one of the seven VL domains and a hexahistidine-tag.
The scFv fragments were cloned via the restriction sites XbaI and EcoRI into the expression plasmid pMX7. The κ-like L-CDR3 was exchanged in the Vλ domains as reported above. The final expression cassette consists of a phoA signal sequence, the short FLAG-tag (DYKD), one of the seven VH domains a (Gly4Ser)4 linker and one of the seven VL domains, the long FLAG-tag (DYKDDDD) and a hexahistidine-tag.
Soluble Periplasmic Expression
dYT medium (30 ml containing 30 μg/mL chloramphenicol, 1.0% glucose) was inoculated with a single bacterial colony and incubated overnight at 25° C. One liter of dYT media (30 μg/mL chloramphenicol, 50 mM K2HPO4) was inoculated with the preculture and incubated at 25° C. (5 L flask with baffles, 105 rpm). Expression was induced at an OD550 of 1.0 by addition of IPTG to a final concentration of 0.5 mM. Incubation was continued for 18 hours, when the cell density reached an OD550 between 8.0 and 11.0. Cells were collected by centrifugation (8000 g, 10 minutes at 4° C.), suspended in 40 ml of 50 mM Tris-HCl (pH 7.5) and 500 mM NaCl and disrupted by French Press lysis. The crude extract was centrifuged (48,000 g, 60 minutes at 4° C.), the supernatant passed through a 0.2 μm filter and directly applied to IMAC chromatography.
Preparative Two-Column Purification
The proteins were purified using the two column coupled in-line procedure (Plückthun et al., 1996). In this strategy, the eluate of an immobilized metal ion affinity chromatography (IMAC) column, which exploits the C-terminal His-tag, was directly loaded onto an ion-exchange column. Elution from the ion-exchange column was achieved with a 0-800 mM NaCl gradient. The VH and Vκ domains were purified with a HS cation-exchange column in 10 mM MES (pH 6.0) and the Vλ domains and the scFv fragments with an HQ anion-exchange column in 10 mM Tris-HCl (pH 8.0). Pooled fractions were dialyzed against 50 mM Na-phosphate, pH 7.0, 100 mM NaCl.
Insoluble Periplasmic Expression
LB medium (30 ml, containing 30 μg/ml chloramphenicol, 1% glucose) was inoculated with a single colony and incubated overnight at 37° C. One liter of SB medium (10 μg/ml chloramphenicol, 0.1% glucose, 0.4 M sucrose) was inoculated with 10 ml of the preculture and incubated at 25° C. Expression was induced at an OD550 of 0.8 by addition of IPTG to a final concentration of 0.05 mM. Incubation was continued for about 15 hours at 25° C. After centrifugation, cells were suspended in 100 mM Tris-HCl, pH 8.0, 2 mM MgCl2 and disrupted by French Press lysis. Inclusion bodies were isolated following a standard protocol (Buchner & Rudolph, 1991). The inclusion body pellet from 1 l bacterial culture was solubilized at room temperature in 10 ml of solubilization buffer (0.2 M Tris-HCl, pH 8.0, 6 M guanidine hydrochloride (GdnHCl), 10 mM EDTA, 50 mM DTT). The resulting solution was centrifuged and the supernatant dialyzed against solubilization buffer without DTT at 10° C. The sample was loaded on a nitrilotriacetic acid column (Qiagen), which had been charged with Ni2+, and IMAC under denaturating conditions was performed. The eluate was diluted (1:10) into refolding buffer (0.5 M Tris-HCl, pH 8.5, 0.4 M arginine, 5 mM EDTA, 20% glycerol, 0.5 mM ε-amino-caproic acid, 0.5 mM benzamidinium-HCl) at 16° C. at a final protein concentration of 1 μM. The formation of disulfide bonds was catalyzed either by the presence of reduced and oxidized glutathione in the refolding buffer at molar concentrations of [GSH]:[GSSG] 0.2:1 mM (oxidizing conditions) or 5:1 mM (reducing conditions). The refolding mixture was incubated at 16° C. for 20 hours and dialyzed against 50 mM Na-phosphate, pH 7.0, 100 mM NaCl.
Ni-NTA Batch Purification
Twenty mL of the supernatant of the French press lysis of the scFv fragments was incubated with 2 mL of a 50% Ni-NTA slurry for 30 min at room temperature. The suspension was applied on a empty column with a diameter of 1.5 cm and washed extensively with 50 mM sodium-phosphate (pH 7.0) and 1 M NaCl. To remove unspecific binding proteins, the column was washed with 30 mM imidazole. The scFv fragments were eluted by adding 250 mM imidazole. The purity of the samples was checked by SDS-PAGE analysis and the concentration was determined by absorbance at 280 nm. Four scFv fragments were purified in parallel with H3κ3 always as a control. The yield was normalized to the yield of H3κ3 and to a 1 L expression culture with an OD550 of 10.
Determination of Insoluble Protein Ratio
An aliquot of a French press lysis extract of a 1 L scFv fragment expression experiment was centrifuged at 4° C. for 30 minutes at 16000 g. The supernatant (soluble fraction) and the precipitate (insoluble fraction), which was resuspended in 50 mM Tris-HCl (pH 7.5) and 500 mM NaCl, were analyzed by SDS-PAGE followed by Western Blot with the anti-His antibody 3D5 as described (Lindner et al., 1997). Chemiluminiscence was detected using a ChemiImager™ 4400 (Alpha Innotech Corporation) and the density of the bands were determined with the software ChemiImager™ 5500 (Alpha Innotech Corporation). As the method involves many steps, the error is possibly high, and therefore we give the values as a percentage of insoluble material, rounded to tens, with an estimated error of 10%.
Gel Filtration Chromatography
Samples of purified proteins were analyzed on a gel filtration column equilibrated with 50 mM Na-phosphate, pH 7.0, 500 mM NaCl. The isolated VH domains and the scFv fragments at a concentration of 5 μM were injected on a Superdex-75 column (Pharmacia) and the isolated Vκ domains at a concentration of 50 and 5 μM on a Superose-12 column (Pharmacia) in a volume of 50 μL and a flow-rate of 60 μL/min on a SMART-system (Pharmacia). The Vλ domains were injected on a silica based TSK-Gel® G3000SWXL column (TosoH) on a HPLC system (HP) in a volume of 50 μL at a concentration of 5 μM and a flow rate of 0.5 mL/min. Lysozyme (14 kDa), carbonic anhydrase (29 kDa) and bovine serum albumin (66 kDa) were used as molecular standards. Elution was followed by detection of the absobance at 280 nm in the case of the SMART-system and at 220 nm in the case of the HPLC system.
Ultracentrifugation
Sedimentation equilibria were determined with a XL-A analytical ultracentrifuge (Beckmann). The samples were dialyzed against 10 mM sodium-phosphate (pH 7.0) and 100 mM NaCl overnight and loaded into a standard 6 channel 12 mm pathlength cell at a sample OD280 of 0.4. The fluorocarbon FC43 was added to each cell sector to provide a false bottom. The samples were run for 24 h at 20° C. at 19000 rpm. Data were collected at 280 nm at a radial spacing of 0.001 cm and a minimum of 10 scans were averaged for each sample. Data were analyzed with software provided by the instrument manufacturer using models that assumed either the presence of a single species or of a monomer-dimer equilibrium as described previously (Liu et al., 1998). Solvent densities and sample partial volumes were calculated using standard methods.
Expression and Protein Purification of VH Domains
The seven HuCAL consensus VH domains representing the major framework subclasses were expressed with the same CDR-H3 to enable the comparison of their biophysical properties. First the VH domains were investigated with the CDR3 from the antibody hu4D5-8 (WGGDGFYAMDY) (Carter et al., 1992), but the VH domains were insoluble when expressed on its own, and only a small inclusion body pellet was obtained. This was not surprising, as many if not most VH domains by themselves are insoluble upon periplasmic expression (Jäger et al., 2001; Jäger & Plückthun, 1999b; Wirtz & Steipe, 1999), since they contain an exposed large hydrophobic interface which is usually covered by VL. However, recently three isolated VH domains from the HuCAL (with framework classes VH1a, VH1b, and VH3) have been selected in a metabolic selection experiment. These could be expressed in the periplasm of E. coli and purified from the soluble fraction of the cell extracts. The main feature of the selected VH domains is the length of the CDR3, as all three selected and soluble VH fragments contain a longer CDR3. This long CDR3 may cover the hydrophobic interface of VH, thereby preventing aggregation. After introducing the CDR3 from one of the selected VH3 domains (YNHEADMLIRNWLYSDV), VH1a, VH1b and VH3 could be expressed in soluble form in the periplasm of E. coli and purified from the soluble fraction of the cell extracts with a yield of 2 mg/l.
In contrast, VH2, VH4, VH5 and VH6 were still insoluble in the E. coli periplasm. These domains were purified from the insoluble fraction with IMAC under denaturating conditions, and the eluted fractions were subjected to in vitro refolding. Approximately 1 mg soluble, refolded VH5 domain could be obtained from 1 l E. coli culture using an oxidizing glutathione redox shuffle. VH2, VH4 and VH6 could only be refolded using a redox shuffle with an excess of reduced glutathione and yielded about 0.2 mg soluble, refolded protein from 1 l E. coli. VH1a, VH1b, VH3 and VH5 remained in solution at 4° C. and no degradation was observed. In contrast, VH2, VH4 and VH6 have a high tendency to aggregate upon standing at 4° C. Therefore, all subsequent experiments were performed with freshly purified proteins.
Analytical Gel Filtration
Samples of purified VH domains were analyzed on a Superdex-75 column equilibrated with 50 mM Na-phosphate, pH 7.0, 100 mM NaCl, on a SMART-system (Pharmacia). The VH domains were injected at a concentration of 2 μM in a volume of 50 μl, and the flow-rate was 50 μl/min. Lysozyme (14 kDa), carbonic anhydrase (29 kDa) and bovine serum albumin (66 kDa) were used as molecular standards.
To analyze the oligomeric state of the purified domains in solution, analytical gel filtration experiments were performed. VH1b, VH3, and VH5 elute at the expected size of a monomer (
Equilibrium Denaturation Experiments of VH Fragments
Fluorescence spectra were recorded at 25° C. with a PTI Alpha Scan spectrofluorimeter (Photon Technologies, Inc., Ontario, Canada). Slit widths of 2 and 5 nm were used for excitation and emission, respectively. Protein/GdnHCl-mixtures (2 ml) containing a final protein concentration of 0.5 μM and denaturant concentrations ranging from 0 to 5 M GdnHCl were prepared from freshly purified protein and a GdnHCl stock solution (7.2 M, in 50 mM NaPO4, pH 7.0, 100 mM NaCl). Each final concentration of GdnHCl was determined from its refractive index. After overnight incubation at 10° C., the fluorescence emission spectra of the samples were recorded from 320 to 370 nm with an excitation wavelength of 280 nm. With increasing denaturant concentrations, the maxima of the recorded emission spectra shifted from about 342 to 348 nm. The fluorescence emission maximum was determined by fitting the fluorescence emission spectrum to a Gaussian function (isolated VH domain and scFv fragments), or the fluorescence intensity at 345 nm (isolated VL domains) was plotted versus the GdnHCl concentration. Protein stabilities for the isolated human consensus VH and VL domains were calculated as described (Jäger et al., 2001). To compare VH, VL and scFv denaturation curves in one plot, relative emission maxima and fluorescence intensities were scaled by setting the highest value to 1 and the lowest to 0.
The thermodynamic stability of the seven human consensus VH domains was examined by GdnHCl equilibrium denaturation experiments. Unfolding of the VH domains was monitored by the shift of the fluorescence emission maximum as a function of denaturant concentration.
Expression and Protein Purification of VL Fragments
The four human consensus Vκ domains (Vκ 1, Vκ 2, Vκ 3 and Vκ 4) carrying the κ-like L-CDR3 from the antibody hu4D5-8 (sequence: HYTTP (Carter et al., 1992) were expressed in soluble form in the periplasm of E. coli. After purification with IMAC followed by a cation exchange column the Vκ domains could be obtained in high amounts, ranging from 17.1 mg/L bacteria culture normalized to an OD550 of 10 for Vκ3 to 4.5 for Vκ1 (Table 1).
The κ-like L-CDR3 has a conserved cis-proline at position 136 (numbering scheme for variable domain residues according to Honegger & Plückthun, 2001). The amino acid sequence of Vλ domains never show a proline at this position. Therefore, we used for these domains a human consensus λ-like CDR3 (sequence: YDSSLSGV). The three human consensus Vλ domains (Vλ1, Vλ2 and Vλ3) were also expressed in soluble form in the periplasm of E. coli, but the yield after purification with IMAC and anion exchange column was much smaller than for the Vλ domains ranging from 1.9 mg/L bacteria culture normalized to an OD550 of 10 for Vλ2 to 0.3 mg for Vλ1 (Table 1).
Analytical Gel Filtration of VL Fragments
While the monomeric VH fragments elute at the expected molecular weight around 13 kDa (
To interpret these results from analytical gel filtration, the samples were also analyzed by equilibrium ultracentrifugation. The method was used to calibrate the elution values of the different columns for VL domains: Vκ3 and Vλ2 give results consistent with a monomer, while λ3 shows a dimer (shown in
Equilibrium Transition Experiments of VL Fragments
Most VL domains have only one tryptophan (the highly conserved Trp43), which is buried in the core in the native state. In GdnHCl denaturation under native conditions no emission maxima could be determined, because the fluorescence is fully quenched by the disulfide bond Cys23-Cys106. During unfolding the tryptophan becomes solvent exposed, giving a steep increase in fluorescence intensity. Therefore, the thermodynamic parameters were calculated using the 6-parameter fit (Pace & Scholtz, 1997) on the plot of concentration of GdnHCl vs. fluorescence intensity, giving curves consistent with two-state behavior. All VL domains show reversible unfolding behavior (data not shown). FIGS. 3(a) and 3(b) show relative fluorescence intensity plots against GdnHCl concentration of Vκ and Vλ domains. Vκ3 is the most stable VL domain with a ΔGN-U of 34.5 kJ mol−1, followed by Vκ1 with 29.0 kJ mol−1 and Vκ2 and Vλ1 with 24.8 and 23.7 kJ mol−1, respectively (Table 1). The least stable VL domains are Vλ2 and Vλ3 with a ΔGN-U of 16.0 and 15.1 kJ mol−1. All VL domains show m-values between 11.1 and 16.2 kJ mol−1 M−1, indicating that they have the cooperativity expected for a two-state transition (Myers et al., 1995). The human consensus Vκ4 carries an exposed tryptophan at position 58 in addition to the conserved Trp43, which is not quenched in the native state. The denaturation curve is fully reversible, but shows a steep pre-transition baseline followed by a non-cooperative transition. Because of this uncertainly, no ΔGN-U values for Vκ4 but only the midpoint of transition are reported, which is at 1.5 M GdnHCl. For the Vκ4 domain Len, a stability of 32 kJ/mol has been reported (Raffen et al., 1999).
Analysis of Primary Sequence and Model Structures
In the group of isolated VH fragments large differences are seen: VH3 shows the highest yield of soluble protein and thermodynamic stability, VH1a, VH1b and VH5 show intermediate yield and intermediate or low stability, while VH2, VH4 and VH6 show more aggregation prone behavior and low cooperativity during denaturant-induced unfolding. The properties of Vκ and Vλ domains are more homogenous. The thermodynamic stabilities differ by only approximately 10 kJ/mol in the group of Vκ and in the group Vλ domains. In general, the stability and soluble yield is higher in isolated Vκ domains than in Vλ domains. To analyze possible structural reasons for this different behavior of the variable antibody domains, the primary sequence and the modeled structures of the seven human consensus VH and VL domains were analyzed. The models have been published previously (Knappik et al., 2000) (PDB entries: 1DHA (H1a), 1DHO (H1b), 1DHQ (H2), 1DHU (H3), 1DHV (H4), 1DHW (H5), and 1DHZ (H6)) and VL domains (PDB entries: 1DGX (κ1), 1DH4 (κ2), 1DH5 (κ3), 1DH6 (κ4), 1DH7 (λ1), 1DH8 (λ2), 1DH9 (λ3)). The quality of the models varies for the different domains. Many antibody structures in the Protein Data Bank use, for example, the VH3 framework, and the chosen template structure for building the model shares 86% sequence identity excluding the CDR3 region (PDB entry: 1IGM) and the structural differences between templates could be traced to distinct sequence differences. In the case of VH6, the closest templates were human VH4 and murine VH8 domains, since no crystal structure of a member of the VH6 germline family is available in the PDB. Both germline families encode a different framework 1 structural subtype (I) than VH6 (III) (Honegger & Plückthun, 2001). The chosen template for VH6 (PDB entry: 7FAB) shares 62% sequence identity, excluding the CDR3 region and belongs to human VH4. Three questions regarding the domains in isolation came up: Why is VH3 so extraordinarily stable, why do VH2, VH4 and VH6 behave comparatively poorly concerning expression and aggregation and why did Vκ domains give higher yields and are more stable than Vλ domains?
Salt Bridges
Salt bridges between positively and negatively charged amino acids and repulsions between equally charged amino acids play an important role in protein stability (Nakamura, 1996).
In VL domains (
Hydrophobic Core Packing
Another important stabilizing factor is hydrophobic core packing (Pace, 1990). All model structures were checked for cavities, which would indicate improper packing leading to fewer van der Waals interactions and reduced thermodynamic stability. A van der Waals contact surface was generated for a water radius of 1.4 Å with the program Molmol (Koradi et al., 1996). When cavities were found, the surrounding residues were checked whether they would contribute hydrophobic surface area to the cavity. A cavity lined with hydrophobic residues would be less favorable as a water molecule would be energetically unfavorable at such a position. Based on these cavities and sequence comparisons between the different variable domain frameworks, positions in the hydrophobic core could be identified, which may lead to sub-optimal packing. In
Upper Core
The residues 2, 4, 25, 29, 31, 41, 80, 82, 89, and 108 form the upper core. In the sequence alignment shown in Table 2 these residues have been compared for the variable domains. In VH domains two sequence motifs can be distinguished: the VH3-like motif with two bulky aromatic residues at positions 29 and 31 (VH1b, VH3, VH5), the alternative location of the aromatic residues at 25 and 29 (VH2) and the VH4/VH6 motif with Trp at position 41 and a big aliphatic residue at position 25.
Lower Core
Within VH domains an interesting correlation is seen between stability and framework 1 classification after Honegger and Plückthun (Honegger & Plückthun, 2001), which influences hydrophobic core packing of the lower core (Saul & Poljak, 1993) and is determined by the type of amino acid in positions 6, 7 and 10 (Table 3). The most stable VH3 domain falls into subgroup II, while VH1a, VH1b and VH5 with intermediate properties fall into subgroup III (Table 3). The VH domains showing high inclusion body propensity and no cooperative denaturation VH2, and VH4 fall into subgroup I. VH6 is a member of subgroup III because of its Gln at position 6 and the absence of Pro in position 7. However, previous experiments (Jung et al., 2001) have shown that Pro in position 10 destabilizes the domain.
Residues 19, 74, 78, 93, and 104 (Table 2) are part of the lower core, which is built of residues 13, 19, 21, 45, 55, 74, 77, 78, 91, 93, 96, 100, 102, 104 and 145. Only VH3, the most stable framework, has a bulky aromatic residue (Phe) at position 78. However, VH1a, VH1b, and VH5 have Phe at position 74, thereby simply switching the residues in positions 74 and 78, probably leading to similar interactions (
In VL domains only one framework 1 subtype is found (Honegger & Plückthun, 2001), and as a consequence, the lower core residues of Vκ and Vλ domains are almost the same and have similar orientations (Table 2 and
Residues Possibly Influencing Solubility and Folding Efficiency
Residues that could correlate with poor expression behavior and a high tendency to aggregate due to kinetic rather than thermodynamic reasons (Fink, 1998) were further examined. The analysis was started from a sequence alignment of the human consensus VH domains grouped by VH with good biophysical properties (VH1a, VH1b, VH3, VH5) and more aggregation prone VH domains (VH2, VH4, VH6) (Table 3).
It was shown previously that mutations of exposed hydrophobic residues do not change the solubility of the native scFv fragment, as determined by salting-out, but have a profound effect on the in vivo folding yield (Nieba et al., 1997). Position 5 is exposed to solvent and therefore the hydrophilic residue Gln or Lys of VH2, VH4, and VH6 might be thought to decrease the aggregation tendency in contrast to the hydrophobic Val in VH1a, VH1b, VH3, and VH5. Nevertheless, in a selection experiment favoring stability (Jung et al., 1999), Val was selected out of Val, Gln, Leu, and Glu in the scFv 4D5Flu, possibly indicating the importance of local secondary structure propensity.
VH2, VH4 and VH6 have a non-glycine residue with a conserved positive phi angle at position 16 (
For the antibody McPC603, it has been shown by Knappik & Plückthun, 1995 that the exchange of Pro47 to Ala, adjacent to another Pro at position 48, does not result in better thermodynamic stability, but enhances folding efficiency. VH2 and VH4 also carry Pro at position 47. In VH6, the highly conserved hydrophobic core residue Ile is exchanged to Thr at position 58, which buries an unsatisfied hydrogen bond donor.
A proline residue in position H10 can have a strong influence on FR 1 conformation. VH structures can be classified into four subtypes with distinct FR 1 conformation and correlated differences in the packing of the lower core depending on the type of amino acid found in positions H6, H7 and H10 (Honegger & Plückthun, 2001a). To prove that these residues indeed cause the different conformations, Jung et al. (2001) introduced different H6/H7/H10 residue combination into the same VH domain and determined the effect on the structure by X-ray crystallography. In their system, all combinations containing Pro in position 10 were destabilized compared to molecules containing a Gly, Ala or Ser in this position. While these constructs contained Pro in an “unnatural” combination with a VH-domain normally containing a different amino acid in this position, and therefore the destabilizing effect could also be due to a mismatch between local sequence and overall sequence context, the poorly behaved VH2, VH4 and VH6 all contain Pro10, while VH1B, VH1B, VH3 and VH5 have a Gly or Ala in this position.
At position 44 the even numbered VH domains carry Ile in contrast to Val of the odd numbered VH domain. This position is located at the interface to VL and should have no effect on the isolated domains, but it should have an effect when in complex with VL.
The exposed CDR 2 residue 60 of the even numbered VH domains is an aromatic bulky amino acid (Trp and Tyr) and probably decreases folding efficiency. This residue cannot be exchanged because of possible participation in antigen binding.
The solvent exposed residue 72 was changed in the antibody McPC603 from a hydrophobic residue Ala to Asp, which increases the soluble/insoluble ratio 20-fold but does not alter the thermodynamic stability (Knappik et al., 1995). VH6 carries a hydrophobic Val at this position.
The odd numbered VH domains have Gly at position 76 in contrast to the even numbered VH domains, which carry Thr or Ser. In half of the antibody structures determined that are found in the PDB the residue at this position has a positive phi angle, indicating that glycine could be better at this position.
The semi-buried position 90 of VH1a, VH1b, VH3, and VH5 is occupied with Tyr, whereas VH2, VH4, and VH6 have Val or Ser. The influence of this substitution on the poor behavior of the even numbered domains can only be tested experimentally.
As the VL domains can be primarily grouped in κ and λ domains the analysis was concentrated on a comparison between these two groups. At the solvent exposed C-terminal end at positions 146, 148 and 149 Vκ domains have charged amino acids in contrast to Vλ domains, which have Thr, Leu and Gly, respectively, at these positions (Table 4,
Proline is an α-helix and β-strand breaker and thus destabilizes those secondary structures. Positions 12 and 18 in VL domains are both part of a β-sheet structure. Only Vκ2 has Pro at both positions while Ser and Arg, respectively, are the dominant residues at these positions in the other VL domains (Table 4,
Expression and Protein Purification of scFv Fragments
After biophysical characterization of isolated human consensus VH and VL domains systematic combinations of VH and VL were also tested to understand their mutual influence on biophysical properties and chose the scFv format, in which the VH domain is linked via a flexible peptide linker to the VL domain. To limit the number of possible VH-VL combinations of 49, the scFv fragments with the most stable VH domain VH3 was tested combined with each of the seven human consensus VL domains and, conversely, the most stable VL domain Vκ3 with each of the seven human consensus VH domains. It should be examined if there is a mutual compensation or addition of the individual biophysical properties of the isolated variable domains in the scFv fragment or if even synergetic effects can occur.
All VH domains within the scFv fragment carry the same H-CDR3, which is derived from the VH domain of the well expressing antibody 4D5 (Knappik et al., 2000; Carter et al., 1992). The Vκ and Vλ domains in the scFv fragments carry the κ- and λ-like L-CDR3, respectively. All scFv fragments could be expressed in soluble form in the periplasm and purified with IMAC, followed by an anion exchange column. Purity of the fragments was over 98%, confirmed by SDS-PAGE analysis (data not shown) and the subsequent measurements were all carried out with freshly purified proteins. To compare the expression yield of the scFv fragments with the different VH or VL domains, we additionally isolated the scFvs with a batch method. To test the error inherent in the yield determination the scFv H3κ3 was purified 4 times independently. The yield of purified H3κ3 was 6.5±0.2 mg from a 1 L bacteria culture normalized to an OD550 of 10, which is approximately the final cell density in a shaken flask under these conditions. Yields of all scFv fragments tested were normalized to the yield of H3κ3 and were in the range of 2.6 to 12.4 mg/L (Table 5). H1aκ3 and H1bκ3 with 11.1 mg/L and 12.4 mg/L, respectively, (1.7 and 1.9 fold the amount of H3κ3), show the highest yield and H2κ3, H4κ3 and H6κ3 show the lowest yield of scFv fragments with the Vκ3 domain with 0.6, 0.4 and 0.6 fold that of H3κ3, respectively. All scFv fragments with VH3 but different VL domains show yields only below that of H3κ3. The percentage of insoluble protein was determined for H3κ3 in 4 independent measurements to be (30±10) %. The other scFv fragments tested show a percentage of insoluble protein between 50% and 10% with the exception of H2κ3, H4κ3 and H6κ3, which show a percentage of insoluble protein between 80% and 90% (Table 5).
Analytical Gel Filtration of scFv Fragments
H3κ3 elutes from an analytical gel filtration column Superdex-75 at a protein concentration of 5 μM in 50 mM sodium phosphate (pH 7.0) and 500 mM NaCl with an apparent molecular weight of 29 kDa, which indicates that H3κ3 is monomeric in solution. The other scFv fragments with VLκ3 as the VL domain are also monomeric under these conditions, with the exception of H1aκ3, which shows besides the monomer peak also smaller dimer and multimer peaks. H4κ3 shows in addition a small amount of dimer of less than 10%.
Equilibrium Unfolding Experiments of scFv Fragments
Unfolding and refolding of the scFv fragments as a function of denaturant concentration was monitored by the shift of the maximum of the fluorescence emission after excitation at 280 nm. Each scFv fragment shows reversible unfolding behavior (data not shown). The denaturation of the scFv fragments is usually not a two-state process (Wörn & Plückthun, 2001), because the scFv fragments are built from two domains, which may have different intrinsic stabilities and interact over an interface region and can potentially stabilize each other. Therefore, no ΔGN-U values are reported, but instead the midpoints of the transitions of denaturation are given, which are a semi-quantitative measure for the stability of the scFv fragments. The assignment of the transitions to VH or VL domain results from the determination of the transition of single domains (Table 1). In Table 5 the midpoints are listed for the VH and VL domain within the scFv fragments. If only one transition is visible, the midpoint is assigned to both the VH and VL domain.
With the knowledge of the denaturation properties of the isolated VH and VL domains and the combinations of these domains in the scFv fragments it is now possible to systematically study the influence of the interface interaction on the stability of the scFv fragments. Different cases can be distinguished (Wörn & Plückthun, 1999): If the stability of the isolated VH and VL domains is very similar, the resulting scFv has also the same stability (see
The scFv fragments with Vλ domains show an interesting behavior (
Vλ domains were also cloned and purified with the κ-like L-CDR3. The isolated Vλ domains with the κ-like CDR3 gave very poor yields. They do not show reversible behavior in denaturant induced equilibrium denaturation and have lower midpoints of denaturation than the corresponding Vλ domain with the λ-like L-CDR3. The combinations of VH3 with Vλ domains carrying the κ-like CDR3 show similar yield and dimer/monomer ratios in analytical gel filtration as the ones carrying the λ-like CDR3 (data not shown) but a different behavior in GdnHCl denaturation. As an example,
In summary, the most stable scFv fragments found to denature only starting above 2 M GdnHCl are H3κ3, H1bκ3, H5κ3 and H3κ1. Although the isolated Vλ domains are rather unstable by themselves, in combination with VH3 they can build very stable scFv fragments, but depend on the L-CDR3 for this effect. Most likely this CDR is responsible for a favorable orientation of VL to VH and thus enables a tighter interaction through the interface. ScFv fragments with an intermediate stability starting denaturation above 1 M GdnHCl are H1aκ3, H2κ3, H3κ2 and H3κ4, while H4κ3 and H6κ3 are scFv fragments with a modest stability, starting denaturation under 1 M GdnHCl.
Structure-Based Improvement of the Biophysical Properties of Immunoglobulin VH Domains with a Generalizable Approach
CDR, complementary determining region; GdnHCl, guanidine hydrochloride; HuCAL, Human Combinatorial Antibody Library; IMAC, immobilized metal ion affinity chromatography; IPTG, isopropyl-β-D-thiogalactopyranoside; scFv, single-chain antibody fragment consisting of the variable domains of the heavy and of the light chain connected by a peptide linker; VH, variable domain of the heavy chain of an antibody; VL variable domain of the light chain of an antibody.
In a systematic study of V gene families carried out with consensus VH and VL domains alone and in combinations in scFv fragments, we found comparatively low expression yields and lower cooperativity in equilibrium unfolding in antibody fragments containing VH domains of human germline families 2, 4 and 6. From an analysis of the packing of the hydrophobic core, the completeness of charge clusters, the occurrence of unsatisfied hydrogen bonds, and residues with low β-sheet propensity, positive Φ angle and exposed hydrophobic side chains, we pinpointed residues potentially responsible for these unsatisfactory properties of these germline-encoded sequences. Several of those are in common between the domains of the even-numbered subgroups, but do not occur in the odd-numbered ones. In this study, we have systematically exchanged those residues alone and in combination in two different scFv fragments using the VH6 framework and we describe their effect on equilibrium stability and folding yield. We improved the stability by 20.9 kJ/mol, the expression yield by a factor 4, and can now use these data to rationally engineer antibodies derived from this and similar germline families for better biophysical properties. Furthermore, we provide an improved design for libraries exploiting the significant additional diversity provided by these frameworks. Both antibodies studied here completely retain their binding affinity, demonstrating that the CDR conformations were not affected.
Recombinant antibodies are used in an ever increasing number of applications from biological research to therapy. In addition to showing high antigen specificity and affinity, such recombinant antibodies should also be obtainable in high yield, have low tendency to aggregate and be stable against high denaturant concentrations, elevated temperatures and proteases, depending on the requested task. A popular format for many of these applications is the single-chain Fv (scFv) fragment, where the variable domain of the heavy chain (VH) is connected via a flexible linker to the variable domain of the light chain (VL) or vice versa (1-3). This format contains the complete antigen binding site and can be expressed in a wide range of hosts including bacteria (4) and yeast (5). While we chose to investigate these questions with scFv fragments, as their simple structure makes an untangling of domain interactions much easier, differences in physical properties are also manifest in Fab fragments and whole antibodies, which contain the same domains.
Mutations important for the biophysical behavior can either influence the equilibrium thermodynamic stability or the aggregation tendency during folding or both. While these properties are distinguishable and mutations are known (see below) which influence only one of these properties, frequently they are related and amino acid exchanges can have an effect on both. Mutations influencing thermodynamic stability can make contributions to many different types of interactions, such as packing of the hydrophobic core, secondary structure propensity, charge interactions, hydrogen bonding, desolvation upon unfolding, compatibility with the enforced local structure, and many more (6, 7). Mutations that influence folding efficiency can also be part of this list, as the stability of intermediates is an important component. Additionally, however, natural proteins use “negative design” (8) to avoid aggregation. In its simplest form, this avoids hydrophobic patches on the surface. In the case of antibodies, such hydrophobic patches were found to have almost no effect on the solubility of the native protein, correctly defined as the maximal concentration of the soluble native protein (9). The hydrophobic patches can have a very dramatic effect on the folding yield and thus the yield of functional protein in E. coli, which is colloquially but incorrectly often termed “solubility”, as the yield describes the overall process of producing soluble protein, but not its solubility.
In the case of scFv fragments, a further complication is introduced by their two-domain nature. The two domains can stabilize each other and unfold either cooperatively or with an equilibrium intermediate, depending on the relative intrinsic stability of the domains and their interface (10). However, from these studies of domain interactions and a systematic study of isolated domains and their interactions (see Example 1, 11), we can now untangle this system. We can thus pinpoint the problem spots, and in the present study we wish to provide the evidence that a correction of these small defects indeed leads to a marked improvement of phenotypes.
It is thus important to distinguish expression yield from thermodynamic stability. In the periplasmic expression of antibodies, the most important limitation of the level of observed expression level of functional protein is the periplasmic folding yield (4). Antibodies with poor yield of functional protein give rise to periplasmic aggregates. There are three principal mechanisms leading to an increased expression yield of soluble proteins: Increasing the total expression level (provided the folding yield stays constant), increasing the folding yield in E. coli or decreasing degradation by E. coli proteases. All three mechanisms can be somewhat influenced by extrinsic factors including the choice of bacterial strain, expression vector, media composition, and expression temperature (summarized in ref. (4)) and coexpression of periplasmic chaperones (12, 13). Nevertheless, the major contribution to changes of the expression yield of folded protein is due to changes in the protein sequence itself. In the case of secreted proteins placed in the same vector, the translation initiation region and the beginning of the protein sequence (the signal sequence) is identical between different variants. Therefore, sequence changes are extremely unlikely to influence translation per se. Mutations leading to higher thermodynamic stability often also decrease protease digestion of the protein, as the E. coli proteases usually prefer unfolded protein as a substrate. Nevertheless, mutations removing potential cutting sites for E. coli proteases may also prevent degradation. Mutations may thus also influence the efficiency of folding, independent of influencing the equilibrium thermodynamic stability of the protein. Side reactions of the folding process often lead to aggregated protein, which is enriched in inclusion bodies. The kinetic partioning into productive folding and aggregation can be influenced by mutations increasing either the thermodynamic stability of intermediates or removing a solvent-exposed hydrophobic residue or otherwise making the surface less suitable for aggregate growth (“negative design” (8)). In addition, the mutations increasing folding efficiency can also indirectly lead to a higher total expression level by preventing the formation of toxic side-products, most likely soluble aggregates, which lead to leakiness of the outer membrane and eventually decrease the viability of E. coli.
There are different approaches finding residues that improve the thermodynamic stability and yield of soluble protein of scFv fragments (reviewed by Wörn & Plückthun (7)). Previously, most work had concentrated on the optimization of individual antibodies. If the three-dimensional (3D) structure of the antibody to be improved is known, a detailed analysis can identify problematic residues, which can then be exchanged by side-directed mutagenesis (14-16). A second approach uses random mutagenesis followed by selection with a bias toward the improvement of the desired property (17-19). The consensus approach as a third approach (20) uses the sequence information from antibodies naturally encoded by the immune system. The genes of immunoglobulin variable domains, as is assumed for all gene families, have diverged by multiple gene duplications and mutations. Selected genes are further subjected to an accelerated “local” evolution by somatic mutations that optimize the capacity of the antibody to bind to antigen structures with high affinity, but these mutations are not propagated in the germline. In contrast, mutations acquired during the duplication of the primordial V gene to make the present-day Ig-locus are manifest as germline family-specific differences. In this study, we wanted to explore a generic approach for improving antibodies for their biophysical properties combining the above knowledge with our knowledge of the biophysical properties of the germline-encoded VH, Vκ and Vλ families (see Example 1, 11). Since we focus on genes with initially germline-encoded sequences, our approach is not limited to improving individual molecules and thus to removing changes introduced by somatic mutations, but particularly to problematic residues encoded by different germline genes.
Destabilizing mutations may be highly probable but are selectively neutral as long as the overall domain stability does not fall below a certain threshold (20). Conversely, random mutations resulting in increased thermodynamic stability are highly improbable in the absence of a positive selection. Consequently, the most frequent amino acid at any position in an alignment of homologous immunoglobulin variable domains should be most favorable for the stability of the protein domain. This method was tested on a Vκ domain and of ten proposed mutations six increased the stability. Nevertheless, the simplification inherent in this approach is that all frameworks are averaged to a single “ideal” sequence. The different germline genes or frameworks have an important function for antibody diversity. First, framework residues in the outer loop and close to the 2-fold axis can contribute important interactions to protein- and hapten-antigens, respectively. Second, several framework regions can influence the conformation of the CDRs and thereby indirectly modulate antigen binding. Third, different frameworks carry mutually incompatible residues, which cannot simply be exchanged to those of other frameworks. It follows that family-specific solutions are needed to create a variety of different frameworks with superior properties. In this paper we provide the basis for this approach.
Recently, we analyzed the biophysical properties of human germline family-specific consensus domains (see Example 1, 11) derived from the Human Combinatorial Antibody Library (HuCAL™) (21). In case of the VH domains we found that the VH3 germline family-specific consensus domain was the most stable VH domain, followed by the VH1a, VH1b and VH5 consensus domains with intermediate stabilities and only little or no aggregation-prone behavior. VH2, VH4 and VH6 domains, on the other hand, showed low cooperativity during denaturant-induced unfolding, lower yield and a higher tendency to aggregate. The detailed analysis of hydrophobic core packing and formation of salt bridges revealed that the VH3 domain had always found the optimal solution while all other VH domains had some shortcomings explaining the higher thermodynamic stability of VH3. Furthermore, with the help of a sequence alignment grouped by VH domains with favorable properties (families 1, 3 and 5) and unfavorable properties (families 2, 4 and 6), residues of the even-numbered VH domains were identified and structurally analyzed which potentially decrease the folding efficiency being the reason for the unfavorable properties.
In this study, we used a structure-based approach exploiting the knowledge of the biophysical properties of the human germline family-specific consensus VH domains (see Example 1, 11), and in addition, resorting to tables of published and in-house selection experiments (A. Honegger et al., unpublished) to improve the VH6 framework as a model. We chose the VH6 framework, because it shows a somewhat aggregation-prone behavior and the lowest midpoint of denaturation, compared to the other human VH domains, indicating that VH6 is the VH domain with the lowest thermodynamic stability. These properties were observed with isolated domains as well as in the scFv format with Vκ3 (see Example 1, 11). We used two scFv fragments containing the VH6 framework which had been selected from the HuCAL (21): 2C2, binding the peptide M18 coupled to transferrin and 6B3, binding myoglobin (see Materials and Methods for details). With side-directed mutagenesis and based on our structural analysis we introduced six mutations (Q5V, S16G, T58I, V72D, S76G and S90Y) alone and in several combinations, which were hypothesized to be independently acting, individually exchangeable and were also a feature distinguishing the group of VH families with favorable properties from the families with less favorable properties. We compared these mutants to the wild-type scFv fragments for effects on folding yield and, independently, the free energy of unfolding as a measure for the thermodynamic stability and determined the additivity of these mutations.
Construction of Expression Vectors
The scFv fragment 2C2 (A. Hahn et al., MorphoSys AG, unpublished results) with the human consensus domains VH6 and VLκ3 (H-CDR3: QRGHYGKGYKGFNSGFFDF and L-CDR3: QYYNIPT) was obtained by panning against the peptide M18 with the sequence CDAFRSEKSRQELNTIASKPPRDHVF coupled to transferrin (Jerini GmbH, Berlin), while the scFv fragment 6B3 (S. Müller et al., MorphoSys AG, unpublished results) with VH6 and VLλ3 (H-CDR3: SYFISFFSFDY and L-CDR3: SYDSGFSTV) was obtained by panning against myoglobin from horse skeletal muscle (Sigma). Both scFv fragments were subcloned via the restriction sites XbaI and EcoRI into the expression plasmid pMX7 (21). The different mutations were introduced with the QuikChange™ site-directed mutagenesis kit from Stratagene according to the manufacturers instructions. Multiple mutations were constructed by exchanging restriction fragments using unique XbaI, XhoI, BsaBI and EcoRI sites in the antibody. The final expression cassettes consist of a phoA signal sequence, short FLAG-tag (DYKD), the scFv fragment in the orientation VH6 domain-(Gly4Ser)4 linker-VL domain, followed by long FLAG-tag (DYKDDDD) and a hexahistidine-tag.
Expression and Purification
Thirty mL dYT medium (containing 30 μg/mL chloramphenicol, 1.0% glucose) was inoculated with a single bacterial colony and shaken overnight at 25° C. One liter of dYT medium (containing 30 μg/mL chloramphenicol, 50 mM K2HPO4) was inoculated with this preculture and incubated at 25° C. (5 L flask with baffles, 105 rpm). Expression was induced at an OD550 of 1.0 by addition of IPTG to a final concentration of 0.5 mM. Incubation was continued for 18 hours while the cell density reached an OD550 between 8.0 and 11.0. Cells were collected by centrifugation (8000 g, 10 min at 4° C.), resuspended in 40 ml of 50 mM Tris-HCl (pH 7.5) and 500 mM NaCl and disrupted by French Press lysis. The crude extract was centrifuged (48,000 g, 60 minutes at 4° C.) and the supernatant passed through a 0.2 μm filter. The proteins were purified using the two column coupled in-line procedure (4). In this strategy, the eluate of an immobilized metal ion affinity chromatography (IMAC) column, which exploits the C-terminal His-tag, was directly loaded onto an ion-exchange column. Elution from the ion-exchange column was achieved with a 0-800 mM NaCl gradient. The constructs derived from the scFv 2C2 were purified with a HS cation-exchange column in 10 mM MES (pH 6.0) and those derived from 6B3 with an HQ anion-exchange column in 10 mM Tris-HCl (pH 8.0). Pooled fractions were dialyzed against 50 mM Na-phosphate, pH 7.0, 100 mM NaCl. Protein concentrations were determined by OD280. The soluble yield was normalized to a one liter bacterial culture with an OD550 of 10.
Gel Filtration Chromatography
Samples of purified scFv fragments were analyzed on a Superdex-75 column equilibrated with 50 mM Na-phosphate, pH 7.0, 500 mM NaCl, on a SMART-system (Pharmacia). The samples were injected at a concentration of 5 μM in a volume of 50 μl, and the flow-rate was 60 μl/min. Lysozyme (14 kDa), carbonic anhydrase (29 kDa) and bovine serum albumin (66 kDa) were used as molecular weight standards.
Equilibrium Denaturation Experiments
Fluorescence spectra were recorded at 25° C. with a PTI Alpha Scan spectrofluorimeter (Photon Technologies, Inc., Ontario, Canada). Slit widths of 2 nm were used both for excitation and emission. Protein/GdnHCl-mixtures (1.6 ml) containing a final protein concentration of 0.5 μM and denaturant concentrations ranging from 0 to 5 M GdnHCl were prepared from freshly purified protein and a GdnHCl stock solution (8 M, in 50 mM Na-phosphate, pH 7.0, 100 mM NaCl). Each final concentration of GdnHCl was determined by measuring the refractive index. After overnight incubation at 10° C., the fluorescence emission spectra of the samples were recorded from 320 to 370 nm with an excitation wavelength of 280 nm. With increasing denaturant concentrations, the maxima of the recorded emission spectra shifted from about 340 to 350 nm. The fluorescence emission maximum was determined by fitting the fluorescence emission spectrum to a Gaussian function and was plotted versus the GdnHCl concentration. Protein stabilities were calculated as described (22, 23). To compare scFv denaturation curves in one plot the emission maxima were scaled by setting the highest value to 1 and the lowest to 0 to give normalized emission maxima.
Enzyme Linked Immunosorbent Assay (ELISA)
Myoglobin from horse skeletal muscle (Sigma) and peptide M18 coupled to transferrin (Jerini GmbH, Berlin) at a concentration of 5 μg/ml in 50 mM Na-phosphate, 100 mM NaCl, pH 7.0 were coated overnight at 4° C. on Maxisorb 96-well plates (Nunc). Plates were blocked in 2.0% sucrose, 0.1% bovine serum albumin (Sigma), 0.9% NaCl for 2 h at room temperature. After incubation of samples at concentrations from 2 μM to 0.125 μM, bound scFv fragments were detected using an α-tetra-his antibody (Qiagen) followed by an anti-mouse antibody conjugated with alkaline phosphatase (Sigma).
BIAcore Measurements
BIAcore analysis was performed using a CM5-chip (Amersham Pharmacia) with one lane coated with 2,700 resonance units (RU) of myoglobin from horse skeletal muscle (Sigma), one coated with 2,500 RU peptide M18 coupled to transferrin (Jerini GmbH, Berlin) and one blank lane as a control surface. Each binding-regeneration circle was performed at 25° C. with a constant flow rate of 25 μL/min with different antibody concentrations ranging from 5 μM to 0.08 μM in 20 mM HEPES (pH 7.0), 150 mM NaCl and 0.005% Tween 20 and 2 M NaSCN for regeneration. Determination of the antigen dissociation constant in solution was performed with competition BIAcore (24, 25) with the same chip, buffer and regeneration conditions. ScFv fragments at constant concentration and variable amounts of antigen were preincubated at least for one hour at 10° C. and injected in a sample volume of 100 μL. Data were evaluated by using BIAevaluation software (Pharmacia) and SigmaPlot (SPSS Inc.). Slopes of the association phase of linear sensograms were plotted against the corresponding total antigen concentrations and the dissociation constant was calculated as described previously (26).
Properties of the Wild Type scFv Fragments
We chose the VH6 framework as the model system to test our strategy for improving the biophysical properties by a structure-based design and used two scFv fragments selected from the HuCAL as model systems: 2C2, which binds the peptide M18 coupled to transferrin, and consists of VH6 paired with Vκ3, and 6B3, which binds myoglobin, consisting of VH6 paired with Vλ3. The two antibodies differ in CDR3 (see Materials and Methods), but otherwise the VH6 sequence is identical. The wild-type (wt) scFv fragments 2C2 and 6B3 were expressed in the periplasm of E. coli. The scFv fragments were purified from the soluble fraction of the cell extract by immobilized metal affinity chromatography (IMAC), followed by an ion-exchange column. The purity of the scFv fragments was greater than 98%, as determined by SDS-PAGE (data not shown). The soluble yield after purification of a one liter bacterial culture normalized to OD550 of 10 of 2C2-wt and 6B3-wt was 1.2±0.1 mg and 0.4±0.1 mg, respectively. Approximately 10% and 25%, respectively, of the total amount of expressed protein was found in insoluble form, as determined by Western Blot. The oligomeric state was determined by analytical gel filtration. Both proteins elute with an apparent molecular weight of 29 kDa, indicating that they are monomeric (
Structural Rationale for the Selection of Mutations
The first set of mutants to improve the properties of scFv fragments 2C2 and 6B3 containing the human VH6 framework was chosen from the analysis of the structural model, guided by the sequence alignment of the human consensus VH domains grouped by VH domains with favorable biophysical properties (families 1, 3 and 5) and VH domains with less favorable properties (families 2, 4 and 6) (
Q5V: In a selection experiment of the scFv 4D5Flu favoring stability, Val was selected at this position out of Val, Gln, Leu, and Glu (18). Position 5 is part of the first β-strand and Val has a higher β-sheet propensity as Gln (31). Nevertheless, it was shown previously that mutations of exposed hydrophobic residues have a profound effect on the in vivo folding yield (9).
S16G: VH2, VH4 and VH6 carry a non-glycine residue, nevertheless with a conserved positive phi angle at position 16 in the loop of framework 1 (
T58I: The residue at position 58, which is the highly conserved Ile, points into the hydrophobic core (
V72D: The solvent exposed residue 72 (
S76G: The odd numbered VH domains have Gly at position 76 in framework 2 (
S90Y: The semi-buried position 90 (
In position 20 and 88 group-specific differences are seen, too (
Single Mutations
The six mutations (Q6V, S16G, T58I, V72D, S76G agfnd S90Y) described above were introduced into 2C2-wt and 6B3-wt by site directed mutagenesis. All scFv fragments carrying one mutation were expressed and purified in an identical manner to the wild type scFv fragments and were monomeric in solution (data not shown). In all single and subsequently constructed multiple mutants the proportion of soluble to insoluble protein in the periplasm stayed constant, even in those cases where the total expression level increased. The biophysical data are summarized in Table 7 To compare the improvements caused by the mutations in 2C2 and 6B3, the expression yield of soluble protein is normalized to the yield of the corresponding wild-type scFv fragments and the free energy of unfolding (ΔGN-U) is given as the difference (ΔΔGN-U) to the corresponding scFv-wt. The denaturant-induced unfolding curves are shown in
Both single mutations exchanging the non-gycine residues with positive phi-angles (S16G and S76G) increased the yield of soluble protein by a factor of approximately two. The thermodynamic stability was also increased in both single mutations with ΔΔGN-U of 6.2 and 7.3 kJ/mol for 2C2-S16G and 6B3-S16G and ΔΔGN-U of 3.7 and 3.5 kJ/mol for 2C2-S76G and 6B3-S76G, respectively, compared to the wild-type scFv fragments. The mutation to Gly in a loop region causes a higher flexibility, which enables the optimal orientation of the anti-parallel β-sheet stabilizing the whole domain. The higher yield of these mutants is probably due to the increased protease resistance and folding efficiency caused by the stabilized folded state of the protein.
The mutation of the OH-carrying Thr58 to Ile, pointing into the hydrophobic core, did not alter the yield of soluble protein but caused a marked increase of thermodynamic stability with ΔΔGN-U of 7.9 and 6.8 kJ/mol for 2C2-T58I and 6B3-T58I, respectively. This remarkable improvement in stability is due to the additional van der Waals interaction of the hydrophobic Ile within the hydrophobic core and to the absence of the desolvation necessary when burying Thr. Interestingly, this mutation does not have an effect on the yield of soluble protein, indicating that the folding efficiency is not increased.
Both mutations exchanging a residue in a β-sheet to a residue with higher β-sheet propensity (Q5V and S90Y) resulted in an approximately 1.8-fold increase in yield of soluble protein. In addition, the thermodynamic stability is slightly increased with the exception of 2C2-S90Y, which shows even a very small decrease in comparison to the wild-type scFv fragment. The analysis of these constructs shows that mutations of residues, which participate in a β-sheet, to a residue with higher β-sheet building propensity can increase yield of soluble protein due to a higher folding efficiency. Depending on the scFv fragment the thermodynamic stability is also increased probably because of better orientation of the mutated residue, facilitating the orientation of stabilizing hydrogen bonds in the β-sheet.
The last single mutation exchanges a solvent-exposed hydrophobic residue with a hydrophilic one (V72D). The yield of soluble protein in 2C2-V72D and 6B3-V72D is increased 3.2 and 1.8 fold, respectively. The thermodynamic stability in 2C2-V72D is not changed, while in 6B3-V72D it is slightly increased with ΔΔGN-U of 2.2 kJ/mol.
Multiple Mutations
To determine whether the improvements were additive, we cloned combinations of the single mutations. The scFv fragments with multiple mutations were expressed and purified as above and were also monomeric in solution, as demonstrated by analytical gel filtration (2C2- and 6B3-all as examples in
The details of the yield of soluble protein and thermodynamic stability determinations are listed in Table 7. In summary, the effect on yield and stability of the single mutations is almost fully additive. The scFv fragments carrying all six mutations, 2C2-all and 6B3-all, show an increase in yield of 4.3 and 4.2 fold, respectively, compared to the wild-type scFv fragments. The absolute values for 2C2-all are a yield of 5.1 mg/L, which is 3.9 mg/L more than for 2C2-wt, and a thermodynamic stability of 72.3 kJ/mol. In the case of 6B3-all, a yield of 1.7 mg/L was obtained, which is 1.3 mg/L more than for 6B3-wt.
Analysis of Framework 1 Subtype
VH structures can be divided into four distinct framework 1 conformations depending on the type of amino acids at position 6, 7 and 10 (32) (numbering scheme is according to Honegger & Plückthun (33)). Residues at position 19, 74, 78 and 93, which are part of the hydrophobic core of the lower part of the domain and thus influence thermodynamic stability and folding efficiency, are correlated to this structural subtype (32). While the VH domains with the most favorable properties fall into subtype II (VH3) and subtype III (VH1a, VH1b and VH5), the VH domains with less favorable properties VH2 and VH4 fall into subgroup I. VH6, which we want to improve, can be assigned to subtype III which is defined by Gln at position 6 and the absence of Pro at position 7 (32). Analysis of subtype III defining and correlated residues of human VH domains (32) shows that the VH6 fragment carries rarely used residues in position 10, 74 and 78 (Table 8). Pro in position 10 is used in 8% of the sequences, whereas Ala is used in 76% of the sequences. Pro only allows a more limited number of conformations than Ala. In a mutagenesis experiment (34), Pro at position 10 was shown to destabilize a VH domain in a subtype IV context (only occurring in murine, not in human sequences). Val at position 74 and Ile at position 78 have a frequency of 1% and 8%, respectively, compared to VH subtype III sequences. Val74 was exchanged in 2C2 and 6B3 to the more frequently found Phe, as the bulky aromatic amino acid probably increases the packing density of the hydrophobic core. Ile78 was not exchanged to the subtype III consensus residues Ala or Val, which are, as Ile, non-aromatic aliphatic residues, as the effect on the packing density would probably be small. In
The mutations to the framework 1 subtype III consensus P10A alone and in combination with V74F were introduced into the wild-type scFv fragments by site directed mutagenesis. 2C2-P10A and 6B3-P10A showed a 2.9 and 4.2 fold increase in yield of soluble protein compared to the wild-type scFv fragments, respectively, while the double mutants with P10A and V74F showed a lower increase with 1.9 and 1.7 fold, respectively. All biophysical data are summarized in Table 7. The analysis of the soluble and insoluble fraction of the periplasmic expression in E. coli of the single- and double-mutant showed that both the total expression level and the level of soluble protein increased by the mutations and thus the ratio between soluble and insoluble scFv fragment remained constant (data not shown). The thermodynamic stability of the scFv fragments 2C2 and 6B3 is not increased by the mutation P10A, and is only slightly increased (ΔΔGN-U of 0.5 kJ/mol and 0.4 kJ/mol, respectively) with the double-mutation P10A and V74F (Table 7,
Determination of Binding Activity
The goal of the study was to show that yield and stability of VH6 containing scFv fragments can be improved by the structure-based approach, guided by the family-specific analysis, while the binding activity is retained. We analyzed the binding activity with two independent methods: ELISA and BIAcore. For the ELISA, we coated the corresponding antigen and applied various concentrations of scFv fragments. We tested all single mutations including scFv-P10A and the multiple mutations scFv-all and scFv-all+P10A. All mutants show similar concentration dependence, which indicates that they have the same binding affinity (data not shown).
BIAcore experiments were performed with different concentrations of scFv fragments flowing over an antigen-coated chip.
The aim of this study was to demonstrate the validity of the structure-based, family-consensus based predictions. We chose scFv fragments containing the human germline family VH6 consensus domain as a model system to improve the expression yield of soluble protein and thermodynamic stability. Potential mutations improving these biophysical properties were identified from comparison of the residues which define the framework 1 subtype and other interacting residues to the consensus found within the same subtype. The next set of potential mutations was found by an analysis of the structure for potential imperfections, guided by a comparison to the consensus sequences of those VH domains with known favorable biophysical properties (families 1, 3 and 5). We excluded CDR residues from this analysis. We could pinpoint such residues, as we had previously systematically determined the biophysical properties of consensus sequences of all human variable domain subgroups (see Example 1, 11). The experiment shows that all seven proposed single mutations fall into three categories. They result either only in an increase in expression yield of soluble protein, or only in thermodynamic stability, or both. This distinction helps to understand the role of these residues in determining the biophysical properties of this proteins. In case of the scFv 2C2 three and in case of the scFv 6B3 even five out of these seven mutations result in an improvement of both biophysical properties. These results illustrate that the combination of structure-based analysis, guided by family alignments, is a powerful way to improve the properties of immunoglobulin variable domains. Since our analysis (see Example 1, 11) covers all human families, we have now a general strategy for this task.
The analysis of different combinations of the single mutations to the consensus of VH domains with favorable properties showed that the improvements in free energy were almost perfectly additive, indicating that they act independently. The mutant with the highest yield and thermodynamic stability compared to the wild-type scFv fragments is indeed the mutant with all six mutations. In the case of the scFv 2C2, the properties of the best mutant are comparable to the properties of a model scFv fragment consisting of the most stable VH domain, VH3, and the same VL domain Vκ3 with a different CDR3, which was part of the systematic biophysical characterization of human variable antibody domains (see Example 1, 11), indicating that it is indeed possible to turn an antibody with unfavorable properties into a one with very favorable properties by changing only a few residues. Most importantly, both CDRs and those framework residues are maintained which are important for binding.
The addition of the mutation P10A to the scFv fragments carrying six mutations decreases both expression yield and thermodynamic stability, although in the wild-type scFv fragments this mutation increased the soluble yield 2.9-fold in the case of 2C2-P10A and 4.2-fold in the case of 6B3-P10A and left the thermodynamic stability unchanged. The mutations Q5V and S16G, which are close to position 10, should still be beneficial to the VH6 framework as they are independent of the type of amino acid in position 10. The reason of the declined biophysical properties of this mutation in the context of the improved framework can probably only be explained with the help of the experimentally determined 3D structure.
The improvements seem to be independent of the VL domain and of the sequence and length of CDR3, as 2C2 with Vκ3 and 6B3 with Vλ3 and different H-CDR3 loops gave similar results. There were only two minor exceptions, as the thermodynamic stability of the 6B3 mutants V72D and S90Y is slightly increased, while in 2C2 no stability increase could be observed. It was shown previously that in scFv fragments Vλ domains, in contrast to Vκ domains, are able to form very stable VH-VL interfaces, increasing the stability of the whole scFv fragment even above the intrinsic stabilities of the isolated domains (see Example 1, 11). The residue at position 72 is not involved in the interface interactions but is in close proximity to it (
Although we did not exchange residues of the CDR with possible direct contact to the antigen, it could not be a priori excluded that changes in the framework might affect the orientation of the CDRs and, thereby, antigen binding. Therefore, we experimentally determined the binding properties. However, in the case of the examined mutations, antigen binding was fully retained as demonstrated by three independent methods.
In this study we show that it is possible to rationally transform antibody frameworks with less favorable properties into those with very favorable properties while retaining their binding activity and the binding characteristics of the framework. It could be argued that an easier approach would be to use directly the very stable VH3 framework with a suitable VL domain. Nevertheless, framework residues can affect the orientation of CDRs, can be part of the hapten-binding cavity located in the VH-VL interface and build the “outer loop”, which was seen in some cases to be involved in antigen binding. These “framework” residues can thereby contribute greatly to affinity and diversity and it is unlikely that a single framework can provide the ideal solution in all cases. Therefore, we believe that the preferred approach to achieve a structurally diverse library of stable frameworks is to optimize the human consensus antibody frameworks further in the way we presented here, as it would give access to a whole range of stable scaffolds covering all natural families.
In this study we focused on the improvement of the VH6 framework. However, because of the sequence similarity five of the mutations studied (Q5V, S16G, V72D, S76G and S90Y) should give similar results for VH domains belonging to family VH2 and VH4. While this approach is useful for the design of antibody libraries, in many cases given human antibodies, e.g. from transgenic mice (35, 36), obtained by humanization (37) or by phage display from a library of natural sequences (38-40) may also benefit from improvement.
These results also show that some human germline genes do not encode an optimal version of the protein, regarding its biophysical properties. Since the biophysical properties of natural domains cover a wide range, it cannot be argued that limited stability is a desirable property for the immune system. Rather, the stability of VH2, VH4 and VH6 may simply be good enough to be tolerated by the immune system. For those biomedical or biotechnological applications where it is not good enough, however, we have now provided a pathway to improve these properties in a straightforward way.
42. Koradi, R., Billeter, M., and Wüthrich, K. (1996) MOLMOL: a program for display and analysis of macromolecular structures, J. Mol. Graph. 14, 51-55, 29-32.
adata from Ewert et al., 2002
blong CDR3, sequence: YNHEADMLIRNWLYSDV
cshort CDR3, sequence: WGGDGFYAMDY
dκ-like CDR3, sequence: QQHYTTPPT
eλ-like CDR3, sequence: QSYDSSLSGVV
fno soluble protein obtained, purification via refolding of inclusion bodies.
gmonomer in 50 mM sodium-phosphate (pH 7.0) and 500 mM NaCl, in case of VH1a with 0.9 M GdnHCl
hnot determined
idimer and monomer equilibrium
aNumbering according to the structurally based scheme of Honegger & Plückthun (2001)
aNumbering according to the structurally based scheme of Honegger & Plückthun (2001)
aNumbering according to the structurally based scheme of Honegger & Plückthun (2001)
asequence of H-CDR3 (short, WGGDGFYAMDY) / L-CDR3 (κ-like: QQHYTTPPT)
bsequence of H-CDR3 (short, WGGDGFYAMDY) / L-CDR3 (λ-like: QSYDSSLSGVV)
cgiven in mg per 1 L bacteria at OD550 of 10, and compared to in parenthesis to the soluble yield of H3κ3
doligomeric state in 50 mM sodium-phosphate (pH 7.0) and 500 mM NaCl with M monomer; D dimer; m multimer.
ewithin the scFv fragment
fonly one transition is visible
aTaken from VBASE; 51 human germline segments for VH and 76 for VL.
bTaken from Griffiths et al., (1994), originally 215 binders were sequenced but there are only 137 unique sequences. The Griffiths library is built from an in vitro rearranged germline bank, therefore the theoretical distribution is given by the percentage of germline segment, present in the human genome, as given in column 3.
cTheoretical distribution is corrected for size of sublibaries and percentage of correct clones in the original HuCAL-1 scFv library (Knappik et al., (2000).
dTaken from (Knappik et al., (2000).
eincluding DP-21 (VH7)
fone germline segment
ayield of soluble protein after IMAC and ion-exchange column, normalized to yield of the respective wild-type scFv fragments 2C2 and 6B3. Absolute values: 2C2-wt: 1.2 ± 0.1 mg and 6B3-wt: 0.4 ± 0.1 mg per 1 L bacterial culture of an OD550 of 10.
bAbsolute values of free energy of unfolding of wild-type scFv fragments: 2C2-wt: ΔGN-U 51.3 kJ / mol and 6B3-wt: ΔGN-U = 42.4 kJ / mol
cin parentheses sum of the free energy contributions of the individual mutations to equilibrium stability
dnot determined because of low cooperativity (see text for details)
aaccording to ref. (32)
busing the numbering scheme of Honegger & Plilckthun (33)
cAla is used in 76% of subtype III sequences (32)
dpercentage use of specified amino acid in subtype III sequences, regardless of VH family (32)
Number | Date | Country | Kind |
---|---|---|---|
01 11 6756.6 | Jul 2001 | EP | regional |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/EP02/08094 | 7/19/2002 | WO | 7/12/2004 |