The present disclosure relates to stabilized forms of human immunodeficiency virus gp120 envelope protein, specifically to crystalline forms of gp120, high resolution structures obtained from these crystals, and use thereof.
The primary immunologic abnormality resulting from infection by human immunodeficiency virus (HIV) is the progressive depletion and functional impairment of T lymphocytes expressing the CD4 cell surface glycoprotein. The loss of CD4 helper/inducer T cell function probably underlies the profound defects in cellular and humoral immunity leading to the opportunistic infections and malignancies characteristic of the acquired immunodeficiency syndrome (AIDS) (Lane et al., Ann. Rev. Immunol. 3:477, 1985). Studies of HIV-1 infection of fractionated CD4 and CD8 T cells from normal donors and AIDS patients have revealed that depletion of CD4 T cells results from the ability of HIV-1 to selectively infect, replicate in, and ultimately destroy this T lymphocyte subset (Klatzmann et al., Science 225:59, 1984). The possibility that CD4 itself is an essential component of the cellular receptor for HIV-1 was first indicated by the observation that monoclonal antibodies directed against CD4 block HIV-1 infection and syncytia induction (Dalgleish et al., Nature 312:767, 1984; McDougal et al., J. Immunol. 135:3151, 1985). This hypothesis has been confirmed by the demonstration that a molecular complex forms between CD4 and the major envelope glycoprotein of HIV-1 (McDougal et al., Science 231:382, 1986)
The major envelope protein of HIV-1 is a glycoprotein of approximately 160 kD (160). During infection proteases of the host cell cleave gp160 into gp120 and gp41. gp41 is an integral membrane protein, while gp120 protrudes from the mature virus. Together gp120 and gp41 make up the HIV envelope spike.
The HIV envelope spike mediates binding to receptors and virus entry (Wyatt and Sodroski, Science 280:188, 1998). The spike is trimeric and composed of three gp120 exterior and three gp41 transmembrane envelope glycoproteins. CD4 binding to gp120 in the spike induces conformational changes that allow binding to a coreceptor, either CCR5 or CXCR4, which is required for viral entry (Dalgleish et al., Nature 312:763, 1984; Sattentau and Moore, J. Exp. Med. 174:407, 1991; Feng at al., Science 272:872, 1996; Wu et al., Nature 384:179, 1996; Trkola et al., Nature 384:184, 1996).
The mature gp120 glycoprotein is approximately 470-490 amino acids long depending on the HIV strain of origin. N-linked glycosylation at approximately 20-25 sites makes up nearly half of the mass of the molecule. Sequence analysis shows that the polypeptide is composed of five conserved regions (C1-C5) and five regions of high variability (V1-V5).
With the number of individuals infected by HIV-1 approaching 1% of the world's population, an effective vaccine is urgently needed. An enveloped virus, HIV-1 hides from humoral recognition behind a protective lipid bilayer. An available viral target for neutralizing antibodies is the envelope spike. Genetic, immunologic and structural studies of the HIV-1 envelope glycoproteins have revealed extraordinary diversity as well as multiple overlapping mechanisms of humoral evasion, including self-masquerading glycan, immunodominant variable loops, and conformational masking. These evolutionarily honed bathers of diversity and evasion have confounded traditional means of vaccine development. It is believed that immunization with effectively immunogenic HIV gp120 envelope glycoprotein can elicit a neutralizing response directed against gp120, and thus HIV. The need exists for immunogens that are capable of eliciting an immunogenic response in a suitable subject. In order to be effective, the antibodies raised must be capable of neutralizing a broad range of HIV strains and subtypes.
Disclosed herein are gp120 polypeptides and nucleic acid molecules encoding gp120 polypeptides, which are useful to induce an immunogenic response to a lentivirus, such as SIV or HIV (for example HIV-1 and HIV-II) in a subject. In several embodiments, the gp120 polypeptides are stabilized in a CD4 bound conformation by the introduction of a plurality of non-naturally occurring cross-linking cysteine residues. In other examples, the gp120 polypeptide has the V3 loop in an extended conformation.
Immunogenic compositions containing a therapeutically effective amount of gp120 polypeptides and nucleic acid molecules encoding gp120 polypeptides are also disclosed. Also disclosed are methods for eliciting and/or enhancing an immune response in a subject, for example by administering an immunogenic composition.
Crystalline forms of gp120 are disclosed as are crystal structures of gp120 polypeptides obtained from these structures. Methods are also disclosed for identifying an immunogen that induces an immune response to gp120 using these crystal structures. Also provided by this disclosure is a machine readable data storage medium including a data storage material encoded with machine readable data corresponding to the coordinates of the crystal structures disclosed herein. A computer system is disclosed for displaying the coordinate data from these crystal structures of gp120, such as the atomic positions, surface, domain, or region of the gp120 polypeptide.
The foregoing and other objects, features, and advantages of the invention will become more apparent from the following detailed description, which proceeds with reference to the accompanying figures.
The nucleic and amino acid sequences listed in the accompanying sequence listing are shown using standard letter abbreviations for nucleotide bases, and three letter code for amino acids, as defined in 37 C.F.R. 1.822. Only one strand of each nucleic acid sequence is shown, but the complementary strand is understood as included by any reference to the displayed strand.
SEQ ID NO: 1 is the amino acid sequence of gp120 HXBc2 Core New 9c.
SEQ ID NO: 2 the amino acid sequence of the gp120 with an extended V3 loop.
SEQ ID NO: 3 is a nucleotide sequence of gp120 HXBc2 DM.
SEQ ID NOs: 4-18 are nucleotide sequences of stabilized HXBc2 Core gp120.
SEQ ID NO: 19 is a nucleotide sequence of wild type (WT) HXBc2.
SEQ ID NO: 20 is the amino acid sequence of wild type (WT) HXBc2.
SEQ ID NO: 21-25 are amino acid sequences of V3 loops.
SEQ ID NO: 26 is a nucleotide sequence of gp120 HXBc2 Core New 9c.
SEQ ID NO: 27 is the amino acid sequence of gp120 HXB2CG.
Unless otherwise noted, technical terms are used according to conventional usage. Definitions of common terms in molecular biology can be found in Benjamin Lewin, Genes V, published by Oxford University Press, 1994 (ISBN 0-19-854287-9); Kendrew et al. (eds.), The Encyclopedia of Molecular Biology, published by Blackwell Science Ltd., 1994 (ISBN 0-632-02182-9); and Robert A. Meyers (ed.), Molecular Biology and Biotechnology: a Comprehensive Desk Reference, published by VCH Publishers, Inc., 1995 (ISBN 1-56081-569-8). Terms describing protein structure and structural elements of proteins can be found in Creighton, Proteins, Structures and Molecular Properties, W.H. Freeman & Co., New York, 1993 (ISBN 0-717-7030) which is incorporated by reference herein in its entirety.
Unless otherwise explained, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. The singular terms “a,” “an,” and “the” include plural referents unless context clearly indicates otherwise. Similarly, the word “or” is intended to include “and” unless the context clearly indicates otherwise. It is further to be understood that all base sizes or amino acid sizes, and all molecular weight or molecular mass values, given for nucleic acids or polypeptides are approximate, and are provided for description. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of this disclosure, suitable methods and materials are described below. The term “comprises” means “includes.” The abbreviation, “e.g.” is derived from the Latin exempli gratia, and is used herein to indicate a non-limiting example. Thus, the abbreviation “e.g.” is synonymous with the term “for example.”
All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including explanations of terms, will control. In addition, all the materials, methods, and examples are illustrative and not intended to be limiting. In order to facilitate review of the various embodiments of the disclosure, the following explanations of specific terms are provided:
Adjuvant: A vehicle used to enhance antigenicity; such as a suspension of minerals (alum, aluminum hydroxide, aluminum phosphate) on which antigen is adsorbed; or water-in-oil emulsion in which antigen solution is emulsified in oil (MF-59, Freund's incomplete adjuvant), sometimes with the inclusion of killed mycobacteria (Freund's complete adjuvant) to further enhance antigenicity (inhibits degradation of antigen and/or causes influx of macrophages). Adjuvants also include immunostimulatory molecules, such as cytokines, costimulatory molecules, and for example, immunostimulatory DNA or RNA molecules, such as CpG oligonucleotides.
Administration: The introduction of a composition into a subject by a chosen route. For example, if the chosen route is intravenous, the composition is administered by introducing the composition into a vein of the subject.
Antibody: A polypeptide substantially encoded by an immunoglobulin gene or immunoglobulin genes, or fragments thereof, which specifically binds and recognizes an analyte (antigen) such as gp120 or an antigenic fragment of gp120. Immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, epsilon and mu constant region genes, as well as the myriad immunoglobulin variable region genes.
Antibodies exist, for example as intact immunoglobulins and as a number of well characterized fragments produced by digestion with various peptidases. For instance, Fabs, Fvs, and single-chain Fvs (SCFvs) that bind to gp120 or fragments of gp120 would be gp120-specific binding agents. This includes intact immunoglobulins and the variants and portions of them well known in the art, such as Fab′ fragments, F(ab)′2 fragments, single chain Fv proteins (“scFv”), and disulfide stabilized Fv proteins (“dsFv”). A scFv protein is a fusion protein in which a light chain variable region of an immunoglobulin and a heavy chain variable region of an immunoglobulin are bound by a linker, while in dsFvs, the chains have been mutated to introduce a disulfide bond to stabilize the association of the chains. The term also includes genetically engineered forms such as chimeric antibodies (such as humanized murine antibodies), heteroconjugate antibodies such as bispecific antibodies). See also, Pierce Catalog and Handbook, 1994-1995 (Pierce Chemical Co., Rockford, Ill.); Kuby, J., Immunology, 3rd Ed., W.H. Freeman & Co., New York, 1997.
Antibody fragments are defined as follows: (1) Fab, the fragment which contains a monovalent antigen-binding fragment of an antibody molecule produced by digestion of whole antibody with the enzyme papain to yield an intact light chain and a portion of one heavy chain; (2) Fab′, the fragment of an antibody molecule obtained by treating whole antibody with pepsin, followed by reduction, to yield an intact light chain and a portion of the heavy chain; two Fab′ fragments are obtained per antibody molecule; (3) (Fab′)2, the fragment of the antibody obtained by treating whole antibody with the enzyme pepsin without subsequent reduction; (4) F(ab′)2, a dimer of two Fab′ fragments held together by two disulfide bonds; (5) Fv, a genetically engineered fragment containing the variable region of the light chain and the variable region of the heavy chain expressed as two chains; and (6) single chain antibody (“SCA”), a genetically engineered molecule containing the variable region of the light chain, the variable region of the heavy chain, linked by a suitable polypeptide linker as a genetically fused single chain molecule. The term “antibody,” as used herein, also includes antibody fragments either produced by the modification of whole antibodies or those synthesized de novo using recombinant DNA methodologies.
Typically, a naturally occurring immunoglobulin has heavy (H) chains and light (L) chains interconnected by disulfide bonds. There are two types of light chain, lambda (λ) and kappa (κ). There are five main heavy chain classes (or isotypes) which determine the functional activity of an antibody molecule: IgM, IgD, IgG, IgA and IgE.
Each heavy and light chain contains a constant region and a variable region, (the regions are also known as “domains”). In combination, the heavy and the light chain variable regions specifically bind the antigen. Light and heavy chain variable regions contain a “framework” region interrupted by three hypervariable regions, also called “complementarity-determining regions” or “CDRs.” The extent of the framework region and CDRs have been defined (see, Kabat et al., Sequences of Proteins of Immunological Interest, U.S. Department of Health and Human Services, 1991, which is hereby incorporated by reference). The Kabat database is now maintained online. The sequences of the framework regions of different light or heavy chains are relatively conserved within a species. The framework region of an antibody, that is the combined framework regions of the constituent light and heavy chains, serves to position and align the CDRs in three-dimensional space.
The CDRs are primarily responsible for binding to an epitope of an antigen. The CDRs of each chain are typically referred to as CDR1, CDR2, and CDR3, numbered sequentially starting from the N-terminus, and are also typically identified by the chain in which the particular CDR is located. Thus, a VH CDR3 is located in the variable domain of the heavy chain of the antibody in which it is found, whereas a VL CDR1 is the CDR1 from the variable domain of the light chain of the antibody in which it is found. Light chain CDRs are sometimes referred to as CDR L1, CDR L2, and CDR L3. Heavy chain CDRs are sometimes referred to as CDR H1, CDR H2, and CDR H3.
References to “VH” or “VH” refer to the variable region of an immunoglobulin heavy chain, including that of an Fv, scFv, dsFv or Fab. References to “VL” or “VL” refer to the variable region of an immunoglobulin light chain, including that of an Fv, scFv, dsFv or Fab.
A “monoclonal antibody” is an antibody produced by a single clone of B-lymphocytes or by a cell into which the light and heavy chain genes of a single antibody have been transfected. Monoclonal antibodies are produced by methods known to those of skill in the art, for instance by making hybrid antibody-forming cells from a fusion of myeloma cells with immune spleen cells. These fused cells and their progeny are termed “hybridomas.” Monoclonal antibodies include humanized monoclonal antibodies.
A “humanized” immunoglobulin is an immunoglobulin including a human framework region and one or more CDRs from a non-human (such as a mouse, rat, or synthetic) immunoglobulin. The non-human immunoglobulin providing the CDRs is termed a “donor,” and the human immunoglobulin providing the framework is termed an “acceptor.” In one embodiment, all the CDRs are from the donor immunoglobulin in a humanized immunoglobulin. Constant regions need not be present, but if they are, they must be substantially identical to human immunoglobulin constant regions, such as at least about 85-90%, such as about 95% or more identical. Hence, all parts of a humanized immunoglobulin, except possibly the CDRs, are substantially identical to corresponding parts of natural human immunoglobulin sequences. A “humanized antibody” is an antibody comprising a humanized light chain and a humanized heavy chain immunoglobulin. A humanized antibody binds to the same antigen as the donor antibody that provides the CDRs. The acceptor framework of a humanized immunoglobulin or antibody may have a limited number of substitutions by amino acids taken from the donor framework. Humanized or other monoclonal antibodies can have additional conservative amino acid substitutions which have substantially no effect on antigen binding or other immunoglobulin functions. Humanized immunoglobulins can be constructed by means of genetic engineering (for example, see U.S. Pat. No. 5,585,089).
Antigenic gp120 polypeptide: An “antigenic gp120 polypeptide” includes a gp120 molecule or a portion thereof that is capable of provoking an immune response in a mammal, such as a mammal with or without an HIV infection. Administration of an antigenic gp120 polypeptide that provokes an immune response preferably leads to protective immunity against HIV.
Antigenic surface: A surface of a molecule, for example a protein such as a gp120 protein or polypeptide, capable of eliciting an immune response. An antigenic surface includes the defining features of that surface, for example the three-dimensional shape and the surface charge. An antigenic surface includes both surfaces that occur on gp120 polypeptides as well as surfaces of compounds that mimic the surface of a gp120 polypeptide (mimetics).
CD4: Cluster of differentiation factor 4 polypeptide, a T-cell surface protein that mediates interaction with the MHC class II molecule. CD4 also serves as the primary receptor site for HIV on T-cells during HIV-1 infection.
The known sequence of the CD4 precursor has a hydrophobic signal peptide, an extracellular region of approximately 370 amino acids, a highly hydrophobic stretch with significant identity to the membrane-spanning domain of the class II MHC beta chain, and a highly charged intracellular sequence of 40 resides (Maddon, Cell 42:93, 1985).
The term “CD4” includes polypeptide molecules that are derived from CD4 include fragments of CD4, generated either by chemical (for example enzymatic) digestion or genetic engineering means. Such a fragment may be one or more entire CD4 protein domains. The extracellular domain of CD4 consists of four contiguous immunoglobulin-like regions (D1, D2, D3, and D4, see Sakihama et al., Proc. Natl. Acad. Sci. 92:6444, 1995; U.S. Pat. No. 6,117,655), and amino acids 1 to 183 have been shown to be involved in gp120 binding. For instance, a binding molecule or binding domain derived from CD4 would comprise a sufficient portion of the CD4 protein to mediate specific and functional interaction between the binding fragment and a native or viral binding site of CD4. One such binding fragment includes both the D1 and D2 extracellular domains of CD4 (D1D2 is also a fragment of soluble CD4 or sCD4 which is comprised of D1 D2 D3 and D4), although smaller fragments may also provide specific and functional CD4-like binding. The gp120-binding site has been mapped to D1 of CD4.
CD4 polypeptides also include “CD4-derived molecules” which encompasses analogs (non-protein organic molecules), derivatives (chemically functionalized protein molecules obtained starting with the disclosed protein sequences) or mimetics (three-dimensionally similar chemicals) of the native CD4 structure, as well as proteins sequence variants or genetic alleles that maintain the ability to functionally bind to a target molecule.
CD4BS antibodies: Antibodies that bind to or substantially overlap the CD4 binding surface of a gp120 polypeptide. The antibodies interfere with or prevent CD4 from binding to a gp120 polypeptide.
CD4i antibodies: Antibodies that bind to a conformation of gp120 induced by CD4 binding.
Contacting: Placement in direct physical association; includes both in solid and liquid form.
Computer readable media: Any medium or media, which can be read and accessed directly by a computer, so that the media is suitable for use in a computer system. Such media include, but are not limited to: magnetic storage media such as floppy discs, hard disc storage medium and magnetic tape; optical storage media such as optical discs or CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media.
Computer system: Hardware that can be used to analyze atomic coordinate data. The minimum hardware of a computer-based system typically comprises a central processing unit (CPU), an input device, for example a mouse, keyboard, and the like, an output device, and a data storage device. Desirably a monitor is provided to visualize structure data. The data storage device may be RAM or other means for accessing computer readable. Examples of such systems are microcomputer workstations available from Silicon Graphics Incorporated and Sun Microsystems running Unix based Windows NT or IBM OS/2 operating systems.
Degenerate variant and conservative variant: A polynucleotide encoding a polypeptide or an antibody that includes a sequence that is degenerate as a result of the genetic code. For example, a polynucleotide encoding a gp120 polypeptide or an antibody that binds gp120 that includes a sequence that is degenerate as a result of the genetic code. There are 20 natural amino acids, most of which are specified by more than one codon. Therefore, all degenerate nucleotide sequences are included as long as the amino acid sequence of the gp120 polypeptide or antibody that binds gp120 encoded by the nucleotide sequence is unchanged. Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given polypeptide. For instance, the codons CGU, CGC, CGA, CGG, AGA, and AGG all encode the amino acid arginine. Thus, at every position where an arginine is specified within a protein encoding sequence, the codon can be altered to any of the corresponding codons described without altering the encoded protein. Such nucleic acid variations are “silent variations,” which are one species of conservative variations. Each nucleic acid sequence herein that encodes a polypeptide also describes every possible silent variation. One of skill will recognize that each codon in a nucleic acid (except AUG, which is ordinarily the only codon for methionine) can be modified to yield a functionally identical molecule by standard techniques. Accordingly, each “silent variation” of a nucleic acid which encodes a polypeptide is implicit in each described sequence.
Furthermore, one of ordinary skill will recognize that individual substitutions, deletions or additions which alter, add or delete a single amino acid or a small percentage of amino acids (for instance less than 5%, in some embodiments less than 1%) in an encoded sequence are conservative variations where the alterations result in the substitution of an amino acid with a chemically similar amino acid.
Conservative amino acid substitutions providing functionally similar amino acids are well known in the art. The following six groups each contain amino acids that are conservative substitutions for one another:
1) Alanine (A), Serine (S), Threonine (T);
2) Aspartic acid (D), Glutamic acid (E);
3) Asparagine (N), Glutamine (Q);
4) Arginine (R), Lysine (K);
5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V); and
6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W).
Not all residue positions within a protein will tolerate an otherwise “conservative” substitution. For instance, if an amino acid residue is essential for a function of the protein, even an otherwise conservative substitution may disrupt that activity.
Epitope: An antigenic determinant. These are particular chemical groups or peptide sequences on a molecule that are antigenic, such that they elicit a specific immune response. An antibody binds a particular antigenic epitope, such as an epitope of a gp120 polypeptide.
Expression: Translation of a nucleic acid into a protein. Proteins may be expressed and remain intracellular, become a component of the cell surface membrane, or be secreted into the extracellular matrix or medium.
Expression Control Sequences: Nucleic acid sequences that regulate the expression of a heterologous nucleic acid sequence to which it is operatively linked. Expression control sequences are operatively linked to a nucleic acid sequence when the expression control sequences control and regulate the transcription and, as appropriate, translation of the nucleic acid sequence. Thus expression control sequences can include appropriate promoters, enhancers, transcription terminators, a start codon (ATG) in front of a protein-encoding gene, splicing signal for introns, maintenance of the correct reading frame of that gene to permit proper translation of mRNA, and stop codons. The term “control sequences” is intended to include, at a minimum, components whose presence can influence expression, and can also include additional components whose presence is advantageous, for example, leader sequences and fusion partner sequences. Expression control sequences can include a promoter.
A promoter is a minimal sequence sufficient to direct transcription. Also included are those promoter elements which are sufficient to render promoter-dependent gene expression controllable for cell-type specific, tissue-specific, or inducible by external signals or agents; such elements may be located in the 5′ or 3′ regions of the gene. Both constitutive and inducible promoters are included (see for example, Bitter et al., Methods in Enzymology 153:516-544, 1987). For example, when cloning in bacterial systems, inducible promoters such as pL of bacteriophage lambda, plac, ptrp, ptac (ptrp-lac hybrid promoter) and the like may be used. In one embodiment, when cloning in mammalian cell systems, promoters derived from the genome of mammalian cells (such as metallothionein promoter) or from mammalian viruses (such as the retrovirus long terminal repeat; the adenovirus late promoter; the vaccinia virus 7.5K promoter) can be used. Promoters produced by recombinant DNA or synthetic techniques may also be used to provide for transcription of the nucleic acid sequences.
A polynucleotide can be inserted into an expression vector that contains a promoter sequence which facilitates the efficient transcription of the inserted genetic sequence of the host. The expression vector typically contains an origin of replication, a promoter, as well as specific nucleic acid sequences that allow phenotypic selection of the transformed cells.
gp120: The envelope protein from Human Immunodeficiency Virus (HIV). The envelope protein is initially synthesized as a longer precursor protein of 845-870 amino acids in size, designated gp160. Gp160 forms a homotrimer and undergoes glycosylation within the Golgi apparatus. It is then cleaved by a cellular protease into gp120 and gp41. Gp41 contains a transmembrane domain and remains in a trimeric configuration; it interacts with gp120 in a non-covalent manner. Gp120 contains most of the external, surface-exposed, domains of the envelope glycoprotein complex, and it is gp120 which binds both to the cellular CD4 receptor and to the cellular chemokine receptors (such as CCR5).
The mature gp120 wildtype polypeptides have about 500 amino acids in the primary sequence. Gp120 is heavily N-glycosylated giving rise to an apparent molecular weight of 120 kD. The polypeptide is comprised of five conserved regions (C1-C5) and five regions of high variability (V1-V5). Exemplary sequence of wt gp160 polypeptides are shown on GENBANK, for example accession numbers AAB05604 and AAD12142
The gp120 core has a unique molecular structure, which comprises two domains: an “inner” domain (which faces gp41) and an “outer” domain (which is mostly exposed on the surface of the oligomeric envelope glycoprotein complex). The two gp120 domains are separated by a “bridging sheet” that is not part of either domain. The gp120 core comprises 25 beta strands, 5 alpha helices, and 10 defined loop segments.
“Stabilized gp120” is a form of gp120 polypeptide from HIV-1, characterized by an increase in Tm over the wild type gp120. In some examples the gp120 is stabilized by the replacement of at least two amino acids of gp120 with cysteines such that a disulfide bond can form, wherein the gp120 protein has a Tm of greater than about 53.8° C. The stabilized gp120 mutants may contain amino acid substitutions that fill cavities present in the core of native gp120. The stabilized gp120 can bind CD4. Stabilized forms of gp120 may include forms that have synthetic amino acids. Several exemplary stabilized gp120 proteins are disclosed herein.
Gp120 polypeptides also include “gp120-derived molecules” which encompasses analogs (non-protein organic molecules), derivatives (chemically functionalized protein molecules obtained starting with the disclosed protein sequences) or mimetics (three-dimensionally similar chemicals) of the native gp120 structure, as well as proteins sequence variants (such as mutants), genetic alleles, fusions proteins of gp120, or combinations thereof.
The third variable region referred to herein as the V3 loop is a loop of about 35 amino acids critical for the binding of the co-receptor and determination of which of the co-receptors will bind. In certain examples the V3 loop comprises residues 296-331.
The numbering used in gp120 polypeptides disclosed herein is relative to the HXB2 numbering scheme as set forth in Numbering Positions in HIV Relative to HXB2CG Bette Korber et al., Human Retroviruses and AIDS 1998: A Compilation and Analysis of Nucleic Acid and Amino Acid Sequences. Korber B, Kuiken C L, Foley B, Hahn B, McCutchan F, Mellors J W, and Sodroski J, Eds. Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, N. Mex. which is incorporated by reference herein in its entirety. For reference, the amino acid sequence of HXB2CG is given below as SEQ ID NO: 27: lwvtvyygvpvwkeatttlfcasdakaydtevhnvwathacvptdpnpqevvlvnvtenfnmwkndmveqmhediislwdqslkpcvkltplcvs lkctdlkndtntnsssgrmimekgeikncsfnistsirgkvqkeyaffykldiipidndttsykltscntsvitqacpkvsfepipihycapagfailkcnnkt fngtgpctnvstvqcthgirpvvstqlllngslaeeevvirsvnftdnaktiivqlntsveinctrpnnntrkririqrgpgrafvtigkignmrqahcnisrak wnntlkqiasklreqfgnnktiifkqssggdpeivthsfncggeffycnstqlfnstwfnstwstegsnntegsdtitlpcrikqiinmwqkvgkamyapp isgqircssnitgllltrdggnsnneseifrpgggdmrdnwrselykykvvkieplgvaptkakrrvvqrekr (SEQ ID NO: 27). HXB2 is also known as: HXBc2, for HXB clone 2; HXB2R, in the Los Alamos HIV database, with the R for revised, as it was slightly revised relative to the original HXB2 sequence; and HXB2CG in GenBank, for HXB2 complete genome.
Heavy atom derivatization: A method of producing a chemically modified form of a protein crystal, for example a crystal containing gp120. In practice, a crystal is soaked in a solution containing heavy metal atom salts, or organometallic compounds, such as lead chloride, gold thiomalate, thimerosal or uranyl acetate, which can diffuse through the solvent channels of the crystal and bind the surface of the protein. The location(s) of the bound heavy metal atom(s) can be determined by X-ray diffraction analysis of the soaked crystal. This information, in turn, is used to generate the phase information used to construct three-dimensional structure of the enzyme (see Blundel and Johnson, Protein Crystallography, Academic Press (1976).
Host cells: Cells in which a vector can be propagated and its DNA expressed. The cell may be prokaryotic or eukaryotic. The term also includes any progeny of the subject host cell. It is understood that all progeny may not be identical to the parental cell since there may be mutations that occur during replication. However, such progeny are included when the term “host cell” is used.
In silico: A process performed virtually within a computer. For example, using a computer, a virtual compound can be screened for surface similarity or conversely surface complementarity to a virtual representation of the atomic positions at least a portion of a gp120 polypeptide, for example as stabilized gp120, such as defined in Table 1 or a gp120 with an extended V3 loop, such as defined in Table 2.
Immune response: A response of a cell of the immune system, such as a B cell, T cell, or monocyte, to a stimulus. In one embodiment, the response is specific for a particular antigen (an “antigen-specific response”). In one embodiment, an immune response is a T cell response, such as a CD4+ response or a CD8+ response. In another embodiment, the response is a B cell response, and results in the production of specific antibodies.
Immunogenic peptide: A peptide which comprises an allele-specific motif or other sequence, such as an N-terminal repeat, such that the peptide will bind an MHC molecule and induce a cytotoxic T lymphocyte (“CTL”) response, or a B cell response (for example antibody production) against the antigen from which the immunogenic peptide is derived.
In one embodiment, immunogenic peptides are identified using sequence motifs or other methods, such as neural net or polynomial determinations known in the art. Typically, algorithms are used to determine the “binding threshold” of peptides to select those with scores that give them a high probability of binding at a certain affinity and will be immunogenic. The algorithms are based either on the effects on MHC binding of a particular amino acid at a particular position, the effects on antibody binding of a particular amino acid at a particular position, or the effects on binding of a particular substitution in a motif-containing peptide. Within the context of an immunogenic peptide, a “conserved residue” is one which appears in a significantly higher frequency than would be expected by random distribution at a particular position in a peptide. In one embodiment, a conserved residue is one where the MHC structure may provide a contact point with the immunogenic peptide. In one specific non-limiting example, an immunogenic polypeptide includes a region of gp120, or a fragment thereof.
Immunogenic composition: A composition comprising an immunogenic peptide that induces a measurable CTL response against virus expressing the immunogenic peptide, or induces a measurable B cell response (such as production of antibodies) against the immunogenic peptide. In one example an “immunogenic composition” is composition comprising a gp120 polypeptide that induces a measurable CTL response against virus expressing gp120 polypeptide, or induces a measurable B cell response (such as production of antibodies) against a gp120 polypeptide. It further refers to isolated nucleic acids encoding an immunogenic peptide, such as a nucleic acid that can be used to express the gp120 polypeptide (and thus be used to elicit an immune response against this polypeptide).
For in vitro use, an immunogenic composition may consist of the isolated protein, peptide epitope, or nucleic acid encoding the protein, or peptide epitope. For in vivo use, the immunogenic composition will typically comprise the protein or immunogenic peptide in pharmaceutically acceptable carriers, and/or other agents. Any particular peptide, such as a gp120 polypeptide, or nucleic acid encoding the polypeptide, can be readily tested for its ability to induce a CTL or B cell response by art-recognized assays. Immunogenic compositions can include adjuvants, which are well known to one of skill in the art.
Immunologically reactive conditions: Includes reference to conditions which allow an antibody raised against a particular epitope to bind to that epitope to a detectably greater degree than, and/or to the substantial exclusion of, binding to substantially all other epitopes. Immunologically reactive conditions are dependent upon the format of the antibody binding reaction and typically are those utilized in immunoassay protocols or those conditions encountered in vivo. The immunologically reactive conditions employed in the methods are “physiological conditions” which include reference to conditions (such as temperature, osmolarity, pH) that are typical inside a living mammal or a mammalian cell. While it is recognized that some organs are subject to extreme conditions, the intra-organismal and intracellular environment is normally about pH 7 (such as from pH 6.0 to pH 8.0, more typically pH 6.5 to 7.5), contains water as the predominant solvent, and exists at a temperature above 0° C. and below 50° C. Osmolarity is within the range that is supportive of cell viability and proliferation.
Immunotherapy: A method of evoking an immune response against a virus based on their production of target antigens. Immunotherapy based on cell-mediated immune responses involves generating a cell-mediated response to cells that produce particular antigenic determinants, while immunotherapy based on humoral immune responses involves generating specific antibodies to virus that produce particular antigenic determinants.
Inhibiting or treating a disease: Inhibiting the full development of a disease or condition, for example, in a subject who is at risk for a disease such as acquired immune deficiency syndrome (AIDS), AIDS related conditions, HIV-1 infection, or combinations thereof. “Treatment” refers to a therapeutic intervention that ameliorates a sign or symptom of a disease or pathological condition after it has begun to develop. The term “ameliorating,” with reference to a disease or pathological condition, refers to any observable beneficial effect of the treatment. The beneficial effect can be evidenced, for example, by a delayed onset of clinical symptoms of the disease in a susceptible subject, a reduction in severity of some or all clinical symptoms of the disease, a slower progression of the disease, a reduction in the number of metastases, an improvement in the overall health or well-being of the subject, or by other parameters well known in the art that are specific to the particular disease. A “prophylactic” treatment is a treatment administered to a subject who does not exhibit signs of a disease or exhibits only early signs for the purpose of decreasing the risk of developing pathology.
Isolated: An “isolated” biological component (such as a nucleic acid, peptide or protein) has been substantially separated, produced apart from, or purified away from other biological components in the cell of the organism in which the component naturally occurs, such as, other chromosomal and extrachromosomal DNA and RNA, and proteins. Nucleic acids, peptides and proteins which have been “isolated” thus include nucleic acids and proteins purified by standard purification methods. The term also embraces nucleic acids, peptides, and proteins prepared by recombinant expression in a host cell as well as chemically synthesized nucleic acids.
Kd: The dissociation constant for a given interaction, such as a polypeptide ligand interaction. For example, for the bimolecular interaction of CD4 and gp120 it is the concentration of the individual components of the bimolecular interaction divided by the concentration of the complex.
Leukocyte: Cells in the blood, also termed “white cells,” that are involved in defending the body against infective organisms and foreign substances. Leukocytes are produced in the bone marrow. There are 5 main types of white blood cell, subdivided between 2 main groups: polymorphonuclear leukocytes (neutrophils, eosinophils, basophils) and mononuclear leukocytes (monocytes and lymphocytes).
Ligand: Any molecule which specifically binds a protein, such as a gp120 protein, and includes, inter alia, antibodies that specifically bind a gp120 protein. In alternative embodiments, the ligand is a protein or a small molecule (one with a molecular weight less than 6 kiloDaltons).
Mimetic: A molecule (such as an organic chemical compound) that mimics the activity of an agent, such as the activity of a gp120 protein, for example by inducing an immune response to gp120. Peptidomimetic and organomimetic embodiments are within the scope of this term, whereby the three-dimensional arrangement of the chemical constituents of such peptido- and organomimetics mimic the three-dimensional arrangement of the peptide backbone and component amino acid side chains in the peptide, resulting in such peptido- and organomimetics of the peptides having substantial specific activity. For computer modeling applications, a pharmacophore is an idealized, three-dimensional definition of the structural requirements for biological activity. Peptido- and organomimetics can be designed to fit each pharmacophore with computer modeling software (using computer assisted drug design or CADD). See Walters, “Computer-Assisted Modeling of Drugs”, in Klegerman & Groves, eds., 1993, Pharmaceutical Biotechnology, Interpharm Press: Buffalo Grove, Ill., pp. 165-174 and Principles of Pharmacology (ed. Munson, 1995), chapter 102 for a description of techniques used in computer assisted drug design.
Molecular Replacement: A method that involves generating a preliminary model, such as a model of a gp120 polypeptide, whose structure coordinates are unknown, by orienting and positioning a molecule whose structure coordinates are known (such as coordinates from Table 1) within the unit cell of the unknown crystal so as best to account for the observed diffraction pattern of the unknown crystal. Phases can then be calculated from this model and combined with the observed amplitudes to give an approximate Fourier synthesis of the structure whose coordinates are unknown. This, in turn, can be subject to any of the several forms of refinement to provide a final, accurate structure of the unknown molecule (see Lattman, Methods in Enzymology, 115:55-77, 1985; Rossmann, ed., “The Molecular Replacement Method”, Int. Sci. Rev. Ser., No. 13, Gordon & Breach, New York, 1972). Using the structure coordinates of gp120, such as a stabilized gp120 provided herein; molecular replacement may be used to determine the structure coordinates of a crystalline mutant or homologue of gp120, a different crystal form of gp120, or gp120 in complex with another molecule, such as an antibody, cell surface receptor, or combination thereof.
Naturally Occurring Amino Acids: L-isomers of the naturally occurring amino acids. The naturally occurring amino acids are glycine, alanine, valine, leucine, isoleucine, serine, methionine, threonine, phenylalanine, tyrosine, tryptophan, cysteine, proline, histidine, aspartic acid, asparagine, glutamic acid, glutamine, gamma.-carboxyglutamic acid, arginine, ornithine and lysine. Unless specifically indicated, all amino acids referred to in this application are in the L-form. “Synthetic amino acids” refers to amino acids that are not naturally found in proteins. Examples of synthetic amino acids used herein, include racemic mixtures of selenocysteine and selenomethionine. In addition, unnatural amino acids include the D or L forms of nor-leucine, para-nitrophenylalanine, homophenylalanine, para-fluorophenylalanine, 3-amino-2-benzylpropionic acid, homoarginine, and D-phenylalanine. The term “positively charged amino acid” refers to any naturally occurring or synthetic amino acid having a positively charged side chain under normal physiological conditions. Examples of positively charged naturally occurring amino acids are arginine, lysine and histidine. The term “negatively charged amino acid” refers to any naturally occurring or synthetic amino acid having a negatively charged side chain under normal physiological conditions. Examples of negatively charged naturally occurring amino acids are aspartic acid and glutamic acid. The term “hydrophobic amino acid” refers to any amino acid having an uncharged, nonpolar side chain that is relatively insoluble in water. Examples of naturally occurring hydrophobic amino acids are alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan and methionine. The term “hydrophilic amino acid” refers to any amino acid having an uncharged, polar side chain that is relatively soluble in water. Examples of naturally occurring hydrophilic amino acids are serine, threonine, tyrosine, asparagine, glutamine, and cysteine.
Nucleic acid: A polymer composed of nucleotide units (ribonucleotides, deoxyribonucleotides, related naturally occurring structural variants, and synthetic non-naturally occurring analogs thereof) linked via phosphodiester bonds, related naturally occurring structural variants, and synthetic non-naturally occurring analogs thereof. Thus, the term includes nucleotide polymers in which the nucleotides and the linkages between them include non-naturally occurring synthetic analogs, such as, for example and without limitation, phosphorothioates, phosphoramidates, methyl phosphonates, chiral-methyl phosphonates, 2-O-methyl ribonucleotides, peptide-nucleic acids (PNAs), and the like. Such polynucleotides can be synthesized, for example, using an automated DNA synthesizer. The term “oligonucleotide” typically refers to short polynucleotides, generally no greater than about 50 nucleotides. It will be understood that when a nucleotide sequence is represented by a DNA sequence (i.e., A, T, G, C), this also includes an RNA sequence (i.e., A, U, G, C) in which “U” replaces “T.”
“Nucleotide” includes, but is not limited to, a monomer that includes a base linked to a sugar, such as a pyrimidine, purine or synthetic analogs thereof, or a base linked to an amino acid, as in a peptide nucleic acid (PNA). A nucleotide is one monomer in a polynucleotide. A nucleotide sequence refers to the sequence of bases in a polynucleotide. A gp120 polynucleotide is a nucleic acid encoding a gp120 polypeptide.
Conventional notation is used herein to describe nucleotide sequences: the left-hand end of a single-stranded nucleotide sequence is the 5′-end; the left-hand direction of a double-stranded nucleotide sequence is referred to as the 5′-direction. The direction of 5′ to 3′ addition of nucleotides to nascent RNA transcripts is referred to as the transcription direction. The DNA strand having the same sequence as an mRNA is referred to as the “coding strand;” sequences on the DNA strand having the same sequence as an mRNA transcribed from that DNA and which are located 5′ to the 5′-end of the RNA transcript are referred to as “upstream sequences;” sequences on the DNA strand having the same sequence as the RNA and which are 3′ to the 3′ end of the coding RNA transcript are referred to as “downstream sequences.”
“cDNA” refers to a DNA that is complementary or identical to an mRNA, in either single stranded or double stranded form.
“Encoding” refers to the inherent property of specific sequences of nucleotides in a polynucleotide, such as a gene, a cDNA, or an mRNA, to serve as templates for synthesis of other polymers and macromolecules in biological processes having either a defined sequence of nucleotides (for example, rRNA, tRNA and mRNA) or a defined sequence of amino acids and the biological properties resulting therefrom. Thus, a gene encodes a protein if transcription and translation of mRNA produced by that gene produces the protein in a cell or other biological system. Both the coding strand, the nucleotide sequence of which is identical to the mRNA sequence and is usually provided in sequence listings, and non-coding strand, used as the template for transcription, of a gene or cDNA can be referred to as encoding the protein or other product of that gene or cDNA. Unless otherwise specified, a “nucleotide sequence encoding an amino acid sequence” includes all nucleotide sequences that are degenerate versions of each other and that encode the same amino acid sequence. Nucleotide sequences that encode proteins and RNA may include introns.
“Recombinant nucleic acid” refers to a nucleic acid having nucleotide sequences that are not naturally joined together. This includes nucleic acid vectors comprising an amplified or assembled nucleic acid which can be used to transform a suitable host cell. A host cell that comprises the recombinant nucleic acid is referred to as a “recombinant host cell.” The gene is then expressed in the recombinant host cell to produce, such as a “recombinant polypeptide.” A recombinant nucleic acid may serve a non-coding function (such as a promoter, origin of replication, ribosome-binding site, etc.) as well.
A first sequence is an “antisense” with respect to a second sequence if a polynucleotide whose sequence is the first sequence specifically hybridizes with a polynucleotide whose sequence is the second sequence.
Terms used to describe sequence relationships between two or more nucleotide sequences or amino acid sequences include “reference sequence,” “selected from,” “comparison window,” “identical,” “percentage of sequence identity,” “substantially identical,” “complementary,” and “substantially complementary.”
For sequence comparison of nucleic acid sequences and amino acids sequences, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Default program parameters are used. Methods of alignment of sequences for comparison are well known in the art. Optimal alignment of sequences for comparison can be conducted, for example, by the local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482, 1981, by the homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol. 48:443, 1970, by the search for similarity method of Pearson & Lipman, Proc. Nat'l. Acad. Sci. USA 85:2444, 1988, by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by manual alignment and visual inspection (see for example, Current Protocols in Molecular Biology (Ausubel et al., eds 1995 supplement)).
One example of a useful algorithm is PILEUP. PILEUP uses a simplification of the progressive alignment method of Feng & Doolittle, J. Mol. Evol. 35:351-360, 1987. The method used is similar to the method described by Higgins & Sharp, CABIOS 5:151-153, 1989. Using PILEUP, a reference sequence is compared to other test sequences to determine the percent sequence identity relationship using the following parameters: default gap weight (3.00), default gap length weight (0.10), and weighted end gaps. PILEUP can be obtained from the GCG sequence analysis software package, such as version 7.0 (Devereaux et al., Nuc. Acids Res. 12:387-395, 1984.
Another example of algorithms that are suitable for determining percent sequence identity and sequence similarity are the BLAST and the BLAST 2.0 algorithm, which are described in Altschul et al., J. Mol. Biol. 215:403-410, 1990 and Altschul et al., Nucleic Acids Res. 25:3389-3402, 1977. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (ncbi.nlm.nih.gov). The BLASTN program (for nucleotide sequences) uses as defaults a word length (W) of 11, alignments (B) of 50, expectation (E) of 10, M=5, N=−4, and a comparison of both strands. The BLASTP program (for amino acid sequences) uses as defaults a word length (W) of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915, 1989).
Another indicia of sequence similarity between two nucleic acids is the ability to hybridize. The more similar are the sequences of the two nucleic acids, the more stringent the conditions at which they will hybridize. The stringency of hybridization conditions are sequence-dependent and are different under different environmental parameters. Thus, hybridization conditions resulting in particular degrees of stringency will vary depending upon the nature of the hybridization method of choice and the composition and length of the hybridizing nucleic acid sequences. Generally, the temperature of hybridization and the ionic strength (especially the Na+ and/or Mg++ concentration) of the hybridization buffer will determine the stringency of hybridization, though wash times also influence stringency. Generally, stringent conditions are selected to be about 5° C. to 20° C. lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. The Tm is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. Conditions for nucleic acid hybridization and calculation of stringencies can be found, for example, in Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 2001; Tijssen, Hybridization With Nucleic Acid Probes, Part I. Theory and Nucleic Acid Preparation, Laboratory Techniques in Biochemistry and Molecular Biology, Elsevier Science Ltd., NY, N.Y., 1993 and Ausubel et al. Short Protocols in Molecular Biology, 4th ed., John Wiley & Sons, Inc., 1999.
“Stringent conditions” encompass conditions under which hybridization will only occur if there is less than 25% mismatch between the hybridization molecule and the target sequence. “Stringent conditions” may be broken down into particular levels of stringency for more precise definition. Thus, as used herein, “moderate stringency” conditions are those under which molecules with more than 25% sequence mismatch will not hybridize; conditions of “medium stringency” are those under which molecules with more than 15% mismatch will not hybridize, and conditions of “high stringency” are those under which sequences with more than 10% mismatch will not hybridize. Conditions of “very high stringency” are those under which sequences with more than 6% mismatch will not hybridize. In contrast nucleic acids that hybridize under “low stringency conditions include those with much less sequence identity, or with sequence identity over only short subsequences of the nucleic acid. For example, a nucleic acid construct can include a polynucleotide sequence that hybridizes under high stringency or very high stringency, or even higher stringency conditions to a polynucleotide sequence that encodes SEQ ID NO: 1.
Operably linked: A first nucleic acid sequence is operably linked with a second nucleic acid sequence when the first nucleic acid sequence is placed in a functional relationship with the second nucleic acid sequence. For instance, a promoter is operably linked to a coding sequence if the promoter affects the transcription or expression of the coding sequence. Generally, operably linked DNA sequences are contiguous and, where necessary to join two protein-coding regions, in the same reading frame.
Peptide Modifications: The present disclosure includes mutant gp120 peptides, as well as synthetic embodiments. In addition, analogues (non-peptide organic molecules), derivatives (chemically functionalized peptide molecules obtained starting with the disclosed peptide sequences) and variants (homologs) of gp120 can be utilized in the methods described herein. The peptides disclosed herein include a sequence of amino acids that can be either L- and/or D-amino acids, naturally occurring and otherwise.
Peptides can be modified by a variety of chemical techniques to produce derivatives having essentially the same activity as the unmodified peptides, and optionally having other desirable properties. For example, carboxylic acid groups of the protein, whether carboxyl-terminal or side chain, may be provided in the form of a salt of a pharmaceutically-acceptable cation or esterified to form a C1-C16 ester, or converted to an amide of formula NR1R2 wherein R1 and R2 are each independently H or C1-C16 alkyl, or combined to form a heterocyclic ring, such as a 5- or 6-membered ring. Amino groups of the peptide, whether amino-terminal or side chain, may be in the form of a pharmaceutically-acceptable acid addition salt, such as the HCl, HBr, acetic, benzoic, toluene sulfonic, maleic, tartaric and other organic salts, or may be modified to C1-C16 alkyl or dialkyl amino or further converted to an amide.
Hydroxyl groups of the peptide side chains can be converted to C1-C16 alkoxy or to a C1-C16 ester using well-recognized techniques. Phenyl and phenolic rings of the peptide side chains can be substituted with one or more halogen atoms, such as F, Cl, Br or I, or with C1-C16 alkyl, C1-C16 alkoxy, carboxylic acids and esters thereof, or amides of such carboxylic acids. Methylene groups of the peptide side chains can be extended to homologous C2-C4 alkylenes. Thiols can be protected with any one of a number of well-recognized protecting groups, such as acetamide groups. Those skilled in the art will also recognize methods for introducing cyclic structures into the peptides of this disclosure to select and provide conformational constraints to the structure that result in enhanced stability. For example, a C- or N-terminal cysteine can be added to the peptide, so that when oxidized the peptide will contain a disulfide bond, generating a cyclic peptide. Other peptide cyclizing methods include the formation of thioethers and carboxyl- and amino-terminal amides and esters.
Peptidomimetic and organomimetic embodiments are also within the scope of the present disclosure, whereby the three-dimensional arrangement of the chemical constituents of such peptido- and organomimetics mimic the three-dimensional arrangement of the peptide backbone and component amino acid side chains, resulting in such peptido- and organomimetics of the proteins of this disclosure. For computer modeling applications, a pharmacophore is an idealized, three-dimensional definition of the structural requirements for biological activity. Peptido- and organomimetics can be designed to fit each pharmacophore with current computer modeling software (using computer assisted drug design or CADD). See Walters, “Computer-Assisted Modeling of Drugs”, in Klegerman & Groves, eds., 1993, Pharmaceutical Biotechnology, Interpharm Press: Buffalo Grove, Ill., pp. 165-174 and Principles of Pharmacology Munson (ed.) 1995, Ch. 102, for descriptions of techniques used in CADD. Also included within the scope of the disclosure are mimetics prepared using such techniques. In one example, a mimetic mimics the antigenic activity generated by gp120 a mutant, a variant, fragment, or fusion thereof.
Pharmaceutical agent: A chemical compound or composition capable of inducing a desired therapeutic or prophylactic effect when properly administered to a subject or a cell. “Incubating” includes a sufficient amount of time for a drug to interact with a cell. “Contacting” includes incubating a drug in solid or in liquid form with a cell. An “anti-viral agent” or “anti-viral drug” is an agent that specifically inhibits a virus from replicating or infecting cells. Similarly, an “anti-retroviral agent” is an agent that specifically inhibits a retrovirus from replicating or infecting cells.
A “therapeutically effective amount” is a quantity of a chemical composition or an anti-viral agent sufficient to achieve a desired effect in a subject being treated. For instance, this can be the amount necessary to inhibit viral replication or to measurably alter outward symptoms of the viral infection, such as increase of T cell counts in the case of an HIV-1 infection. In general, this amount will be sufficient to measurably inhibit virus (for example, HIV) replication or infectivity. When administered to a subject, a dosage will generally be used that will achieve target tissue concentrations (for example, in lymphocytes) that has been shown to achieve in vitro inhibition of viral replication.
Pharmaceutically acceptable carriers: The pharmaceutically acceptable carriers of use are conventional. Remington's Pharmaceutical Sciences, by E. W. Martin, Mack Publishing Co., Easton, Pa., 15th Edition, 1975, describes compositions and formulations suitable for pharmaceutical delivery of the fusion proteins herein disclosed.
In general, the nature of the carrier will depend on the particular mode of administration being employed. For instance, parenteral formulations usually comprise injectable fluids that include pharmaceutically and physiologically acceptable fluids such as water, physiological saline, balanced salt solutions, aqueous dextrose, glycerol or the like as a vehicle. For solid compositions (such as powder, pill, tablet, or capsule forms), conventional non-toxic solid carriers can include, for example, pharmaceutical grades of mannitol, lactose, starch, or magnesium stearate. In addition to biologically neutral carriers, pharmaceutical compositions to be administered can contain minor amounts of non-toxic auxiliary substances, such as wetting or emulsifying agents, preservatives, and pH buffering agents and the like, for example sodium acetate or sorbitan monolaurate.
Polypeptide: Any chain of amino acids, regardless of length or post-translational modification (such as glycosylation or phosphorylation). “Polypeptide” applies to amino acid polymers to naturally occurring amino acid polymers and non-naturally occurring amino acid polymer as well as in which one or more amino acid residue is a non-natural amino acid, for example an artificial chemical mimetic of a corresponding naturally occurring amino acid. In one embodiment, the polypeptide is a gp120 polypeptide, such as a stabilized gp120. A “residue” refers to an amino acid or amino acid mimetic incorporated in a polypeptide by an amide bond or amide bond mimetic. A polypeptide has an amino terminal (N-terminal) end and a carboxy terminal (C-terminal) end. “Polypeptide” is used interchangeably with peptide or protein, and is used interchangeably herein to refer to a polymer of amino acid residues.
Protein core: The protein core refers to the interior of a folded protein, which is substantially free of solvent exposure, such as solvent in the form of water molecules in solution. Typically, the protein core is predominately composed of hydrophobic or apolar amino acids. In some examples, a protein core may contain charged amino acids, for example aspartic acid, glutamic acid, arginine, and/or lysine. The inclusion of uncompensated charged amino acids (a compensated charged amino can be in the form of a salt bridge) in the protein core can lead to a destabilized protein. That is, a protein with a lower Tm then a similar protein without an uncompensated charged amino acid in the protein core. In other examples, a protein core may have a cavity with in the protein core. Cavities are essentially voids within a folded protein where amino acids or amino acid side chains are not present. Such cavities can also destabilize a protein relative to a similar protein without a cavity. Thus, when creating a stabilized form of a protein, for example a stabilized form of gp120, it may be advantageous to substitute amino acid residues within the core in order to fill cavities present in the wild-type protein.
Purified: The term purified does not require absolute purity; rather, it is intended as a relative term. Thus, for example, a purified protein is one in which the protein is more enriched than the protein is in its natural environment within a cell. Preferably, a preparation is purified such that the protein represents at least 50% of the protein content of the preparation.
The gp120 polypeptides disclosed herein, or antibodies that specifically bind gp120, can be purified by any of the means known in the art. See for example Guide to Protein Purification, ed. Deutscher, Meth. Enzymol. 185, Academic Press, San Diego, 1990; and Scopes, Protein Purification: Principles and Practice, Springer Verlag, New York, 1982. Substantial purification denotes purification from other proteins or cellular components. A substantially purified protein is at least 60%, 70%, 80%, 90%, 95% or 98% pure. Thus, in one specific, non-limiting example, a substantially purified protein is 90% free of other proteins or cellular components.
Space Group: The arrangement of symmetry elements of a crystal.
Structure coordinates: Mathematical coordinates derived from mathematical equations related to the patterns obtained on diffraction of a monochromatic beam of X-rays by the atoms (scattering centers) such as a gp120, a gp120:CD4 complex, a gp120:antibody complex, or combinations thereof in a crystal in crystal form. The diffraction data are used to calculate an electron density map of the repeating unit of the crystal. The electron density maps are used to establish the positions of the individual atoms within the unit cell of the crystal. In one example, the term “structure coordinates” refers to Cartesian coordinates derived from mathematical equations related to the patterns obtained on diffraction of a monochromatic beam of X-rays, such as by the atoms of a stabilized form of gp120 in crystal form.
Atomic coordinate data, such as that in Table 1 and Table 2 lists each atom by a unique number (column 2); the atom name in the context of the residue to which it belongs (column 3), for example CA refers to the alpha carbon of the peptide backbone (detailed descriptions of the atom identifiers for each residue can be found for example in Creighton, Proteins, Structures and Molecular Properties, W.H. Freeman & Co., New York, 1993); the amino acid residue in which the atom is located (column 4); the chain identifier (column 4′) which may or may not be included, the number of the residue (column 5); the coordinates (for example, X, Y, Z) which define with respect to the crystallographic axes the atomic position (in Å) of the respective atom (columns 6, 7, and 8); the occupancy of the atom in the respective position (column 9); the “B-factor”, which is the isotropic displacement parameter (in Å2) and accounts for movement of the atom around its atomic center (column 10).
Those of ordinary skill in the art understand that a set of structure coordinates determined by X-ray crystallography is not without standard error. For the purpose of this disclosure, any set of structure coordinates for a stabilized form of gp120 or a gp120 with an extended V3 loop that have a root mean square deviation of protein backbone atoms (N, Cα, C and 0) of less than about 1.0 Angstroms when superimposed, such as about 0.75, or about 0.5, or about 0.25 Angstroms, using backbone atoms, on the structure coordinates listed in Table 1 or Table 2 shall (in the absence of an explicit statement to the contrary) be considered identical.
Subject: Living multi-cellular vertebrate organisms, a category that includes both human and veterinary subjects, including human and non-human mammals.
T Cell: A white blood cell critical to the immune response. T cells include, but are not limited to, CD4+ T cells and CD8+ T cells. A CD4+ T lymphocyte is an immune cell that carries a marker on its surface known as “cluster of differentiation 4” (CD4). These cells, also known as helper T cells, help orchestrate the immune response, including antibody responses as well as killer T cell responses. CD8+ T cells carry the “cluster of differentiation 8” (CD8) marker. In one embodiment, a CD8 T cells is a cytotoxic T lymphocytes. In another embodiment, a CD8 cell is a suppressor T cell.
Therapeutic agent: Used in a generic sense, it includes treating agents, prophylactic agents, and replacement agents.
Tm: The temperature at which a change of state occurs. For example, the temperature at which gp120 undergoes a transition from the folded form to the unfolded form. Essentially this is the temperature at which the structure melts away. Stabilized gp120 has a higher Tm than native gp120. Another example would be the temperature at which a DNA duplex melts.
Transformed: A transformed cell is a cell into which has been introduced a nucleic acid molecule by molecular biology techniques. As used herein, the term transformation encompasses all techniques by which a nucleic acid molecule might be introduced into such a cell, including transfection with viral vectors, transformation with plasmid vectors, and introduction of DNA by electroporation, lipofection, and particle gun acceleration.
Unit Cell: The smallest building block of a crystal. The entire volume of a crystal may be constructed by regular assembly of such blocks. Each unit cell comprises a complete representation of the unit of pattern, the repetition of which builds produces a crystal lattice
Vector: A nucleic acid molecule as introduced into a host cell, thereby producing a transformed host cell. Recombinant DNA vectors are vectors having recombinant DNA. A vector can include nucleic acid sequences that permit it to replicate in a host cell, such as an origin of replication. A vector can also include one or more selectable marker genes and other genetic elements known in the art. Viral vectors are recombinant DNA vectors having at least some nucleic acid sequences derived from one or more viruses.
Virus: Microscopic infectious organism that reproduces inside living cells. A virus consists essentially of a core of a single nucleic acid surrounded by a protein coat, and has the ability to replicate only inside a living cell. “Viral replication” is the production of additional virus by the occurrence of at least one viral life cycle. A virus may subvert the host cells' normal functions, causing the cell to behave in a manner determined by the virus. For example, a viral infection may result in a cell producing a cytokine, or responding to a cytokine, when the uninfected cell does not normally do so.
“Retroviruses” are RNA viruses wherein the viral genome is RNA. When a host cell is infected with a retrovirus, the genomic RNA is reverse transcribed into a DNA intermediate which is integrated very efficiently into the chromosomal DNA of infected cells. The integrated DNA intermediate is referred to as a provirus. The term “lentivirus” is used in its conventional sense to describe a genus of viruses containing reverse transcriptase. The lentiviruses include the “immunodeficiency viruses” which include human immunodeficiency virus (HIV) type 1 and type 2 (HIV-1 and HIV-2), simian immunodeficiency virus (SIV), and feline immunodeficiency virus (FIV).
HIV-1 is a retrovirus that causes immunosuppression in humans (HIV disease), and leads to a disease complex known as the acquired immunodeficiency syndrome (AIDS). “HIV disease” refers to a well-recognized constellation of signs and symptoms (including the development of opportunistic infections) in persons who are infected by an HIV virus, as determined by antibody or western blot studies. Laboratory findings associated with this disease are a progressive decline in T cells.
X5: An antibody that bonds a conformation of gp120 induced by the binding of CD4. Antibodies that bind to gp120 in a conformation induced by CD4 binding are termed CD4i antibodies.
ΔS: The change in entropy, such as the change in entropy upon the association of gp120 and CD4 or an antibody or antibody fragment, for example X5.
ΔH: The change in the enthalpy, such as the change enthalpy upon the association of gp120 and CD4 or an antibody.
Provided herein in various embodiments are gp120 polypeptides, which are useful to induce immunogenic response in vertebrate animals (such as mammals, for example primates, such as humans) to lentivirus, such as SIV or HIV (for example HIV-1 and HIV-2).
In several embodiments, the gp120 polypeptides are stabilized in a CD4 bound conformation. In several disclosed examples, the gp120 polypeptides are stabilized by modification. In certain examples, these modifications can be the introduction of a plurality of non-naturally occurring cross-linking cysteine residues. In certain examples, the modification can be the introduction of at least one amino acid substitution in the protein core of gp120.
In several disclosed examples, cysteines are introduced into the gp120 polypeptide at position 96, 109, 123, 231, 267, 275, 428, 431 or in combinations thereof. In some examples of gp120 polypeptides disclosed herein, the plurality of non-naturally occurring cross-linking cysteine residues are defined by the interaction and crosslinking of at least one of residue pairs 96 and 275; 109 and 428; 123 and 431; and 231 and 267. In some embodiments, all of the residue pairs 96 and 275; 109 and 428; 123 and 431; and 231 and 267 are crosslinked.
In some embodiments, the stabilized gp120 polypeptide contains one or more amino acid substitutions in the protein core. In several examples, the substitution is made at position 95, 257, 375, 433, or a combination thereof. In specific examples, the substitution is a serine to tryptophan substitution at position 95, a threonine to serine substitution at position 257, a serine to tryptophan substitution at position 375, an alanine to methionine substitution at position 433, or a combination thereof.
In specific examples, the stabilized gp120 polypeptide includes the amino acid sequence set forth as SEQ ID NO: 1 or is encoded by one of SEQ ID NO: 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, or degenerate variants thereof. In still other embodiments, the stabilized gp120 contains a portion of the amino acid sequence set forth as SEQ ID NO: 1 or as encoded by any one of SEQ NOs: 4-18, for example, a domain such as the outer domain, or a contiguous stretch of about 5 or more amino acids, such as about 6, about 7, about 8, about 9, about 10, about 15, about 20, about 25, or more amino acids.
In other examples, the gp120 polypeptide has the V3 loop in an extended conformation. In one example, the gp120 polypeptide with the V3 loop in an extended conformation contains the amino acid sequence set forth as SEQ ID NO: 2. In other embodiments, the gp120 polypeptide with an extended v3 loop contains a portion of the amino acid sequence set forth as SED ID NO: 2, for example, a domain such as the outer domain, or a contiguous stretch of about 5 or more amino acids, such as about 6, about 7, about 8, about 9, about 10, about 15, about 20, about 25, or more amino acids wherein the domain or contiguous stretch of amino acids includes a portion of the V3 loop.
Other embodiments are compositions containing a therapeutically effective amount of at least one gp120 polypeptide, such as a stabilized gp120 polypeptide (such as set forth as SEQ ID NO: 1 or as encoded by the nucleotide sequence set forth as one of SEQ ID NO: 4, 5, 6, 7, 8, 10, 11, 12, 13, 14 15, 16, 17, and 18, or a degenerate variant thereof) or a gp120 polypeptide with the V3 loop in an extended conformation, such as the amino acid sequence set forth as SEQ NO: 2. In some embodiments, the composition can contain pharmaceutically acceptable carriers, adjuvants, or combinations thereof.
This disclosure further provides methods for eliciting and/or enhancing an immune response in a subject (such as a primate subject, for example a human subject). In some embodiments, these methods involve administering to the subject a composition including a gp120 polypeptide as disclosed herein, for example a stabilized gp120 such as set forth as SEQ ID NO: 1 or as encoded by the nucleotide sequence set forth as one of SEQ ID NO: 4, 5, 6, 7, 8, 10, 11, 12, 13, 14 15, 16, 17, and 18, or a degenerate variant thereof. In some embodiments, these methods involve administering to the subject a composition including a gp120 polypeptide with an extended V3 loop such as set forth as SEQ ID NO: 2. In one specific, non-limiting example, the subject is infected with a lentivirus, for example SIV or HIV, such as HIV-1 or HIV-2. In some embodiments, the immune response is a B cell response, a T cell response, or a combination thereof.
In other embodiments, the subject is further administered a therapeutically effective amount of a monomeric or trimeric gp140 polypeptide, an unmodified monomeric or trimeric gp120 polypeptide, or a combination thereof.
Other embodiments of this disclosure are isolated polynucleotides (nucleic acid molecules) which encode the gp120 polypeptides described herein. Specific examples of such nucleic acid molecules contain nucleic acids encoding the amino acid sequence set forth as one of SEQ ID NO: 1 or 2, the nucleotide sequences set forth as one of SEQ ID NOs: 4-18, or degenerate variants thereof. In other embodiments, the isolated polynucleotides consist of nucleic acid molecules encoding the amino acid sequence set forth as one of SEQ ID NO: 1 or 2, the nucleotide sequences set forth as one of SEQ ID NOs: 4-18, or degenerate variants thereof. In certain embodiments, the nucleic acid encoding a gp120 polypeptide is operably linked to a promoter. Vectors comprising such polynucleotides are also disclosed, as are host cells transformed with such vectors.
Other embodiments are compositions containing a therapeutically effective amount of a polynucleotide containing a nucleic acid encoding a gp120 polypeptide disclosed herein. In certain embodiments, the nucleic acid encodes the amino acid sequence set forth as SEQ ID NO: 1 and 2. In other embodiments the nucleic acid contains the one of the nucleotide sequences set forth as SEQ ID NO: 4-18 or a degenerate variant thereof. In some embodiments, the composition can contain pharmaceutically acceptable carriers, adjuvants, or combinations thereof.
This disclosure further provides methods for eliciting and/or enhancing an immune response in a subject (such as a primate subject, for example a human subject). The methods involve administering to the subject a composition containing a nucleic acid encoding a gp120 polypeptide of this disclosure. In one specific, non-limiting example, the subject is infected with a lentivirus, for example SIV or HIV, such as HIV-1 or HIV-2. In some embodiments, the immune response is a B cell response, a T cell response, or a combination thereof.
In other embodiments, the subject is further administered a therapeutically effective amount of a plasmid vector expressing a polypeptide containing a monomeric or trimeric gp140 polypeptide, an unmodified monomeric or trimeric gp120 polypeptide; or combination thereof.
Also disclosed herein are methods for identifying an immunogen that induces an immune response to gp120, for example gp120 from a lentivirus, such as SIV or HIV such as HIV-1 or HIV-2. Typically the immune response is a B cell response, a T cell response, or a combination thereof. These methods involve using a three-dimensional structure of gp120 as defined by atomic coordinates set forth in Table 1, Table 2, or a portion thereof to design or select the immunogen, synthesizing the immunogen, immunizing a subject with the immunogen; and determining if an immune response to gp120 is induced in the subject. In some embodiments, the immunogen is designed from the gp120 amino acid sequence. In certain embodiments, the immunogen is designed or selected using a three-dimensional structure of gp120 as defined by atomic coordinates set forth in Table 1, Table 2, or a portion thereof and an amino acid sequence is assembled to provide an immunogen, for example by synthesizing the amino acid sequence or producing a nucleic acid encoding the immunogen. In other embodiments the is selected from a database of compounds or is designed de novo.
Also provided by this disclosure is a machine readable data storage medium including a data storage material encoded with machine readable data corresponding to the coordinates of a stabilized form of gp120 as defined by Table 1 or a portion thereof or a form of gp120 having an extended conformation of the V3 loop as defined by Table 2 or a portion thereof.
Also provided for are computer systems including data and a data processor, wherein the system forms a representation of the three-dimensional structure gp120 protein as defined by Table 1, Table 2, or a portion thereof, such as the atomic positions, surface, domain, or region of the gp120 polypeptide.
Also disclosed herein is the use of stabilized gp120 molecules as crystallization tools. A crystalline form of a stabilized gp120 also is disclosed, for example the crystalline form of gp120 as defined by the coordinates as given in Table 1, or with coordinates having a root mean square deviation therefrom, wherein the distance between the residues is less than about 0.75 Å. A crystalline form of a gp120 with an extended V3 loop also is disclosed, for example the crystalline form of gp120 as defined by the coordinates as given in Table 2, or with coordinates having a root mean square deviation therefrom, wherein the distance between the residues is less than about 0.75 Å.
The present disclosure relates to gp120 polypeptides and nucleic acids encoding these gp120 polypeptides. The gp120 polypeptides of this disclosure are capable of eliciting an immune response to a gp120 protein in a subject, such as a human subject. In some embodiments, the gp120 polypeptides of this disclosure are stabilized in a CD4 bound conformation.
Using a combination of atomic level structural information with biophysical techniques novel gp120 polypeptides were designed that are stabilized in the conformation substantially identical to the CD4 bound polypeptide. For example, the three-dimensional structure of the wild-type polypeptide was analyzed to determine where cysteine residues could be introduced such that they would form disulfide bonds in the folded molecule. This methodology is not specific to cysteine residues; other natural or non-natural amino acids could be used. In some embodiments, the stabilized gp120 has a Kd for CD4 of less than or equal to about 10 nM, such as less than or equal to about 5 nM, less than or equal to about 3 nM, or less than or equal to about 1 nM. In some embodiments the stabilized gp120 has −TΔS for CD4 binding of about less than or equal to 40 kcal/mol, such as about less than or equal to 30 kcal/mol, about less than or equal to 15 kcal/mol, or about less than or equal to 10 kcal/mol.
The stability of folded polypeptides can be measured using techniques such as thermal denaturation. The temperature of the unfolding transition (Tm) is an accepted measure of the stability of the folded polypeptide, where increases in Tm indicate an increase in the stability of the folded polypeptide. In some embodiments, the stabilized gp120 polypeptides has a Tm value greater than about 52° C., such as greater than about 53° C., greater than about 54° C. (such as 53.8° C.), greater than about 55° C., greater than about 56° C., greater than about 57° C., greater than about 58° C., or even greater than about 59° C.
In some embodiments, the stabilized gp120 polypeptides are stabilized by a plurality of non-naturally occurring cross-linking cysteine residues. By plurality it is meant that there are at least 2, such as at least 4, at least 6, or at least 8 cysteines introduced by mutation into a gp120 polypeptide, such that pairs of cysteines form at least 1, such as at least 2, at least 3, or at least 4 disulfide bonds. Each disulfide bond is formed by a pair of cysteines.
In some embodiments, the mutationally introduced cysteines are introduced into the gp120 polypeptide at positions 96, 109, 123, 231, 267, 275, 428, 431, or in a sub-combination thereof. In some examples of the stabilized gp120 polypeptides, the plurality of non-naturally occurring cross-linking cysteine residues are defined by the interaction of at least one of residue pairs 96 and 275; 109 and 428; 123 and 431; and 231 and 267. Thus, the stabilized gp120 polypeptides of this disclosure may have any combination of the crosslinked cysteines defined by the interaction of 96 and 275; 109 and 428; 123 and 431; and 231 and 267.
In some embodiments, the stabilized gp120 polypeptide contains one or more amino acid substitutions in the protein core. In several disclosed examples, the substitution is made at position 95, 257, 375, 433, or a combination thereof. Thus, a stabilized gp120 polypeptide may have one, two, three, or four substitutions in the protein core. In specific examples, the substitution is a serine to tryptophan substitution at position 95, a threonine to serine substitution at position 257, a serine to tryptophan substitution at position 375, an alanine to methionine substitution at position 433, or various combinations thereof.
In one embodiment, the stabilized gp120 polypeptide (new—9c) includes the amino acid sequence set forth as:
In other embodiments, the stabilized gp120 includes the amino acid sequence encoded by one of SEQ ID NO: 4, 5, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, and 18, or degenerate variants thereof. In still other embodiments, the stabilized gp120 polypeptide consists of the amino acid sequence set forth as SEQ ID NO: 1 or as encoded by the nucleotide sequence set forth as one of SEQ ID NO: 4, 5, 6, 7, 8, 10, 11, 12, 13, 14 15, 16, 17, and 18, or a degenerate variant thereof. In some embodiments, a stabilized gp120 polypeptide is an immunogenic fragment of SEQ ID NO: 1 or as encoded by the nucleotide sequence set forth as one of SEQ ID NO: 4, 5, 6, 7, 8, 10, 11, 12, 13, 14 15, 16, 17, and 18, or a degenerate variant thereof, such that the immunogenic fragment is stabilized in a CD4 binding conformation. In some embodiments, the stabilized gp120 includes the outer-domain. In one example, the outer domain includes residues 255-421 and 436-474 of gp120. Thus, the outer domain can contain residues 109-246 and 261-299 of SEQ ID NO: 1, the amino acid sequence encoded by SEQ ID NO: 4-18 or a degenerate variant thereof. In some examples residues 246 and 261 are covalently linked, for example by a peptide linker. In some examples, the peptide linker is residues 247-260 of SEQ ID NO: 1, the amino acid sequence encoded by SEQ ID NO: 4-18 or a degenerate variant thereof. Ideally the linker should be of sufficient length such that the folded protein is a conformation that can be bound by CD4. In some embodiments, the linker is a peptide linker and the peptide linker is about 2 to about 20 amino acids in length, such as about 2, about 3, about 4, about 5, about 6, about 7, about 8, about 10, about 12, about 15, or about 20 amino acids in length. In some embodiments, the immunogenic fragment of gp120 consists of residues 109-246 and 261-299, and a linker In some embodiments the linker does not contain a sequence form gp120.
In other embodiments, the stabilized gp120 fragment is truncated on the carboxy terminal end. For example, the carboxy terminal end can be truncated to about amino acid residue 433. In addition, portions of the amino terminus of gp120 can also be eliminated from the stabilized gp120 fragment. The truncated gp120 sequence can be free from the carboxy terminus through amino acid residue 95. In one embodiment, the truncated gp120 sequence is free from the amino terminus of gp120 through residue 95 and residue 433 through the carboxy terminus of gp120. Thus, in some embodiments the stabilized gp120 contains a portion of the amino acid sequence set forth as SEQ ID NO: 1 or as encoded by any one of SED NOs:4-18.
In other embodiments, the gp120 polypeptide has the V3 loop in an extended conformation. An exemplary sequence of a gp120 with an extended loop is set forth as:
Thus, a gp120 polypeptide with an extended V3 loop can contain the amino acid sequence set forth as SEQ ID NO: 2 or a fragment thereof. In one example, the gp120 polypeptide with the V3 loop in an extended conformation consists of the amino acid sequence set forth as SEQ ID NO: 2 or a fragment thereof. In still other embodiments, the gp120 polypeptide with an extended V3 loop contains a portion of the amino acid sequence set forth as SED ID NO: 2. In some embodiments, the stabilized gp120 includes the outer-domain. In one example, the outer domain includes residues 255-421 and 436-474 of gp120. Thus, the outer domain can include residues 109-246 and 261-299 of SEQ ID NO: 2.
In other embodiments, the gp120 polypeptide has the V3 loop in an extended conformation is truncated on the carboxy terminal end. For example, the carboxy terminal end can be truncated to about amino acid residue 433. In addition, portions of the amino terminus of gp120 can also be eliminated from the gp120 polypeptide has the V3 loop in an extended conformation fragment. The truncated gp120 sequence can be free from the carboxy terminus through amino acid residue 95. In one embodiment, the truncated gp120 sequence is free from the amino terminus of gp120 through residue 95 and residue 433 through the carboxy terminus of gp120. Thus, in some embodiments the gp120 polypeptide has the V3 loop in an extended conformation contains a portion of the amino acid sequence set forth as SEQ ID NO: 2.
In other embodiments, the gp120 polypeptide has an amino acid sequence least 90% identical to SEQ ID NO: 1, SEQ ID NO: 2, or the amino acid sequence encoded by any one of SEQ ID NO: 4-18, for example a polypeptide that has about 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or even higher sequence identity to SEQ ID NO: 1, SEQ ID NO: 2, or the amino acid sequence encoded by any one of SEQ ID NO: 4-18.
The immunogenic gp120 polypeptides or immunogenic fragments of the gp120 polypeptides disclosed herein can be chemically synthesized by standard methods, or can be produced recombinantly. An exemplary process for polypeptide production is described in Lu et al., Federation of European Biochemical Societies Letters. 429:31-35, 1998. They can also be isolated by methods including preparative chromatography and immunological separations.
In other embodiments, fusion proteins are provided including a first and second polypeptide moiety in which one of the protein moieties includes an amino acid sequence as set forth in SEQ ID NO: 1 or 2, or a fragment thereof. In other embodiments, fusion proteins are provided comprising a first and second polypeptide moiety in which one of the protein moieties includes an amino acid sequence encoded by one of the nucleotide sequences as set forth as SEQ ID NO: 4-18, or a fragment thereof. The other moiety is a heterologous protein such as can be a carrier protein and/or an immunogenic protein. Such fusions also are useful to evoke an immune response against gp120. In certain embodiments the gp120 polypeptides disclosed herein are covalent or non-covalent addition of TLR ligands or dendritic cell or B cell targeting moieties.
A gp120 polypeptide can be covalently linked to a carrier, which is an immunogenic macromolecule to which an antigenic molecule can be bound. When bound to a carrier, the bound polypeptide becomes more immunogenic. Carriers are chosen to increase the immunogenicity of the bound molecule and/or to elicit higher titers of antibodies against the carrier which are diagnostically, analytically, and/or therapeutically beneficial. Covalent linking of a molecule to a carrier can confer enhanced immunogenicity and T cell dependence (see Pozsgay et al., PNAS 96:5194-97, 1999; Lee et al., J. Immunol. 116:1711-18, 1976; Dintzis et al., PNAS 73:3671-75, 1976). Useful carriers include polymeric carriers, which can be natural (for example, polysaccharides, polypeptides or proteins from bacteria or viruses), semi-synthetic or synthetic materials containing one or more functional groups to which a reactant moiety can be attached. Bacterial products and viral proteins (such as hepatitis B surface antigen and core antigen) can also be used as carriers, as well as proteins from higher organisms such as keyhole limpet hemocyanin, horseshoe crab hemocyanin, edestin, mammalian serum albumins, and mammalian immunoglobulins. Additional bacterial products for use as carriers include bacterial wall proteins and other products (for example, streptococcal or staphylococcal cell walls and lipopolysaccharide (LPS)).
Most antigenic epitopes of HIV proteins are relatively small in size, such as about 5 to 100 amino acids in size, for example about 5, about 6, about 7, about 8, about 9, about 10, about 15, about 20, about 25, about 30, about 40, about 50, about 60, about 70, about 80, about 90, or about 100. Thus, fragments (for example, epitopes or other antigenic fragments) of a gp120 polypeptide, such as any of the gp120 polypeptides described herein or a fragment thereof, can be used as an immunogen.
In some embodiments, the disclosed gp120 polypeptides are modified by glycosylation, for example by N-linked glycans. Thus, the immune response can be focused on a region interest of a gp120 polypeptide by masking other regions with non-immunogenic glycans. Glycosylation sites can be introduced into the gp120 polypeptides by site directed mutagenesis. This straggly can be utilized to focus the immune response to regions of interest in the gp120 polypeptide, for example the CD4 binding site or the binding site for a neutralizing antibody, for example a the b12 antibody. Examples of glycan masking can be found in Pantophlet and Burton, Trends Mol Med. 9(11):468-73, 2003, which is incorporated by reference herein in its entirety.
Another strategy to focus the immune response on the CD4 binding region or b12 epitope region is to use SIV and HIVgp120 core glycoproteins (such as the stabilized gp120 polypeptides disclosed herein) that possess an endogenous CD4 binding site or to scaffold the heterologous HIV-1 CD4 binding region onto cores derived from selected SIV or HIV-2 strains. The gp120 core can be derived from the envelope glycoproteins of lentivirus, for example SIV such as SIV mac239 and HIV, such as HIV-2 7132A. The residues required for CD4BS antibody recognition, for example the site of b12 binding, are transplanted by site-directed mutagenesis of the appropriate codon-optimized plasmid sequences. In some embodiments, extra N-glycans are added to these cores to eliminate the elicitation of non-cross reactive antibodies directed against regions outside the antibody binding site, for example the binding site of a neutralizing antibody such as CD4BS antibody.
The present disclosure concerns nucleic acid constructs including polynucleotide sequences that encode antigenic gp120 polypeptides of HIV-1. These polynucleotides include DNA, cDNA and RNA sequences which encode the polypeptide of interest.
Methods for the manipulation and insertion of the nucleic acids of this disclosure into vectors are well known in the (see for example, Sambrook et al., Molecular Cloning, a Laboratory Manual, 2d edition, Cold Spring Harbor Press, Cold Spring Harbor, N.Y., 1989, and Ausubel et al., Current Protocols in Molecular Biology, Greene Publishing Associates and John Wiley & Sons, New York, N.Y., 1994).
Typically, the nucleic acid constructs encoding the gp120 polypeptides of this disclosure are plasmids. However, other vectors (for example, viral vectors, phage, cosmids, etc.) can be utilized to replicate the nucleic acids. In the context of this disclosure, the nucleic acid constructs typically are expression vectors that contain a promoter sequence which facilitates the efficient transcription of the inserted genetic sequence of the host. The expression vector typically contains an origin of replication, a promoter, as well as specific nucleic acid sequences that allow phenotypic selection of the transformed cells.
More generally, polynucleotide sequences encoding the gp120 polypeptides of this disclosure can be operably linked to any promoter and/or enhancer that is capable of driving expression of the nucleic acid following introduction into a host cell. A promoter is an array of nucleic acid control sequences that directs transcription of a nucleic acid. A promoter includes necessary nucleic acid sequences (which can be) near the start site of transcription, such as in the case of a polymerase II type promoter (a TATA element). A promoter also can include distal enhancer or repressor elements which can be located as much as several thousand base pairs from the start site of transcription. Both constitutive and inducible promoters are included (see, for example, Bitter et al., Methods in Enzymology 153:516-544, 1987).
To produce such nucleic acid constructs, polynucleotide sequences encoding gp120 polypeptides are inserted into a suitable expression vector, such as a plasmid expression vector. Procedures for producing polynucleotide sequences encoding gp120 polypeptides and for manipulating them in vitro are well known to those of skill in the art, and can be found, for example in Sambrook and Ausubel, supra.
In addition to the polynucleotide sequences encoding the polypeptides set forth as SEQ ID NOs:1-2 disclosed herein and nucleic acids encoding gp120 polypeptides as set forth as SEQ ID NOs:4-18 as disclosed herein, the nucleic acid constructs can include variant polynucleotide sequences that encode polypeptides that are substantially similar to SEQ ID NOs: 1-2 and nucleic acids encoding gp120 polypeptides as set forth as SEQ ID NOs: 4-18. Similarly, the nucleic acid constructs can include polynucleotides that encode chimeric polypeptides, for example fusion proteins. For enhanced immunogenicity, it may be advantageous to include the sequence encoding for heterologous T helper sequences derived from HIV or other heterologous sources.
The similarity between amino acid (and polynucleotide) sequences is expressed in terms of the similarity between the sequences, otherwise referred to as sequence identity. Sequence identity is frequently measured in terms of percentage identity (or similarity); the higher the percentage, the more similar are the primary structures of the two sequences. In general, the more similar the primary structures of two amino acid sequences, the more similar are the higher order structures resulting from folding and assembly. Thus, the nucleic acid constructs can include polynucleotides that encode polypeptides that are at least about 90%, or 95%, 98%, or 99% identical to one of SEQ ID NOs: 1-2 with respect to amino acid sequence, or that have at least about 90%, 95%, 98%, or 99% sequence identity to one or more of SEQ ID NOs: 4-18 and/or that differ from one of these sequences by the substitution of degenerate codons.
DNA sequences encoding an immunogenic gp120 polypeptide can be expressed in vitro by DNA transfer into a suitable host cell. The cell may be prokaryotic or eukaryotic. The term also includes any progeny of the subject host cell. It is understood that all progeny may not be identical to the parental cell since there may be mutations that occur during replication. Methods of stable transfer, meaning that the foreign DNA is continuously maintained in the host, are known in the art.
The polynucleotide sequences encoding an immunogenic gp120 polypeptide can be inserted into an expression vector including, but not limited to, a plasmid, virus or other vehicle that can be manipulated to allow insertion or incorporation of sequences and can be expressed in either prokaryotes or eukaryotes. Hosts can include microbial, yeast, insect, and mammalian organisms. Methods of expressing DNA sequences having eukaryotic or viral sequences in prokaryotes are well known in the art. Biologically functional viral and plasmid DNA vectors capable of expression and replication in a host are known in the art.
Transformation of a host cell with recombinant DNA can be carried out by conventional techniques that are well known to those of ordinary skill in the art. Where the host is prokaryotic, such as E. coli, competent cells which are capable of DNA uptake can be prepared from cells harvested after exponential growth phase and subsequently treated by the CaCl2 method using procedures well known in the art. Alternatively, MgCl2 or RbCl can be used. Transformation can also be performed after forming a protoplast of the host cell if desired, or by electroporation.
When the host is a eukaryote, such methods of transfection of DNA as calcium phosphate coprecipitates, conventional mechanical procedures such as microinjection, electroporation, insertion of a plasmid encased in liposomes, or virus vectors can be used. Eukaryotic cells can also be co-transformed with polynucleotide sequences encoding an immunogenic gp120 polypeptide, and a second foreign DNA molecule encoding a selectable phenotype, such as the herpes simplex thymidine kinase gene. Another method is to use a eukaryotic viral vector, such as simian virus 40 (SV40) or bovine papilloma virus, to transiently infect or transform eukaryotic cells and express the protein (see for example, Eukaryotic Viral Vectors, Cold Spring Harbor Laboratory, Gluzman ed., 1982).
Any of the gp120 polypeptides and nucleic acid molecules encoding the gp120 polypeptides disclosed herein can be used as immunogens, or to produce immunogens to elicit an immune response (immunogenic compositions) to gp120 such as to a gp120 expressing virus, for example to reduce HIV-1 infection or a symptom of HIV-1 infection. Following administration of a therapeutically effective amount of the disclosed therapeutic compositions, the subject can be monitored for HIV-1 infection, symptoms associated with HIV-1 infection, or both. Disclosed herein are methods of administering the therapeutic molecules disclosed herein (such as gp120 polypeptides and nucleic acids encoding gp120 polypeptides) to reduce HIV-1 infection. In several examples, a therapeutically effective amount of a gp120 polypeptide including SEQ ID NO: 1, a therapeutically effective amount of a gp120 polypeptide including SEQ ID NO: 2, a therapeutically effective amount of a gp120 polypeptide encoded by one of SEQ ID NOs: 4-18 or a degenerate variant thereof, or a combination thereof is administered to a subject.
In certain embodiments, the immunogenic composition includes an adjuvant. An adjuvant can be a suspension of minerals, such as alum, aluminum hydroxide, aluminum phosphate, on which antigen is adsorbed; or water-in-oil emulsion in which antigen solution is emulsified in oil (MF-59, Freund's incomplete adjuvant), sometimes with the inclusion of killed mycobacteria (Freund's complete adjuvant) to further enhance antigenicity (inhibits degradation of antigen and/or causes influx of macrophages). In one embodiment, the adjuvant is a mixture of stabilizing detergents, micelle-forming agent, and oil available under the name PROVAX® (IDEC Pharmaceuticals, San Diego, Calif.). An adjuvant can also be an immunostimulatory nucleic acid, such as a nucleic acid including a CpG motif.
In one example, the immunogenic composition is mixed with an adjuvant containing two or more of a stabilizing detergent, a micelle-forming agent, and an oil. Suitable stabilizing detergents, micelle-forming agents, and oils are detailed in U.S. Pat. No. 5,585,103; U.S. Pat. No. 5,709,860; U.S. Pat. No. 5,270,202; and U.S. Pat. No. 5,695,770, all of which are incorporated by reference herein in their entirety. A stabilizing detergent is any detergent that allows the components of the emulsion to remain as a stable emulsion. Such detergents include polysorbate 80 (TWEEN) (Sorbitan-mono-9-octadecenoate-poly(oxy-1,2-ethanediyl; manufactured by ICI Americas, Wilmington, Del.), TWEEN 40™, TWEEN 20™, TWEEN 60™, ZWITTERGENT™ 3-12, TEEPOL HB7™, and SPAN 85™. These detergents are usually provided in an amount of approximately 0.05 to 0.5%, such as at about 0.2%. A micelle forming agent is an agent which is able to stabilize the emulsion formed with the other components such that a micelle-like structure is formed. Such agents generally cause some irritation at the site of injection in order to recruit macrophages to enhance the cellular response. Examples of such agents include polymer surfactants described by BASF Wyandotte publications, for example, Schmolka, J. Am. Oil. Chem. Soc. 54:110, 1977, and Hunter et al., J. Immunol 129:1244, 1981, PLURONIC™ L62LF, L101, and L64, PEG1000, and TETRONIC™ 1501, 150R1, 701, 901, 1301, and 130R1. The chemical structures of such agents are well known in the art. In one embodiment, the agent is chosen to have a hydrophile-lipophile balance (HLB) of between 0 and 2, as defined by Hunter and Bennett, J. Immun. 133:3167, 1984. The agent can be provided in an effective amount, for example between 0.5 and 10%, or in an amount between 1.25 and 5%.
The oil included in the composition is chosen to promote the retention of the antigen in oil-in-water emulsion, to provide a vehicle for the desired antigen, and preferably has a melting temperature of less than 65° C. such that emulsion is formed either at room temperature (about 20° C. to 25° C.), or once the temperature of the emulsion is brought down to room temperature. Examples of such oils include squalene, Squalane, EICOSANE™, tetratetracontane, glycerol, and peanut oil or other vegetable oils. In one specific, non-limiting example, the oil is provided in an amount between 1 and 10%, or between 2.5 and 5%. The oil should be both biodegradable and biocompatible so that the body can break down the oil over time, and so that no adverse effects, such as granulomas, are evident upon use of the oil.
Immunogenic compositions can be formulated with an appropriate solid or liquid carrier, depending upon the particular mode of administration chosen. If desired, the disclosed pharmaceutical compositions can also contain minor amounts of non-toxic auxiliary substances, such as wetting or emulsifying agents, preservatives, and pH buffering agents and the like, for example sodium acetate or sorbitan monolaurate. Excipients that can be included in the disclosed compositions include flow conditioners and lubricants, for example silicic acid, talc, stearic acid or salts thereof, such as magnesium or calcium stearate, and/or polyethylene glycol, or derivatives thereof.
Immunogenic compositions can be provided as parenteral compositions, such as for injection or infusion. Such compositions are formulated generally by mixing a disclosed therapeutic agent at the desired degree of purity, in a unit dosage injectable form (solution, suspension, or emulsion), with a pharmaceutically acceptable carrier, for example one that is non-toxic to recipients at the dosages and concentrations employed and is compatible with other ingredients of the formulation. In addition, a disclosed therapeutic agent can be suspended in an aqueous carrier, for example, in an isotonic buffer solution at a pH of about 3.0 to about 8.0, preferably at a pH of about 3.5 to about 7.4, 3.5 to 6.0, or 3.5 to about 5.0. Useful buffers include sodium citrate-citric acid and sodium phosphate-phosphoric acid, and sodium acetate/acetic acid buffers. The active ingredient, optionally together with excipients, can also be in the form of a lyophilisate and can be made into a solution prior to parenteral administration by the addition of suitable solvents. Solutions such as those that are used, for example, for parenteral administration can also be used as infusion solutions.
A form of repository or “depot” slow release preparation can be used so that therapeutically effective amounts of the preparation are delivered into the bloodstream over many hours or days following transdermal injection or delivery. Such long acting formulations can be administered by implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. The compounds can be formulated with suitable polymeric or hydrophobic materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as sparingly soluble derivatives, for example, as a sparingly soluble salt.
Immunogenic compositions that include a disclosed therapeutic agent can be delivered by way of a pump (see Langer, supra; Sefton, CRC Crit. Ref. Biomed. Eng. 14:201, 1987; Buchwald et al., Surgery 88:507, 1980; Saudek et al., N. Engl. J. Med. 321:574, 1989) or by continuous subcutaneous infusions, for example, using a mini-pump. An intravenous bag solution can also be employed. One factor in selecting an appropriate dose is the result obtained, as measured by the methods disclosed here, as are deemed appropriate by the practitioner. Other controlled release systems are discussed in Langer (Science 249:1527-33, 1990).
In one example, a pump is implanted (for example see U.S. Pat. Nos. 6,436,091; 5,939,380; and 5,993,414). Implantable drug infusion devices are used to provide patients with a constant and long-term dosage or infusion of a therapeutic agent. Such device can be categorized as either active or passive.
Active drug or programmable infusion devices feature a pump or a metering system to deliver the agent into the patient's system. An example of such an active infusion device currently available is the Medtronic SYNCHROMED™ programmable pump. Passive infusion devices, in contrast, do not feature a pump, but rather rely upon a pressurized drug reservoir to deliver the agent of interest. An example of such a device includes the Medtronic ISOMED™.
In particular examples, immunogenic compositions including a disclosed therapeutic agent are administered by sustained-release systems. Suitable examples of sustained-release systems include suitable polymeric materials (such as, semi-permeable polymer matrices in the form of shaped articles, for example films, or microcapsules), suitable hydrophobic materials (for example as an emulsion in an acceptable oil) or ion exchange resins, and sparingly soluble derivatives (such as, for example, a sparingly soluble salt). Sustained-release compositions can be administered orally, parenterally, intracistemally, intraperitoneally, topically (as by powders, ointments, gels, drops or transdermal patch), or as an oral or nasal spray. Sustained-release matrices include polylactides (U.S. Pat. No. 3,773,919, EP 58,481), copolymers of L-glutamic acid and gamma-ethyl-L-glutamate (Sidman et al., Biopolymers 22:547-556, 1983, poly(2-hydroxyethyl methacrylate)); (Langer et al., J. Biomed. Mater. Res. 15:167-277, 1981; Langer, Chem. Tech. 12:98-105, 1982, ethylene vinyl acetate (Langer et al., Id.) or poly-D-(−)-3-hydroxybutyric acid (EP 133,988).
Polymers can be used for ion-controlled release. Various degradable and nondegradable polymeric matrices for use in controlled drug delivery are known in the art (Langer, Accounts Chem. Res. 26:537, 1993). For example, the block copolymer, polaxamer 407 exists as a viscous yet mobile liquid at low temperatures but forms a semisolid gel at body temperature. It has shown to be an effective vehicle for formulation and sustained delivery of recombinant interleukin-2 and urease (Johnston et al., Pharm. Res. 9:425, 1992; and Pec, J. Parent. Sci. Tech. 44(2):58, 1990). Alternatively, hydroxyapatite has been used as a microcarrier for controlled release of proteins (Ijntema et al., Int. J. Pharm. 112:215, 1994). In yet another aspect, liposomes are used for controlled release as well as drug targeting of the lipid-capsulated drug (Betageri et al., Liposome Drug Delivery Systems, Technomic Publishing Co., Inc., Lancaster, Pa., 1993). Numerous additional systems for controlled delivery of therapeutic proteins are known (for example, U.S. Pat. No. 5,055,303; U.S. Pat. No. 5,188,837; U.S. Pat. No. 4,235,871; U.S. Pat. No. 4,501,728; U.S. Pat. No. 4,837,028; U.S. Pat. No. 4,957,735; and U.S. Pat. No. 5,019,369; U.S. Pat. No. 5,055,303; U.S. Pat. No. 5,514,670; U.S. Pat. No. 5,413,797; U.S. Pat. No. 5,268,164; U.S. Pat. No. 5,004,697; U.S. Pat. No. 4,902,505; U.S. Pat. No. 5,506,206; U.S. Pat. No. 5,271,961; U.S. Pat. No. 5,254,342; and U.S. Pat. No. 5,534,496).
Immunogenic compositions can be administered for therapeutic treatments. In therapeutic applications, a therapeutically effective amount of the immunogenic composition is administered to a subject suffering from a disease, such as HIV-1 infection or AIDS. The immunogenic composition can be administered by any means known to one of skill in the art (see Banga, A., “Parenteral Controlled Delivery of Therapeutic Peptides and Proteins,” in Therapeutic Peptides and Proteins, Technomic Publishing Co., Inc., Lancaster, Pa., 1995) such as by intramuscular, subcutaneous, or intravenous injection, but even oral, nasal, or anal administration is contemplated. To extend the time during which the peptide or protein is available to stimulate a response, the peptide or protein can be provided as an implant, an oily injection, or as a particulate system. The particulate system can be a microparticle, a microcapsule, a microsphere, a nanocapsule, or similar particle (see, for example, Banga, supra). A particulate carrier based on a synthetic polymer has been shown to act as an adjuvant to enhance the immune response, in addition to providing a controlled release. Aluminum salts can also be used as adjuvants to produce an immune response.
Immunogenic compositions can be formulated in unit dosage form, suitable for individual administration of precise dosages. In pulse doses, a bolus administration of an immunogenic composition that includes a disclosed immunogen is provided, followed by a time-period wherein no disclosed immunogen is administered to the subject, followed by a second bolus administration. A therapeutically effective amount of an immunogenic composition can be administered in a single dose, or in multiple doses, for example daily, during a course of treatment. In specific, non-limiting examples, pulse doses of an immunogenic composition that include a disclosed immunogen are administered during the course of a day, during the course of a week, or during the course of a month.
Immunogenic compositions can be administered whenever the effect (such as decreased signs, symptom, or laboratory results of HIV-1 infection) is desired. Generally, the dose is sufficient to treat or ameliorate symptoms or signs of disease without producing unacceptable toxicity to the subject. Systemic or local administration can be utilized.
Amounts effective for therapeutic use can depend on the severity of the disease and the age, weight, general state of the patient, and other clinical factors. Thus, the final determination of the appropriate treatment regimen will be made by the attending clinician. Typically, dosages used in vitro can provide useful guidance in the amounts useful for in situ administration of the pharmaceutical composition, and animal models may be used to determine effective dosages for treatment of particular disorders. Various considerations are described, for example in Gilman et al., eds., Goodman and Gilman: The Pharmacological Bases of Therapeutics, 8th ed., Pergamon Press, 1990; and Remington's Pharmaceutical Sciences, 17th ed., Mack Publishing Co., Easton, Pa., 1990. Typically, the dose range for a gp120 polypeptide is from about 0.1 μg/kg body weight to about 100 mg/kg body weight. Other suitable ranges include doses of from about 1 μg/kg to 10 mg/kg body weight. In one example, the dose is about 1.0 μg to about 50 mg, for example, 1 μg to 1 mg, such as 1 mg peptide per subject. The dosing schedule can vary from daily to as seldom as once a year, depending on clinical factors, such as the subject's sensitivity to the peptide and tempo of their disease. Therefore, a subject can receive a first dose of a disclosed therapeutic molecule, and then receive a second dose (or even more doses) at some later time(s), such as at least one day later, such as at least one week later.
The pharmaceutical compositions disclosed herein can be prepared and administered in dose units. Solid dose units include tablets, capsules, transdermal delivery systems, and suppositories. The administration of a therapeutic amount can be carried out both by single administration in the form of an individual dose unit or else several smaller dose units and also by multiple administrations of subdivided doses at specific intervals. Suitable single or divided doses include, but are not limited to about 0.01, 0.1, 0.5, 1, 3, 5, 10, 15, 30, or 50 μg protein/kg/day
The nucleic acid constructs encoding antigenic gp120 polypeptides described herein are used, for example, in combination, as pharmaceutical compositions (medicaments) for use in therapeutic, for example, prophylactic regimens (such as vaccines) and administered to subjects (for example, primate subjects such as human subjects) to elicit an immune response against one or more clade or strain of HIV. For example, the compositions described herein can be administered to a human (or non-human) subject prior to infection with HIV to inhibit infection by or replication of the virus. Thus, the pharmaceutical compositions described above can be administered to a subject to elicit a protective immune response against HIV. To elicit an immune response, a therapeutically effective (for example, immunologically effective) amount of the nucleic acid constructs are administered to a subject, such as a human (or non-human) subject.
Immunization by nucleic acid constructs is well known in the art and taught, for example, in U.S. Pat. No. 5,643,578 (which describes methods of immunizing vertebrates by introducing DNA encoding a desired antigen to elicit a cell-mediated or a humoral response), and U.S. Pat. No. 5,593,972 and U.S. Pat. No. 5,817,637 (which describe operably linking a nucleic acid sequence encoding an antigen to regulatory sequences enabling expression). U.S. Pat. No. 5,880,103 describes several methods of delivery of nucleic acids encoding immunogenic peptides or other antigens to an organism. The methods include liposomal delivery of the nucleic acids (or of the synthetic peptides themselves), and immune-stimulating constructs, or ISCOMS™, negatively charged cage-like structures of 30-40 nm in size formed spontaneously on mixing cholesterol and QUIL A™ (saponin).
For administration of gp120 nucleic acid molecules, the nucleic acid can be delivered intracellularly, for example by expression from an appropriate nucleic acid expression vector which is administered so that it becomes intracellular, such as by use of a retroviral vector (see U.S. Pat. No. 4,980,286), or by direct injection, or by use of microparticle bombardment (such as a gene gun; Biolistic, Dupont), or coating with lipids or cell-surface receptors or transfecting agents, or by administering it in linkage to a homeobox-like peptide which is known to enter the nucleus (for example Joliot et al., Proc. Natl. Acad. Sci. USA 1991, 88:1864-8). The present disclosure includes all forms of nucleic acid delivery, including synthetic oligos, naked DNA, plasmid and viral, integrated into the genome or not.
In another approach to using nucleic acids for immunization, an immunogenic gp120 polypeptide can also be expressed by attenuated viral hosts or vectors or bacterial vectors. Recombinant vaccinia virus, adeno-associated virus (AAV), herpes virus, retrovirus, or other viral vectors can be used to express the peptide or protein, thereby eliciting a CTL response. For example, vaccinia vectors and methods useful in immunization protocols are described in U.S. Pat. No. 4,722,848. BCG (Bacillus Calmette Guerin) provides another vector for expression of the peptides (see Stover, Nature 351:456-460, 1991).
In one example, a viral vector is utilized. These vectors include, but are not limited to, adenovirus, herpes virus, vaccinia, or an RNA virus such as a retrovirus. In one example, the retroviral vector is a derivative of a murine or avian retrovirus. Examples of retroviral vectors in which a single foreign gene can be inserted include, but are not limited to: Moloney murine leukemia virus (MoMuLV), Harvey murine sarcoma virus (HaMuSV), murine mammary tumor virus (MuMTV), and Rous Sarcoma Virus (RSV). When the subject is a human, a vector such as the gibbon ape leukemia virus (GaLV) can be utilized. A number of additional retroviral vectors can incorporate multiple genes. All of these vectors can transfer or incorporate a gene for a selectable marker so that transduced cells can be identified and generated. By inserting a nucleic acid sequence encoding a gp120 polypeptide into the viral vector, along with another gene that encodes the ligand for a receptor on a specific target cell, for example, the vector is now target specific. Retroviral vectors can be made target specific by attaching, for example, a sugar, a glycolipid, or a protein. Preferred targeting is accomplished by using an antibody to target the retroviral vector. Those of skill in the art will know of, or can readily ascertain without undue experimentation, specific polynucleotide sequences which can be inserted into the retroviral genome or attached to a viral envelope to allow target specific delivery of the retroviral vector containing the polynucleotide encoding a gp120 polypeptide.
Since recombinant retroviruses are defective, they need assistance in order to produce infectious vector particles. This assistance can be provided, for example, by using helper cell lines that contain plasmids encoding all of the structural genes of the retrovirus under the control of regulatory sequences within the LTR. These plasmids are missing a nucleotide sequence that enables the packaging mechanism to recognize an RNA transcript for encapsidation. Helper cell lines that have deletions of the packaging signal include, but are not limited to Q2, PA317, and PA12, for example. These cell lines produce empty virions, since no genome is packaged. If a retroviral vector is introduced into such cells in which the packaging signal is intact, but the structural genes are replaced by other genes of interest, the vector can be packaged and vector virion produced.
Suitable formulations for the nucleic acid constructs, include aqueous and non-aqueous solutions, isotonic sterile solutions, which can contain anti-oxidants, buffers, and bacteriostats, and aqueous and non-aqueous sterile suspensions that can include suspending agents, solubilizers, thickening agents, stabilizers, and preservatives. The formulations can be presented in unit-dose or multi-dose sealed containers, such as ampules and vials, and can be stored in a freeze-dried (lyophilized) condition requiring only the addition of the sterile liquid carrier, for example, water, immediately prior to use. Extemporaneous solutions and suspensions can be prepared from sterile powders, granules, and tablets. Preferably, the carrier is a buffered saline solution. More preferably, the composition for use in the inventive method is formulated to protect the nucleic acid constructs from damage prior to administration. For example, the composition can be formulated to reduce loss of the adenoviral vectors on devices used to prepare, store, or administer the expression vector, such as glassware, syringes, or needles. The compositions can be formulated to decrease the light sensitivity and/or temperature sensitivity of the components. To this end, the composition preferably comprises a pharmaceutically acceptable liquid carrier, such as, for example, those described above, and a stabilizing agent selected from the group consisting of polysorbate 80, L-arginine, polyvinylpyrrolidone, trehalose, and combinations thereof.
In therapeutic applications, a therapeutically effective amount of the composition is administered to a subject prior to or following exposure to or infection by HIV. When administered prior to exposure, the therapeutic application can be referred to as a prophylactic administration (such as in the form of a vaccine). Single or multiple administrations of the compositions are administered depending on the dosage and frequency as required and tolerated by the subject. In one embodiment, the dosage is administered once as a bolus, but in another embodiment can be applied periodically until a therapeutic result, such as a protective immune response, is achieved. Generally, the dose is sufficient to treat or ameliorate symptoms or signs of disease without producing unacceptable toxicity to the subject. Systemic or local administration can be utilized.
In the context of nucleic acid vaccines, naturally occurring or synthetic immunostimulatory compositions that bind to and stimulate receptors involved in innate immunity can be administered along with nucleic acid constructs encoding the gp120 polypeptides. For example, agents that stimulate certain Toll-like receptors (such as TLR7, TLR8 and TLR9) can be administered in combination with the nucleic acid constructs encoding gp120 polypeptides. In some embodiments, the nucleic acid construct is administered in combination with immunostimulatory CpG oligonucleotides.
Nucleic acid constructs encoding gp120 polypeptides can be introduced in vivo as naked DNA plasmids. DNA vectors can be introduced into the desired host cells by methods known in the art, including but not limited to transfection, electroporation (for example, transcutaneous electroporation), microinjection, transduction, cell fusion, DEAE dextran, calcium phosphate precipitation, use of a gene gun, or use of a DNA vector transporter (See for example, Wu et al. J. Biol. Chem., 267:963-967, 1992; Wu and Wu J. Biol. Chem., 263:14621-14624, 1988; and Williams et al. Proc. Natl. Acad. Sci. USA 88:2726-2730, 1991). As described in detail in the Examples, a needleless delivery device, such as a BIOJECTOR® needleless injection device can be utilized to introduce the therapeutic nucleic acid constructs in vivo. Receptor-mediated DNA delivery approaches can also be used (Curiel et al. Hum. Gene Ther., 3:147-154, 1992; and Wu and Wu, J. Biol. Chem., 262:4429-4432, 1987). Methods for formulating and administering naked DNA to mammalian muscle tissue are disclosed in U.S. Pat. Nos. 5,580,859 and 5,589,466, both of which are herein incorporated by reference. Other molecules are also useful for facilitating transfection of a nucleic acid in vivo, such as a cationic oligopeptide (for example, WO95/21931), peptides derived from DNA binding proteins (for example, WO96/25508), or a cationic polymer (for example, WO95/21931).
Another well-known method that can be used to introduce nucleic acid constructs encoding gp120 immunogens into host cells is particle bombardment (also known as biolistic transformation). Biolistic transformation is commonly accomplished in one of several ways. One common method involves propelling inert or biologically active particles at cells. This technique is disclosed in, for example, U.S. Pat. Nos. 4,945,050, 5,036,006; and 5,100,792, all to Sanford et al., which are hereby incorporated by reference. Generally, this procedure involves propelling inert or biologically active particles at the cells under conditions effective to penetrate the outer surface of the cell and to be incorporated within the interior thereof. When inert particles are utilized, the plasmid can be introduced into the cell by coating the particles with the plasmid containing the exogenous DNA. Alternatively, the target cell can be surrounded by the plasmid so that the plasmid is carried into the cell by the wake of the particle.
Alternatively, the vector can be introduced in vivo by lipofection. For the past decade, there has been increasing use of liposomes for encapsulation and transfection of nucleic acids in vitro. Synthetic cationic lipids designed to limit the difficulties and dangers encountered with liposome mediated transfection can be used to prepare liposomes for in vivo transfection of a gene encoding a marker (Felgner et. al. Proc. Natl. Acad. Sci. USA 84:7413-7417, 1987; Mackey, et al. Proc. Natl. Acad. Sci. USA 85:8027-8031, 1988; Ulmer et al. Science 259:1745-1748, 1993). The use of cationic lipids can promote encapsulation of negatively charged nucleic acids, and also promote fusion with negatively charged cell membranes (Felgner and Ringoid Science 337:387-388, 1989). Particularly useful lipid compounds and compositions for transfer of nucleic acids are described in WO95/18863 and WO96/17823, and in U.S. Pat. No. 5,459,127, herein incorporated by reference.
As with the immunogenic polypeptide, the nucleic acid compositions may be administered in a single dose, or multiple doses separated by a time interval can be administered to elicit an immune response against HIV. For example, two doses, or three doses, or four doses, or five doses, or six doses or more can be administered to a subject over a period of several weeks, several months or even several years, to optimize the immune response.
It may be advantageous to administer the immunogenic compositions disclosed herein with other agents such as proteins, peptides, antibodies, and other anti-HIV agents. Examples of such anti-HIV therapeutic agents include nucleoside reverse transcriptase inhibitors, such as abacavir, AZT, didanosine, emtricitabine, lamivudine, stavudine, tenofovir, zalcitabine, zidovudine, and the like, non-nucleoside reverse transcriptase inhibitors, such as delavirdine, efavirenz, nevirapine, protease inhibitors such as amprenavir, atazanavir, indinavir, lopinavir, nelfinavir osamprenavir, ritonavir, saquinavir, tipranavir, and the like, and fusion protein inhibitors such as enfuvirtide and the like. In certain embodiments, immunonogenic compositions are administered concurrently with other anti-HIV therapeutic agents. In certain embodiments, the immunonogenic compositions are administered sequentially with other anti-HIV therapeutic agents, such as before or after the other agent. One of ordinary skill in the art would know that sequential administration can mean immediately following or after an appropriate period of time, such as hours days, weeks, months, or even years later.
While not being bound by theory, it is believed that CD4 binding to gp120 triggers the exposure of the immunodominant V3 loop. Thus, co-administration of soluble forms of CD4, such as the fragments described herein, or an antibody that binds to the CD4 binding site, can lead to enhanced elicitation of an immunogenic response to gp120.
In certain embodiments, immunonogenic compositions disclosed herein are administered with a soluble portion of CD4, for example a sufficient portion of the CD4 to bind to the CD4 binding site on gp120. Such soluble fragments typically include both the D1 and D2 extracellular domains of CD4 (D1D2) or sCD4 (which is comprised of D1 D2 D3 and D4 domains of CD4), although smaller fragments may also provide specific and functional CD4-like binding. In certain embodiments, the gp120 polypeptide with an extended V3 loop or a nucleic acid encoding the same is administered concurrently with a soluble portion of CD4. In other embodiments, the gp120 polypeptide with an extended V3 loop or a nucleic acid encoding the same is administered concurrently with an antibody that binds to the CD4 binding site on gp120.
The immunogenic gp120 polypeptides and nucleic acid encoding these polypeptides (such as stabilized gp120 polypeptides, gp120 polypeptides with an extended V3 loop) can be used in a novel multistep immunization regime. Typically, this regime includes administering to a subject a therapeutically effective amount of a gp120 polypeptide as disclosed herein (the prime) and boosting the immunogenic response with stabilized gp140 trimer (Yang et al. J Virol. 76(9):4634-42, 2002) after an appropriate period of time. The method of eliciting such an immune reaction is what is known as “prime-boost.” In this method, a gp120 polypeptide is initially administered to a subject and at periodic times thereafter stabilized gp140 trimer boosts are administered. Examples of stabilized gp140 or gp120 trimers can be found for example in U.S. Pat. No. 6,911,205 which is incorporated herein in its entirely.
The prime can be administered as a single dose or multiple doses, for example two doses, three doses, four doses, five doses, six doses or more can be administered to a subject over day week or months. The boost can be administered as a single dose or multiple doses, for example two to six doses, or more can be administered to a subject over a day, a week or months. Multiple boosts can also be given, such one to five, or more.
The boosts can be an identical molecule or a somewhat different, but related, molecule. For example, one preferred strategy with the gp120 polypeptides of the present disclosure would be to prime using a stabilized gp120 polypeptide or a gp120 polypeptide with an extended V3 and boosting periodically with stabilized trimers where the gp120 units are designed to come closer and closer to the wild type gp120 over the succession of boosts. For example, the first prime could be a stabilized gp120 polypeptide, with a boost by a stabilized trimer form with the same stabilized gp120 or a trimer with less deletions or changes from the native gp120 conformation, with subsequent boosts using trimers that had still less deletions or changes from the native gp120 conformation until the boosts were finally being given by trimers with a gp120 portion based on the native wild type HIV gp120.
One can also use cocktails containing a variety of different HIV strains to prime and boost with trimers from a variety of different HIV strains or with trimers that are a mixture of multiple HIV strains For example, the first prime could be with a gp120 polypeptide from one primary HIV isolate, with subsequent boosts using trimers from different primary isolates.
In certain embodiments, the prime is a nucleic acid construct comprising a nucleic acid sequence encoding a gp120 immunogen as disclosed herein, for example an nucleotide sequence encoding the amino acid sequence set forth as SEQ ID NO: 1 or SEQ ID NO: 2, or the nucleotide sequence as set forth as one of SEQ ID NO: 4-18 or a degenerate variant thereof. In certain embodiments the boost comprises a nucleic acid sequence encoding a stabilized gp140 trimer.
The stabilized gp120 polypeptides and the gp120 polypeptides with an extended V3 loop disclosed herein can be used to produce detailed models of gp120 polypeptide atomic structure. Exemplary coordinate data is given in Table 1 and Table 2. The atomic coordinate data is disclosed herein, or the coordinate data derived from homologous proteins may be used to build a three-dimensional model of a gp120 polypeptide or a portion thereof, for example by providing a sufficient number of atoms of the stabilized form of gp120 or the gp120 with the V3 loop in the extended conformation as defined by the coordinates of Table 1 or Table 2 which represent a surface or three-dimensional region of interest, such as an antigenic surface or ligand binding site. Thus, there can be provided the coordinates of at least about 5, such at least about 10, at least about 20, at least about 30, at least at least about 40, at least about 50, at least about 60, at least about 70, at least about 80, at least about 90, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, at least about 500 or more atoms of the structure, such as defined by the coordinates of Table 1 or Table 2. Thus, a sub-domain, region, or fragment of interest of the stabilized form of gp120 or the gp120 with the extended V3 loop which is in the vicinity of the antigenic surface, can be provided for identifying or rationally designing a compound or drug, such as an immunogen. A “sub-domain,” “region,” or “fragment” can mean at least one, for example, one, two, three, four, or more, element(s) of secondary structure of particular regions of the stabilized form of gp120 or the gp120 with the extended V3 loop gp120 with the extended V3 loop, and includes those set forth in Table 1 and Table 2.
Any available computational methods may be used to build the three dimensional model. As a starting point, the X-ray diffraction pattern obtained from the assemblage of the molecules or atoms in a crystalline version of a gp120 polypeptide can be used to build an electron density map using tools well known to those skilled in the art of crystallography and X-ray diffraction techniques. Additional phase information extracted either from the diffraction data and available in the published literature and/or from supplementing experiments may then used to complete the reconstruction.
For an overview of the procedures of collecting, analyzing, and utilizing X-ray diffraction data for the construction of electron densities see, for example, Campbell et al., Biological Spectroscopy, The Benjamin/Cummings Publishing Co., Inc., Menlo Park, Calif., 1984; Cantor et al., Biophysical Chemistry, Part II: Techniques for the study of biological structure and function, W.H. Freeman and Co., San Francisco, Calif. 1980; A. T. Brunger, X-plor Version 3.1: A system for X-ray crystallography and NMR, Yale Univ. Pr., New Haven, Conn. 1993; M. M. Woolfson, An Introduction to X-ray Crystallography, Cambridge Univ. Pr., Cambridge, UK, 1997; J. Drenth, Principles of Protein X-ray Crystallography (Springer Advanced Texts in Chemistry), Springer Verlag; Berlin, 1999; Tsirelson et al, Electron Density and Bonding in Crystals: Principles, Theory and X-ray Diffraction Experiments in Solid State Physics and Chemistry, Inst. of Physics Pub., 1996; each of which is herein specifically incorporated by reference in their entirety.
Information on molecular modeling can be found for example in, M. Schlecht, Molecular Modeling on the PC, 1998, John Wiley & Sons; Gans et al., Fundamental Principals of Molecular Modeling, Plenum Pub. Corp., 1996; N.C. Cohen (editor), Guidebook on Molecular Modeling in Drug Design, Academic Press, 1996; and W. B. Smith, Introduction to Theoretical Organic Chemistry and Molecular Modeling, 1996.
Typically, a well-ordered crystal that will diffract x-rays strongly is used to solve the three-dimensional structure of a protein by x-ray crystallography. The crystallographic method directs a beam of x-rays onto a regular, repeating array of many identical molecules. The x-rays are diffracted from it in a pattern from which the atomic positions of the atom that make up the molecule of interest can be determined.
Substantially pure and homogeneous protein samples are usually used for crystallization. Typically, crystals form when molecules are precipitated very slowly from supersaturated solutions. A typical procedure for making protein crystals is the hanging-drop method, in which a drop of protein solution is brought very gradually to supersaturation by loss of water from the droplet to the larger reservoir that contains salt, polyethylene glycol, or other solution that functions as a hydroattractant, although any other method that generates diffraction quality crystals can be used. In some examples diffraction quality crystals are obtained by seeding the supersaturated solution with smaller crystals that serve as templates.
Powerful x-ray beams can be produced from synchrotron storage rings where electrons (or positrons) travel close to the speed of light. These particles emit very strong radiation at all wavelengths from short gamma rays to visible light. When used as an x-ray source, only radiation within a window of suitable wavelengths is channeled from the storage ring.
In diffraction experiments a narrow and parallel beam of x-rays is taken out from the x-ray source and directed onto the crystal to produce diffracted beams. The incident x-ray beam causes damage to both protein and solvent molecules. The crystal is, therefore, usually cooled to prolong its lifetime (for example to −220° to −50° C.). In some examples, single crystals are used to obtain a data set, while in other examples, multiple crystals are used to obtain a data set. The x-ray beam must strike the crystal from many different directions to produce all possible diffraction spots, thereby creating a complete data set. Therefore, the crystal is rotated relative to the beam during data collection. The diffracted spots are recorded either on a film, or by an electronic detector, both of which are commercially available.
When the primary beam from an x-ray source strikes the crystal, x-rays interact with the electrons on each atom in the crystal and cause them to oscillate. The oscillating electrons serve as a new source of x-rays, which are emitted in almost all directions in a process referred to as scattering. When atoms (and hence their electrons) are arranged in a regular three-dimensional array, as in a crystal, the x-rays emitted from the oscillating electrons interfere with one another. In most cases, these x-rays, colliding from different directions, cancel each other out; those from certain directions, however, will add together to produce diffracted beams of radiation that can be recorded as a pattern on a photographic plate or detector.
The diffraction pattern obtained in an x-ray experiment is related to the crystal that caused the diffraction. X-rays that are reflected from adjacent planes travel different distances, and diffraction only occurs when the difference in distance is equal to the wavelength of the x-ray beam. This distance is dependent on the reflection angle, which is equal to the angle between the primary beam and the planes.
Each atom in a crystal scatters x-rays in all directions, and only those that positively interfere with one another, according to Bragg's law (2d sin θ=λ), give rise to diffracted beams that can be recorded as a distinct diffraction spot above background. Each diffraction spot is the result of interference of all x-rays with the same diffraction angle emerging from all atoms. To extract information about individual atoms from such a system requires considerable computation. The mathematical tool that is used to handle such problems is called the Fourier transform.
Each diffracted beam, which is recorded as a spot on the film, is defined by three properties: the amplitude, which is measured as the intensity of the spot; the wavelength, which is determined by the x-ray source; and the phase information, which is lost in x-ray experiments and must be calculated. All three properties are used for all of the diffracted beams, in order to determine the position of the atoms giving rise to the diffracted beams. Methods of determining the phases are well known in the art.
For example, phase differences between diffracted spots can be determined from intensity changes following heavy atom derivatization. Another example would be determining the phases by molecular replacement.
The amplitudes and the phases of the diffraction data from the protein crystals are used to calculate an electron-density map of the repeating unit of the crystal. A model of the particular amino acid sequence is built to approximate the electron density map.
The initial model will contain some errors. Provided the protein crystals diffract to high enough resolution (e.g., better than 3.5 Å), most or substantially all of the errors can be removed by crystallographic refinement of the model using computer algorithms. In this process, the model is changed to minimize the difference between the experimentally observed diffraction amplitudes and those calculated for a hypothetical crystal containing the model. This difference is expressed as an R factor (residual disagreement) which is 0.0 for exact agreement and about 0.59 for total disagreement.
Typically, the R factor of a refined model is preferably between 0.15 and 0.35 (such as less than about 0.24-0.28) for a well-determined protein structure. The residual difference is a consequence of errors and imperfections in the data. These derive from various sources, including slight variations in the conformation of the protein molecules, as well as inaccurate corrections both for the presence of solvent and for differences in the orientation of the microcrystals from which the crystal is built. Thus, the final model represents an average of molecules that are slightly different in both conformation and orientation.
In refined structures at high resolution, there are usually no major errors in the orientation of individual residues, and the estimated errors in atomic positions are usually around 0.1-0.2 Å, provided the amino acid sequence is known.
Most x-ray structures are determined to a resolution between 1.7 Å. and 3.5 Å. Electron-density maps with this resolution range are preferably interpreted by fitting the known amino acid sequences into regions of electron density in which individual atoms are not resolved.
The present disclosure also relates to the crystals obtained from stabilized forms of gp120, the crystal structures of the stabilized forms of gp120, the three-dimensional coordinates of the stabilized forms of gp120 polypeptide and three-dimensional structures of models of stabilized forms of gp120. Table 1 provides the atomic coordinates of the crystal structure of the polypeptide encoded by SEQ ID NO: 14.
The present disclosure further relates to the crystal structure of gp120 in which the V3 loop is in an extended conformation. The present disclosure also relates to the crystals obtained from a gp120 polypeptide with an extended V3 loop. The three-dimensional coordinates of a gp120 polypeptide with an extended V3 loop, three-dimensional structures of models of a gp120 polypeptide with an extended V3 loop, and uses of these models. The amino acid sequence of a gp120 polypeptide with an extended V3 loop variant is set forth as SEQ ID NO: 2.
The structure of a gp120 polypeptide with an extended V3 loop was solved in complex with the X5 Fab and the d1d2 domain of the CD4 receptor. Analysis of the structure revealed that the V3 loop was present in an elongated conformation that was previously not seen in other complexes involving the gp120 protein. An advantageous feature of this crystal structure over previous structures is the organization of the V3 loop in an elongated conformation, compatible with the elicitation of immunodominant antibody response. Table 2 provides the atomic coordinates of the crystal structure of the polypeptide disclosed in SEQ ID NO: 2.
The present disclosure also provides for a machine-readable data storage medium which comprises a data storage material encoded with machine readable data defined by the structure coordinates of a stabilized gp120 polypeptide or gp120 polypeptide with an extended V3 loop as define in Table 1 or Table 2 respectively, or a subset thereof, such as at least about 5, such at least about 10, at least about 20, at least about 30, at least at least about 40, at least about 50, at least about 60, at least about 70, at least about 80, at least about 90, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, at least about 500 or more atoms of the structure, such as defined by the coordinates of Table 1 or Table 2.
Those of skill in the art will understand that a set of structure coordinates for a gp120 polypeptide, for example a stabilized gp120 polypeptide, a gp120 polypeptide with an extended V3 loop, or a portion thereof, is a relative set of points that define a shape in three dimensions. Thus, it is possible that an entirely different set of coordinates could define a similar or identical shape. Moreover, slight variations in the individual coordinates will have little effect on overall shape. The variations in coordinates discussed above may be generated because of mathematical manipulations of the structure coordinates. For example, the structure coordinates set forth in Table 1 or Table 2, or a portion thereof could be manipulated by crystallographic permutations of the structure coordinates, fractionalization of the structure coordinates; integer additions or subtractions to sets of the structure coordinates, deletion of a portion of the coordinates, inversion of the structure coordinates, or any combination of the above.
This disclosure further provides systems, such as computer systems, intended to generate structures and/or perform rational drug or compound design for an antigenic compound capable of eliciting an immune response in a subject. The system can contain one or more or all of: atomic co-ordinate data according to Table 1, Table 2, or a subset thereof and the Figures derived therefrom by homology modeling, the data defining the three-dimensional structure of a gp120 or at least one sub-domain thereof, or structure factor data for gp120, the structure factor data being derivable from the atomic co-ordinate data of Table 1 or Table 2 or a subset thereof and the Figures. This disclosure also involves computer readable media with: atomic co-ordinate data according to Table 1, Table 2 or a subset thereof and/or the Figures or derived therefrom by homology modeling, the data defining the three-dimensional structure of a gp120 or at least one sub-domain thereof; or structure factor data for a gp120, the structure factor data being derivable from the atomic co-ordinate data of Table 1, Table 2, or a subset thereof and/or the Figures. By providing such computer readable media, the atomic co-ordinate data can be routinely accessed to the gp120 or a sub-domain thereof. For example RASMOL (Sayle et al., TIBS vol. 20 (1995), 374) is a publicly available software package which allows access and analysis of atomic co-ordinate data for structural determination and/or rational drug design. Structure factor data, which are derivable from atomic co-ordinate data (see, for example, Blundell et al., in Protein Crystallography, Academic Press, NY, London and San Francisco (1976)), are particularly useful for calculating electron density maps, for example, difference Fourier electron density maps. Thus, there are additional uses for the computer readable media and/or computer systems and/or atomic co-ordinate data and additional reasons to provide them to users.
The crystals of this disclosure and particularly the atomic structure coordinates obtained from these crystals are particularly useful for identifying compounds elicit neutralizing antibodies, for example CD4BS and CD4i antibodies. The compounds identified are useful in eliciting antibodies to gp120, such as antibodies to lentivirus, such as SIV, or HIV, for example HIV-1 or HIV-2.
The crystal structure of a stabilized form of gp120 or a gp120 with the V3 loop in the extended conformation allows a novel approach for drug or compound discovery, identification, and design of compounds that mimic the antigenic surfaces of gp120 that bind neutralizing antibodies. Such compound can be useful as immunogens to illicit an immune response to HIV when administered to a subject, for example by eliciting anti-HIV antibodies, such as neutralizing antibodies, for example CD4BD or CD4i antibodies. Compounds that elicit anti-HIV antibodies are useful in diagnosis, treatment, or prevention of HIV-1 in a subject in need thereof.
The disclosure provides a computer-based method of rational drug, compound design, or identification which comprises: providing the structure of a stabilized form of gp120 (for example as defined by the coordinates or a subset of the coordinates in Table 1 and/or in the Figures) or a gp120 with the V3 loop in the extended conformation (for example as defined by the coordinates or subset of the coordinates in Table 2 and/or in the Figures); providing a structure of a candidate compound; and fitting the structure of the candidate compound to the structure of the stabilized form of gp120 (for example as defined by the coordinates or a subset of the coordinates in Table 1 and/or in the Figures) or the gp120 with the V3 loop in the extended conformation (for example as defined by the coordinates or a subset of the coordinates in Table 2 and/or in the Figures.
In certain embodiments, the coordinates of atoms of interest of the stabilized form of gp120 or the gp120 with the V3 loop in the extended conformation in the vicinity of the antigenic surface are used to model the antigenic surface to which as antibody binds, such as a neutralizing antibody, for example a CD4i or CD4BS antibody. These coordinates may be used to define a space which is then screened “in silico” against a candidate compound. Thus, the disclosure provides a computer-based method of rational drug or compound design or identification which comprises: providing the coordinates of at least two atoms of Table 1 or Table 2; providing the structure of a candidate compound; and fitting the structure of the candidate to the coordinates of at least two atoms of Table 1 or Table 2.
In practice, it may be desirable to model a sufficient number of atoms of the stabilized form of gp120 or the gp120 with the V3 loop in the extended conformation as defined by the coordinates of Table 1 or Table 2 which represent the active site or binding region. Thus, there can be provided the coordinates of at least about 5, such at least about 10, at least about 20, at least about 30, at least at least about 40, at least about 50, at least about 60, at least about 70, at least about 80, at least about 90, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, or at least about 500 atoms of the structure.
The methods disclosed herein can employ a sub-domain, region, or fragment of interest of the stabilized form of gp120 or the gp120 with the extended V3 loop which is in the vicinity of the antigenic surface, and providing a computer-based method for identifying or rationally designing a compound or drug, such as an immunogen which includes: providing the coordinates of at least a sub-domain, region, or fragment of the stabilized form of gp120 or the gp120 with the extended V3 loop; providing the structure of a candidate compound that mimics the antigenic surface of the gp120 with the extended V3 loop; and fitting the structure of the candidate compound to the coordinates of the stabilized form of gp120 or the gp120 with the extended V3 loop sub-domain, region, or fragment provided. A “sub-domain”, “region”, or “fragment” can mean at least one, for example, one, two, three, four, or more, element(s) of secondary structure of particular regions of the stabilized form of gp120 or the gp120 with the extended V3 loop gp120 with the extended V3 loop, and includes those set forth in Table 1 and Table 2.
These methods can optionally include synthesizing the candidate compound, (such as an immunogen) and/or administering the candidate compound to an animal capable of eliciting antibodies and testing whether the candidate compound elicits anti-HIV antibodies. Compounds which elicit anti-HIV antibodies are useful for diagnostic purposes, as well as for immunogenic, immunological or even vaccine compositions, as well as pharmaceutical compositions.
In some embodiments, the candidate compound is designed from the gp120 amino acid sequence, for example an amino acid sequence is assembled to provide a candidate compound, for example by synthesizing the amino acid sequence or producing a nucleic acid encoding the candidate compound.
The step of providing the structure of a candidate compound may involve selecting the candidate compound by computationally screening a database of compounds for surface similarity with an epitope on the stabilized form of gp120 or the gp120 with the extended V3 loop. For example, a 3-D descriptor for the candidate compound may be derived, the descriptor including geometric and functional constraints derived from the architecture and chemical nature of the epitope. The descriptor may then be used to interrogate the compound database, a candidate compound being a compound that has a good match to the features of the descriptor. In effect, the descriptor can be a type of virtual pharmacophore.
The determination of the three-dimensional structure of the gp120 with the extended V3 loop provides a basis for the design of new and specific compounds that are useful for eliciting an immune response. For example, from knowing the three-dimensional structure the stabilized form of gp120 or the gp120 with the extended V3 loop, computer modeling programs may be used to design or identify different molecules expected to interact with possible or confirmed active sites such as binding sites or other structural or functional features of neutralizing antibodies.
By way of example, a compound that potentially mimics the antigenic surface of the stabilized form of gp120 or the gp120 with the extended V3 loop can be examined through the use of computer modeling using a docking program such as GRAM, DOCK or AUTODOCK (see for example, Walters et al. Drug Discovery Today, 3(4):160-178, 1998; Dunbrack et al. Folding and Design 2:27-42, 1997). This procedure can include computer fitting of potential immunogens to ascertain how well the shape and the chemical structure of the potential binder will mimic the antigenic surface. Various other computer programs such as AMBER or CHARM may be used to further refine the dynamic and electrostatic characteristics of a candidate compound. Programs such as GRID (Goodford, J. Med. Chem, 28:849-57, 1985) may also be used to analyze the antigenic surfaces to predict immunogenic compounds. Alternatively, computer-assisted, manual examination can be used to predict immunogenic compounds from antigenic surfaces.
One problem with the formation of crystals containing wild-type gp120 is that conformationally variable molecules are not amenable to crystallization. For an ordered crystal to form the molecules forming the crystal must be essential locked in place. Molecules that are unstable or “floppy” such as wild-type gp120 must overcome large entropic (ΔS) costs to form a crystal lattice. By using conformationally stabilized forms of gp120 this entropic cost of becoming ordered is lessened and crystals form more easily. Those skilled in the art can take advantage of this by crystallizing their complex of interest with a stabilized form of gp120. For example, stabilized forms of gp120 can be used to crystallize previously uncrystallizable broadly neutralizing antibodies. In one embodiment, the broadly neutralizing antibody does not induce conformational stabilization as measured by −TΔS of less than 28 kcal/mol upon antibody binding to gp120. The use of broadly neutralizing antibodies is disclosed, for example, in Burton, Nature Re. 2:706-713, 2002, herein incorporated by reference. One example of how this can be accomplished is by forming complexes of a stabilized form of gp120 and the antibody of interest in the presence of CD4.
The following examples are provided to illustrate certain particular features and/or embodiments. These examples should not be construed to limit the invention to the particular features or embodiments described.
This example describes the methods used to design stabilized forms of gp120 disclosed herein. Thermodynamic analysis showed that the conformation of gp120 prior to CD4 binding was highly flexible (Myszka et al., Proc Natl Acad Sci USA. 97(16):9026-31, 2000). The CD4-bound state of gp120 consists of an inner domain (containing the N and C termini), an outer domain, and a four-stranded bridging sheet minidomain. Two-thirds of the CD4 contact surface is with the outer domain, the remaining one-third with the bridging sheet. In the unliganded state, the inner domain is radically altered, with most of its secondary structural elements repositioned. The bridging sheet is pulled apart with the two β-hairpins of the sheet separated by 20 Å. The outer domain, by contrast, remains virtually unchanged.
An initial series of mutants was constructed and analyzed. Initial antigenic analysis suggested that a single mutation, 375 S to W, was able to partially stabilize gp120 in the CD4-bound state. Thermodynamic analysis (ITC) confirmed this result, showing that the entropy (−TΔS) of gp120 binding to CD4 had reduced from 40 kcal/mol to roughly 25 kcal/mol (Xiang et al., J Virol. 76(19):9888-99, 2002).
To further reduce the entropy of CD4 binding to a range typical of antibody recognition (5-10 kcal/mol), precise characterization was used to confirm the mutational stabilization of conformation including: (1) crystallographic determination of the gp120 mutant structure (2) isothermotitration calorimetric analysis of the entropy of CD4 binding, and (3) precise surface-plasmon resonance analysis of the on/off rates of antibodies to the modified gp120 glycoproteins. This design cycle is shown in
Additional cavity-filling mutations and five different disulfides were modeled. The cavity-filling mutants increased hydrophobic interactions at domain interfaces. The disulfides either tied together the inner domain, outer domain and bridging sheet, or were internal to the bridging sheet. Crystallographic analysis on five of these disulfides showed that four of them formed disulfide bonds. Two of these showed minimal perturbation in structure: 96-275 which tied together the inner and outer domain, 109-428 which tied together the bridging sheet and outer domain. The 231-267 disulfide, which tied together the inner domain and outer domain and the 123-431 disulfide, which tied together two strands of the bridging sheet, both showed local perturbations of structure. The potential disulfide formed by mutating 231 to C and 268 to C did not form (
In the core context, each single inter-domain disulfide reduced the entropy of CD4 interaction by roughly 10 kcal/mol, as measured by isothermotitration calorimetry (ITC). Combinations of disulfides were tested. Two disulfide combinations showed similar antigenic phenotypes suggesting a partially stabilized gp120 conformation; ITC analysis for several of the different two disulfide combinations showed the entropy of CD4 interaction was reduced by roughly 20 kcal/mol. Combinations of three and four disulfides were also tested, although most of these only expressed poorly, perhaps due to complications of folding so many cysteines into the correct disulfide bonds. Removal of additional core disulfide (such as the second conserved disulfide in the V1/V2 region) and stabilization of the V3 region may enhance folding. A summary of the qualitative Biacore and ITC results for 17 mutants is shown in Table 4.
Qualitative Biacore analysis and ITC of conformationally stabilized mutants. Biacore analyses were carried out on transfected cell supernatants or with purified protein at 10 ug/ml. Yellow rows represent mutants with structures determined by X-ray crystallography. “A” indicates binding, “F” indicates folding, and “N” indicates no binding or folding. The mutants are indicated with the wildtype residue and position followed by the substituted residue as follows, C1:M95W; C2:T275S/S375W; C3:A433M; S1:W96C/V275C; S2:I109C/Q428C; S3:T123C/G431C; S4K231C/E267C, for example A433M means that a methionine has been substituted for an alanine to create a C3 mutant protein.
Quantitative surface-plasmon resonance characterization of the binding of the various mutants to CD4, to 17b in the absence of CD4 and to 17b in the presence of CD4 allowed the degree of conformational stabilization to be assessed (Table 5).
CD4-on rate did not change much, indicating that initial CD4 occurs without conformational stabilization. The off-rate did decrease relative to wild-type, however, indicating that once CD4 bound, the conformational change was able to lock CD4 into place. A very different effect was seen with the CD4i antibody 17b. With 17b, conformational stabilization greatly increased the “on-rate” of binding, with little effect on the off-rate. This indicated that 17b cannot bind to its site, without the conformational change induced by CD4. In contrast, the initial binding even of CD4 must occur without the conformational change.
Surface-plasmon resonance (SPR) experiments were performed on a Biacore biosensor system at 25° C. Antibody (17b or m6 for the CD4i antibodies; F105, b12, 1.5e, etc. for CD4BS antibodies; b3, b3, b11 etc. for Fab fragments of CD4BS antibodies; and 2-domain CD4 for CD4) were immobilized on research grade CM5 sensor chips using the recommended standard amine coupling. Binding experiments were carried out in HBSP buffer (10 mM HEPES, pH 7.4, 150 mM NaCl and 0.005% surfactant P-20).
During the association phase, gp120 were passed over the buffer-equilibrated chip surface at a rate of 30 ul/min. After the association phase, bound analytes were allowed to dissociate for 5 min. The chip surface was then regenerated by two 25 ul injections of 10 mM Glycine/HCl (pH 3.0) at a flow rate of 50 ul/min. Association and dissociation values were calculated by numerical integration and global fitting to a 1:1 interaction model using BIAevaluation 3.0 software (Biacore, Inc.)
This example describes the methods used to obtain crystals of a gp120 with an extended V3 loop.
To increase the probability of obtaining crystals suitable for X-ray structural analysis, 13 different complexes of HIV-1 envelope glycoprotein gp120 core with intact V3 were prepared and screened for crystallization. To ensure that gp120 was in its coreceptor binding conformation, all complexes contained CD4 (2-domain).
1) Protein Production, Purification, and Complex Preparation
Constructs of core+V3 gp120 from clade B HIV-1 isolates, YU2, JR-FL, and HXBc2, were prepared as previously described (Wu et al., Nature 384:179, 1996; Grundner et al., Virology 330:233, 2004). Truncations of the N-terminus, C-terminus, and substitution of the tripeptide GAG for the V1/V2 region were identical to those previously described (Grundner et al., Virology 330:233, 2004). Wild-type isolates were used for YU2 and HXBc2. For JR-FL, a functional 2-glycan deletion variant was used with mutations, 301N/Q and 388T/A (Koch et al., Virology 313:387, 2003). This CCR5-using JR-FL variant was more susceptible to neutralization by CD4-binding site antibodies, but not to CD4-induced antibodies (Koch et al., Virology 313: 387, 2003. Constructs were expressed in Drosophila Schneider 2 cells under an inducible metallothionein promoter. The 2-domain CD4 (d1d2), antigen-binding fragments (Fabs) and single-chain variable fragments (scFv) of CD4-induced (CD4i) antibodies, 17b, 48d, 412d, m6, m9 and X5, were prepared as previously described (Ryu et al., Nature 348:419, 1990; Kwong et al., J. Biol. Chem. 274:4115, 1999; Huang et al., Proc. Natl. Acad. Sci. USA 101:2706, 2004; Zhang et al., J. Mol. Biol. 335:209, 2004; Moulard et al., Proc. Natl. Acad. Sci. USA 99:6913, 2002). Preparations of gp120 complexes followed procedures that were essentially the same as previously described (Kwong et al., J. Biol. Chem. 274:4115, 1999). Briefly, glycans were removed by digestion with endoglycosidases H and D to leave only the protein proximal N-acetylglucosamine and 1,6 fucose residues. The 2-domain CD4 was added, the binary complexes passed through a concanavalin A column to remove any gp120 proteins with uncleaved N-linked glycan, and the complexes further purified by gel filtration (Hiload 26/60 Superdex S200 prep grad, Amersham). Fab or scFv of CD4-induced (CD4i) antibodies were added and the ternary complexes purified by Superdex S200 chromatography. Purified complexes in 0.35 M NaCl, 2.5 mM Tris pH 7.0, 0.02% NaN3 were concentrated to 5-8 mg/ml. The following complexes were made (specified by strain of core+V3 gp120:soluble CD4 domain fragment:CD4-induced antibody type and fragment):
JR-FL:d1d2:17b Fab
JR-FL:d1d2:48d Fab
JR-FL:d1d2:412d Fab
JR-FL:d1d2:X5 Fab
JR-FL:d1d2
YU2:d1d2:48d Fab
YU2:d1d2:X5 Fab
HXBc2:d1d2:17b Fab
HXBc2:d1d2:48d Fab
HXBc2:d1d2:412d Fab
HXBc2:d1d2:X5 Fab
HXBc2:d1d2:m6 scFv
HXBc2:d1d2:m9 scFv
2) Robotic Screening of Crystallization Conditions
The gp120 complexes were screened robotically using vapor-diffusion sitting droplets composed of 50 nl protein combined with 50 nl crystallization solution (Lesley et al., Proc. Natl. Acad. Sci. USA 99:11664, 2002). 576 different commercially available crystallization solutions were used in each screen. JRFL complexes were screened with Hampton Research Screen I/II, Emerald Wizard Screen I/II, Emerald Wizard Cryo Screen I/II, Hampton Crystal Screen Cryo, Hampton PEG/Ion Screen, Hampton Grid Screens (ammonium sulfate, PEG 6000, MPD, and PEG/LiCl), and Syrrx Polymer Screen. YU2 and HXBc2 complexes were screened in the same manner except that the Hampton Research Index screen was substituted for the Emerald Wizard Cryo Screens. Pictures of crystallization drops were taken at 0, 1, 3, 7, 14, and 21 days after set-up, and the images inspected visually for protein crystals.
3) Crystallization Optimization
Initial crystals observed from robotic screens were reproduced and optimized manually using vapor-diffusion hanging droplets. A total of eight different crystal forms were grown to sizes suitable for testing diffraction quality. While most of the crystals diffracted to at best only 6-10 Å, one crystal consisting of JR-FL:d1d2:X5 Fab diffracted to at least 5 Å and was chosen for further optimization. Larger single crystals were produced by macroseeding (Thaller et al., J. Mol. Biol. 147:465, 1981): 1.5 μl of 5 mg/ml JR-FL:d1d2:X5 Fab was mixed with an equal volume of 1.3 M ammonium sulfate and placed over a 0.5 ml reservoir of 1.3 M ammonium sulfate; after 30 minutes, a single crystal was transferred directly to the droplet. Macroseeded crystals grew to 0.1×0.1×0.2 mm in 5-7 days.
This example describes the methods used to determine the structure of a gp120 with an extended loop to atomic resolution.
Crystals were dehydrated (Heras et al., Structure 11:139, 2003) over 3 M ammonium sulfate reservoirs for 2-3 days. Dehydrated crystals were cross-linked over 20 μl of 1.5% glutaraldehyde for 1.5 hr using the procedure of Lusty (Lusty, J. Appl. Cryst. 32:106, 1999), transferred to a cryoprotectant solution containing 2 M ammonium sulfate, 60% (w/v) xylitol, 10% (w/v) erythritol and 5% (v/v) ethylene glycol for 1-2 minutes, covered with paratone-N, loop mounted, and flash-cooled to 100° K. for data collection. X-ray data were collected at a wavelength of 1.00 Å, using the intense 3rd generation undulator beam-line (SER-CAT) at the Advanced Photon Source, and processed and reduced with HKL2000 (Otwinowski and Minor, Methods Enymol. 276:307, 1997). The crystals were found to belong to space group P622 and to contain one complex per asymmetric unit. The diffraction was anisotropic, with stronger diffraction along the 6-fold axis. The crystal structure of JR-FL:d1d2:X5 Fab was solved by molecular replacement with CNS (Brunger et al., Acta Crystallogr. D 54:905, 1998). For gp120:CD4, a binary search model was constructed from YU2 core gp120 complexed to d1d2 as extracted from the previously determined ternary complex with 17b (pdb accession number, 1RZK) (Kwong et al., Structure 8:1329, 2000), with gp120 N-terminus (residues 83-86) and V4 region (residues 399-406) deleted. For X5 Fab, the structure of free X5 was used (pdb accession number, 1RHH) (Darbha et al., Biochemistry 43:1410, 2004). Cross-rotation and translation search with 15-4 Å data yielded Patterson correlation coefficients of 22.3% and 31.1% for YU2core:d1d2 and X5 Fab, respectively. The combined solution gave a Patterson correlation coefficient of 51.7%. By using the programs, O (Jones et al., Acta Crystallogr. A 47:110, 1991) for model building and CNS (Brunger et al., Acta Crystallogr. D54:905, 1998) for refinement, side-chains of the initial models were corrected, and the models subjected to torsion angle simulated annealing with slow cooling. Iterative manual fittings were carried out in B-value sharpened maps (−75 Å2; 2Fo-Fc) to enhance visual recognition of protein sidechain definition. Refinement in CNS, however, used unsharpened data, with strong 3 geometric constraints to maintain idealized stereochemistry. Statistics summarizing the X-ray crystallographic data and refinement are shown in Table 6.
All superpositions were performed using lsqkab in CCP4 (Collaborative Computational Project, Acta Crystallogr. D50:760, 1994). Molecular surface interactions were calculated using MS (Connolly, J. Mol. Graph. 11:139, 1993). Figures were prepared using PyMOL (DeLano Scientific, San Carlos, Calif., 2002) and GRASP (Nicholls et al, Proteins Struct. Funct. Genet. 11:281, 1991).
Asn-(N-acetylglucosamine)2(mannose)3 N-linked sugar cores were modeled following procedures described previously for the HXBc2 core (Wyatt et al., Nature 393:705, 1998). Briefly, JR-FL core with V3 and the HXBc2 core with modeled glycan were superimposed. Conserved sites of Nlinked glycan were transferred, and other sites were built manually, including glycans at 301 and 386. The core was fixed and the Asn and attached glycan were subjected to molecular dynamics.
Analyses were carried out with only sequences with complete V3, limited to one sequence per individual, extracted from the Los Alamos HIV sequence database (www.hiv.lanl.gov/content/hiv-db.) for all M group sequences that had coreceptor usage specified as either CCR5 or CXCR4. The B clade subset of the M group had the most coreceptor usage information for a single clade, and so it was also analyzed separately. Alignments were made from constant to variable regions, with the β-turn (GPGR analog) of the tip forced into alignment. The Shannon entropy (Shannon, Bell System Tech. 27:379, 1948) was calculated for each site, treating gaps inserted to maintain alignment and distinct amino acids as characters, and statistical analysis of the variation at each site comparing R5 and X4 viruses was performed by using a Monte Carlo randomization of the two data sets (Korber et al., J. Virol. 68:7467, 1994), with a Bonferroni correction to contend with multiple tests. An entropy score is actually a simple measure of the information content of a data set: when considered in this context, as a measure of amino acid diversity in the column of an alignment, it has the virtue of capturing both the range and distribution of observed amino acids. Zero indicates absolute conservation, and a score of 4.4 indicates complete randomness.
This example describes the analysis of the structural details of a gp120 with an extended loop.
The third variable region (V3) of the HIV-1 gp120 envelope glycoprotein is immunodominant and contains features essential for coreceptor binding. Disclosed herein is the structure of the V3 loop in the context of an HIV-1 gp120 core complexed to the CD4 receptor and to the X5 antibody at 3.5 angstrom resolution. Binding of gp120 to cell-surface CD4 positions V3 so that its coreceptor-binding tip protrudes 30 angstroms from the core toward the target cell membrane. The extended nature and antibody accessibility of V3 explain its immunodominance. Snapshots of the gp120 entry mechanism have been visualized through crystal structures of unliganded and CD4-bound states (Chen et al., Nature 433:834, 2005; Kwong et al., Nature 393:648, 1998). Prior to this disclosure an essential component of the coreceptor binding site, the third variable region (V3), was been absent from structural characterizations of the gp120 core. The structure of V3 in the context of core gp120 bound to CD4, described herein, reveals the entire coreceptor binding site. The V3 appears to act as a molecular hook, not only for snaring coreceptor but also for modulating subunit associations within the viral spike. Its extended nature is compatible with the elicitation of an immunodominant antibody response and the generation of broadly neutralizing antibodies to V3 epitopes.
The extreme glycosylation and conformational flexibility of gp120 inhibit crystallization. Variational crystallization and various technologies adapted from structural genomics were used to obtain crystals suitable for x-ray structural analysis (Kwong et al., J. Biol. Chem. 274:4115, 1999; Stevens and Wilson, Science 293:519 (2001). The gp120 core with V3 from JR-FL The crystallized JR-FL was derived from a JR-FL variant with two point mutants, Asn301Gln and Thr388Ala. These mutations removed two Nlinked glycans, and the resultant virus was more sensitive to neutralization but was otherwise functional (Koch et al., Virology 313:387, 2003), when complexed to CD4 (two domain) and the antigen-binding fragment (Fab) of the X5 antibody (Koch et al., Virology 313:387, 2003), formed hexagonal crystals that diffracted to approximately 3.5 Å resolution with x-rays provided by an Advanced Photon Source undulator beam line (SER-CAT) (Table 5). The structure was solved by molecular replacement and is shown in
The overall assembly of CD4, X5, and core gp120 resembled the previously determined individual structures of CD4 (Ryu et al., Nature 34:419, 1990; Wang et al., Nature 348:411, 1990) and of free X5 (Darbha et al., Biochemistry 43:1410, 2004) as well as the complex of core gp120 bound to CD4 (Kwong et al., Nature 393:648, 1998; Kwong et al., Structure 8:1329, 2000). For core gp120, some differences were observed in the variable loops and also at the N terminus, regions where variations in gp120 have previously been observed (Chen et al., Nature 433:834, 2005; Kwong et al., Nature 393:648, 1998; Kwong et al., Structure 8:1329, 2000; Huang et al., Structure 13:755, 2005). Structural resemblance was maintained around the base of V3, indicating that the previous truncation (Chen et al., Nature 433:834, 2005; Kwong et al., Nature 393:648, 1998; Kwong et al., Structure 8:1329, 2000; Huang et al., Structure 13:755, 2005) did not distort this region of the core. In X5, a large structural difference was observed for the third complementarity determining loop of the X5 heavy chain (CDR H3). Comparison of the refined structures of free X5 (Darbha et al., Biochemistry 43:1410, 2004) and bound X5 showed Ca movements of up to 17 Å, one of the largest induced fits observed for an antibody (
By integrating the two-site gp120 binding site on the coreceptor with the two-site coreceptor binding site that it is observe in the structure of V3 gp120 with an extended V3 loop, that the N terminus of the coreceptor reaches up and binds to the core and V3 base while the V3 tip of gp120 reaches down to interact with the second extracellular loop of the coreceptor (
virtually all of the neutralizing activity is directed at V3. The conformation of crystal and nuclear magnetic resonance structures of V3-reactive antibody-peptide complexes was examined for clues to this immunodominant response (
This example describes the “prime-boost” immunization scheme used to generate a heightened immune response in a subject.
Based on the biophysical characterization of gp120 stabilized in the CD4 bound conformation performed an immunization scheme was performed whereby HXBc2 strain wild-type or cysteine-stabilized core gp120 proteins were used to prime the immune response for subsequent immunization with soluble, stabilized trimeric YU2 strain gp140-foldon molecules (Yang et al. J Virol. 76(9):4634-42, 2002). B-cells primed by the stabilized cores were primed for epitopes displayed preferentially only on the stabilized HX core CD4 binding site, or to other stabilized surfaces, efficiently presented only by the cysteine-stabilized cores.
Boosting with the gp140 trimeric molecules “immuno-focuses” primed B cells on shared and conserved determinants between the two immunogens and altering strains would not boost B cells directed at HX- or YU2-specific epitopes. Thus, the only B-cells boosted selectively by the trimer would be those that could bind efficiently both the stabilized core as well as the trimer. Thus, stabilized cores can stimulate B cells that could induce the CD4-bound or the b12 conformation in the gp140 trimers.
Based upon this scheme, HIV gp120 core and trimer proteins were expressed by transient transfection of 293 cells with the relevant plasmid DNA. Soluble proteins were purified from culture supernatants by affinity chromatography and maintained in PBS, pH 7.4. Each rabbit was injected at two sites by the intramuscular route in the hind leg with 50 ug of protein emulsified at 1:1 ratio in GSK AS01B adjuvant in a total volume of 1 ml. The rabbits were inoculated four times with emulsified HX wild-type or stabilized core proteins followed by two injections with the emulsified YU2 gp140 trimeric proteins. Inoculations were performed at approximate four week intervals and the immune sera were collected ten days following each injection. The presence of high-titer anti-gp120 antibodies were confirmed by ELISA. The ability to neutralize viral particles derived from selected HIV strains was determined in a luciferase-based HIV entry assay. Virus was incubated with pre- or post-immune sera and the percent neutralization in the immune sera was calculated as the decrease in entry relative to virus incubated with pre-immune sera or an irrelevant BSA protein-emulsified control. The tabulated results of the immunogenicity-neutralization are shown in
This example describes the neutralization of various HIV isolates with CD4 induced triggering.
Plasmid DNA and Ad5-based first-generation (ΔE1, ΔE3) recombinant adenoviruses expressing different V loop deletions of gp140(ΔCFI) were constructed. HIV envelope genes encoding gp145(ΔCFI) (BaL) (Genbank accession No. M68893), gp145(ΔCFI) (clade C) (Genbank accession No. AF286227), gp145(ΔCFI) (CN54) (Genbank accession No. AX149771), and gp145(ΔCFI) (clade A) (Genbank accession No. U08794) were synthesized using human-preferred codons. gp145(ΔCFI)(B)(V3/C/1AB) and gp145(ΔCFI)(B)(V3/A/1AB) were made by replacing Bal V3 loop with shortened clade C V3(1AB) and clade A V3(1AB) sequences respectively.
Guinea pigs were intramuscularly immunized with 500 μg (in 400 μl PBS) of the gp145 version of plasmid DNA at week 0, 2, and 6. At week 14, the guinea pigs were boosted with 1011 particles (in 400 μl PBS) of recombinant replication defective adenovirus (rAd) expressing the corresponding gp140 version of the protein. Serum was collected at week −2 and week 16, aliquotted, and frozen at −20° C.
Single round of infection HIV-1 Env pseudoviruses were prepared by cotransfecting 293T cells with an Env expression plasmid containing a full gp160 env gene and an env-deficient HIV-1 backbone vector (pSG3ΔEnv). Virus-containing culture supernatants were harvested 2 days after transfection, centrifuged and filtered through 0.45-micron filter, and stored at −80° C. Pseudovirus neutralization was measured as a function of Tat-induced luciferase reporter gene expression after a single round of infection in TZM-bl cells. TZM-bl cells express CD4, CXCR4 and CCR5 and contain and integrated reporter gene for firefly luciferase under the control of an HIV-1 LTR. The level of viral infection was quantified by measurement of relative luciferase units (RLU) that are directly proportion to the amount of virus inputs. Briefly, 40 ul of virus was incubated for 30 minutes at 37° C. with serial dilutions of test serum samples (10 ul) in duplicate wells of a 96-well flat bottom culture plate. The final serum dilution was defined at the point of incubation with virus supernatant. 10,000 TZM-bl cells were then added to each well in a total volume of 20 ul and plates were incubated overnight at 37° C. in a 5% CO2 incubator. One set of eight wells received mock antibody followed by virus and cells (controls wells for virus entry) and a set of eight wells received cells with mock virus (to control for luciferase background). Viral input was set at a multiplicity of infection (moi) of approximately 0.1, which generally results in 100,000 to 400,00 0RLU. After over night incubation, 150 ul of fresh medium was added to each well and incubated for 24 hours at 37° C. in a 5% CO2 incubator. To determine RLU, cell culture medium was aspirated from wells followed by addition of 50 ul of cell lysis buffer (Promega, Madison, Wis.). 30 ul of cell lysate was transferred to wells of a black Optiplate (PerkinElmer) for measurement of luminescence using a Perkin-Elmer Victor-light luminometer that injects 50 ul of luciferase substrate reagent to each well just prior to reading RLU. To test for sCD4 triggering, two-domain sCD4 was added to the virus just prior to the addition of sera.
This example describes the selection of immunogenic fragments of stabilized gp120.
A nucleic acid molecule encoding a stabilized p120 fragment is expressed in a host using standard techniques (see above; see Sambrook et al., Molecular Cloning; A Laboratory Manual, Cold Spring Harbor Press, Cold Spring Harbor, N.Y.: 1989). Preferable gp120 fragment is expressed such that the gp120 fragment can be isolated or purified in sufficient quantity. The stabilized gp120 fragment that are expressed are analyzed by various techniques known in the art, such as immunoblot, and ELISA, and for binding to CD4 and mAbs directed to the CD4 binding site, for example the b12 antibody.
To determine the antigenic potential of stabilized p120 fragments, subjects such as mice, rabbits or other suitable subjects are immunized with stabilized p120 fragments. Sera from such immunized subjects are tested for antibody activity for example by ELISA with the expressed polypeptide. They are also tested in a CD4 binding assay, for example by qualitative biacore, and the binding of neutralizing antibodies, for example by using the b12 antibody. Thus, antigenic fragments of stabilized forms are selected to archive broadly reactive neutralizing antibody responses.
This example describes the strategies to mask portions of a stabilized gp120 polypeptide from non-neutralizing antibodies.
The polypeptide “new 9c” as set forth as SEQ ID NO: 1 includes residues at the base of the V3 loop, and restores recognition of the core by the CD4-induced antibodies, such as 17b. Individual and combination glycan mutations were designed in the context of the stabilized gp120 polypeptides disclosed herein (for example, such as set forth in SEQ ID NO: 2 or encoded by SEQ ID NO: 4-18) to prevent the elicitation of non-neutralizing antibodies. Using site-directed mutagenesis, specific Asn and Ser/Thr residues are incorporated into the 8b core. The Asn-X-Ser/Thr residues mediate the attachment of glycans to the designated asparagine residues by mammalian cell glycosylating enzymes in the endoplasmic reticulum. This scheme is used to mask the immunogenic but non-neutralizing surfaces present in gp120.
Typically, wild-type gp120 cores elicit antibodies in rabbits that bind more efficiently to the core proteins than to full length gp120 glycoproteins. It is likely that the cores, via their truncated loops and N- and C-termini, elicit antibodies to surfaces that are not exposed in monomeric gp120.
As another aspect of an overall strategy to optimize the stabilized core priming of a trimer boost, glycans are designed at selected densities on the stabilized core to dampen or eliminate unwanted core-specific responses based upon the 8b core-b12 structure disclosed herein. The optimized and proteins are expressed, purified, analyzed and tested for immunogenicity by themselves or in sequential prime-boost with the YU2 gp140 trimers.
To mask the surface recognized by 17b and other CD4-induced antibodies the following mutations were designed:
To mask surfaces other than the CD4 binding site, which includes the b12 epitope region, the following N-glycan addition sites were designed:
Mutation 1 and Mutation 2 correspond to the N glycosylation consensus sequence: NxT/S where x is anything except proline. T is better than S for glycosylation. Blanks indicate positions where no mutations are necessary. These glysolated peptides are used to induce a immune response in a subject.
In view of the many possible embodiments to which the principles of the disclosed invention may be applied, it should be recognized that the illustrated embodiments are only preferred examples of the invention and should not be taken as limiting the scope of the invention. Rather, the scope of the invention is defined by the following claims. We therefore claim as our invention all that comes within the scope and spirit of these claims.
This application is a continuation application of U.S. patent application Ser. No. 13/232,775, filed Sep. 14, 2011, which is a divisional application of U.S. patent application Ser. No. 12/065,894, filed Mar. 5, 2008, now U.S. Pat. No. 8,044,185, which is the U.S. §371 National Stage of International Application No. PCT/US2006/034681, filed Sep. 6, 2006, published in English under PCT Article 21(2), which in turn claims the benefit of U.S. Provisional Application No. 60/713,725, filed Sep. 6, 2005; U.S. Provisional Application No. 60/729,878, filed Oct. 24, 2005; U.S. Provisional Application No. 60/731,627, filed Oct. 28, 2005; and U.S. Provisional Application No. 60/832,458, filed Jul. 20, 2006. All of the applications are incorporated by reference herein in their entirety.
Number | Date | Country | |
---|---|---|---|
60713725 | Sep 2005 | US | |
60729878 | Oct 2005 | US | |
60731627 | Oct 2005 | US | |
60832458 | Jul 2006 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 12065894 | Mar 2008 | US |
Child | 13232775 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13232775 | Sep 2011 | US |
Child | 13585700 | US |