All patents, patent applications and publications cited herein are hereby incorporated by reference in their entirety. The disclosure of these publications in their entireties are hereby incorporated by reference into this application in order to more fully describe the state of the art as known to those skilled therein as of the date of the invention described herein.
The present invention relates in general, to a composition suitable for use in inducing anti-HIV-1 antibodies, and, in particular, to immunogenic compositions comprising envelope proteins and nucleic acids to induce cross-reactive neutralizing antibodies and increase their breadth of coverage. The invention also relates to methods of inducing such broadly neutralizing anti-HIV-1 antibodies using such compositions.
The development of a safe and effective HIV-1 vaccine is one of the highest priorities of the scientific community working on the HIV-1 epidemic. While anti-retroviral treatment (ART) has dramatically prolonged the lives of HIV-1 infected patients, ART is not routinely available in developing countries.
In certain embodiments, the invention provides compositions and method for induction of immune response, for example cross-reactive (broadly) neutralizing Ab induction. In certain embodiments, the methods use compositions comprising “swarms” of sequentially evolved envelope viruses that occur in the setting of bnAb generation in vivo in HIV-1 infection.
In certain aspects the invention provides compositions comprising a selection of HIV-1 envelopes and/or nucleic acids encoding these envelopes as described herein for example but not limited to Selections as described herein. Without limitations, these selected combinations comprise envelopes which provide representation of the genetic (sequence) and antigenic diversity of the HIV-1 envelope variants which lead to the induction and maturation of the CH103 and CH235 antibody lineages. In certain embodiments, these compositions are used in immunization methods as a prime and/or boost as described in Selections as described herein.
In one aspect the invention provides selections of envelopes from individual CH505, which selections can be used in compositions for immunizations to induce lineages of broad neutralizing antibodies. In certain embodiments, there is some variance in the immunization regimen; in some embodiments, the selection of HIV-1 envelopes may be grouped in various combinations of primes and boosts, either as nucleic acids, proteins, or combinations thereof. In certain embodiments the compositions are pharmaceutical compositions which are immunogenic. In certain embodiments, the compositions comprise amounts of envelopes which are therapeutic and/or immunogenic.
In one aspect the invention provides a composition comprising any one of the envelopes described herein, or any combination thereof (selections in Examples). In some embodiments, CH505 transmitted/founder (T/F) Env is administered first as a prime, followed by a mixture of a next group of Envs, followed by a mixture of a next group(s) of Envs, followed by a mixture of the final Envs. In some embodiments, grouping of the envelopes is based on their binding affinity for the antibodies expected to be induced. In some embodiments, grouping of the envelopes is based on chronological evolution of envelope viruses that occurs in the setting of bnAb generation in vivo in HIV-1 infection. In some embodiments Loop D mutants could be included in either prime and/or boost. In some embodiments, the composition comprises an adjuvant. In some embodiments, the composition and methods comprise use of agents for transient modulation of the host immune response.
In one aspect the invention provides a composition comprising an HIV-1 envelope polypeptide or a nucleic acid encoding an HIV-1 envelope selected from the group consisting of M5, M6 and M11, or any combination thereof, wherein the HIV-1 envelope is a loop D mutant envelope.
In another aspect the invention provides a method of inducing an immune response in a subject comprising administering a composition comprising HIV-1 envelope M11, M6 and/or M5 as a prime in an amount sufficient to induce an immune response, wherein the envelope is administered as a polypeptide or a nucleic acid encoding the same. A method of inducing an immune response in a subject comprising administering a composition comprising any one of the HIV-1 envelopes in Table 1 or any combination as a prime in an amount sufficient to induce an immune response, wherein the envelope is administered as a polypeptide or a nucleic acid encoding the same.
In certain embodiments the methods comprise administering a composition comprising any one of HIV-1 envelopes polypeptides selected from the group consisting of w000.TF, w020.15, w030.13,w020.25, w004.54, w020.11, w078.15, w053.22, w136.B23, w053.31, w136.B2, w100.A13, w100.B4, w160.T4, w030.21, w053.15, w078.17, w136.B10, w053.29, w078.33, w136.B5, w030.36, w030.17, w078.9, w030.20, w100.B2, w078.6, or any combination thereof as a prime.
In certain embodiments the methods comprise administering a composition comprising any one of a nucleic acid encoding HIV-1 envelope selected from the group consisting of w000.TF, w020.15, w030.13,w020.25, w004.54, w020.11, w078.15, w053.22, w136.B23, w053.31, w136.B2, w100.A13, w100.B4, w160.T4, w030.21, w053.15, w078.17, w136.B10, w053.29, w078.33, w136.B5, w030.36, w030.17, w078.9, w030.20, w100.B2, w078.6, or any combination thereof as a prime.
In certain embodiments the methods comprise administering a composition comprising any one of HIV-1 envelopes in Table 1 or any combination thereof as a boost, wherein the envelope is administered as a polypeptide or a nucleic acid encoding the same.
In certain embodiments the methods comprise administering a composition comprising any one of HIV-1 envelopes polypeptides selected from the group consisting of w000.TF, w020.15, w030.13,w020.25, w004.54, w020.11, w078.15, w053.22, w136.B23, w053.31, w136.B2, w100.A13, w100.B4, w160.T4, w030.21, w053.15, w078.17, w136.B10, w053.29, w078.33, w136.B5, w030.36, w030.17, w078.9, w030.20, w100.B2, w078.6, or any combination thereof as a boost.
In certain embodiments the methods comprise administering a composition comprising any one of a nucleic acid encoding HIV-1 envelope selected from the group consisting of w000.TF, w020.15, w030.13,w020.25, w004.54, w020.11, w078.15, w053.22, w136.B23, w053.31, w136.B2, w100.A13, w100.B4, w160.T4, w030.21, w053.15, w078.17, w136.B10, w053.29, w078.33, w136.B5, w030.36, w030.17, w078.9, w030.20, w100.B2, w078.6, or any combination thereof as a boost.
In certain embodiments, the compositions contemplate nucleic acid, as DNA and/or RNA, or proteins immunogens either alone or in any combination. In certain embodiments, the methods contemplate genetic, as DNA and/or RNA, immunization either alone or in combination with envelope protein(s).
In certain embodiments the nucleic acid encoding an envelope is operably linked to a promoter inserted an expression vector. In certain aspects the compositions comprise a suitable carrier. In certain aspects the compositions comprise a suitable adjuvant.
In certain embodiments the induced immune response includes induction of antibodies, including but not limited to autologous and/or cross-reactive (broadly) neutralizing antibodies against HIV-1 envelope. Various assays that analyze whether an immunogenic composition induces an immune response, and the type of antibodies induced are known in the art and are also described herein.
In certain aspects the invention provides an expression vector comprising any of the nucleic acid sequences of the invention, wherein the nucleic acid is operably linked to a promoter. In certain aspects the invention provides an expression vector comprising a nucleic acid sequence encoding any of the polypeptides of the invention, wherein the nucleic acid is operably linked to a promoter. In certain embodiments, the nucleic acids are codon optimized for expression in a mammalian cell, in vivo or in vitro. In certain aspects the invention provides nucleic acids comprising any one of the nucleic acid sequences of invention. In certain aspects the invention provides nucleic acids consisting essentially of any one of the nucleic acid sequences of invention. In certain aspects the invention provides nucleic acids consisting of any one of the nucleic acid sequences of invention. In certain embodiments the nucleic acid of the invention, is operably linked to a promoter and is inserted in an expression vector. In certain aspects the invention provides an immunogenic composition comprising the expression vector.
In certain aspects the invention provides a composition comprising at least one of the nucleic acid sequences of the invention. In certain aspects the invention provides a composition comprising any one of the nucleic acid sequences of invention. In certain aspects the invention provides a composition comprising at least one nucleic acid sequence encoding any one of the polypeptides of the invention.
In certain aspects the invention provides a composition comprising at least one nucleic acid encoding HIV-1 envelope from
In certain aspects, the invention provides a composition comprising any one or at least one nucleic acid encoding HIV-1 envelope selected from the group consisting of w000.TF, w004.31, w004.54, w007.8, w007.21, w007.25, w007.34, w008.20, w009.19, w010.7, w020.15, w020.11, w020.24, w020.25, w022.6, w022.5, w022.9, w022.22, w030.20, w030.17, w030.21, w030.36, w030.26, w030.13, w030.32, w053.15, w053.29, w053.22, w053.8, w053.31, w053.9, w078.6, w078.36, w078.9, w078.26, w078.29, w078.30, w078.33, w078.17, w078.15, w078.27, w100.T3, w100.B10, w100.B2, w100.B4, w100.A11, w100.A13, w136.B10, w136.B5, w136.B2, w136.B23, w160.C1, w160.T3, w160.T4, or any combination thereof.
In certain aspects, the invention provides a composition comprising any one or at least one of the nucleic acids encoding HIV-1 envelope selected from the envelopes in
In certain aspects, the invention provides a composition comprising any one of or at least one an HIV-1 envelope polypeptide selected from the group consisting of w000.TF, w004.31, w004.54, w007.8, w007.21, w007.25, w007.34, w008.20, w009.19, w010.7, w020.15, w020.11, w020.24, w020.25, w022.6, w022.5, w022.9, w022.22, w030.20, w030.17, w030.21, w030.36, w030.26, w030.13, w030.32, w053.15, w053.29, w053.22, w053.8, w053.31, w053.9, w078.6, w078.36, w078.9, w078.26, w078.29, w078.30, w078.33, w078.17, w078.15, w078.27, w100.T3, w100.B10, w100.B2, w100.B4, w100.A11, w100.A13, w136.B10, w136.B5, w136.B2, w136.B23, w160.C1, w160.T3, w160.T4, or any combination thereof.
In certain aspects, the invention provides a composition comprising any one of or at least one an HIV-1 envelope polypeptide from the envelopes in
In certain embodiments, the compositions and methods employ an HIV-1 envelope as polypeptide instead of a nucleic acid sequence encoding the HIV-1 envelope. In certain embodiments, the compositions and methods employ an HIV-1 envelope as polypeptide, a nucleic acid sequence encoding the HIV-1 envelope, or a combination thereof.
The envelope used in the compositions and methods of the invention can be a gp160, gp150, gp145, gp140, gp120, gp41, N-terminal deletion variants as described herein, cleavage resistant variants as described herein, or codon optimized sequences thereof. In certain embodiments, the envelope used in the compositions and methods of the invention is gp160. In certain embodiments, the envelope used in the compositions and methods of the invention is gp150. In certain embodiments, the envelope used in the compositions and methods of the invention is gp145. In certain embodiments, the envelope used in the compositions and methods of the invention is gp120. In certain embodiments, the envelope used in the compositions and methods of the invention is gp41. In certain embodiments, the envelope used in the compositions and methods of the invention is a gp120 variant. In certain embodiments, the envelope used in the compositions and methods of the invention is gp120D8 variant.
The polypeptide contemplated by the invention can be a polypeptide comprising any one of the polypeptides described herein. The polypeptide contemplated by the invention can be a polypeptide consisting essentially of any one of the polypeptides described herein. The polypeptide contemplated by the invention can be a polypeptide consisting of any one of the polypeptides described herein. In certain embodiments, the polypeptide is recombinantly produced. In certain embodiments, the polypeptides and nucleic acids of the invention are suitable for use as an immunogen, for example to be administered in a human subject.
In certain embodiments the envelope is any of the forms of HIV-1 envelope. In certain embodiments the envelope is gp120, gp140, gp145 (i.e. with a transmembrane). In certain embodiments, the envelope is in a liposome and transmembrane with a cytoplasmic tail in a liposome. In certain embodiments, the nucleic acid comprises a nucleic acid sequence which encode a gp120, gp140, gp145, gp150 or gp160.
In certain embodiments, where the nucleic acids are operably linked to a promoter and inserted in a vector, the vectors is any suitable vector. Non-limiting examples, include, the VSV, replicating rAdenovirus type 4, MVA, Chimp adenovirus vectors, pox vectors, and the like. In certain embodiments, the nucleic acids are administered in NanoTaxi block polymer nanospheres. In certain embodiments, the composition and methods comprise an adjuvant. Non-limiting examples include, AS01 B, AS01 E, gla/SE, alum, Poly I poly C, TLR agonists, TLR7/8 and 9 agonists, or a combination of TLR7/8 and TLR9 agonists (see Moody et al. (2014) J. Virol. March 2014 vol. 88 no. 6 3329-3339), or any other adjuvant. Non-limiting examples of TLR7/8 agonist include TLR7/8 ligands, Gardiquimod, Imiquimod and R848 (resiquimod). A non-limiting embodiment of a combination of TLR7/8 and TLR9 agonist comprises R848 and oCpG in STS (see Moody et al. (2014) J. Virol. March 2014 vol. 88 no. 6 3329-3339).
In certain aspects the invention provides a method for selecting a swarm of HIV-1 envelopes, among a population of HIV-1 envelopes isolated over a period of time from an individual who develops bnAbs against HIV-1 wherein the swarm mimics the envelope diversity in a person who made a good antibody response during natural infection, by representing the relevant HIV diversity, capturing evolution of representative sites from within subject diverse populations.
To conform to the requirements for PCT patent applications, many of the figures presented herein are black and white representations of images originally created in color. In the below descriptions and the examples, the colored images are described in terms of its appearance in black and white. Different colors are described by different shades of white to grey with an attempt to match the description the descriptions of the color as closely as possible to that of the figures. The original color versions of some of the Figures can be viewed in Liao, et al., Co-evolution of a broadly neutralizing HIV-1 antibody and founder virus. Nature. 2013; 496 (7446): 469-76 (including the accompanying Supplementary Information) and Haynes et al., B-cell-lineage immunogen design in vaccine development with HIV-1 as a case study. Nat. Biotechnol 2012; 30: 423-33 (including the accompanying Supplementary Information). For the purposes of the PCT, contents of Liao, et al. (2013), including the accompanying “Supplementary Information,” and the contents of Haynes et al. (2012), including the accompanying “Supplementary Information,” are each herein incorporated by reference.
The development of a safe, highly efficacious prophylactic HIV-1 vaccine is of paramount importance for the control and prevention of HIV-1 infection. A major goal of HIV-1 vaccine development is the induction of broadly neutralizing antibodies (bnAbs) (Immunol. Rev. 254: 225-244, 2013). BnAbs are protective in rhesus macaques against SHIV challenge, but as yet, are not induced by current vaccines.
For the past 25 years, the HIV vaccine development field has used single or prime boost heterologous Envs as immunogens, but to date has not found a regimen to induce high levels of bnAbs.
Recently, a new paradigm for design of strategies for induction of broadly neutralizing antibodies was introduced, that of B cell lineage immunogen design (Nature Biotech. 30: 423, 2012) in which the induction of bnAb lineages is recreated. It was recently demonstrated the power of mapping the co-evolution of bnAbs and founder virus for elucidating the Env evolution pathways that lead to bnAb induction (Nature 496: 469, 2013). From this type of work has come the hypothesis that bnAb induction will require a selection of antigens to recreate the “swarms” of sequentially evolved viruses that occur in the setting of bnAb generation in vivo in HIV infection (Nature 496: 469, 2013).
A critical question is why the CH505 immunogens are better than other immunogens. This rationale comes from three recent observations. First, a series of immunizations of single putatively “optimized” or “native” trimers when used as an immunogen have not induced bnAbs as single immunogens. Second, in all the chronically infected individuals who do develop bnAbs, they develop them in plasma after ˜2 years. When these individuals have been studied at the time soon after transmission, they do not make bnAbs immediately. Third, now that individual's virus and bnAb co-evolution has been mapped from the time of transmission to the development of bnAbs, the identification of the specific Envs that lead to bnAb development have been identified-thus taking the guess work out of envelope choice.
Two other considerations are important. The first is that for the CH103 bnAb CD4 binding site lineage, the VH4-59 and Vλ3-1 genes are common as are the VDJ, VJ recombinations of the lineage (Liao, Nature 496: 469, 2013). In addition, the bnAb sites are so unusual, the same VH and VL usage has been found to be recurring in multiple individuals. Thus, it can be expected that the CH505 Envs induce CD4 binding site antibodies in many different individuals.
Finally, regarding the choice of gp120 vs. gp160, for the genetic immunization, gp160 would not normally even be considered for use. However, in acute infection, gp41 non-neutralizing antibodies are dominant and overwhelm gp120 responses (Tomaras, G et al. J. Virol. 82: 12449, 2008; Liao, H X et al. JEM 208: 2237, 2011). Recently, it has been found that the HVTN 505 DNA prime, rAd5 vaccine trial that utilized gp140 as an immunogen, also had the dominant response of non-neutralizing gp41 antibodies. Thus, the early-on use of gp160 vs gp120 for gp41 dominance will be explored.
In certain aspects the invention provides a strategy for induction of bnAbs is to select and develop immunogens and combinations designed to recreate the antigenic evolution of Envs that occur when bnAbs do develop in the context of infection.
That broadly neutralizing antibodies (bnAbs) occur in nearly all sera from chronically infected HIV-1 subjects suggests anyone can develop some bnAb response if exposed to immunogens via vaccination. Working back from mature bnAbs through intermediates enabled understanding their development from the unmutated ancestor, and showed that antigenic diversity preceded the development of population breadth. See Liao et al. (2013) Nature 496, 469-476. In this study, an individual “CH505” was followed from HIV-1 transmission to development of broadly neutralizing antibodies. This individual developed antibodies targeted to CD4 binding site on gp120. In this individual the virus was sequenced over time, and broadly neutralizing antibody clonal lineage (“CH103”) was isolated by antigen-specific B cell sorts, memory B cell culture, and amplified by VH/VL next generation pyrosequencing. The CH103 lineage began by binding the T/F virus, autologous neutralization evolved through somatic mutation and affinity maturation, escape from neutralization drove rapid (clearly by 20 weeks) accumulation of variation in the epitope, antibody breadth followed this viral diversification.
Further analysis of envelopes and antibodies from the CH505 individual indicated that a non-CH103 Lineage (DH235) participates in driving CH103-BnAb induction. See Gao et al. (2014) Cell 158:481-491. For example V1 loop, V5 loop and CD4 binding site loop mutations escape from CH103 and are driven by CH103 lineage. Loop D mutations enhanced neutralization by CH103 lineage and are driven by another lineage. Transmitted/founder Env, or another early envelope for example W004.26, triggers naïve B cell with CH103 Unmutated Common Ancestor (UCA) which develop in to intermediate antibodies. Transmitted/founder Env, or another early envelope for example W004.26, also triggers non-CH103 autologous neutralizing Abs that drive loop D mutations in Env that have enhanced binding to intermediate and mature CH103 antibodies and drive remainder of the lineage. In certain embodiments, the inventive composition and methods also comprise loop D mutant envelopes (e.g. but not limited to M10, M11, M19, M20, M21, M5, M6, M7, M8, M9) as immunogens. In certain embodiments, the D-loop mutants are included in an inventive composition used to induce an immune response in a subject. In certain embodiments, the D-loop mutants are included in a composition used as a prime.
The invention provides various methods to choose a subset of viral variants, including but not limited to envelopes, to investigate the role of antigenic diversity in serial samples. In other aspects, the invention provides compositions comprising viral variants, for example but not limited to envelopes, selected based on various criteria as described herein to be used as immunogens. In some embodiments, the immunogens are selected based on the envelope binding to the UCA, and/or intermediate antibodies. In some embodiments the immunogens are selected based on their chronological appearance and/or sequence diversity during infection.
In other aspects, the invention provides immunization strategies using the selections of immunogens to induce cross-reactive neutralizing antibodies. In certain aspects, the immunization strategies as described herein are referred to as “swarm” immunizations to reflect that multiple envelopes are used to induce immune responses. The multiple envelopes in a swarm could be combined in various immunization protocols of priming and boosting.
In certain embodiments the invention provides that sites losing the ancestral, transmitted-founder (T/F) state are most likely under positive selection. From acute, homogenous infections with 3-5 years of follow-up, identified herein are sites of interest among plasma single genome analysis (SGA) Envs by comparing the proportion of sequences per time-point in the T/F state with a threshold, typically 5%. Sites with T/F frequencies below threshold are putative escapes. Clones with representative escape mutations were selected. Where more information was available, such as tree-corrected neutralization signatures and antibody contacts from co-crystal structure, additional sites of interest were considered.
Co-evolution of a broadly neutralizing HIV-1 antibody (CH103) and founder virus was previously reported in African donor (CH505). See Liao et al. (2013) Nature 496, 469-476. In CH505, which had an early antibody that bound autologous T/F virus, 398 envs from 14 time-points over three years (median per sample: 25, range: 18-53) were studied. 36 sites with T/F frequencies under 20% were found in any sample. Neutralization and structure data identified 28 and 22 interesting sites, respectively. Together, six gp41 and 53 gp120 sites were identified, plus six V1 or V5 insertions not in HXB2.
The invention provides an approach to select reagents for neutralization assays and subsequently investigate affinity maturation, autologous neutralization, and the transition to heterologous neutralization and breadth. Given the sustained coevolution of immunity and escape this antigen selection based on antibody and antigen coevolution has specific implications for selection of immunogens for vaccine design.
In one embodiment, 54 envelopes were selected that represent the selected sites. In another embodiment, 27 envelopes were selected that represent the selected sites. These sets of envelopes represent antigenic diversity by deliberate inclusion of polymorphisms that result from immune selection by neutralizing antibodies, and had a lower clustering coefficient and greater diversity in selected sites than sets sampled randomly. These selections represent various levels of antigenic diversity in the HIV-1 envelope. In some embodiments the selections are based on the genetic diversity of longitudinally sampled SGA envelopes. In some embodiments the selections are based on antigenic and or neutralization diversity. In some embodiments and are based on the genetic diversity of longitudinally sampled SGA envelopes, and correlated with other factors such as antigenic/neutralization diversity, and antibody coevolution.
Sequences/Clones
Described herein are nucleic and amino acids sequences of HIV-1 envelopes. In certain embodiments, the described HIV-1 envelope sequences are gp160s. In certain embodiments, the described HIV-1 envelope sequences are gp120s. Other sequences, for example but not limited to gp145s, gp140s, both cleaved and uncleaved, gp150s, gp41s, which are readily derived from the nucleic acid and amino acid gp160 sequences. In certain embodiments the nucleic acid sequences are codon optimized for optimal expression in a host cell, for example a mammalian cell, a rBCG cell or any other suitable expression system. Described herein are nucleic and amino acids sequences of HIV-1 envelopes. In certain embodiments, the described HIV-1 envelope sequences are gp160s. In certain embodiments, the described HIV-1 envelope sequences are gp120s. Other sequences, for example but not limited to gp140s, both cleaved and uncleaved, gp140 Envs with the deletion of the cleavage (C) site, fusion (F) and immunodominant (I) region in gp41—named as gp140ΔCFI, gp140 Envs with the deletion of only the cleavage (C) site and fusion (F) domain—named as gp140ΔCF, gp140 Envs with the deletion of only the cleavage (C)—named gp140ΔC (See e.g. Liao et al. Virology 2006, 353, 268-282), gp145s, gp150s, gp41s, which are readily derived from the nucleic acid and amino acid gp160 sequences. In certain embodiments the nucleic acid sequences are codon optimized for optimal expression in a host cell, for example a mammalian cell, a rBCG cell or any other suitable expression system.
In certain embodiments, the envelope design in accordance with the present invention involves deletion of residues (e.g., 5-11, 5, 6, 7, 8, 9, 10, or 11 amino acids) at the N-terminus. For delta N-terminal design, amino acid residues ranging from 4 residues or even fewer to 14 residues or even more are deleted. These residues are between the maturation (signal peptide, usually ending with CX, X can be any amino acid) and “VPVXXXX . . . ”. In case of CH505 T/F Env as an example, 8 amino acids (italicized and underlined in the below sequence) were deleted: MRVMGIQRNYPQWWIWSMLGFWMLMICNGMWVTVYYGVPVWKEAKTTLFCASDA KAYEKEVHNVWATHACVPTDPNPQE . . . (rest of envelope sequence is indicated as “ . . . ”). In other embodiments, the delta N-design described for CH505 T/F envelope can be used to make delta N-designs of other CH505 envelopes. In certain embodiments, the invention relates generally to an immunogen, gp160, gp120 or gp140, without an N-terminal Herpes Simplex gD tag substituted for amino acids of the N-terminus of gp120, with an HIV leader sequence (or other leader sequence), and without the original about 4 to about 25, for example 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25 amino acids of the N-terminus of the envelope (e.g. gp120). See WO2013/006688, e.g. at pages 10-12, the contents of which publication is hereby incorporated by reference in its entirety.
The general strategy of deletion of N-terminal amino acids of envelopes results in proteins, for example gp120s, expressed in mammalian cells that are primarily monomeric, as opposed to dimeric, and, therefore, solves the production and scalability problem of commercial gp120 Env vaccine production. In other embodiments, the amino acid deletions at the N-terminus result in increased immunogenicity of the envelopes.
In certain embodiments, the invention provides envelope sequences, amino acid sequences and the corresponding nucleic acids, and in which the V3 loop is substituted with the following V3 loop sequence TRPNNNTRKSIRIGPGQTFY ATGDIIGNIRQAH (SEQ ID NO: 150). This substitution of the V3 loop reduced product cleavage and improves protein yield during recombinant protein production in CHO cells.
In certain embodiments, the CH505 envelopes will have added certain amino acids to enhance binding of various broad neutralizing antibodies. Such modifications could include but not limited to, mutations at W680G or modification of glycan sites for enhanced neutralization.
In certain aspects, the invention provides composition and methods which use a selection of sequential CH505 Envs, as gp120s, gp 140s cleaved and uncleaved, gp145s, gp150s and gp160s, as proteins, DNAs, RNAs, or any combination thereof, administered as primes and boosts to elicit immune response. Sequential CH505 Envs as proteins would be co-administered with nucleic acid vectors containing Envs to amplify antibody induction. In certain embodiments, the compositions and methods include any immunogenic HIV-1 sequences to give the best coverage for T cell help and cytotoxic T cell induction. In certain embodiments, the compositions and methods include mosaic and/or consensus HIV-1 genes to give the best coverage for T cell help and cytotoxic T cell induction. In certain embodiments, the compositions and methods include mosaic group M and/or consensus genes to give the best coverage for T cell help and cytotoxic T cell induction. In some embodiments, the mosaic genes are any suitable gene from the HIV-1 genome. In some embodiments, the mosaic genes are Env genes, Gag genes, Pol genes, Nef genes, or any combination thereof. See e.g. U.S. Pat. No. 7,951,377. In some embodiments the mosaic genes are bivalent mosaics. In some embodiments the mosaic genes are trivalent. In some embodiments, the mosaic genes are administered in a suitable vector with each immunization with Env gene inserts in a suitable vector and/or as a protein. In some embodiments, the mosaic genes, for example as bivalent mosaic Gag group M consensus genes, are administered in a suitable vector, for example but not limited to HSV2, would be administered with each immunization with Env gene inserts in a suitable vector, for example but not limited to HSV-2.
In certain aspects the invention provides compositions and methods of Env genetic immunization either alone or with Env proteins to recreate the swarms of evolved viruses that have led to bnAb induction. Nucleotide-based vaccines offer a flexible vector format to immunize against virtually any protein antigen. Currently, two types of genetic vaccination are available for testing—DNAs and mRNAs.
In certain aspects the invention contemplates using immunogenic compositions wherein immunogens are delivered as DNA. See Graham B S, Enama M E, Nason M C, Gordon I J, Peel S A, et al. (2013) DNA Vaccine Delivered by a Needle-Free Injection Device Improves Potency of Priming for Antibody and CD8+ T-Cell Responses after rAd5 Boost in a Randomized Clinical Trial. PLoS ONE 8(4): e59340, page 9. Various technologies for delivery of nucleic acids, as DNA and/or RNA, so as to elicit immune response, both T-cell and humoral responses, are known in the art and are under developments. In certain embodiments, DNA can be delivered as naked DNA. In certain embodiments, DNA is formulated for delivery by a gene gun. In certain embodiments, DNA is administered by electroporation, or by a needle-free injection technologies, for example but not limited to Biojector® device. In certain embodiments, the DNA is inserted in vectors. The DNA is delivered using a suitable vector for expression in mammalian cells. In certain embodiments the nucleic acids encoding the envelopes are optimized for expression. In certain embodiments DNA is optimized, e.g. codon optimized, for expression. In certain embodiments the nucleic acids are optimized for expression in vectors and/or in mammalian cells. In non-limiting embodiments these are bacterially derived vectors, adenovirus based vectors, rAdenovirus (e.g. Barouch D H, et al. Nature Med. 16: 319-23, 2010), recombinant mycobacteria (e.g. rBCG or M smegmatis) (Yu, J S et al. Clinical Vaccine Immunol. 14: 886-093, 2007; ibid 13: 1204-11, 2006), and recombinant vaccinia type of vectors (Santra S. Nature Med. 16: 324-8, 2010), for example but not limited to ALVAC, replicating (Kibler K V et al., PLoS One 6: e25674, 2011 Nov. 9.) and non-replicating (Perreau M et al. J. virology 85: 9854-62, 2011) NYVAC, modified vaccinia Ankara (MVA)), adeno-associated virus, Venezuelan equine encephalitis (VEE) replicons, Herpes Simplex Virus vectors, and other suitable vectors.
In certain aspects the invention contemplates using immunogenic compositions wherein immunogens are delivered as DNA or RNA in suitable formulations. Various technologies which contemplate using DNA or RNA, or may use complexes of nucleic acid molecules and other entities to be used in immunization. In certain embodiments, DNA or RNA is administered as nanoparticles consisting of low dose antigen-encoding DNA formulated with a block copolymer (amphiphilic block copolymer 704). See Cany et al., Journal of Hepatology 2011 vol. 54 j 115-121; Arnaoty et al., Chapter 17 in Yves Bigot (ed.), Mobile Genetic Elements: Protocols and Genomic Applications, Methods in Molecular Biology, vol. 859, pp 293-305 (2012); Arnaoty et al. (2013) Mol Genet Genomics. 2013 August; 288(7-8):347-63. Nanocarrier technologies called Nanotaxi® for immunogenic macromolecules (DNA, RNA, Protein) delivery are under development. See for example technologies developed by incellart.
In certain aspects the invention contemplates using immunogenic compositions wherein immunogens are delivered as recombinant proteins. Various methods for production and purification of recombinant proteins suitable for use in immunization are known in the art. In certain embodiments recombinant proteins are produced in CHO cells.
The immunogenic envelopes can also be administered as a protein boost in combination with a variety of nucleic acid envelope primes (e.g., HIV -1 Envs delivered as DNA expressed in viral or bacterial vectors).
Dosing of proteins and nucleic acids can be readily determined by a skilled artisan. A single dose of nucleic acid can range from a few nanograms (ng) to a few micrograms GO or milligram of a single immunogenic nucleic acid. Recombinant protein dose can range from a few μg micrograms to a few hundred micrograms, or milligrams of a single immunogenic polypeptide.
Administration: The compositions can be formulated with appropriate carriers using known techniques to yield compositions suitable for various routes of administration. In certain embodiments the compositions are delivered via intramascular (IM), via subcutaneous, via intravenous, via nasal, via mucosal routes, or any other suitable route of immunization.
The compositions can be formulated with appropriate carriers and adjuvants using techniques to yield compositions suitable for immunization. The compositions can include an adjuvant, such as, for example but not limited to, alum, poly IC, MF-59 or other squalene-based adjuvant, ASOIB, or other liposomal based adjuvant suitable for protein or nucleic acid immunization. In certain embodiments, the adjuvant is GSK AS01E adjuvant containing MPL and QS21. This adjuvant has been shown by GSK to be as potent as the similar adjuvant AS01B but to be less reactogenic using HBsAg as vaccine antigen [Leroux-Roels et al., IABS Conference, April 2013, 9]. In certain embodiments, TLR agonists are used as adjuvants. In other embodiment, adjuvants which break immune tolerance are included in the immunogenic compositions.
In certain embodiments, the compositions and methods comprise any suitable agent or immune modulation which could modulate mechanisms of host immune tolerance and release of the induced antibodies. In non-limiting embodiments modulation includes PD-1 blockade; T regulatory cell depletion; CD40L hyperstimulation; soluble antigen administration, wherein the soluble antigen is designed such that the soluble agent eliminates B cells targeting dominant epitopes, or a combination thereof. In certain embodiments, an immunomodulatory agent is administered in at time and in an amount sufficient for transient modulation of the subject's immune response so as to induce an immune response which comprises broad neutralizing antibodies against HIV-1 envelope. Non-limiting examples of such agents is any one of the agents described herein: e.g. chloroquine (CQ), PTP1B Inhibitor—CAS 765317-72-4—Calbiochem or MSI 1436 clodronate or any other bisphosphonate; a Foxol inhibitor, e.g. 344355|Foxo1 Inhibitor, AS1842856—Calbiochem; Gleevac, anti-CD25 antibody, anti-CCR4 Ab, an agent which binds to a B cell receptor for a dominant HIV-1 envelope epitope, or any combination thereof. In certain embodiments, the methods comprise administering a second immunomodulatory agent, wherein the second and first immunomodulatory agents are different.
There are various host mechanisms that control bNAbs. For example highly somatically mutated antibodies become autoreactive and/or less fit (Immunity 8: 751, 1998; PloS Comp. Biol. 6 e1000800, 2010; J. Thoret. Biol. 164:37, 1993); Polyreactive/autoreactive naïve B cell receptors (unmutated common ancestors of clonal lineages) can lead to deletion of Ab precursors (Nature 373: 252, 1995; PNAS 107: 181, 2010; J. Immunol. 187: 3785, 2011); Abs with long HCDR3 can be limited by tolerance deletion (JI 162: 6060, 1999; JCI 108: 879, 2001). BnAb knock-in mouse models are providing insights into the various mechanisms of tolerance control of MPER BnAb induction (deletion, anergy, receptor editing). Other variations of tolerance control likely will be operative in limiting BnAbs with long HCDR3s, high levels of somatic hypermutations.
The invention is described in the following non-limiting examples.
HIV-1 sequences, including envelopes, and antibodies from HIV-1 infected individual CH505 were isolated as described in Liao et al. (2013) Nature 496, 469-476 including supplementary materials; See also Gao et al. (2014) Cell 158:481-491.
Recombinant HIV-1 Proteins
HIV-1 Env genes for subtype B, 63521, subtype C, 1086, and subtype CRF_01, 427299, as well as subtype C, CH505 autologous transmitted/founder Env were obtained from acutely infected HIV-1 subjects by single genome amplification, codon-optimized by using the codon usage of highly expressed human housekeeping genes, de novo synthesized (GeneScript) as gp140 or gp120 (AE.427299) and cloned into a mammalian expression plasmid pcDNA3.1/hygromycin (Invitrogen). Recombinant Env glycoproteins were produced in 293F cells cultured in serum-free medium and transfected with the HIV-1 gp140- or gp120-expressing pcDNA3.1 plasmids, purified from the supernatants of transfected 293F cells by using Galanthus nivalis lectin-agarose (Vector Labs) column chromatography, and stored at −80° C. Select Env proteins made as CH505 transmitted/founder Env were further purified by superose 6 column chromatography to trimeric forms, and used in binding assays that showed similar results as with the lectin-purified oligomers.
ELISA
Binding of patient plasma antibodies and CH103, and DH235(CH235), See Gao et al. (2014) Cell 158:481-491, clonal lineage antibodies to autologous and heterologous HIV-1 Env proteins was measured by ELISA as described previously. Plasma samples in serial threefold dilutions starting at 1:30 to 1:521,4470 or purified monoclonal antibodies in serial threefold dilutions starting at 100 μg ml-1 to 0.000 μg ml-1 diluted in PBS were assayed for binding to autologous and heterologous HIV-1 Env proteins. Binding of biotin-labelled CH103 at the subsaturating concentration was assayed for cross-competition by unlabeled HIV-1 antibodies and soluble CD4-Ig in serial fourfold dilutions starting at 10 μg ml-1. The half-maximal effective concentration (EC50) of plasma samples and monoclonal antibodies to HIV-1 Env proteins were determined and expressed as either the reciprocal dilution of the plasma samples or concentration of monoclonal antibodies.
Surface Plasmon Resonance Affinity and Kinetics Measurements
Binding Kd and rate constant (association rate (Ka)) measurements of monoclonal antibodies and all candidate UCAs to the autologous Env C. CH05 gp140 and/or the heterologous Env B.63521 gp120 are carried out on BIAcore 3000 instruments as described previously. Anti-human IgG Fc antibody (Sigma Chemicals) is immobilized on a CM5 sensor chip to about 15,000 response units and each antibody is captured to about 50-200 response units on three individual flow cells for replicate analysis, in addition to having one flow cell captured with the control Synagis (anti-RSV) monoclonal antibody on the same sensor chip. Double referencing for each monoclonal antibody—HIV-1 Env binding interactions is used to subtract nonspecific binding and signal drift of the Env proteins to the control surface and blank buffer flow, respectively. Antibody capture level on the sensor surface is optimized for each monoclonal antibody to minimize rebinding and any associated avidity effects. C.CH505 Env gp140 protein is injected at concentrations ranging from 2 to 25 μg ml-1, and B.63521 gp120 was injected at 50-400 μg ml-1 for UCAs and early intermediates IA8 and IA4, 10-100 μg ml-1 for intermediate IA3, and 1-25 μg ml-1 for the distal and mature monoclonal antibodies. All curve-fitting analyses are performed using global fit of to the 1:1 Langmuir model and are representative of at least three measurements. All data analysis was performed using the BIAevaluation 4.1 analysis software (GE Healthcare).
Neutralization Assays
Neutralizing antibody assays in TZM-bl cells are performed as described previously. Neutralizing activity of plasma samples in eight serial threefold dilutions starting at 1:20 dilution and for recombinant monoclonal antibodies in eight serial threefold dilutions starting at 50 μg ml-1 are tested against autologous and herologous HIV-1 Env-pseudotyped viruses in TZM-bl-based neutralization assays using the methods known in the art. Neutralization breadth of CH103 is determined using a panel of 196 of geographically and genetically diverse Env-pseudoviruses representing the major circulated genetic subtypes and circulating recombinant forms. HIV-1 subtype robustness is derived from the analysis of HIV-1 clades over time. The data are calculated as a reduction in luminescence units compared with control wells, and reported as IC50 in either reciprocal dilution for plasma samples or in micrograms per microlitre for monoclonal antibodies.
The GenBank accession numbers for 292 CH505 Env proteins are KC247375-KC247667, and accessions for 459 VHDJH and 174 VLJL sequences of antibody members in the CH103 clonal lineage are KC575845-KC576303 and KC576304-KC576477, respectively.
Binding of sequential envelopes to CH103 CD4 binding site bnAb lineage members. The binding assay was an ELISA with the envelope protein bound to the well surface of a 96 well plate, and the antibody in questions incubated with the envelope bound to the plate. After washing, an enzyme-labeled anti-human IgG antibody was added and after incubation, washed away. The intensity of binding was determined by the intensity of enzyme-activated color in the well.
Combinations of antigens derived from CH505 envelope sequences for swarm immunizations
Provided herein are non-limiting examples of combinations of antigens derived from CH505 envelope sequences for a swarm immunization. Without limitations, these selected combinations comprise envelopes which provide representation of the sequence and antigenic diversity of the HIV-1 envelope variants which lead to the induction and maturation of the CH103 and CH235 antibody lineages.
The selection includes priming with a virus which binds to the UCA, for example a T/F virus or another early (e.g. but not limited to week 004.3, or 004.26) virus envelope. In certain embodiments the prime could include D-loop variants. In certain embodiments the boost could include D-loop variants. In certain embodiments, these D-loop variants are envelope escape mutants not recognized by the UCA. Non-limiting examples of such D-loop variants are envelopes designated as M10, M11, M19, M20, M21, M5, M6, M7, M8, M9, M14 (TF_M14), M24 (TF_24), M15, M16, M17, M18, M22, M23, M24, M25, M26. See Gao et al. (2014) Cell 158:481-491.
Non-limiting embodiments of envelopes selected for swarm vaccination are shown as the selections described below. A skilled artisan would appreciate that a vaccination protocol can include a sequential immunization starting with the “prime” envelope(s) and followed by sequential boosts, which include individual envelopes or combination of envelopes. In another vaccination protocol, the sequential immunization starts with the “prime” envelope(s) and is followed with boosts of cumulative prime and/or boost envelopes. In certain embodiments, the sequential immunization starts with the “prime” envelope(s) and is followed by boost(s) with all or various combinations of the envelopes in the selection. In certain embodiments, the prime does not include T/F sequence (W000.TF). In certain embodiments, the prime includes w004.03 envelope. In certain embodiments, the prime includes w004.26 envelope. In certain embodiment the prime includes M11. In certain embodiments the prime includes M5. In certain embodiments, the immunization methods do not include immunization with HIV-1 envelope T/F. In certain embodiments, the immunization methods do not include a schedule of four valent immunization with HIV-1 envelopes T/F, w053.16, w078.33, and w100.B6.
In certain embodiments, there is some variance in the immunization regimen; in some embodiments, the selection of HIV-1 envelopes may be grouped in various combinations of primes and boosts, either as nucleic acids, proteins, or combinations thereof.
In certain embodiments the immunization includes a prime administered as DNA, and MVA boosts. See Goepfert, et al. 2014; “Specificity and 6-Month Durability of Immune Responses Induced by DNA and Recombinant Modified Vaccinia Ankara Vaccines Expressing HIV-1 Virus-Like Particles” J Infect Dis. 2014 Feb. 9. [Epub ahead of print].
HIV-1 Envelope selection A (54 envelopes):
HIV-1 Envelope selection B (27 envelopes): The bolded envelopes from selection A above:
In certain embodiments the selections above could include additional envelopes from later time points. In certain embodiments, the selections above could include a D-loop mutant, or a combination thereof.
The selections of CH505-Envs were down-selected from a series of 400 CH505 Envs isolated by single-genome amplification followed for 3 years after acute infection, based on experimental data. The enhanced neutralization breadth that developed in the CD4-binding site (bs) CH103 antibody lineage that arose in subject CH505 developed in conjunction with epitope diversification in the CH505's viral quasispecies. It was observed that at 6 months post-infection in there was more diversification in the CD4bs epitope region in this donor than sixteen other acutely infected donors. Population breadth did not arise in the CH103 antibody lineage until the epitope began to diversify. A hypothesis is that the CH103 linage drove viral escape, but then the antibody adapted to the relatively resistant viral variants. As this series of events was repeated, the emerging antibodies evolved to tolerate greater levels of diversity in relevant sites, and began to be able to recognize and neutralize diverse heterologous forms for the virus and manifest population breadth. In certain embodiments, 54 envs are selected from CH505 sequences to reflect diverse variants for making Env pseudoviruses, with the goal of recapitulating CH505 HIV-1 antigenic diversity over time, making sure selected site (i.e. those sites reflecting major antigenic shifts) diversity was represented.
Specifically, for CH505 the virus and envelope evolution were mapped, and the CH103 CD4 binding-site bnAb evolution. In addition, 135 CH505 varied envelope pseudotyped viruses were made and tested them for neutralization sensitivity by members of the CH103 bnAb lineage (e.g,
In certain embodiments, the envelopes are selected based on Env mutants with sites under diversifying selection, in which the transmitted/founder (T/F) Env form vanished below 20% in any sample, i.e. escape variants; signature sites based on autologous neutralization data, i.e. Envs with statistically supported signatures for escape from members of the CH103 bnAb lineage; and sites with mutations at the contact sites of the CH103 antibody and HIV Env. In this manner, a sequential swarm of Envs was selected for immunization to represent the progression of virus escape mutants that evolved during bnAb induction and increasing neutralization breadth in the CH505 donor.
In certain embodiments, additional sequences are selected to contain five additional specific amino acid signatures of resistance that were identified at the global population level. These sequences contain statistically defined resistance signatures, which are common at the population level and enriched among heterologous viruses that CH103 fails to neutralize. When they were introduced into the TF sequence, they were experimentally shown to confer partial resistance to antibodies in the CH103 lineage. Following the reasoning that serial viral escape and antibody adaptation to escape is what ultimate selects for neutralizing antibodies that exhibit breadth and potency against diverse variants, in certain embodiments, inclusion of these variants in a vaccine may extend the breadth of vaccine-elicited antibodies even beyond that of the CH103 lineage. Thus the overarching goal will be to trigger a CH103-like lineage first using the CH505TF modified M11, that is well recognized by early CH103 ancestral states, then vaccinating with antigenic variants, to allow the antibody lineage to adapt through somatic mutation to accommodate the natural variants that arose in CH505. In certain embodiments, vaccination regimens include a total of 27 sequences (Selection B) that capture the antigenic diversity of CH505. In another embodiment, additional antigenic diversity is added (Selection A), to enable the induction of antibodies by vaccination that may have even greater breadth than those antibodies isolated from CH505.
In some embodiments, the CH505 sequences that represent the accumulation of viral sequence and antigenic diversity in the CD4bs epitope of CH103 in subject CH505 are represented by selection A, or selection B.
M11 is a mutant generated to include two mutations in the loop D (N279D+V281G relative to the TF sequence) that enhanced binding to the CH103 lineage . These were early escape mutations for another CD4bs autologous neutralizing antibody lineage, but might have served to promote early expansion of the CH103 lineage.
In certain embodiments, the two CH103 resistance signature-mutation sequences added to the antigenic swarm are: M14 (TF with S364P), and M24 (TF with S375H+T202K+L520F+G459E). They confer partial resistance to the TF with respect to the CH103 lineage. In certain embodiments, these D-loop mutants are administered in the boost.
Immunization protocols contemplated by the invention include envelopes sequences as described herein including but not limited to nucleic acids and/or amino acid sequences of gp160s, gp150s, gp145, cleaved and uncleaved gp140s, gp120s, gp41s, N-terminal deletion variants as described herein, cleavage resistant variants as described herein, or codon optimized sequences thereof. A skilled artisan can readily modify the gp160 and gp120 sequences described herein to obtain these envelope variants. The swarm immunization protocols can be administered in any subject, for example monkeys, mice, guinea pigs, or human subjects.
In non-limiting embodiments, the immunization includes a nucleic acid which is administered as DNA, for example in a modified vaccinia vector (MVA). In non-limiting embodiments, the nucleic acids encode gp160 envelopes. In other embodiments, the nucleic acids encode gp120 envelopes. In other embodiments, the boost comprises a recombinant gp120 envelope. The vaccination protocols include envelopes formulated in a suitable carrier and/or adjuvant, for example but not limited to alum. In certain embodiments the immunizations include a prime, as a nucleic acid or a recombinant protein, followed by a boost, as a nucleic acid or a recombinant protein. A skilled artisan can readily determine the number of boosts and intervals between boosts.
In certain embodiments an immunization protocol could prime with a bivalent or trivalent Gag mosaic (Gag1 and Gag 2, Gag 1, Gag 2 and Gag3) in a suitable vector.
In one immunization regimen, the prime is M6, M5, M11 then groups of envelopes from the selection of 54 envelopes are added either sequentially or additively.
One of the major obstacles to developing an efficacious preventive HIV-1 vaccine is the challenge of inducing broadly neutralizing antibodies (bnAbs) against the virus. There are several reasons why eliciting bnAbs has been challenging and these include the conformational structure of the viral envelope, molecular mimicry of host antigens by conserved epitopes which may lead to the suppression of potentially useful antibody responses, and the high level of somatic mutations in the variable domains and the requirement for complex maturation pathways [1-3]. It has been shown that up to 25% of HIV-1—infected individuals develop bnAbs that are detected 2-4 years after infection. To date, all bnAbs have one or more of these unusual antibody traits: high levels of somatic mutation, autoreactivity with host antigens, and long heavy chain third complementarity determining regions (HCDR3s)—all traits that are controlled or modified by host immunoregulatory mechanisms. Thus, the hypothesis has been put forth that typical vaccinations of single primes and boosts will not suffice to be able to induce bnAbs; rather, it will take sequential immunizations with Env immunogens, perhaps over a prolonged period of time, to mimic bnAb induction in chronically infected individuals [4].
A process to circumvent host immunoregulatory mechanisms involved in control of bnAbs is termed B cell lineage immunogen design, wherein sequential Env immunogens are chosen that have high affinities for the B cell receptors of the unmutated common ancestor (UCA) or germline gene of the bnAb clonal lineage [4]. Envs for immunization can either be picked randomly for binding or selected, as described herein, from the evolutionary pathways of Envs that actually give rise to bnAbs in vivo. Liao and colleagues recently described the co-evolution of HIV-1 and a CD4 binding site bnAb from the time of seroconversion to the development of plasma bnAb induction, thereby presenting an opportunity to map out the pathways that lead to generation of this type of CD4 binding site bnAb [5]. They showed that the single transmitted/founder virus was able to bind to the bnAb UCA, and identified a series of evolved envelope proteins of the founder virus that were likely stimulators of the bnAb lineage. Thus, this work presents an opportunity to vaccinate with naturally-derived viral envelopes that could drive the desired B-cell responses and induce the development of broad and potent neutralizing antibodies. While the human antibody repertoire is diverse, it has been found that only a few types of B cell lineages can lead to bnAb development, and that these lineages are similar across a number of individuals [6,7]. Thus, it is feasible that use of Envs from one individual will generalize to others.
In certain embodiments the invention provides methods for selecting the Env immunogens, among multitude of diverse viruses that induced a CD4 binding site bnAb clonal lineage in an HIV-infected individual, by making sequential recombinant Envs from that individual and using these Envs for vaccination. The B-cell lineage vaccine strategy thus includes designing immunogens based on unmutated ancestors as well as intermediate ancestors of known bnAb lineages. A candidate vaccine could use transmitted/founder virus envelopes to, at first, stimulate the beginning stages of a bnAb lineage, and subsequently boost with evolved Env variants to recapitulate the high level of somatic mutation needed for affinity maturation and bnAb activity. The goal of such a strategy is to selectively drive desired bnAb pathways.
Broadly neutralizing antibodies likely will not be induced by a single Env, and even a mixture of polyvalent random Envs (e.g. HVTN 505) is unlikely to induce bnAbs. Rather, immunogens must be designed to trigger the UCAs of bnAb lineages to undergo initial bnAb lineage maturation, and then use sequential immunogens to fully expand the desired lineages. The proposed trial will represent the first of many experimental clinical trials testing this concept in order to develop the optimal set of immunogens to drive multiple specificities of bnAbs. The HVTN will be at the cutting edge of this effort.
The concept is applicable to driving CD4 binding site lineage in multiple individuals due to the convergence of a few bnAb motifs among individuals. The adjuvant will be the GSK AS01E adjuvant containing MPL and QS21. Other suitable adjuvants can be used. This adjuvant has been shown by GSK to be as potent as the similar adjuvant AS01B but to be less reactogenic using HBsAg as vaccine antigen [Leroux-Roels et al., IABS Conference, April 2013, [9].
1. Mascola J R, Haynes B F. HIV-1 neutralizing antibodies: understanding nature's pathways. Immunol Rev 2013; 254:225-44.
2. Verkoczy L, Kelsoe G, Moody M A, Haynes B F. Role of immune mechanisms in induction of HIV-1 broadly neutralizing antibodies. Curr Opin Immunol 2011; 23:383-90.
3. Verkoczy L, Chen Y, Zhang J, Bouton-Verville H, Newman A, Lockwood B, Scearce R M, Montefiori D C, Dennison S M, Xia S M, Hwang K K, Liao H X, Alam S M, Haynes B F. Induction of HIV-1 broad neutralizing antibodies in 2F5 knock-in mice: selection against membrane proximal external region-associated autoreactivity limits T-dependent responses. J Immunol 2013; 191:2538-50.
4. Haynes B F, Kelsoe G, Harrison S C, Kepler T B. B-cell-lineage immunogen design in vaccine development with HIV-1 as a case study. Nat Biotechnol 2012; 30:423-33.
5. Liao H X, Lynch R, Zhou T, Gao F, Alam S M, Boyd S D, Fire A Z, Roskin K M, Schramm C A, Zhang Z, Zhu J, Shapiro L, Mullikin J C, Gnanakaran S, Hraber P, Wiehe K, Kelsoe G, Yang G, Xia S M, Montefiori D C, Parks R, Lloyd K E, Scearce R M, Soderberg K A, Cohen M, Kamanga G, Louder M K, Tran L M, Chen Y, Cai F, Chen S, Moquin S, Du X, Joyce M G, Srivatsan S, Zhang B, Zheng A, Shaw G M, Hahn B H, Kepler T B, Korber B T, Kwong P D, Mascola J R, Haynes B F. Co-evolution of a broadly neutralizing HIV-1 antibody and founder virus. Nature 2013; 496:469-76.
6. Morris L, Chen X, Alam M, Tomaras G, Zhang R, Marshall D J, Chen B, Parks R, Foulger A, Jaeger F, Donathan M, Bilska M, Grey E S, Abdool Karim S S, Kepler T B, Whitesides J, Montefiori D, Moody M A, Liao H X, Haynes B F. Isolation of a human anti-HIV gp41 membrane proximal region neutralizing antibody by antigen-specific single B cell sorting. PLoS One 2011;6:e23532.
7. Zhou T, Zhu J, Wu X, Moquin S, Zhang B, Acharya P, Georgiev I S, Altae-Tran H R, Chuang G Y, Joyce M G, Do K Y, Longo N S, Louder M K, Luongo T, McKee K, Schramm C A, Skinner J, Yang Y, Yang Z, Zhang Z, Zheng A, Bonsignori M, Haynes B F, Scheid J F, Nussenzweig M C, Simek M, Burton D R, Koff W C, Mullikin J C, Connors M, Shapiro L, Nabel G J, Mascola J R, Kwong P D. Multidonor analysis reveals structural elements, genetic determinants, and maturation pathway for HIV-1 neutralization by VRC01-class antibodies. Immunity 2013; 39:245-58.
8. Lynch R M, Tran L, Louder M K, Schmidt S D, Cohen M, Dersimonian R, Euler Z, Grey E S, Abdool K S, Kirchherr J, Montefiori D C, Sibeko S, Soderberg K, Tomaras G, Yang Z Y, Nabel G J, Schuitemaker H, Morris L, Haynes B F, Mascola J R. The Development of CD4 Binding Site Antibodies During HIV-1 Infection. J Virol 2012; 86:7588-95.
9. Leroux-Roels I, Koutsoukos M, Clement F, Steyaert S, Janssens M, Bourguignon P, Cohen K, Altfeld M, Vandepapeliere P, Pedneault L, McNally L, Leroux-Roels G, Voss G. Strong and persistent CD4+ T-cell response in healthy adults immunized with a candidate HIV-1 vaccine containing gp120, Nef and Tat antigens formulated in three Adjuvant Systems. Vaccine 2010; 28:7016-24.
Abstract
One strategy for studying broadly neutralizing antibody (bnAb) development is to characterize the coevolution of virus and B-cell clonal lineages during affinity maturation and the development of neutralization breadth. Such longitudinal bnAb studies involve sequencing hundreds to thousands of Envelope (Env) variants from one donor. It is feasible, however to construct Envs reagents for protein expression and detailed analysis for only a fraction of these. Presented herein is a method to select a subset of variants that represents the gradual acquisition of selected mutations from among longitudinal sequences. It uses loss of the transmitted/founder (TF) virus, or the consensus of the first time point in the case of subjects that are enrolled during chronic infection, to identify sites that under strong positive selective pressure. Visualization tools have been developed to readily track mutations in these sites over time. An algorithm then is used to select Envs that represent the gradual acquisition of all recurrent mutations in the selected sites, sampling them in the context that they first appear in the subject. A detailed example of a retrospective application of this method to a subject, CH505, who has already been extensively studied, is provided to enable the assessment of how the method performed. Using 398 single-genome amplification (SGA)-derived Envs that spanned three years of infection, the algorithm identified 35 sites under putative immune selection. Encouragingly, these sites corresponded to verified immune targets: a T cell epitope, and epitopes recognized by neutralizing antibodies isolated from CH505: the CD4bs and the V3 loop. Thus, in this case patterns of mutations identified to be under selection were directly indicative of the antibody specificities of the subject. The algorithm identified 54 Envs that represent all recurrent mutations in selected sites. The Envs were well dispersed throughout the phylogeny, and represented the development of binding and neutralization in a set of 135 previously handpicked Envs. The algorithm chooses sequence sets with more recurrent mutations and less redundancy than would be chosen randomly or by hand. Thus, the algorithm objectively provides a minimal, manageable number of Envs to represent diversity in natural infection, to help study virus-antibody coevolution. This minimal representation of antigenic diversity is called an “antigen swarm.” Initially, this was developed as a strategy to explore mutational patterns and for reagent design. However, given the emergence of new vaccine technologies that may enable the use of high valency antigen cocktails, this approach could also be used to design a vaccine that mimics viral evolution in an individual who made potent bnAb responses.
Genetic sequencing of samples collected over time gives a dynamic view of how viruses evade host immunity while maintaining replication fitness. HIV-1 is a chronic infection, and persists as a diverse and evolving swarm of viral variants within an infected individual. HIV has a high mutation rate, and viral fitness is achieved by selection in an ever-changing immunological landscape within the host. Identifying mutations essential for immune escape, and simultaneously, those that may be important eliciting subsequent immune responses, can be biologically and computationally challenging. Given current state-of-the-art experimental practice, far more viral sequences can be obtained from a subject who is studied over time than can be cloned into constructs suitable for testing and analysis, necessitating down-selection for reagent design. Historically, this has typically been performed by inspection, often by picking some designated number of variants, (based on resources that can be applied to the problem), that represent different time points and different clades within a phylogenetic tree. Such strategies may miss the most relevant mutations, and may have large genetic distance between key variants. A computational strategy has been developed, working only from initial viral sequence data, to identify and visualize viral protein evolutionary “hot spots” and then to select compact virus sets that carry all candidate immune escape mutations. By inference, these sites might also be key in eliciting the ever-broadening immune response. This method uses the loss of the TF form of the virus as a measure of positive selection driven by immune response. An algorithm then chooses sequence variants that represent all recurrent amino acid mutations at each of the selected sites. By capturing each selected mutation as it first arises in the context of earlier and less divergent viruses, the algorithm captures the observed gradual accumulation of mutations as they arise. Such epitope diversification in vivo is associated with the development of a broadening immune response. Adjusting parameters fine-tunes how many sequences result. An advantage of the method is that to minimize costs, no more sequences are chosen than are necessary to represent the composite of variants detected. Use on a well-characterized set of Env sequences from a bnAb individual confirmed that the selected sites were concentrated in antibody contact areas, and that selected sequences represented diverse antigenic phenotypes. Such a compact set of variants is referred to as an antigen swarm, and suggest potential use of antigen swarms for reagent design, to characterize the evolving antibody responses, as well as for an antigen swarm vaccine.
Introduction
It is not yet known how to stimulate protective immunity against HIV-1 with broadly cross-reactive neutralizing antibodies (nAbs) via vaccination, and neutralizing antibody induction remains a central focus of HIV vaccine field. Neutralizing antibodies are immune correlates of protection in all antiviral vaccines licensed to date [1, 2], and administration of neutralizing antibodies can confer protection in SHIV challenge models in rhesus macaques [3, 4]. During the natural course of HIV infection, a single transmitted-founder (TF) virion typically establishes infection, and the virus population grows exponentially, with random mutations that initially follow a Poisson distribution of intersequence distances [5, 6]. The viral load eventually declines and resolves to a quasistationary set-point [7], influenced by both host and viral factors [8]. HIV is maintained as a continuously evolving population throughout chronic infection [9], with diversification driven by adaptive immune responses, including both antibodies [10-18] and T cells [19-21]. Mutations that facilitate immune evasion are positively selected and become more common, while mutations that result in a relative fitness disadvantage do not persist. Neutral mutations may also drift to higher frequency, with rates that depend on the effective population size [22].
Previous studies have revealed that viral diversification precedes the acquisition of breadth, which suggests antigenic diversity may be necessary for acquisition of bnAb breadth in vivo [15, 18], and also that several antibody lineages can concurrently impact selection on the same epitope region [16]. While essentially all HIV-1 infected individuals can elicit antibodies with some cross-reactive neutralization responses during the chronic phase of infection, and neutralization responses vary over a continuous spectrum across individuals [17]. Plasma samples from individuals with the most potent and broad antibody neutralization are frequently singled out for detailed study [23-26]. Such study includes isolation of monoclonal antibodies and investigations of both viral and B cell lineages to understand the immunological processes underlying elicitation of effective immune responses and inform strategistudyes for vaccine design [15, 16, 18, 27-29]. In general, autologous, strain-specific nAbs begin to develop in the initial months after infection, and rapidly select for viral escape variants [11, 14]. High titers of more broadly neutralizing antibodies develop in a subset of cases, but only after years of infection, and perhaps more in cases with persistently high levels of viral replication [30, 31].
Among the subjects with broad cross-reactivity characterized to date, the contemporary co-circulating autologous virus has escaped from an otherwise broadly-reactive neutralizing antibody response [32]. Antibodies that recapitulate much of the potency and breadth of polyclonal sera have been cloned from subjects with high bnAb titers [cite]. The developmental pathway of B cell immunoglobulin genes from early to later infection is an active research frontier, now only beginning to be understood [cite]. It remains unknown what properties of evolving viral Env proteins stimulate or facilitate the important transition from autologous to heterologous reactivity. Understanding these events should ultimately enable new strategies for immunogen design to elicit potent, broadly cross-reactive nAbs.
A continuing research priority has been to characterize virus co-evolution with antibodies in individuals who develop the greatest potency and breadth of neutralization [15, 16, 18, 29, 33, 34]. Working back from mature bnAb clonal lineages, through ancestral intermediates, ultimately to the unmutated germline precursor, has begun to help understand this process of bnAb development [15, 18, 27, 28, 33, 35-37]. To explore antibody/viral co-evolution, mutational patterns that are selected over time in both the antibody population, as it undergoes affinity maturation, and the virus population, as it evolves to evade the ongoing immune responses, are defined by sequencing and sequence analysis of serially obtained, or longitudinal, samples [15, 18].
Described herein is a new approach to such longitudinal sequence analysis, which involves two steps. The first part of our bioinformatics approach allows one to define and visualize sites that are under positive selective pressure in the viral population. Defining the sites under selective pressure can help infer the antibody specificities that are active in the plasma, and to identify key mutations for characterization during experimental follow-up studies. The second part of the approach is a computational method to down-select sequences objectively from a very large sequence sample, yielding a representative subset of viral variants. The resulting set of sequences is an “antigenic swarm,” which captures mutations in the sites that are under the most potent selective pressure as they first emerge in the evolving HIV-1 quasispecies [15, 18, 29]. The size of the sequence subset involves a trade-off between the cost of including more variants and the degree of selection to be represented. Our approach involves two parameters that can be adjusted to balance these two factors, explore the data, and choose the most representative set given experimental feasibility and sample-size limitations. This process has been worked through retrospectively in individual CH505, where extensive information regarding antibody interactions and targeted Env epitopes [15, 16] is available, to determine how well the relevant diversity is captured by our informatics method in this case. This approach can be used to select Envs (or similarly diversifying variants) as reagents, e.g. for Env production for synthesis and use in binding assays, or to generate pseudoviruses for use in neutralization assays. In turn, the resulting reagents can be used to study relationships between viral phenotype and genotype, and to investigate in better detail how neutralizing antibody responses develop by affinity maturation.
Resulting sets of antigens provide a useful baseline for basic research and may also inform immunogen design. A working hypothesis to explain the observation that bnAbs tend to arise late infection, after antigenic diversification has arisen in the subject, is that serial immune escape in vivo drives antibody lineages to adapt to the emerging viral variants, eventually enabling recognition of the diverse forms of the targeted epitope found in the circulating population. Thus, a polyvalent vaccine that represents Env diversity may be one strategy for inducing antibodies with greater breadth than single, invariant clonal antigens. Related work in vaccine design against HIV-1 has suggested that Env variants sampled during development of heterologous neutralization breadth could be administered as immunogens [38, 39]. Described herein is a a method for Env selection to ensure comprehensive representation of antigenic diversity.
Results
A process for antigenic swarm selection has been implemented, which consists of two phases. The first phase identified protein sites most likely to be under positive (diversifying) selection, by considering the extent to which the TF amino acid is “lost” at any one time-point during the longitudinal sampling period. This yielded a list of sites of interest, from which all amino acid mutations that arose over time were tabulated. The second phase selected sequences that represented the mutational variants among sites selected in the first phase. The two phases of analysis, and parameters that influenced the number of sites and sequences thereby obtained, are detailed below.
It is worth noting that the single-genome amplification (SGA) sequences analyzed here were obtained by limiting-dilution PCR, which provides genetic linkage across all of the env gp160, without recombination artifacts, and limited nucleotide substitution errors in cDNA synthesis [40]. Unlike Sanger sequencing from bulk PCR or large numbers of fragmentary high-throughput reads from unlinked templates, SGA sequences provide high-quality sequence data ideally suited to understand how viruses adapt to host immune responses over time [5, 14, 21, 40, 41]
Site Selection
TF loss varied across sites. Building upon recent studies of antibody/Env coevolution in the study subject CH505 [15, 16], first the strategy was applied to this subject to determine how well the method performed in a case where key epitopes have been defined and well characterized. 398 sequences from 14 time-points sampled over three years were aligned across 953 sites in the Env protein.
Initially dominated by the TF form, mutational variants developed over time, and displayed a variety of dynamics among sites with high TF loss.
Peak TF loss identified selected sites. The “peak” TF loss per site was defined as the highest TF loss in that site over all time-points sampled, and it was used to select candidates for sites under immune selection. In all, 15 sites completely lost the TF form during the three-year sampling period, while the other sites never reached 100% TF loss. The cumulative distribution of peak TF loss per site, depicted in
Selected sites were consistent with antibody-driven selection. As illustrated in
Table 2 lists the 35 selected sites with 80% TF loss, ranked by the earliest time at which any non-TF variant exceeded 10%, with ties resolved by cumulative TF loss sorted in descending order. Most (91%) of the selected sites occurred in gp120. In the context of the Env trimer structure, the selected sites formed localized clusters on the outer domain of gp120 (
The third cluster, in the CD4bs, is the most complex. The CD4bs is the target of both the CH103 bnAb lineage [15] and the CH235 nAb helper lineage [16] in subject CH505. Many of the 32 selected gp120 sites included structurally defined contacts for CD4 [47, 48], and several previously studied CD4bs bnAbs, including VRC01 [47, 48], NIH45-46 [49], and b12 [50]. Although the current study is retrospective, this pattern of mutations would have indicated the presence of CD4bs antibodies in the subject, as well as indicate when they were beginning to exert selective pressure, even prior to isolation of nAb lineages. As expected, CH103 contacts and resistance mutations were well represented among the selected sites [16]. Three selected sites (279, 281, 275) were localized to CH103 light-chain contacts near loop D (
To consider what sites might be missed by the TF-loss criterion, the localization of sites that never reach 80% loss was explored. The 365 sites that varied were dispersed over the entire protein, as expected. Requiring multiple mutations among all 398 available sequences, regional patterns appeared in the spatial distribution of mutations (
A concise representation of selected sites was to string them together to form “concatamers” of 35 amino acids. The order of sites in concatamers did not follow the primary Env sequence, but rather by when non-TF mutations first emerged and cumulative TF loss suggested a cumulative progression of mutations. Using modified sequence logos [52, 53], in which symbol height indicates frequency in a sample, shows this progression over time clearly (
Electrostatic charges of amino acid side chains, depicted by symbol colors (
Swarm Selection
Algorithm. The swarm-selection algorithm is outlined as a flow-chart in
The algorithm is deterministic, meaning it will always produce the same set of sequences from a given alignment, because the algorithm does not make any random choices, and does not depend on the order in which sequences are provided in the input alignment. The algorithm was made efficient through use of vector operations and computes distance matrices only when they are needed to choose between otherwise ambiguous alternatives. Its computational complexity is expected to require no worse than a linear increase with the number of sequences in the input alignment. That is, doubling the number of input sequences should no more than double the expected run time.
Each mutation observed more than once in a selected site will ultimately be included in the antigenic swarm. The algorithm isolated mutations of interest in the least divergent sequence background possible, among available sequences sampled. It did this by progressively covering mutations that occurred in selected sites in the first time-point they appeared, and by representing them with the sequence most similar to the TF or, to resolve ties, the sequence most similar to those under consideration (lower-right box in
Objective choice of representative variants among selected sites. The algorithm identified 54 Envs that covered variant diversity at the 35 sites selected by TF loss. Table 3 summarizes these as concatamers. Algorithm selection criteria had at least two clear consequences. First, the gradual accumulation of mutations found in early infection was deliberately mimicked using this strategy. Second, the appearance of each new mutation of interest is, by design, relatively isolated from other accumulating mutations emerging in the within-host virus population. Therefore, to the extent possible given sampling, each mutation in each the selected sites will be expressed in a context as close as possible to the form of the Env in which it was embedded when it first began to appear in the viral population at a high enough level to be sampled. By using the antigenic swarm to characterize variation among neutralization phenotypes in the population, if a particular mutation conferred a phenotypic change in either antigenicity or neutralization susceptibility of an isolate, then that change would be identified relative to the other mutations naturally occurring in the sampled virus population.
Swarm variant frequencies (
Random sequence selection. A resampling experiment was performed to evaluate the swarm-selection algorithm against a null distribution, which might be sampled by less informed methods. To eliminate multiple copies of the same Env sequence, the full-length Envs that had been normalized were randomly sampled. Removing duplicates and excluding Envs with premature stop or incomplete codons gave 260 distinct Envs, from which the same number of sequences as in the swarm set were repeated resampled, without replacement.
A comparison was made as to how many of the non-TF mutations tabulated in the first pass of the algorithm through all 398 sequences were covered. The antigen swarm set was designed to cover 92 distinct mutations that arose in the 35 selected sites. As expected, random sampling of Envs gave consistently lower coverage of the mutations needed (median 77; 95% CI: 69 to 84) than the 92 mutations that were included by the swarm-selection algorithm (
Further, hierarchical dendrograms were computed from Hamming distance matrices for swarm and random sets, and the outcomes were summarized as clustering coefficients. The clustering coefficient, a dimensionless quantity, is the mean distance (from 0 to 1) at which sequences cluster together. It summarizes the distribution of terminal branch lengths as the expected similarity (the complement of normalized distance) among terminal branches [54]. To give an intuition for how this coefficient works,
These metrics compared sequence sets from the swarm selection algorithm with null distributions that were obtained by random selection. Because the three metrics are only loosely correlated, they measure different aspects of selected sets of sequences. This appears to be the first attempt to establish criteria to quantify how well any subset of sequences from a larger related set represents diversity (distinct concatamers), polymorphisms (mutations included), and progressive divergence (clustering coefficient) in the larger set.
Phylogenetic and antigenic contexts. The phylogenetic context of Envs represented in the swarm set showed that selected sites persisted against the scattered background of ephemeral mutations (
ELISA binding assays with mAbs from the CH505-derived CH103 CD4 binding site bnAb B cell lineage were available for gp120s synthesized from a subset of 27 selected Envs (
Similarly for neutralization sensitivity, 26 selected Envs were among the 121 Env-pseudotyped viruses tested for sensitivity to neutralization by mAbs of the CH103 lineage (
In
Swarm Size Adjustments
A main goal of this procedure is to enable down-selection from a large set of Env sequences, an Env subset that recapitulates development of antigenic diversity in the subject, given realistic experimental and cost constraints. Amino acid sites that were most likely under strong selective pressure were identified first (an important analysis step in its own right), and then sequences were chosen to represent diversity found in those sites. Having more available sequences per time-point allows a user to choose sites with more complete TF loss. To explore how the algorithm functions when applied to larger data sets, it was applied to additional acutely infected study participants with much more extensive sampling, CH694 and CH848.
The cutoff used for loss of the TF form determined how many sites were selected. In turn, this influenced the number of sequences, here Env variants, in the antigenic swarm sets intended for synthesis for phenotypic assays (Table 4). Similarly, the minimum variant count reduced the number of sequences selected by excluding rare mutations. For example, a minimum variant count of two excluded mutations that only ever appeared once. If one did not wish to include sequences that capture each isolated mutation found in selected sites, the size of the resulting reagent set defined by the algorithm could be reduced. By exploring different parameter settings, investigators can evaluate the impact of including variants that represent increasingly rare mutations, in light of resources available for experimental reagents. Unpublished Env sequences from two additional study participants with much greater sequencing depth and more longitudinal samples than individual CH505 provided an opportunity to consider effects of varied parameters (Table 4). In these cases, increasing the TF loss cutoff to 95% or 100% was necessary to preserve a desired swarm size of 100 Envs chosen from over a thousand Env SGA sequences.
A large proportion of variants are required to represent each recurrent mutation among selected sites from hypervariable loop regions. An alternative approach may be to emphasize only those sites that can be mapped onto HXB2, and consider hypervariable regions separately. This approach assumes it is not essential to sample each particular form that appears in disordered regions, such as the hypervariable portions of V1, V2, V4, and V5. Instead, it emphasizes covering all variants among more ordered regions and picking up the linked variants in disordered regions without sampling them completely. If only distinct HXB2 positions were counted and represented, then 80% TF loss with CH848 gives 65 sites and 127 sequences against 970 sequences obtained through peak breadth at d1432. These 127 sequences capture all 209 mutations (including indels) in the 65 HXB2 positions that appear more than once. Similarly, for CH694, the algorithm chose 112 sequences from 1103 Envs to represent 181 mutations that appeared more than once over 59 sites with at least 80% TF loss.
Chronic Infection
These methods were developed initially to select sequences from longitudinal studies beginning early in infection, where the TF virus is known or reliably inferred, and the progression of escape mutations is readily apparent. This is not true for chronic infection. Still, it may be desirable to select a subset of sequences that represent diversity in chronic infection samples. To evaluate the algorithm ability to select an antigenic swarm from a chronic infection, the algorithm was applied to sequences from a study participant enrolled in chronic infection, designated CH457 [45]. 205 plasma SGA Envs from ten sample time-points were analyzed (median was 20 sequences per time-point; the distribution ranged from 12 to 35). In the chronic enrollment sample, five of twenty Envs exactly matched the within-time-point consensus. One of these (w0.e18) was used as the reference to compute variant frequencies. No variation was detected in 582 of 888 aligned sites, and an 85% cutoff identified 35 sites that were candidates for strong positive selection (
With singleton variants excluded, the algorithm selected a swarm of 44 Envs (
The phylogeny indicated a persistent, divergent secondary clade, represented by 24 of 205 plasma Envs (
Discussion
The task of selecting representative variants from a larger set for follow-up studies from longitudinal samples is routine, but can be complex when choosing from hundreds to thousands of sequences. Furthermore, while methods for isolating bnAbs from HIV-1 infected subjects and vaccinees are rapidly improving, it remains a challenging task. The approach described herein suggests the task can be divided into two main parts, identification and tracking of selected sites within a subject, and identifying sequences that represent the antigenic diversification in that subject. A computational approach (LASS) to automate these tasks has been developed.
First, transmitted-founder loss is used in one or more samples in a longitudinal study as a simple way to identify sites under selective pressure. Despite the existence of a variety of methods to test for positive selection [55, 56], their utility to identify sites under positive selection in the context of within-subject viral evolution during acute infection is limited due to statistical power for inference. In contrast, the loss of the TF form at any one time-point is a simple and inclusive measure. In CH505, sites selected by this criterion were focused in regions that were highly relevant to the adaptive immune responses that were previously identified in the subject [15]. This suggests that in future studies, structural localization of selected sites could be used to raise hypotheses about specificities of bnAbs in plasma. Furthermore, the timing of TF loss identifies these important mutational events, and could help determine when antibodies exert the most selective pressure. Such information could guide the search for monoclonal antibodies in subjects with potent nAbs, by focusing on antibody specificities that recognize the epitopes under selection, and by aiding in selecting the sample used to isolate new bnAbs in a subject who was sampled over several years of follow-up.
Second, a rational, objective method is provided to guide the selection of Env sets for experimental study from large sequences sets sampled over time. LASS can select sets of sequences that represent gradual antigenic diversification induced during bnAb development, ensuring that all variants in sites identified by TF loss are represented in an Env reagent set. The method is computationally efficient, scaling linearly with the number of sequences, and minimizes redundancy, selecting only as many variants as are necessary to represent diversity in sites selected by TF loss. The algorithm starts with sequences most closely resembling the form that established the infection, and gradually increases diversity in a manner that parallels natural infection.
LASS was used to identify selected sites and representative sequence subsets in longitudinal samples from three acutely infected subjects and one subject sampled only during chronic infection. SGA sequences were analyzed from all four subjects, providing intact env gp160s with no recombination artifacts and minimal error [5, 14, 21, 40, 41]. While this provides optimal conditions, the approach could also be used in other longitudinal study designs and sequencing strategies.
In related research, sequence selection has been represented as a set-coverage problem [57], and networks of covarying sites are identified from a population-level alignment, which represents a particular clade [58], not a within subject alignment as in our case. A limitation of our approach, which will be addressed, is that sites are treated independently, while covariation between sites may influence variant suitability and TF loss. Considering covariation may potentially facilitate identification of smaller representative swarm sets. However, by progressively adopting mutations in the context of variant sequences where they first arise in the sequence sets, the swarm sets, by definition, allow the study of mutations in the context of the natural pairings as they were found in vivo. This strategy also has a potential advantage over site-specific mutagenesis, which necessarily studies mutations in isolation. A mutation observed in a later time-point and introduced into the TF, for example, may not have the same phenotypic consequences as it does in the background of the Env in which it arose, so the ability to study related natural variants isolated serially may be ultimately more informative.
Virus diversification precedes, and thus may drive, the development of neutralization breadth in HIV-1 infection [16, 18], and exposure of a neutralizing antibody lineage during affinity maturation to a gradual increase in antigen diversity could result in selection of antibodies with increased breadth. Thus, mimicking in vivo diversification has been proposed as a possible vaccination strategy [15, 18, 59-61]. With recent technological advances, it is becoming feasible to test vaccine designs that not only include 5-10 antigens, but potentially between 50-100 antigens, administered as DNA in either in series or in combination [62, 63]. As LASS uses efficient algorithm to identify candidate sets of antigens with progressively increasing diversity at important sites in polymorphic viral proteins, in could be used to aid in the design for such “antigen-swarm vaccines.” An additional potential use of the algorithm, not described here, is to analyze large antibody sequence data sets to identify, analyze selection, and select a representative subset of antibody sequences from clonal lineages of for detailed study. For example, the algorithm could be used to identify key members of antibody clonal lineages as mutations arise for HIV-antibody co-evolution studies.
In summary, computational methods have been developed to identify and track selected sites in longitudinal data, and to use these selected sites to aid in down-selecting sequence sets for reagent design, or for testing the “antigen swarm” vaccine concept. When applied to longitudinal HIV samples, a retrospective evaluation of viral sequences from the intensely studied subject CH505 showed that the LASS provided meaningful results, highlighting selected sites that were indeed under immune selective pressure, and building a non-redundant collection of sequences tailored to characterize the phenotypic consequences of those mutations. LASS may be useful in many contexts, such as assisting in bnAb isolation, as well studies of other viral infections, and studies of antibody evolution.
Methods
Site Selection
Transmitted-founder (TF) loss is the proportion of sequences sampled per time-point that have lost the ancestral TF state. This is an efficient way to select rapidly evolving sites. Here no information other than TF loss was considered, though such information could be used to select sites. This could include signature sites associated with neutralization assay outcomes and antibody contact residues from structural data, if available.
The starting point was env cDNA amplicons sequenced via single-genome amplification (SGA), also known as limiting-dilution PCR, sampled longitudinally, beginning early (3-6 weeks) after infection, with 3-5 years of clinical follow-up. Sequencing effort was intended to obtain about 20 sequences from each of 14 samples. It is common for SGA from homogeneous infections to yield multiple identical sequences, all of which were kept. A naming convention for Env sequences was used to ensure consistency and so sample time-point labels could be parsed from sequence names. To study variant dynamics, the number of elapsed days after the earliest sample from sample dates was computed, and the number of days post-infection estimated from the earliest sample was added. For homogeneous infections sampled before the onset of immune selection, a simple Poisson model of random sequence evolution provides the estimate [5, 6].
The HXB2 reference sequence was added to facilitate numbering positions, the sequences were codon-aligned, then translated, and a phylogeny was inferred. Because no algorithm aligns the HIV envelope perfectly, particularly when a translation is needed, manual alignment was started after preliminary alignment with an HIV-specific hidden Markov model. Aligning all but the hypervariable loops is trivial given such a preliminary alignment. Because hypervariable loops evolve rapidly by tandem duplications, a useful alignment criterion is self-consistency, rather than identification of homologous sites. For example, a putative N-linked glycosylation motif could be placed at either the N- or C-terminal position of an otherwise gapped region. Uniform placement of such motifs, particularly where HXB2 has no corresponding sites, facilitates analysis because the variants appear more clearly as evolutionary signal if aligned consistently.
Maximum-likelihood trees were inferred from translated amino-acid sequences with PhyML v3 and the HIVw (HIV-specific, within-host) substitution model [64-66]. The phylogeny is used to order sequences and is an organizing principle for sequence evolution from the ancestral TF virus. To identify potential N-linked glycosylation (PNG) sites, PNG sites were annotated by replacing asparagine sites that match the Nx[ST] motif to become Ox[ST]. In the PNG motif, x indicates any amino acid except proline, and the third position is either serine or threonine. For each aligned site, TF loss per time-point sequenced was computed, the maximum identified, and this peak TF loss was compared with a threshold. The threshold was adjusted and the resulting number of sites was considered. This gave a list of sites, which were considered as interesting evolutionary “hot spots” to be represented by a swarm of Envs.
Swarm Selection
Having used the TF-loss criterion to select sites from the alignment, a set of Envs was identified to represent the variants that occur at these sites. By simple combinatoric calculations, there are at least 10100 distinct ways to choose k representatives from n individuals for n above 427 and k over 100. On the scale of the current example, choosing 50 representatives from 385 candidates gives over 1063 distinct alternatives. To search such a vast space of possible solutions is intractable for even the fastest computers. Worse, in the regime of interest, the number of possible solutions grows exponentially with n or k, where k<<n/2.
A simple, efficient algorithm was designed and implemented to select sequences that represent variants at sites selected by TF loss. The approach is greedy, meaning it adds variants iteratively, rather than refine the entire set for potentially better solutions. Such a greedy approach is unlikely to give the best possible overall solution, but can efficiently provide reasonably good solutions, and can be refined to include other criteria as needed. It works from the same alignment used to select sites, and assumes that the TF form and sample time-points can be identified from sequence names. A virtue of the greedy approach is that it considers time of sampling, and starts with sequences most like the form that established the infection, then progressively builds diversity in a manner that follows the natural course of infection. In this way, common mutations and mutations that eventually go to fixation are sampled many times.
As outlined in
Within a time-point, the choice among multiple Envs that carry a needed variant is resolved by a series of criteria. The algorithm first tries to identify the sequence that uniquely minimizes the distance (number of mutations, including gaps) to the TF among selected sites. Then, in case of ties, a sequence is chosen that minimizes distance to the full-length TF. Finally, if ties remain, a sequence is chosen that minimizes the average distance to the current working set of sequences.
The sequence selected to represent the needed mutation is included in the swarm set, the corresponding counts in the table of needed mutations are set to zero, and iteration continues. An option exists to require that specific sequences be included, if desired. Such a sequence is added during iteration, to ensure inclusion of earlier forms that carry variants found on the specified sequence, rather than beforehand. Upon iterating over all sample time-points, selected variants, and needed sites, the swarm is complete. This approach is deterministic for a given set of sequences, though unresolved ties may exist among alternative sequences for some data. (This situation, in practice, has yet to be encountered.) Any remaining ties would indicate a need for additional selection criteria, though this outcome is yet to be encountered. An advantage of this approach is that it selects only as many sequence as are necessary to represent the mutational variants in selected sites, rather than some arbitrary number. However, the greedy approach errs towards inclusion of early point mutations that could be included with later, more divergent, viruses.
The software tools for swarm selection were written as an R package called swarmtools. Example data from CH505 and a tutorial “vignette” are included. Phylogenetic trees have been paired, the TF virus has been rooted on, ladderized, and then rendered as phylograms, together with pixel plots (derived from Highlighter plots [5]), which illustrate polymorphisms as either mutations or insertions/deletions relative to the TF sequence. These have been found to be informative representations for understanding evolution of the virus population in an acutely infected host, given the limited genetic diversity that occurs in early infections [15, 45]. Renderings such as in
Example 8 describes a method for swarm immunogen selection.
Neutralization breadths are uniformly distributed across chronic sera. This suggests anyone, not only 10-20%, might develop broadly neutralizing antibodies (bnAbs) if exposed to immunogens via vaccination. Working back from mature bnAbs through intermediates has enabled understanding their development from the unmutated germ-line ancestor, and showed that viral genetic diversity preceded the development of neutralization breadth. Described herein is the selection of sets of viral variants to investigate the role of antigenic diversity in serial samples. It is hypothesized that sites losing the ancestral, transmitted-founder (TF) virus state are most likely under positive selection, not drift. From acute, homogenous infections with 3-5 years of follow-up, sites of interest among plasma SGA Envs were identified by comparing the frequency of sequences per time-point having the TF state with a threshold, typically 5%. Sites with TF frequencies below threshold are putative escapes. Additional sites of interest were considered where more information was available, i.e. tree-corrected neutralization signatures and antibody contacts determined from co-crystal structure. Progressive loss of the TF form was used to identify clones carrying representative escape mutations.
In CH505, a study participant with an early antibody that bound autologous TF virus, 398 Envs from 14 time-points over three years were studied (median per sample: 25, range: 18-53). 36 sites with TF frequencies below 20% were found in any sample. Neutralization and structure data identified 28 and 22 interesting sites, respectively. Together, this identified six gp41 and 53 gp120 sites, plus six V1 or V5 insertions not in HXB2. 100 clones that represent the sites of interest were selected. Selected clones had a lower clustering coefficient and greater diversity in selected sites than sets sampled randomly. This approach was developed to select reagents for neutralization assays, then study affinity maturation, autologous neutralization, and the transition to heterologous neutralization and breadth. Specific implications for vaccine design, given sustained coevolution of immunity and escape is described herein.
Introduction
Neutralizing antibodies are immune correlates of protection in all licensed antiviral vaccines. It is not yet known how to induce broadly cross-reactive neutralizing antibodies (bnAbs) against HIV-1 via vaccination. Variation among proteins that interact directly with antibodies provides evolutionary signal about the effects of immune selection. Motivated by previous findings that early virus diversification drives neutralization breadth in early infection with HIV-1, it was hypothesized that progressively increasing antigen diversity can induce bnAbs. Herein immunogens with progressively increasing diversity at key sites in polymorphic viral proteins are identified. A major innovation of the swarm vaccine concept is rapid turnaround from viral sequence information to immunogen candidates. Another novel aspect is its potential for general utility to promote bnAb development against highly variable viruses, bacteria, and secreted toxins.
Neutralizing antibodies (nAbs) block viruses from entering cells.
All chronic plasmas neutralize some HIV-1 Envs; half neutralize at least 50% of diverse viruses. Virus Envelope diversity and ongoing immune escape drive selection for greater breadth Divergence, bNAb breadth/potencyEnv diversity precedes Nab breadth.
Typical course in natural infection: Autologous Nabs, followed by selection for relative Env resistance, then selection of bNAbs that tolerate the Env variants.
About 80% of new infections are established by single transmitted/founder (TF) virions that diversify randomly until immune selection becomes active.
Longitudinal samples from acute infection through 3-5 years of follow-up enable following bnAb development.
Single-genome sequencing of virus envelopes yields high-quality sequence data.
Diverse swarms of viral variants that induced breadth in bnAb donor were selected.
Related work in vaccine design has suggested that Env variants sampled during development of heterologous neutralization breadth could be administered as immunogens. Diversity among such Envs are hypothesized to emulate immune selection and induce antibodies with more varied specificities than single, clonal immunogens.
In other related research, Env selection has been formally represented as a set-coverage problem. That work identifies networks of covarying sites that occur in a population-level alignment, which represents a particular clade, subtype, or (in the case of hepatitis C virus) genotype. It considers the difference between early, transmissible, transmitted-founder viruses from later, chronic viruses, and utilizes only covarying sites found to occur in the TF stage. Though the underlying vaccine concept in that line of inquiry differs from that used herein, the formalism is related to the problem approach described herein.
Sequence alone: Using evidence for selection as measured by TF loss, once sequences are obtained, a swarm vaccine can be designed.
Contacts: requires antibody/Env structure, or an analog antibody with a known structure.
Signatures: correlations of mutations with bNAb sensitivity can be identified—identify sites of interest both inside and outside of contacts.
A simple indicator of immune selection in viral proteins was developed to identify immunogens that represent diversity induced during development of broadly neutralizing antibodies. A simple, efficient algorithm was designed and implemented to select sequences that represent the accumulation of mutations involved in immune recognition for a vaccine sequence cocktail or reagent set. By factoring in time of sampling, the algorithm starts with sequences nearest the form that established the infection, and progressively builds on diversity in a manner that parallels natural infection, so common mutations and mutations that eventually go to fixation are naturally sampled many times. It is deterministic for a given input set. However, unresolved ties may exist among alternative clones for some data.
Results
Site Selection
398 clones from 14 time-points over three years were aligned (median per sample: 25, range: 18-53) across 953 Env sites. TF loss per site was computed for each of 14 sample timepoints, weeks 4 through 160 (
Clone Selection
The selected sites were extracted from aligned sequences and concatenated to review Env variation among candidate clones. This representation as concatamers (sequences formed by concatenating selected sites) formed the basis for clone selection. The greedy swarm-selection algorithm (
Discussion
Because sites are not independent, but covary, information about site covariation could facilitate smaller swarm sets that represent selected sites. Other optimization algorithms are likely to yield smaller swarms, for the small cost of more computing time.
Experimental validation as immunogens will be carried out.
A strategy to identify candidates for Env sites under immune selection from longitudinally sampled sequences was developed. In CH505, two thirds of these selected sites were ultimately related to the CH103 bNAb lineages, by either signature analysis or structural contacts proximity. Whether this information can guide selection of vaccine antigen sets that recapitulate the evolutionary pressure imposed by Env antigenic diversity on bNAb lineages is being explored. In some embodiments, gradual accumulation of epitope diversity may be key.
Methods
Site Selection
Transmitted-founder (TF) loss is the proportion of sequences sampled per time-point that have lost the ancestral TF state. This is an efficient way to select rapidly evolving sites. Herein, no other information than TF loss was considered, though such information could be used to select sites. This could include signature sites associated with neutralization assay outcomes and antibody contact residues from structural data, if available.
The starting point was SGA env (DNA) sequences, from a minimum of roughly 20 clones sampled longitudinally, beginning early (3-6 weeks) after infection, with 3-5 years of clinical follow-up. It is common for SGA from homogeneous infections to yield multiple identical sequences, all of which were kept. A naming convention for clones was used to ensure consistency and so sample time-point labels could be parsed from sequence names. To study variant dynamics, the number of elapsed days after the earliest sample from sample dates was computed, and the number of days post-infection estimated from the earliest sample was added. For homogeneous infections sampled before the onset of immune selection, a simple model of sequence evolution provides the estimate (Keele B F, Giorgi E E, Salazar-Gonzalez J F, Decker J M, Pham K T, et al. (2008) Identification and characterization of transmitted and early founder virus envelopes in primary HIV-1 infection. Proc Natl Acad Sci (USA) 105: 7552-7557; Giorgi E E, Funkhouser B, Athreya G, Perelson A S, Korber B T, Bhattacharya T (2010) Estimating time since infection in early homogeneous HIV-1 samples using a Poisson model. BMC Bioinformatics 11: 532. doi: 10.1186/1471-2105-11-532).
The HXB2 reference sequence was added to facilitate numbering positions, codon-aligned the sequences, translated them, and inferred a phylogeny. Though no algorithm aligns the HIV envelope perfectly, a useful starting point for manual alignment uses an HIV-specific hidden Markov model [GeneCutter]. Aligning all but the hypervariable loops is trivial given such a preliminary alignment. Because hypervariable loops evolve rapidly by tandem duplications, a useful alignment criterion is self-consistency, rather than identification of homologous sites. For example, a putative N-linked glycosylation motif could be placed at either the N- or C-terminal position of an otherwise gapped region. Uniform placement of such motifs, particularly where HXB2 has no corresponding sites, facilitates analysis because the variants appear more clearly as evolutionary signal if aligned consistently.
Maximum-likelihood trees were inferred from translated amino-acid sequences with PhyML and the HIVw (HIV-specific, within-host) substitution model. The phylogeny is used to order sequences and is an organizing principle for sequence evolution from the ancestral TF virus.
To identify potential N-linked glycosylation (PNG) sites, PNG sites were annotated by replacing asparagine sites that match the Nx[ST] motif to become Ox[ST]. (In the PNG motif, x indicates any amino acid except proline, and the third position is either serine or threonine).
For each aligned site, TF loss was computed per time-point sequenced, the maximum was identified, and this “peak” TF loss was compared with a threshold. The TF loss threshold determines the number of sites that are selected; a high TF loss threshold yields fewer sites than a low threshold. The threshold will depend on many variables, such as number of sequences sampled and time since infection. The threshold was adjusted and the resulting number of sites considered. This gave a list of sites in the alignment, which was considered as interesting evolutionary “hot spots” to be represented by a swarm of clones.
Clone Selection
Having used the TF-loss criterion to select sites from the alignment, a set of clones was identified to represent the variants that occur at these sites. Choosing k representatives from n clones gives at least 10100 possibilities for n above 427 and k over 100. On the scale of the current example, choosing 50 clones from 250 candidates gives over 1053 alternatives. To search such a vast space of possible solutions is intractable for even the fastest computers.
A simple, efficient algorithm was designed and implemented to select sequences to represent variants at sites selected by TF loss. The approach is greedy in that it adds clones iteratively, rather than refine the entire clone set for potentially better solutions. Such a greedy approach is unlikely to give the best possible solution, but can efficiently provide reasonably good solutions, and can be refined to include other criteria as needed. It works from the same alignment used to select sites, and assumes that the TF form and sample timepoints can be identified from clone names.
Clone selection works by initially tabulating amino acid variants among selected sites. This table of variant counts is used to monitor which remaining mutations need to be included in the swarm set. Variants that only ever appear once are disregarded. Candidates for clone selection must be functionally viable, by lacking long deletions and premature stop codons or incomplete codons, which typically result from frame-shift mutations. Starting with the TF form, the procedure iterates chronologically over timepoints sampled, and identifies a clone to represent each needed variant at each of the selected sites, should such a variant be present. The choice among multiple clones that carry a needed variant is resolved by a series of tie-breaking criteria, first to minimize distance (number of mutations, including gaps) to the TF form among selected sites, then for the full-length clone, and finally to minimize average distance to clones in the current swarm set. Any remaining ties would indicate a need for additional selection criteria. The clone selected to represent the needed variant is included in the swarm set, corresponding counts in the table of needed variants are set to zero, and iteration continues. Upon iterating over all sample timepoints, selected variants, and needed sites, the clone set is complete. A benefit of this approach is that it selects only as many clones are necessary to represent the variants in selected sites. However, the greedy approach errs towards inclusion of early point mutations that would be included among later variants.
Pissania F, Malherbe D C, Robins H, DeFilippis V R, Park B, et al. [Sellhorn G, Stamatatos L, Overbaugh J, Haigwood N L] (2012) Motif-optimized subtype A HIV envelope-based DNA vaccines rapidly elicit neutralizing antibodies when delivered sequentially. Vaccine 30: 5519-5526. dx.doi.org/10.1016/j.vaccine.2012.06.042
Malherbe D C, Doria-Rose N A, Misher L, Beckett T, Puryear W B, et al. [Schuman J T, Kraft Z, O′Malley J, Mori M, Srivastava I, Barnett S, Stamatatos L, Haigwood N L] (2011) Sequential immunization with a subtype B HIV-1 envelope quasispecies partially mimics the in vivo development of neutralizing antibodies. J Virol 85: 5262-5274. doi:10.1128/JVI.02419-10
Pissani F, D C Malherbe, Schuman J T, Robins H, Park B S, et al. [Krebs S J, Barnett S W, Haigwood N L] (2014) Improvement of antibody responses by HIV envelope DNA and protein co-immunization. Vaccine 32: 507-513. dx.doi.org/10.1016/j.vaccine.2013.11.022
Giorgi E E, Funkhouser B, Athreya G, Perelson A S, Korber B T, Bhattacharya T (2010) Estimating time since infection in early homogeneous HIV-1 samples using a Poisson model. BMC Bioinformatics 11: 532. doi: 10.1186/1471-2105-11-532
Haynes B F, Kelsoe G, Harrison S C, Kepler T B (2012) B-cell—lineage immunogen design in vaccine development with HIV-1 as a case study. Nature Biotechnol 30: 423-433. doi: 10.1038/nbt.2197
Kaufman L, Rousseew P J (2005) Finding groups in data: An introduction to cluster analysis. Hoboken: John Wiley and Sons. 342 p.
Keele B F, Giorgi E E, Salazar-Gonzalez J F, Decker J M, Pham K T, et al. (2008) Identification and characterization of transmitted and early founder virus envelopes in primary HIV-1 infection. Proc Natl Acad Sci (USA) 105: 7552-7557.
Kwong P D, Mascola J R, Nabel G J (2013) Broadly neutralizing antibodies and the search for an HIV -1 vaccine: the end of the beginning Nat Rev Immunol 13:693-701.
Liao H-X, Lynch R, Zhou T, Gao F, Alam S M, et al. (2013) Coevolution of a broadly neutralizing HIV-lantibody and founder virus. Nature 496: 469-476. doi: 10.1038/nature12053
Corti D, Lanzavecchia A (2013) Broadly neutralizing antiviral antibodies. Annu Rev Immunol 31: 705-742.
Burton D R, Desrosiers R C, Doms R W, Koff W C, Kwong P D, et al. (2004) HIV vaccine design and the neutralizing antibody problem. Nat Immunol 5: 233-236.
Burton D R, Ahmed R, Barouch D H, Butera S T, Crotty S, et al. (2012) A Blueprint for HIV vaccine discovery. Cell Host Microbe 12: 396-407. doi: 10.1016/j.chom.2012.09.008
Klein F, Mouquet H, Dosenovic P, Scheid J F, Scharf L, Nussenzweig M C (2013) Antibodies in HIV-1 vaccine development and therapy. Science 341: 1199-1204.
Korber B, Gnanakaran S (2009) The implications of patterns in HIV diversity for neutralizing antibody induction and susceptibility. Curr Opin HIV AIDS 4: 408-417. doi: 10.1097/COH.0b013e32832f129e
Kwong P D, Mascola J R (2012) Human antibodies that neutralize HIV: Identification, structures, and B cell ontogenies. Immunity 37: 412-425.
McGuire A T, Hoot S, Dreyer A M, Lippy A, Stuart A, et al. (2013) Engineering HIV envelope protein to activate germline B cell receptors of broadly neutralizing anti-CD4 binding site antibodies. J Exp Med 210: 655-633.
Murray J M, Moenne-Loccoz R, Velay A, Habersetzer F, Doffol M, et al. (2013) Genotype 1 hepatitis C virus envelope features that determine antiviral response assessed through optimal covariance networks. PLoS ONE 8(6): e67254. doi: 10.1371/journal.pone.0067254
Example 9 describes swarm immunogen concept. Sites are identified by TF loss (
Example 10 describes swarm selection. Diversity in rapidly evolving sites are sampled as progression of mutations away from TF. Serially sampled sequences are aligned to TF or UA. For site selection, the number of sites selected depends on TF loss cutoff (
Example 11 describes selection procedure. The variants seen are tabulated across all sequences. Rare variants are excluded (<minimum variant count). Variant counts are updated while selecting sequences. For each time point sampled (1 . . . t), for each site selected (1 . . . s), and for each variant not yet included (1 . . . v), select the sequence that can uniquely minimize HD to TF among the selected sites, minimize HD to TF over full length, or minimize mean HD to current swarm. Concatamers from a swarm of 54 env clones that represent selected sites are shown in
For CH103 VH, the sites above cutoff versus the non-UA cutoff is plotted in
Swarms are sequence sets that represent variant diversity from shared ancestor. Selected sites have highest peak TF loss. An algorithm selects clones that carry all but rare variants. Applications include reagent selection and immunogens for bnAb induction. R packages are in preparation (pixgramr and swarmtools).
Example 12 describes the structure of antibody CH103 in complex with the outer domain of HIV-1 gp120. Overall structure of the CH103-gp120 complex, with gp120 polypeptide depicted in ribbon and CH103 shown as a molecular surface.
Example 13 describes the time of appearance and VHDJH mutations in CH103 clonal family. Maximum likelihood phylogram showing the CH103 lineage with the inferred intermediates (circles, 11-4, 17 and 18), and percentage mutated VH sites and timing indicated. Mutation frequency is 4-17% (
Example 14 describes the binding affinity maturation for the CH103 clonal family. Binding affinities (Kd, nM) of antibodies to autologous subtype C CH505 (C.CH505; left box) and heterologous B.63521 (right box) were measured by surface plasmon reasonance (
Example 15 describes the development of neutralization breadth in the CH103 clonal lineage. The phylogenetic CH103 clonal lineage tree showing the IC50 (mg ml21) of neutralization of the autologous transmitted/founder (C.CH505), heterologous tier clades A (A.Q842) and B (B.BG1168) viruses as indicated in
Example 16 describes the steps of a B-cell-lineage—based approach to vaccine design (
Example 17 describes how env diversification precedes breadth. At 6 months, divergence in contact resisues was greatest for CH505 among 17 subjects followed from acute infection. A comparison of the pace of viral sequence evolution in CH505 (indicated here by the 9-digit anonymous study-participant identifier 703010505) in regions relevant to the CH103 epitope with other subjects is shown in
This application claims the benefit of and priority to the U.S. Provisional Patent Application No. 62/056,822, filed on Sep. 29, 2014, and U.S. Provisional Patent Application No. 62/150,019, filed on Apr. 20, 2015, the contents of each of which are hereby incorporated by reference in their entirety.
This invention was made with government support under Center for HIV/AIDS Vaccine Immunology-Immunogen Design grant UM1-AI100645 from the NIH, NIAID, Division of AIDS. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US15/53004 | 9/29/2015 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62056822 | Sep 2014 | US | |
62150019 | Apr 2015 | US |