The instant application contains a Sequence Listing which has been submitted electronically in ASCII format and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Nov. 19, 2019, is named P15406-02_SL.txt and is 172,811 bytes in size.
Antibodies to HIV appear shortly after infection. The titer and avidity of anti-HIV antibodies generally increase over time, but may be impacted by antiretroviral treatment (ART), CD4 T cell decline, and other factors. The breadth and specificity of anti-HIV antibodies also evolve during the course of infection. A detailed understanding of the serologic response to HIV infection is helpful for understanding HIV immune containment and for vaccine development. Multiplexed immunoassays have been used to analyze the specificity of anti-HIV antibodies. These include a microarray assay composed of 15 recombinant HIV env protein targets and five gp41 peptide targets, and an assay based on the Luminex platform that includes six recombinant HIV protein targets. Phage display technology has also been used to screen HIV peptides for binding to immobilized antibodies (6).
HIV incidence is the rate at which new HIV infections occur in populations. While HIV prevalence measures overall disease burden, HIV incidence tracks the leading edge of the HIV/AIDS epidemic. Accurate HIV incidence estimates are critical for monitoring the epidemic, identifying populations at high risk of HIV acquisition, targeting prevention efforts, and evaluating interventions for HIV prevention. HIV incidence can be measured by evaluating HIV seroconversion in longitudinal cohorts and modeling trends in HIV prevalence; however, those approaches have significant practical and methodological limitations. An alternative approach is to use a cross-sectional survey to identify recent infections and estimate HIV incidence. Serologic (antibody-based) assays have been developed for cross-sectional HIV incidence estimation. These assays measure characteristics of the HIV antibody response such as the titer, class, and avidity of anti-HIV antibodies. The United States (US) Centers for Disease Control and Prevention (CDC) has developed two HIV incidence assays: the BED capture immunoassay (BED assay)3, which measures the proportion of antibody that is HIV-specific; and the newer Limiting Antigen Avidity assay (LAg assay), which measures antibody binding to a limited amount of a target antigen. Unfortunately, the serologic response to HIV infection is highly variable. Some HIV-infected individuals never attain a mature antibody response, and numerous factors, such as advanced HIV disease and viral suppression, can blunt the antibody response. Performance of serologic incidence assays also varies by geographic region and in different sub-populations, reflecting differences in HIV subtype and other factors. The highest levels of misclassification are seen with subtype D HIV, which is associated with reduced serologic responses to HIV infection. While serologic HIV incidence assays at first seemed promising, it is now clear that these assays provide inaccurate incidence estimates in some settings and populations because of sample misclassification.
One embodiment of the present invention is a method of estimating the cross-sectional incidence or duration of infection of a virus. Method steps include obtaining a biological sample that contains antibodies from a subject who has one or more viral infections; mixing the biological sample with two or more epitopes or peptides from the proteins of viruses responsible for the viral infection; quantifying the amount of antibody binding to the epitopes or peptides; and estimating the cross-sectional incidence or duration of infection for one or more of the viruses. The methods of the present invention estimate the cross-sectional incidence or duration of infection for a virus that infects mammals, including HIV and EBV, as examples. In addition, the epitopes or peptides of the present invention may be derived from, or expressed in, a phage immunoprecipitation sequencing system (PhIP-Seq or VirScan). The epitopes or peptides of the present invention may be modified by site-directed mutagenesis using alanine substitution, or another method, to alter the amino acid sequence of the peptides. The epitopes or peptides of the present invention may be synthesized chemically or used in a biologic system. For example, the epitopes or peptides of the present invention, including SEQ ID:1 to SEQ ID:309, may be used in a assay system including enzyme immunoassay, chemiluminescent assay, microparticle bead assay, electrochemiluminescent assay, and a combination thereof. The assay systems detect and/or quantify binding of antibodies to one or more epitopes or peptides, either individually or in a multiplex (multi-assay) format.
Another embodiment of the present invention is a method of estimating or calculating the cross-sectional incidence or duration of infection of HIV, including HIV subtype C and HIV subtype D infections. HIV proteins including gp41, gp120, gag, and pol, as examples, are used in methods of the present invention. In addition, the plurality of epitopes or peptides may be selected from the group consisting of SEQ ID:1 to SEQ ID:309. Alternatively, the plurality of epitopes or peptides may be selected from the group consisting of SEQ ID:1 to SEQ ID:309 in the range of 2 to 200, 3 to 150, 4 to 125, 5 to 100, 7 to 100, 10 to 100, 4 to 20, 4 to 30, 4 to 50, 8 to 60 or 10 to 70 epitopes or peptides. Alternatively, the epitopes or peptides of the present invention may comprise SEQ ID:3, SEQ ID:22, SEQ ID:159 and SEQ ID:180.
Unless defined otherwise, all technical and scientific terms used herein have the meaning commonly understood by a person skilled in the art to which this invention belongs. The following references provide one of skill with a general definition of many of the terms used in this invention: Singleton et al., Dictionary of Microbiology and Molecular Biology (2nd ed. 1994); The Cambridge Dictionary of Science and Technology (Walker ed., 1988); The Glossary of Genetics, 5th Ed., R. Rieger et al. (eds.), Springer Verlag (1991); and Hale & Marham, The Harper Collins Dictionary of Biology (1991). As used herein, the following terms have the meanings ascribed to them below, unless specified otherwise.
The term “about” is understood as within a range of normal tolerance in the art, for example within 2 standard deviations of the mean. About can be understood as within 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.5%, 0.1%, 0.05%, or 0.01% of the stated value. Unless otherwise clear from context, all numerical values provided herein are modified by the term about.
The term “antibody,” as used in this disclosure, refers to an immunoglobulin or a fragment or a derivative thereof, and encompasses any polypeptide comprising an antigen-binding site, regardless of whether it is produced in vitro or in vivo. The term includes, but is not limited to, polyclonal, monoclonal, monospecific, polyspecific, non-specific, humanized, single-chain, chimeric, synthetic, recombinant, hybrid, mutated, and grafted antibodies. Unless otherwise modified by the term “intact,” as in “intact antibodies,” for the purposes of this disclosure, the term “antibody” also includes antibody fragments such as Fab, F(ab′)2, Fv, scFv, Fd, dAb, and other antibody fragments that retain antigen-binding function, i.e., the ability to bind, for example, PD-L1, specifically. Typically, such fragments would comprise an antigen-binding domain.
The terms “antigen-binding domain,” “antigen-binding fragment,” and “binding fragment” refer to a part of an antibody molecule that comprises amino acids responsible for the specific binding between the antibody and the antigen. In instances, where an antigen is large, the antigen-binding domain may only bind to a part of the antigen. A portion of the antigen molecule that is responsible for specific interactions with the antigen-binding domain is referred to as “epitope” or “antigenic determinant.” An antigen-binding domain typically comprises an antibody light chain variable region (VL) and an antibody heavy chain variable region (VH), however, it does not necessarily have to comprise both. For example, a so-called Fd antibody fragment consists only of a VH domain, but still retains some antigen-binding function of the intact antibody.
By “agent” is meant any small molecule chemical compound, antibody, nucleic acid molecule, or polypeptide, or fragments thereof.
By “ameliorate” is meant decrease, suppress, attenuate, diminish, arrest, or stabilize the development or progression of a disease.
“Diagnostic” means identifying the presence or nature of a pathologic condition. Diagnostic methods differ in their sensitivity and specificity. The “sensitivity” of a diagnostic assay is the percentage of diseased individuals who test positive (percent of “true positives”). Diseased individuals not detected by the assay are “false negatives.” Subjects who are not diseased and who test negative in the assay, are termed “true negatives.” The “specificity” of a diagnostic assay is 1 minus the false positive rate, where the “false positive” rate is defined as the proportion of those without the disease who test positive. While a particular diagnostic method may not provide a definitive diagnosis of a condition, it suffices if the method provides a positive indication that aids in diagnosis.
By “disease” is meant any condition or disorder that damages or interferes with the normal function of a cell, tissue, or organ. Examples of diseases include HIV.
The term “express” refers to the ability of a gene to express the gene product including for example its corresponding mRNA or protein sequence (s).
By “fragment” is meant a portion of a polypeptide or nucleic acid molecule. This portion contains, preferably, at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% of the entire length of the reference nucleic acid molecule or polypeptide. A fragment may contain 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1000 nucleotides or amino acids.
By “gag” or “group-specific antigen” is a gene that codes for core structural proteins of a retrovirus. For example, HIV gag protein is encoded by the HIV gag gene, HXBE nucleotides 790-2292. One example of a HIV gag protein has a NCBI database accession number ASM60435.
By “gp 41” or “glycoprotein 41” is meant a subunit of the envelope protein complex of retroviruses, including human immunodeficiency virus (HIV). Gp41 is a transmembrane protein that contains several sites within its ectodomain that are required for infection of host cells. As a result of its importance in host cell infection, it has also received much attention as a potential target for HIV vaccines. One example of a HIV gp41 protein has a NCBI database accession number ASV70553.1.
By “gp120” or “Envelope glycoprotein GP120 is meant a glycoprotein exposed on the surface to a retrovirus envelope such as HIV. Gp120 is essential for virus entry into cells as it plays a vital role in attachment to specific cell surface receptors. One example of a HIV gp120 protein has a NCBI database accession number AAF69493.1.
By “immunoassay” is meant an assay that uses an antibody to specifically bind an antigen (e.g., a marker). The immunoassay is characterized by the use of specific binding properties of a particular antibody to isolate, target, and/or quantify the antigen.
By “incidence of infection” is meant the frequency of new infections occurring over a specified period of time (e.g., annual HIV incidence is the percentage of individuals who acquire HIV infection during one year).
The term, “obtaining” as in “obtaining an agent” includes synthesizing, purchasing, or otherwise acquiring the agent.
By “marker” is meant any protein or polynucleotide or antibody having an alteration in expression level or activity that is associated with a disease or disorder. The term “biomarker” is used interchangeably with the term “marker.”
The term “mAb” refers to monoclonal antibody. Antibodies may comprise without limitation whole native antibodies, bispecific antibodies; chimeric antibodies; Fab, Fab′, single chain V region fragments (scFv), fusion polypeptides, and unconventional antibodies.
The term “measuring” means methods which include detecting the presence or absence of marker(s) such as antibodies in the sample, quantifying the amount of marker(s) such as antibodies in the sample, and/or qualifying the type of biomarker or antibody. Measuring can be accomplished by methods known in the art and those further described herein, including but not limited to immunoassay. Any suitable methods can be used to detect and measure one or more of the markers described herein. These methods include, without limitation, ELISA and bead-based immunoassays (e.g., monoplexed or multiplexed bead-based immunoassays, magnetic bead-based immunoassays).
By “pol” is meant a DNA polymerase encoded by a gene in retroviruses, such as HIV. The pol protein is an enzyme that transcribes viral RNA into double-stranded DNA. One example of a HIV pol protein has a NCBI database accession number AAF35355.1.
The terms “polypeptide,” “peptide”, and “protein” are used interchangeably herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residue is an analog or mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers. Polypeptides can be modified, e.g., by the addition of carbohydrate residues to form glycoproteins. The terms “polypeptide,” “peptide” and “protein” include glycoproteins, as well as non-glycoproteins.
By “reduces” is meant a negative alteration of at least 10%, 25%, 50%, 75%, or 100%.
A “reference” refers to a standard or control conditions such as a sample (human cells) or a subject that is a free, or substantially free, of an agent or disease.
A “reference sequence” is a defined sequence used as a basis for sequence comparison. A reference sequence may be a subset of or the entirety of a specified sequence; for example, a segment of a full-length cDNA or gene sequence, or the complete cDNA or gene sequence. For polypeptides, the length of the reference polypeptide sequence will generally be at least about 16 amino acids, preferably at least about 20 amino acids, more preferably at least about 25 amino acids, and even more preferably about 35 amino acids, about 50 amino acids, or about 100 amino acids. For nucleic acids, the length of the reference nucleic acid sequence will generally be at least about 50 nucleotides, preferably at least about 60 nucleotides, more preferably at least about 75 nucleotides, and even more preferably about 100 nucleotides or about 300 nucleotides or any integer thereabout or there between.
As used herein, the term “sensitivity” is the percentage of subjects with a particular disease.
As used herein, the term “specificity” is the percentage of subjects correctly identified as NOT having a particular disease i.e., normal or healthy subjects.
By “specifically binds” is meant an antibody that recognizes and binds a polypeptide of the invention such as a gp41 polypeptide, a gp120 polypeptide, a gag polypeptide, or a pol polypeptide, but which does not substantially recognize and bind other molecules in a sample, for example, a biological sample, which naturally includes a polypeptide of the invention.
As used herein, the term “subject” is intended to refer to any individual or patient to which the method described herein is performed. Generally, the subject is human, although as will be appreciated by those in the art, the subject may be an animal. Thus, other animals, including mammals such as rodents (including mice, rats, hamsters, guinea pigs, cats, dogs, rabbits, farm animals including cows, horses, goats, sheep, pigs, etc., and primates (including monkeys, chimpanzees, orangutans and gorillas) are included within the definition of subject.
By “substantially identical” is meant a polypeptide or nucleic acid molecule exhibiting at least 50% identity to a reference amino acid sequence (for example, any one of the amino acid sequences described herein) or nucleic acid sequence (for example, any one of the nucleic acid sequences described herein). Preferably, such a sequence is at least 60%, more preferably 80% or 85%, and more preferably 90%, 95% or even 99% identical at the amino acid level or nucleic acid to the sequence used for comparison.
Sequence identity is typically measured using sequence analysis software (for example, Sequence Analysis Software Package of the Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 University Avenue, Madison, Wis. 53705, BLAST, BESTFIT, GAP, or PILEUP/PRETTYBOX programs). Such software matches identical or similar sequences by assigning degrees of homology to various substitutions, deletions, and/or other modifications. Conservative substitutions typically include substitutions within the following groups: glycine, alanine; valine, isoleucine, leucine; aspartic acid, glutamic acid, asparagine, glutamine; serine, threonine; lysine, arginine; and phenylalanine, tyrosine. In an exemplary approach to determining the degree of identity, a BLAST program may be used, with a probability score between e−3 and e−100 indicating a closely related sequence.
Ranges provided herein are understood to be shorthand for all of the values within the range. For example, a range of 1 to 50 is understood to include any number, combination of numbers, or sub-range from the group consisting 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50.
Any compositions or methods provided herein can be combined with one or more of any of the other compositions and methods provided herein.
As used herein, the terms “prevent,” “preventing,” “prevention,” “prophylactic treatment” and the like refer to reducing the probability of developing a disorder or condition in a subject, who does not have, but is at risk of or susceptible to developing a disorder or condition.
As used herein, the terms “treat,” treating,” “treatment,” and the like refer to reducing or ameliorating a disorder and/or symptoms associated therewith. It will be appreciated that, although not precluded, treating a disorder or condition does not require that the disorder, condition or symptoms associated therewith be completely eliminated.
The use of the terms “a” and “an” and “the” and similar referents in the context of describing the invention (especially in the context of the following claims) are to be construed to cover both the singular and the plural, unless otherwise indicated herein or clearly contradicted by context. The terms “comprising,” “having,” “including,” and “containing” are to be construed as open-ended terms (i.e., meaning “including, but not limited to,”) unless otherwise noted. Recitation of ranges of values herein are merely intended to serve as a shorthand method of referring individually to each separate value falling within the range, unless otherwise indicated herein, and each separate value is incorporated into the specification as if it were individually recited herein. All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., “such as”) provided herein, is intended merely to better illuminate the invention and does not pose a limitation on the scope of the invention unless otherwise claimed. No language in the specification should be construed as indicating any non-claimed element as essential to the practice of the invention.
Preferred embodiments of this invention are described herein, including the best mode known to the inventors for carrying out the invention. Variations of those preferred embodiments may become apparent to those of ordinary skill in the art upon reading the foregoing description. The inventors expect skilled artisans to employ such variations as appropriate, and the inventors intend for the invention to be practiced otherwise than as specifically described herein. Accordingly, this invention includes all modifications and equivalents of the subject matter recited in the claims appended hereto as permitted by applicable law. Moreover, any combination of the above-described elements in all possible variations thereof is encompassed by the invention unless otherwise indicated herein or otherwise clearly contradicted by context.
Abbreviations: ORF: open reading frame; mo: months; yr: years; kb: kilobases.
Abbreviations: Env: envelope; Pol: polymerase; Gag: group-specific antigen; Rev: HIV regulatory protein; Vpu: viral protein U; ART: antiretroviral treatment; mo: months; yr: years;
EBV: Epstein Barr virus.
Abbreviations: Ab: antibody; ART: antiretroviral therapy; Decr: decreasing antibody breadth; Non-Decr: stable or increasing antibody breadth.
Abbreviations: nrc: Normalized read count; mo: months; yr: years.
Abbreviations: ART: antiretroviral treatment; VL: viral load; mo: months; yr: years.
Abbreviations: ART: antiretroviral therapy
The inventors used a massively-multiplexed antibody profiling system to analyze the fine specificity of the antibody response to HIV infection. This system is based on phage immunoprecipitation sequencing (PhIP-Seq) (7). Testing was performed by incubating samples with a bacteriophage library that expresses peptides encoded by oligonucleotides generated by high-throughput DNA synthesis. The abundance and specificity of antibodies in test samples were assessed by immunoprecipitating phage-antibody complexes and sequencing the DNA in the captured phage particles. The “VirScan” phage library includes >95,000 peptides that span the genomes of >200 viruses that infect humans (the human “virome”) (8). The inventors performed PhIP-Seq using the VirScan library to analyze HIV antibodies from individuals with known duration of HIV infection, ranging from <1 month to 8.7 years. This allowed them to examine dynamic changes in antibody diversity and the fine specificity of HIV antibodies from individuals with early to late stage infection, including individuals on antiretroviral therapy (ART) and individuals with advanced HIV disease.
HIV incidence was often determined by following cohorts of HIV-uninfected individuals and quantifying the rate of new HIV infections. HIV incidence can also be estimated using a cross-sectional study design, using laboratory assays to identify individuals who are likely to have recent HIV infection. Most serologic assays used for cross-sectional HIV incidence estimation measure general characteristics of the antibody response to HIV infection (e.g., antibody titer, antibody avidity) (9-11) which may be impacted by viral suppression, loss of CD4 T cells, and other factors (12-15). Unlike conventional methods the inventors used a VirScan assay to identify novel peptide biomarkers associated with the duration of HIV infection, and surprisingly demonstrated that peptide engineering can be used to enhance the properties of peptides for discriminating between early and late-stage infection. This information could be used to develop improved methods for estimating HIV incidence from cross-sectional surveys, for surveillance of the HIV/AIDS epidemic, and evaluating the impact of interventions for HIV prevention in clinical trials.
Antibody reactivity to HIV peptides.
We used the VirScan assay to characterize anti-HIV antibodies in 403 plasma samples from 57 women with subtype C HIV infection (
Breadth of antibody reactivity.
The inventors next analyzed the diversity of each individual's antibody response to HIV over time. Network graphs were used to determine antibody breadth at each time point; antibody breadth was defined as the number of non-overlapping peptides with high levels of antibody binding.
In both groups (with and without ART initiation), antibody breadth increased during the first 6 months of infection. In the group that did not start ART, a relatively stable value for antibody breadth (termed “antibody breadth set point”) was established in most individuals approximately nine months to one year after infection; the antibody breadth set point varied considerably among study participants. In contrast, in the group that ultimately started ART, a decline in antibody breadth was observed approximately one year after infection. After participants started ART, antibody breadth appeared to stabilize at levels similar to those seen in early HIV infection. The decline in antibody breadth prior to ART initiation did not appear to be related to HIV viral load or CD4 cell count (
The inventors next evaluated the relationship between HIV infection and the antibody response to a different, chronic infection that was expected to have a high prevalence in the study setting (EBV) (
Factors associated with changes in antibody breadth over time.
To explore the relationship between the decline in HIV antibody breadth and subsequent ART initiation, the inventors calculated the rate of change of antibody breadth over the period ˜9 months to ˜2 years after HIV infection (based on sample availability); none of the participants included in the analysis were on ART during this time window. For this time-to-event analysis (the outcome being time to ART initiation), participants were divided into two groups: those with declining breadth and those with stable or increasing breadth. The inventors found that participants who had stable or increasing antibody breadth ˜9 months to ˜2 years after infection were less likely to start ART earlier in infection (log-rank test p=0.009, hazards ratio: 0.29, 95% CI: 0.11, 0.78, p=0.014,
The inventors next evaluated the relationship between the rate of decline in antibody breadth and other factors, including age at infection, baseline CD4 cell count, rate of decline in CD4 cell count, and viral load set point (
Dynamic Changes in Antibody Binding
The inventors next explored the relationship between HIV antibody specificity and the duration of HIV infection. First, the inventors used a linear model to quantify the association between antibody binding and the duration of infection for the 3,384 HIV peptides in the VirScan library. This analysis was performed using all 403 samples in the discovery sample set. The model identified 309 peptides that had a significant association between these two factors (p-value<0.05 after adjusting for multiple comparisons using the Bonferroni method,
The inventors then selected the four peptides that had the strongest independent association between antibody binding and the duration of HIV infection (
The inventors next evaluated the performance of the 4-peptide model using an independent validation sample set (
Epitope Engineering
Next, the inventors explored whether peptide epitopes could be modified to improve the association between antibody binding and the duration of HIV infection. The inventors first selected 11 non-overlapping peptides that were targeted by the majority of HIV-infected individuals (“public epitope peptides”). The inventors then generated variant peptides by substituting each set of three consecutive amino acids with alanine residues.
The present invention provides the most comprehensive analysis of HIV antibody specificities to date, including their characterization from early to late stage infection. The inventors found that changes in antibody diversity early in infection were associated with differences in clinical outcome (measured as time to ART initiation). This study also provides proof-of-principle that an “HIV serosignature”, reactivity to a panel of HIV peptides, is useful for cross-sectional HIV incidence estimation.
The inventors used a novel definition of “antibody breadth” to quantify HIV antibody diversity, and found that this measure reaches a plateau (“antibody breadth set point”) early in infection. In the GS study cohort, a decline in antibody breadth between 9 months and 2 years after infection was associated with a shorter time to ART initiation, which was prompted in the GS Study cohort by a decline in CD4 cell count to <250 cells/mm3. The decline in antibody breadth among those who subsequently started ART likely reflected declining B cell support due to loss of T helper cells. HIV antibody breadth appeared to stabilize at a low level after ART initiation. In contrast, the breadth of the EBV antibody response increased sharply after ART initiation, which may have reflected immune reconstitution.
Previous studies have identified several factors associated with HIV disease progression, including virologic factors [e.g., HIV viral load, replication capacity, and subtype], immunologic factors [e.g., inversion of the CD4/CD8 ratio, polyclonality of the anti-HIV T cell response, degree of early immune activation] and host factors [e.g., human leukocyte antigen (HLA) type B57, CCR5 delta 32 mutations]. It is not clear if the decline in antibody breadth that we observed caused disease progression leading to ART initiation, or if it was a surrogate for other changes, such as a decline in T cell number or function. If the decline in antibody breadth has a causative role in disease progression, then use of therapeutic vaccines to boost antibody diversity may in theory provide clinical benefit.
Generalized antibody responses to HIV infection, such as antibody titer and avidity, tend to plateau approximately one year after HIV infection. These characteristics of the antibody response are impacted by a variety of factors, including natural and drug-induced viral suppression, disease progression, and HIV subtype. Previous studies evaluating the banding pattern in Western blots demonstrate that HIV antibody specificity evolves early in infection. Recent studies have explored whether assays that include a small number of protein or peptide targets could be used to identify recent HIV infections. Using the VirScan assay to analyze 403 plasma samples, the inventors were able to quantify antibody binding to >3,300 HIV peptides from early to late-stage HIV infection. These data were used to generate a simple, unweighted, 4-peptide model that predicted duration of HIV infection. The peptides included in this prototype model were from four different HIV proteins (gp41, gp120, gag and pol). Two of these peptides had increasing antibody reactivity over time, and two had decreasing antibody reactivity over time. It is noteworthy that the gp41 peptide, which showed the strongest association with duration of infection, included a sequence shared by the HIV subtype B target peptide in the Limiting Antigen Avidity (LAg) assay that is in wide use for cross-sectional HIV incidence estimation. Our analysis also demonstrated that epitope engineering can be used to enhance the capacity of individual peptides to discriminate between early and late HIV infection.
Data obtained with the 4-peptide model described above demonstrates that the VirScan assay can be used to identify peptides for applications such as cross-sectional HIV incidence estimation. The inventors are currently investigating more sophisticated statistical and machine-learning models to identify peptide combinations with greater accuracy for predicting the duration of HIV infection, and are generating larger data sets for model building and assessment. We are also exploring whether alternate serosignatures provide more accurate prediction of the duration of infection among people with longer term infections. On-going studies will also provide more information about the possible impact of ART, viral load, and CD4 cell count on antibody binding profiles. Considerable work will be needed to translate findings from this study into a laboratory test that can be used for improved cross-sectional HIV incidence testing. For example, peptides of interest could be incorporated into high-resolution, quantitative, multi-peptide enzyme immunoassays (EIAs) for high-throughput testing. Antibody binding data obtained from the EIA testing platform could then be used to compare the performance of serosignatures for HIV incidence estimation that include different sets of peptides, weighting for individual peptides, and different cut-offs for antibody binding to each peptide in the model. In previous work, we have used this approach to identify multi-assay algorithms that provide accurate cross-sectional HIV incidence estimates.
The VirScan assay has several unique advantages over alternative multiplex serological assays for peptide discovery. These include: quantitative assessment of antibody binding to peptides that span all open reading frames in the HIV genome, including both structural and regulatory proteins; representation of a wide range of HIV subtypes and strains, including groups M, N, and O and HIV-2; and fine resolution for epitope identification, which can be further refined with alanine scanning mutagenesis. The assay also provides information about antibody binding to >200 other human viruses. In this report, data from other viral peptides were used to normalize peptide binding measures, and allowed us to compare the impact of ART on the antibody response to a prevalent non-HIV viral infection (EBV). Data from the same assay runs could be used to examine the evolution and fine specificity of antibodies to other viruses, and the impact of viral co-infections on the anti-HIV antibody response. Future studies could also explore use of the VirScan assay to identify serosignatures for estimating incidence of other viral infections, such as hepatitis C virus. Finally, future phage libraries composed of additional protein products, such as those from the gut microbiome, may be used to explore the impact of immune system pre-conditioning on the response to HIV infection.
This present invention reveals novel features of the humoral response to HIV infection, and demonstrates the utility of the VirScan assay for identifying peptide biomarkers for applications such as cross-sectional HIV incidence estimation. This technology could also be used to evaluate serologic responses to other infectious diseases, as well as the impact of viral co-infections on immune responses. This may improve understanding of the complex relationships between viral infections and the immune responses that they elicit.
The following Examples have been included to provide guidance to one of ordinary skill in the art for practicing representative embodiments of the presently disclosed subject matter. In light of the present disclosure and the general level of skill in the art, those of skill can appreciate that the following Examples are intended to be exemplary only and that numerous changes, modifications, and alterations can be employed without departing from the scope of the presently disclosed subject matter. The following Examples are offered by way of illustration and not by way of limitation.
Samples Used for Analysis
Plasma samples were obtained from the GS Study (Uganda and Zimbabwe; 2001-2009), which evaluated the relationship between hormonal contraceptive use, genital shedding of HIV, and HIV disease progression among women with known dates of HIV seroconversion (18). ART was recommended for study participants with CD4 cell counts below 250 cells/mm3, consistent with local treatment guidelines at the time the GS Study was performed. Data for CD4 cell count and viral load were collected in the GS Study (18); data on the timing of ART initiation was obtained by review of clinic records.
The inventors analyzed samples from participants who acquired HIV infection, where the maximum time between collection of the last HIV-negative sample and the first HIV-positive sample was four months. For each individual, the estimated date of infection was defined as either the midpoint between visits with the last negative HIV antibody test and the first positive HIV antibody test, or fifteen days before documentation of acute infection (HIV RNA positive/HIV antibody negative status). Two sets of samples were analyzed in this report: a discovery sample set and a validation sample set (
Phage Library Used for Analysis
The VirScan library includes 3,384 HIV peptides spanning all HIV proteins (8). The protein sequences used to design peptide tiles were selected from the UniProtKB database, balancing sequence diversity and library size (8). The peptides are 56-amino acids long with 28-amino acid overlaps and represent diverse HIV subtypes and strains (
The VirScan library also includes 2,263 Epstein Barr virus (EBV) peptides, 718 Ebola virus peptides, and 518 rabies virus peptides; the public epitope library includes an additional 227 Ebola virus peptides. In this report, EBV data were used to evaluate the impact of antiretroviral therapy for HIV infection on the breadth of the anti-EBV antibody response. Ebola and rabies virus data were used to normalize antibody binding data to account for differences in sequencing depth between samples.
Phage Immunoprecipitation and DNA Sequencing
Detailed procedures for the VirScan assay were described previously (8, 22). In this study, the concentration of IgG in plasma samples was determined using an in-house enzyme-linked immunosorbent assay (capture and detection antibodies 2040-01 and 2042-05, respectively Southern Biotech, Birmingham, Ala.). Approximately 2 μg of IgG from each sample were added to the combined T7 bacteriophage VirScan and public epitope libraries (1×105 plaque forming units for each phage clone in each library), diluted in phosphate-buffered saline to a final reaction volume of 1 mL in a deep 96-well plate, and incubated overnight at 4° C. Eight mock immunoprecipitation reactions (no plasma) were included on each plate; these reactions served as negative controls for data normalization. After rotating the plates overnight at 4° C., 20 μL of protein A-coated magnetic beads and 20 μL of protein G-coated beads (catalog numbers 10002D and 10004D, Invitrogen, Carlsbad, Calif.) were added to each reaction; the plates were rotated for another 4 hours at 4° C. Immunoprecipitation reactions were processed using the Agilent Bravo liquid handling system (Agilent Technologies, Santa Clara, Calif.). Beads were washed twice with Tris-buffered saline (50 mM Tris-HCl with 150 mM NaCl, pH 7.5) containing 0.1% NP-40 and then resuspended in 20 μL of a polymerase chain reaction (PCR) mix containing Herculase II Polymerase (catalog number 600679, Agilent Technologies). After 20 cycles of PCR, 2 μL of the PCR products was added to a second 20-cycle PCR reaction, which added sample-specific barcodes and P5/P7 Illumina sequencing adapters to the amplified DNA. DNA sequencing of the pooled PCR products was performed using an Illumina HiSeq 2500 instrument (Illumina, San Diego, Calif.) in rapid mode (50 cycles, single end reads).
Analysis of DNA Sequencing Data
Fastq files from DNA sequencing were demultiplexed using exact matching of 8-nucleotide sample-specific i5 and i7 DNA barcodes (Illumina). For each sample, a read count (the number of times each sequence was detected) was obtained for each peptide using Bowtie alignment (23), without allowing any mismatches. The level of antibody-dependent enrichment of each peptide in each sample was determined by comparing the read count for the sample to the read counts obtained for 40 mock immunoprecipitation reactions (8 mock reactions per plate). Two different measures were used to quantify the degree of antibody binding: “z-scores” were used to reduce false positivity in cases of low sequencing depth (this approach was used to generate data for
Determination of Antibody Breadth
The term, “antibody breadth”, was used to indicate the number of unique non-overlapping epitopes that had high levels of antibody binding (z-scores>10). Antibody breadth was determined for HIV and EBV peptides using network graphs as follows. The amino acid sequences of all peptides in the VirScan library (HIV or EBV) were first analyzed to identify sequence overlaps (linkages, defined as two peptides sharing an identical sequence at least 7 amino acids long). The linkages were used to construct an undirected network graph, where each node represented a peptide with high-level antibody binding, and each linkage between two nodes represented a sequence overlap between the two peptides. The number of linkages for each peptide defined its degree of connectivity. Peptides were then removed from the graph one at a time using the following approach. At each iteration, the peptide(s) with maximum connectivity was removed, and the degree of connectivity was recalculated for each of the remaining peptides. If multiple linked peptides had equivalent connectivity, the peptide with the lowest z-score was removed first. This process was repeated until the only remaining structures in the network were simple paths and cycles. For cycles (simple paths without end peptides), the peptide with the lowest z-score was removed first; this resulted in a simple path. Peptides were iteratively removed from simple paths in order to retain the greatest number of unlinked peptides. The number of remaining unlinked peptides was defined as the antibody breadth (25).
Rate of Change in Antibody Breadth
For each participant, we estimated the rate of change in antibody breadth over the time period from 9 months to 2 years after HIV infection. This was calculated by determining the difference in antibody breadth for samples collected closest to time points 9 months and 2 years after HIV infection, and dividing this value by the length of time between the two visits. The rate of change in CD4 cell count was derived in the same way, using samples that had associated CD4 cell count data. The relationship between the rate of change in antibody breadth (and other factors) with time to ART initiation was determined using Cox proportional hazards models. The following factors were included in the analysis: age at seroconversion, CD4 cell count at the first visit after seroconversion, viral load set point, the rate of change in CD4 cell count, and time between HIV seroconversion and ART initiation. Viral load set point was defined as the median log10 viral load, excluding viral load results from the first HIV-positive visit, the visit prior to ART initiation, and any visits after ART initiation. Pearson correlation coefficients and their respective p-values and 95% confidence intervals were used to describe the relationships between the factors analyzed. We also compared the time to ART initiation among individuals who experienced a decline in antibody breath between 9 months and 2 years, and those who had stable or increasing antibody breadth in this period. Statistical significance between the breadth measures and time to ART initiation was assessed using a non-parametric log-rank test and the semi-parametric Cox proportional-hazards model with a dichotomized variable for change in breadth rate (decreasing vs. stable/increasing). Individuals who did not initiate ART were treated as right-censored. Survival curves were plotted based on the resulting hazard functions for the two groups.
Identification of peptides for estimating duration of HIV infection.
The observed duration of infection (log10 transformed) was regressed on each of the normalized read count for each peptide, and the peptide with the strongest association was selected. To select additional peptides with independent information about duration of infection, we correlated the “residuals” (i.e., the differences between the observed and fitted values) from the above linear model against each of the remaining peptides, selected the peptide with the strongest association, and repeated this step twice more to generate a list of four peptides. Two of the four peptides had increased antibody binding over time since infection (positively associated with duration of infection), and two had decreasing antibody binding over time (negatively associated with duration of infection). A simple predictor for duration of infection was calculated as the sum of the normalized read counts for the positively-associated peptides, minus the sum of the normalized read counts for the negatively-associated peptides; read counts were log transformed for this analysis. For the analysis of predicted duration of infection, generalized estimating equations (GEE) were used to account for auto-regressive correlation structure of samples from the same individual.
All references, including publications, patent applications, and patents, cited herein are hereby incorporated by reference to the same extent as if each reference were individually and specifically indicated to be incorporated by reference and were set forth in its entirety herein.
Indian J Med Res 134, 866-877 (2011).
This application claims the benefit of U.S. Provisional Patent application 62/778,342, filed Dec. 12, 2018, which is hereby incorporated by reference for all purposes as if fully set forth herein.
This invention was made with government support under grant nos. AI118633, AI068613, and AI095068 awarded by the National Institutes of Health. The government has certain rights in the invention.
| Filing Document | Filing Date | Country | Kind |
|---|---|---|---|
| PCT/US2019/065132 | 12/9/2019 | WO | 00 |
| Number | Date | Country | |
|---|---|---|---|
| 62778347 | Dec 2018 | US |