The present invention relates to method of detecting Borrelia species in a sample (e.g., a sample from a patient suspected of being infected). In particular, the present invention provides compositions and methods for detecting the presence of Borrelia proteins, nucleic acid sequences encoding these proteins, and patient antibodies to these proteins, where the proteins are selected from those listed in Table 3, including: BB0279 (FLiL), BBK19, BBK07, BB0286 (FlbB), BBG33, BBL27, BBN34, BBP34, BBQ42, BBQ34, BBM34, BBN27, and BBH13.
Lyme disease is the most frequently reported arthropod-borne disease in the United States and Europe (reviewed in ref (Steere, 2004). Serological assays are the most common laboratory tests used to confirm or support a diagnosis based on clinical features and epidemiologic circumstances (reviewed in (Bunikis, 2002; Aguero-Rosenfeld, 2005). Direct detection of the organism by cultivation, histology of a biopsy, or by an approved and validated polymerase chain reaction assay is generally preferable to serological assays for definitive confirmation of a clinical diagnosis but these procedures are uncommon in practice and not likely become widely used for the foreseeable future.
A clinical diagnosis of Lyme disease based on the observation of a characteristic skin rash and suitable epidemiologic features (e.g. exposure to ticks in an endemic area during the season of transmission) can have high accuracy (Steere, 2004). But in the absence of a skin rash (˜20-30% of cases) diagnosis of early Lyme disease solely based on clinical and epidemiologic features is more difficult. Accurate diagnosis of early infection without the typical skin rash is important, because oral antibiotic treatment at this point is usually successful and will prevent the more serious manifestations of disseminated disease and late disease. Serologic assays for late disseminated Lyme disease are also important to help confirm a clinical diagnosis of potentially-treatable chronic infection. But a commonly used, if not recommended, practice is to use a serologic assay to “rule out” B. burgdorferi infection as an explanation of what may be long-standing symptoms, such as chronic joint pain, headache, cognitive problems, and fatigue. For diagnosis of early infection, a sensitive test is desirable to identify the infection at the earliest and most easily treatable point of the infection. For diagnosis of late disease, high sensitivity is also desirable but improved specificity is especially important because the test in practice is often applied in circumstances in which the a priori likelihood of B. burgdorferi infection is low (Bunikis, 2002).
Currently available commercial assays in the United States are either based on whole bacteria cell extracts, such as the enzyme-linked immunoabsorbent (ELISA) and Western blot assays, or on a single antigen ELISA such as the C6 peptide of the VlsE protein (Aguero-Rosenfeld, 2005). The whole cell assays are usually used as a 2-tiered test. First, a more sensitive, typically a whole cell ELISA, is used. This is followed by the more specific Western blot, if the ELISA is positive or equivocal (Control, 1997). Together these assays have served for years as the standard for serodiagnosis, but there remain trade-offs between sensitivity and specificity to minimize false-positive results. One drawback of the 2-tiered, sequential test procedure is the time it takes and the greater expense for two assays. Another problem with whole cell assays is a lack of standardization between tests of different manufacturers. The variables include different strains of B. burgdorferi that are used, different conditions for cultivating the organisms, and different methods for identifying the key antigens on blots.
Assays based on single proteins, such as the flagellin protein FlaB, or combinations of recombinant proteins are available in Europe (Hansen, 1988; Kaiser, 1999; Heikkila, 2003). In general, these have shown sensitivities and specificities approximately equivalent to the 2-tiered procedure. The recombinant antigens used singly or in combination are those that had been previously identified in whole cell Western blot assays using in-vitro cultivated cells. In the United States the most common subunit assays use a single peptide (called C6) of the VlsE protein or the full-length recombinant VlsE protein (Bacon, 2003). In some test formulations these single antigen assays had sensitivity for different stages of infection that was as good as the 2-tier procedure and better specificity (Lawrenz, 1999; Liang, 1999). But in other, more recent studies, including some from Europe, either the specificity or sensitivity of single antigen assays was not as good as tests based on two or more antigens or a 2-tiered procedure (Peltomaa, 2004; Marangoni, 2005; Goettner, 2005).
Perhaps the most important problem with currently available whole cell-based assays is that they utilize for their substrates bacteria that have been grown in vitro. The accumulated evidence =unequivocally shows that cells grown in vitro differ with respect to the expression of several proteins from cells recovered from infected animals (Fikrig, 1997; Gilmore, 2001; Salazar, 2005). While certain in vivo conditions can be duplicated to some extent in vitro by altering growth conditions, such as pH or cell density, there remain many proteins that appear to be only expressed in an infected animal or untreated patient.
The present invention provides methods of detecting Borrelia species in a sample (e.g., a sample from a patient suspected of being infected). In particular, the present invention provides compositions and methods for detecting the presence of Borrelia proteins, nucleic acid sequences encoding these proteins, and patient antibodies to these proteins, where the proteins are selected from those listed in Table 3, including: BB0279 (FLiL), BBK19, BBK07, BB0286 (FlbB), BBG33, BBL27, BBN34, BBP34, BBQ42, BBQ34, BBM34, BBN27, and BBH13.
In some embodiments, the present invention provides methods of detecting Borrelia in a patient sample comprising: contacting a sample with an antibody or other agent configured to bind a molecule selected from the group consisting of: BB0279 (FLiL), a BB0279 patient antibody, BBK19, a BBK19 patient antibody, BBK07, a BBK07 patient antibody, BB0286 (FlbB), a BB0286 patient antibody, BBG33, a BBG33 patient antibody, BBL27, a BBL27 patient antibody, BBN34, a BBN34 patient antibody, BBP34, a BBP34 patient antibody, BBQ42, a BBQ42 patient antibody, BBQ34, a BBQ34 patient antibody, BBM34, a BBM34 patient antibody, BBN27, a BBN27 patient antibody, BBH13, and a BBH13 patient antibody.
In certain embodiments, the contacting is performed with the antibody or a fragment of the antibody. In further embodiments, the other agent is one of the molecules that is not an antibody (e.g., BB0279, BBK19, etc.), and the presence or absence of one or more of the patient antibodies is detected (e.g., a BB0279 patient antibody or BBK19 patient antibody).
In some embodiments, the molecule is a protein that has an amino acid sequence found at an accession number selected from the group consisting of: NC—001318; NC—001852; NC—001853; NC—001855; NC—000953; NC—000951; NC—000954; NC—000948; and AE001584 (each of which is herein incorporated by reference as if fully set forth herein).
In particular embodiments, the Borrelia bacteria detected is Borrelia burgdorferi. In other embodiments, the Borrelia is Borrelia afzelii or Borrelia garinii. In certain embodiments, the Borrelia bacteria detected is selected from: Borrelia afzelii; Borrelia anserina; Borrelia burgdorferi; Borrelia garinii; Borrelia hermsii; Borrelia recurrentis; and Borrelia valaisiana.
In some embodiments, the present invention provides methods of detecting Borrelia in a sample comprising: contacting a sample with an nucleic acid sequence or nucleic acid sequences configured to detect a target nucleic acid sequence selected from the group consisting of: BB0279 (FLiL), BBK19, BBK07, BB0286 (FlbB), BBG33, BBL27, BBN34, BBP34, BBQ42, BBQ34, BBM34, BBN27, and BBH13.
In certain embodiments, the nucleic acid sequence is a probe that comprises a nucleotide sequence selected from the group consisting of: SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NO:115, SEQ ID NO:116, SEQ ID NO:119, SEQ ID NO:120, SEQ ID NO:125, SEQ ID NO:126, SEQ ID NO:131, SEQ ID NO:132, SEQ ID NO:143, SEQ ID NO:144, SEQ ID NO:151, SEQ ID NO:152, SEQ ID NO:157, SEQ ID NO:158, SEQ ID NO:161, SEQ ID NO:162, SEQ ID NO:173, SEQ ID NO:174, SEQ ID NO:185, SEQ ID NO:186, SEQ ID NO:191, SEQ ID NO:192 or any of the nucleic acid sequences (or portions thereof) shown in the accession numbers in Table 3.
In other embodiments, the nucleic acid sequences are a primer pair selected from the group consisting of: SEQ ID NO:15 and SEQ ID NO:16; SEQ ID NO:19 and SEQ ID NO:20; SEQ ID NO:115 and SEQ ID NO:116; SEQ ID NO:119 and SEQ ID NO:120; SEQ ID NO:125 and SEQ ID NO:126; SEQ ID NO:131 and SEQ ID NO:132; SEQ ID NO:143 and SEQ ID NO:144; SEQ ID NO:151 and SEQ ID NO:152; SEQ ID NO:157 and SEQ ID NO:158; SEQ ID NO:161 and SEQ ID NO:162; SEQ ID NO:173 and SEQ ID NO:174; SEQ ID NO:185 and SEQ ID NO:186; and SEQ ID NO:191 and SEQ ID NO:192.
In particular embodiments, the present invention provides methods of vaccinating a person against Borrelia infection, comprising: administering a composition to a patient comprising an isolated protein selected from the group consisting of: BB0279 (FLiL), BBK19, BBK07, BB0286 (FlbB), BBG33, BBL27, BBN34, BBP34, BBQ42, BBQ34, BBM34, BBN27, and BBH13. In particular embodiments, the isolated protein has an amino acid sequence, or at least part of an amino acid sequence, found at an accession number selected from the group consisting of: NC—001318; NC—001852; NC—001853; NC—001855; NC—000953; NC—000951; NC—000954; NC—000948; and AE001584.
In certain embodiments, the present invention provides compositions suitable for injection to a human (or domesticated animal) comprising: i) an adjuvant and/or physiological tolerable buffer, and ii) an isolated protein selected from the group consisting of: BB0279 (FLiL), BBK19, BBK07, BB0286 (FlbB), BBG33, BBL27, BBN34, BBP34, BBQ42, BBQ34, BBM34, BBN27, and BBH13. In further embodiments, the isolated protein has an amino acid sequence found at an accession number selected from the group consisting of: NC—001318; NC—001852; NC—001853; NC—001855; NC—000953; NC—000951; NC—000954; NC—000948; and AE001584.
In other embodiments, the present invention provides methods of detecting Borrelia in a sample comprising: contacting a sample with an antibody or other agent configured to bind a molecule selected from an antigen recited in Table 3 or an antibody to an antigen in Table 3.
In some embodiments, the present invention provides methods of detecting Borrelia in a sample comprising: contacting a sample with an antibody or other agent configured to bind a molecule selected from the group consisting of: BBG33 or antibody thereof; BB0279 or antibody thereof; BBL27 or antibody thereof BBN34 or antibody thereof BBP34 or antibody thereof BBQ42 or antibody thereof BBQ34 or antibody thereof BBM34 or antibody thereof BBN27 or antibody thereof BBH13 or antibody thereof BBO34 or antibody thereof BBQ03 or antibody thereof BBN11 or antibody thereof OspC_A or antibody thereof BBO39 or antibody thereof BBF03 or antibody thereof BBK19 or antibody thereof BBI42 or antibody thereof BBB14 or antibody thereof BB0348 or antibody thereof BBH06 or antibody thereof BBN38 or antibody thereof BB0215 or antibody thereof OspC_K or antibody thereof BBA36 or antibody thereof BBL40 or antibody thereof BB0359 or antibody thereof BBR42 or antibody thereof BBJ24 or antibody thereof BB0543 or antibody thereof BB0774 or antibody thereof BB0844 or antibody thereof BBN39 or antibody thereof BBK12 or antibody thereof BBA07 or antibody thereof BBK07 or antibody thereof BBA57 or antibody thereof BB0323 or antibody thereof BB0681 or antibody thereof BBA03 or antibody thereof BBB09 or antibody thereof BB0238 or antibody thereof BBA48 or antibody thereof BB0408 or antibody thereof BBK53 or antibody thereof BBR35 or antibody thereof BBS41 or antibody thereof BB0286 or antibody thereof BB0385 or antibody thereof and BBG18 or antibody thereof.
In particular embodiments, the present invention provides methods of detecting Borrelia in a sample comprising: contacting a sample with a nucleic acid sequence or nucleic acid sequences configured to detect at least one target nucleic acid sequence of an antigen recited in Table 3. In some embodiments, the at least one target nucleic acid sequence is selected from the group consisting of: BBG33; BB0279; BBL27; BBN34; BBP34; BBQ42; BBQ34; BBM34; BBN27; BBH13; BB034; BBQ03; BBN11; OspC_A; BBO39; BBF03; BBK19; BBI42; BBB14; BB0348; BBH06; BBN38; BB0215; OspC_K; BBA36; BBL40; BB0359; BBR42; BBJ24; BB0543; BB0774; BB0844; BBN39; BBK12; BBA07; BBK07; BBA57; BB0323; BB0681; BBA03; BBB09; BB0238; BBA48; BB0408; BBK53; BBR35; BBS41; BB0286; BB0385; and BBG18. In further embodiments, the nucleic acid sequences comprises at least one nucleic acid sequence selected from SEQ ID NOs:1-202.
In some embodiments, the present invention provides methods of vaccinating a person against Borrelia , comprising: administering a composition to a patient comprising at least one isolated protein from Table 3. In particular embodiments, the at least one isolated protein is selected from the group consisting of: BBG33; BB0279; BBL27; BBN34; BBP34; BBQ42; BBQ34; BBM34; BBN27; BBH13; BB034; BBQ03; BBN11; OspC_A; BBO39; BBF03; BBK19; BBI42; BBB14; BB0348; BBH06; BBN38; BB0215; OspC_K; BBA36; BBL40; BB0359; BBR42; BBJ24; BB0543; BB0774; BB0844; BBN39; BBK12; BBA07; BBK07; BBA57; BB0323; BB0681; BBA03; BBB09; BB0238; BBA48; BB0408; BBK53; BBR35; BBS41; BB0286; BB0385; and BBG18.
In additional embodiments, the present invention provides compositions suitable for injection to a human, or domesticated animal, comprising: i) an adjuvant and/or physiological tolerable buffer, and ii) an isolated protein from Table 3. In particular embodiments, the at least one isolated protein is selected from the group consisting of: BBG33; BB0279; BBL27; BBN34; BBP34; BBQ42; BBQ34; BBM34; BBN27; BBH13; BBO34; BBQ03; BBN11; OspC_A; BB039; BBF03; BBK19; BBI42; BBB14; BB0348; BBH06; BBN38; BB0215; OspC_K; BBA36; BBL40; BB0359; BBR42; BBJ24; BB0543; BB0774; BB0844; BBN39; BBK12; BBA07; BBK07; BBA57; BB0323; BB0681; BBA03; BBB09; BB0238; BBA48; BB0408; BBK53; BBR35; BBS41; BB0286; BB0385; and BBG18.
In some embodiments, the present invention provides methods of detecting Borrelia in a sample comprising: contacting a sample with an antibody or other agent configured to bind a molecule selected from the group consisting of: BBK07, a BBK07 ortholog, a BBK07 antibody, BBK12, a BBK12 ortholog, a BBK12 antibody, BBK19, a BBK19 ortholog, a BBK antibody, FLiL, a FLiL ortholog, a FLiL antibody, FlbB, a FlbB ortholog, or a FlbB antibody.
In other embodiments, the present invention provides methods of detecting Borrelia in a sample comprising: contacting a sample with an nucleic acid sequence or nucleic acid sequences configured to detect a target nucleic acid sequence selected from the group consisting of: bbk07, a bbk07 ortholog, bbk12, a bbk12 ortholog, bbk19, a bbk19 ortholog, flil, a flil ortholog, flbb, or a flbbB ortholog.
In certain embodiments, the present invention provides methods for vaccinating a subject (e.g., a person) against Borrelia, comprising: administering a composition to a patient comprising an isolated protein selected from the group consisting of: BBK07, a BBK07 ortholog, BBK12, a BBK12 ortholog, BBK19, a BBK19 ortholog, FLiL, a FLiL ortholog, FlbB, or a FlbB ortholog.
In further embodiments, the present invention provides compositions suitable for injection to an animal (e.g., human) comprising: i) an adjuvant and/or physiological tolerable buffer, and ii) an isolated protein selected from the group consisting of: BBK07, a BBK07 ortholog, BBK12, a BBK12 ortholog, BBK19, a BBK19 ortholog, FLiL, a FLiL ortholog, FlbB, or a FlbB ortholog.
The term “epitope” as used herein refers to that portion of an antigen that makes contact with a particular antibody. When a protein or fragment of a protein (e.g., those described by accession number in Table 3) is used to immunize a host animal, numerous regions of the protein may induce the production of antibodies which bind specifically to a given region or three-dimensional structure on the protein; these regions or structures are referred to as “antigenic determinants”. An antigenic determinant may compete with the intact antigen (i.e., the “immunogen” used to elicit the immune response) for binding to an antibody.
The terms “specific binding” or “specifically binding” when used in reference to the interaction of an antibody and a protein or peptide means that the interaction is dependent upon the presence of a particular structure (i.e., the antigenic determinant or epitope) on the protein; in other words the antibody is recognizing and binding to a specific protein structure rather than to proteins in general. For example, if an antibody is specific for epitope “A,” the presence of a protein containing epitope A (or free, unlabelled A) in a reaction containing labeled “A” and the antibody will reduce the amount of labeled A bound to the antibody.
As used herein, the terms “non-specific binding” and “background binding” when used in reference to the interaction of an antibody and a protein or peptide refer to an interaction that is not dependent on the presence of a particular structure (i.e., the antibody is binding to proteins in general rather that a particular structure such as an epitope).
As used herein, the term “subject” refers to any animal (e.g., a mammal), including, but not limited to, humans, non-human primates, rodents, and the like, which is to be the recipient of a particular treatment. Typically, the terms “subject” and “patient” are used interchangeably herein in reference to a human subject.
As used herein, the term “subject suspected of being infected with a Borrelia species” refers to a subject that presents one or more symptoms indicative of such infection (see, e.g., NIH guidelines for such infections). A subject suspected of being infected with Borrelia species (e.g., burgdorferi) may also have one or more risk factors (e.g., exposure to ticks). A subject suspected of infection generally not been tested for such infection.
A “patient antibody,” as used herein, is an antibody generated in a patient (e.g., human) as a result of infection with a Borrelia bacteria. In other words, it is the patient's own antibodies generated as a result of infection. Such antibodies provide evidence of infection and are therefore useful to detect in order to provide a diagnosis of Borrelia infection.
As used herein, the term “instructions for using said kit for detecting Borrelia infection in said subject” includes instructions for using the reagents contained in the kit for the detection and characterization of Borrelia infection in a sample from a subject. In some embodiments, the instructions further comprise the statement of intended use required by the U.S. Food and Drug Administration (FDA) in labeling in vitro diagnostic products. The present invention contemplates kits with reagents for detecting Borrelia infection, including antibodies to the antigens recited in Table 3, and nucleic acids sequences (e.g., primer pairs from Table 4).
As used herein, the term “primer” refers to an oligonucleotide, whether occurring naturally as in a purified restriction digest or produced synthetically, that is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product that is complementary to a nucleic acid strand is induced, (i.e., in the presence of nucleotides and an inducing agent such as DNA polymerase and at a suitable temperature and pH). The primer is preferably single stranded for maximum efficiency in amplification, but may alternatively be double stranded. If double stranded, the primer is first treated to separate its strands before being used to prepare extension products. Preferably, the primer is an oligodeoxyribonucleotide. The primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent. The exact lengths of the primers will depend on many factors, including temperature, source of primer and the use of the method. Exemplary primers for detecting the Borrelia target nucleic acids of the present invention are provided in Table 4, which contains 101 primer pairs (SEQ ID NOs:1-202). One of skill in the art could design similar primers given that the nucleic acid sequences are known in the art for the Borrelia antigens (Table 3 useful nucleic acid sequence accession numbers).
As used herein, the term “probe” refers to an oligonucleotide (i.e., a sequence of nucleotides), whether occurring naturally as in a purified restriction digest or produced synthetically, recombinantly or by PCR amplification, that is capable of hybridizing to at least a portion of another oligonucleotide of interest. A probe may be single-stranded or double-stranded. Probes are useful in the detection, identification and isolation of particular gene sequences. It is contemplated that any probe used in the present invention will be labeled with any “reporter molecule,” so that is detectable in any detection system, including, but not limited to enzyme (e.g., ELISA, as well as enzyme-based histochemical assays), fluorescent, radioactive, and luminescent systems. It is not intended that the present invention be limited to any particular detection system or label. The primers listed in Table 4 could also be used as probes (e.g., by labeling these sequences) to detect Borrelia antigens.
As used herein the term “portion” when in reference to a nucleotide sequence (as in “a portion of a given nucleotide sequence”) refers to fragments of that sequence. The fragments may range in size from four nucleotides to the entire nucleotide sequence minus one nucleotide (10 nucleotides, 20, 30, 40, 50, 100, 200, etc.).
The present invention provides methods of detecting Borrelia species in a sample (e.g., a sample from a patient suspected of being infected). In particular, the present invention provides compositions and methods for detecting the presence of Borrelia proteins, nucleic acid sequences encoding these proteins, and patient antibodies to these proteins, where the proteins are selected from those listed in Table 3, including: BB0279 (FLiL), BBK19, BBK07, BB0286 (FlbB), BBG33, BBL27, BBN34, BBP34, BBQ42, BBQ34, BBM34, BBN27, and BBH13.
The present invention provides numerous proteins and nucleic acid targets that can be detected in order to diagnose Borrelia infection. Table 3 below lists the ORFs that were found to be antigenic during the development of the present invention. Table 3 also lists the accession numbers where the protein and nucleic acid sequences for these antigens can be found. These accession numbers allow one skilled in the art to easily design probes and primers to the corresponding nucleic acid sequences. These accession numbers (herein incorporated by reference as if fully set forth herein) also allow one of skill in the art to express these proteins in order to generate antibodies and antibody fragments useful for detecting Borrelia infection.
In some embodiments, the present invention provides methods for detection of the Borrelia antigens listed in Table 3. In some embodiments, expression is detected in bodily fluids (e.g., including but not limited to, plasma, serum, whole blood, mucus, and urine). In certain embodiments, multiple antigens are detected (e.g., two or more antigens from Table 3 or one antigen from Table 3 and one antigen presently known in the art). In certain embodiments, at least 2 . . . 5 . . . 10 . . . 20 . . . 35 . . . 50 . . . or 100 antigens are detected from a single patient sample.
In some embodiments, the presence of a Table 3 Borrelia antigen is used to provide a prognosis to a subject. The information provided is also used to direct the course of treatment.
1. Detection of Nucleic Acid
In some embodiments, detection of Table 3 Borrelia antigens are detected by measuring the existence of nucleic acid encoding such antigens in a patient sample. Table 3 lists the accession numbers for each of the antigens which allows one of skill in the art to design primers and probes to such sequences. Exemplary primers for each of these antigens are shown in Table 4.
In some embodiments, nucleic acid is detected by Northern blot analysis. Northern blot analysis involves the separation of nucleic acid and hybridization of a complementary labeled probe.
In still further embodiments, nuclei acid is detected by hybridization to an oligonucleotide probe). A variety of hybridization assays using a variety of technologies for hybridization and detection are available. For example, in some embodiments, TaqMan assay (PE Biosystems, Foster City, Calif.; See e.g., U.S. Pat. Nos. 5,962,233 and 5,538,848, each of which is herein incorporated by reference) is utilized. The assay is performed during a PCR reaction. The TaqMan assay exploits the 5′-3′ exonuclease activity of the AMPLITAQ GOLD DNA polymerase. A probe consisting of an oligonucleotide with a 5′-reporter dye (e.g., a fluorescent dye) and a 3′-quencher dye is included in the PCR reaction. During PCR, if the probe is bound to its target, the 5′-3′ nucleolytic activity of the AMPLITAQ GOLD polymerase cleaves the probe between the reporter and the quencher dye. The separation of the reporter dye from the quencher dye results in an increase of fluorescence. The signal accumulates with each cycle of PCR and can be monitored with a fluorimeter.
In other embodiments, nucleic acid is detected using a detection assay including, but not limited to, enzyme mismatch cleavage methods (e.g., Variagenics, U.S. Pat. Nos. 6,110,684, 5,958,692, 5,851,770, herein incorporated by reference in their entireties); polymerase chain reaction; branched hybridization methods (e.g., Chiron, U.S. Pat. Nos. 5,849,481, 5,710,264, 5,124,246, and 5,624,802, herein incorporated by reference in their entireties); rolling circle replication (e.g., U.S. Pat. Nos. 6,210,884, 6,183,960 and 6,235,502, herein incorporated by reference in their entireties); NASBA (e.g., U.S. Pat. No. 5,409,818, herein incorporated by reference in its entirety); molecular beacon technology (e.g., U.S. Pat. No. 6,150,097, herein incorporated by reference in its entirety); E-sensor technology (Motorola, U.S. Pat. Nos. 6,248,229, 6,221,583, 6,013,170, and 6,063,573, herein incorporated by reference in their entireties); cycling probe technology (e.g., U.S. Pat. Nos. 5,403,711, 5,011,769, and 5,660,988, herein incorporated by reference in their entireties); Dade Behring signal amplification methods (e.g., U.S. Pat. Nos. 6,121,001, 6,110,677, 5,914,230, 5,882,867, and 5,792,614, herein incorporated by reference in their entireties); ligase chain reaction (Barnay Proc. Natl. Acad. Sci USA 88, 189-93 (1991)); FULL-VELOCITY assays; and sandwich hybridization methods (e.g., U.S. Pat. No. 5,288,609, herein incorporated by reference in its entirety). In other embodiments, the detection assay employed is the INVADER assay (Third Wave Technologies) which is described in U.S. Pat. Nos. 5,846,717, 5,985,557, 5,994,069, 6,001,567, and 6,090,543, WO 97/27214 WO 98/42873, Lyamichev et al., Nat. Biotech., 17:292 (1999), Hall et al., PNAS, USA, 97:8272 (2000), each of which is herein incorporated by reference in their entirety for all purposes).
2. Detection of Protein
In some embodiments, the proteins expressed by the ORFs listed in Table 3 are detected. Protein expression can be detected by any suitable method. In some embodiments, proteins are detected by immunohistochemistry. In other embodiments, proteins are detected by their binding to an antibody raised against the protein. The generation of antibodies is described below.
Antibody binding is detected by techniques known in the art (e.g., radioimmunoassay, ELISA (enzyme linked immunosorbant assay), “sandwich” immunoassays, immunoradiometric assays, gel diffusion precipitation reactions, immunodiffusion assays, in situ immunoassays (e.g., using colloidal gold, enzyme or radioisotope labels, for example), Western blots, precipitation reactions, agglutination assays (e.g., gel agglutination assays, hemagglutination assays, etc.), complement fixation assays, immunofluorescence assays, protein A assays, and immunoelectrophoresis assays, etc.
In certain embodiments, antibody binding is detected by detecting a label on the primary antibody. In another embodiment, the primary antibody is detected by detecting binding of a secondary antibody or reagent to the primary antibody. In a further embodiment, the secondary antibody is labeled. Many methods are known in the art for detecting binding in an immunoassay and are within the scope of the present invention.
In some embodiments, an automated detection assay is utilized. Methods for the automation of immunoassays include those described in U.S. Pat. Nos. 5,885,530, 4,981,785, 6,159,750, and 5,358,691, each of which is herein incorporated by reference. In some embodiments, the analysis and presentation of results is also automated. In other embodiments, the immunoassay described in U.S. Pat. Nos. 5,599,677 and 5,672,480 (each of which is herein incorporated by reference) is utilized.
3. Antibodies and Antibody Fragments
The present invention provides isolated antibodies and antibody fragments against the Borrelia proteins recited in Table 3. Such antibodies and antibody fragments can be used, for example, in diagnostic and therapeutic methods. The antibody, or antibody fragment, can be any monoclonal or polyclonal antibody that specifically recognize Borrelia antigens recited in Table 3. In some embodiments, the present invention provides monoclonal antibodies, or fragments thereof, that specifically bind to Borrelia antigens recited in Table 3. In some embodiments, the monoclonal antibodies, or fragments thereof, are chimeric or humanized antibodies. In other embodiments, the monoclonal antibodies, or fragments thereof, are human antibodies.
The antibodies of the present invention find use in experimental, diagnostic and therapeutic methods. In certain embodiments, the antibodies of the present invention are used to detect the presence or absence of Borrelia proteins in a sample from a patient.
Polyclonal antibodies can be prepared by any known method. Polyclonal antibodies can be raised by immunizing an animal (e.g. a rabbit, rat, mouse, donkey, etc) by multiple subcutaneous or intraperitoneal injections of the relevant antigen (a purified peptide fragment, full-length recombinant protein, fusion protein, etc., from Table 3) optionally conjugated to keyhole limpet hemocyanin (KLH), serum albumin, etc. diluted in sterile saline and combined with an adjuvant (e.g. Complete or Incomplete Freund's Adjuvant) to form a stable emulsion. The polyclonal antibody is then recovered from blood, ascites and the like, of an animal so immunized. Collected blood is clotted, and the serum decanted, clarified by centrifugation, and assayed for antibody titer. The polyclonal antibodies can be purified from serum or ascites according to standard methods in the art including affinity chromatography, ion-exchange chromatography, gel electrophoresis, dialysis, etc.
Monoclonal antibodies can be prepared using hybridoma methods, such as those described by Kohler and Milstein (1975) Nature 256:495. Using the hybridoma method, a mouse, hamster, or other appropriate host animal, is immunized as described above to elicit the production by lymphocytes of antibodies that will specifically bind to an immunizing antigen. Alternatively, lymphocytes can be immunized in vitro. Following immunization, the lymphocytes are isolated and fused with a suitable myeloma cell line using, for example, polyethylene glycol, to form hybridoma cells that can then be selected away from unfused lymphocytes and myeloma cells. Hybridomas that produce monoclonal antibodies directed specifically against a chosen antigen as determined by immunoprecipitation, immunoblotting, or by an in vitro binding assay such as radioimmunoassay (RIA) or enzyme-linked immunosorbent assay (ELISA) can then be propagated either in vitro culture using standard methods (Goding, Monoclonal Antibodies: Principles and Practice, Academic Press, 1986) or in vivo as ascites tumors in an animal. The monoclonal antibodies can then be purified from the culture medium or ascites fluid as described for polyclonal antibodies above.
Alternatively monoclonal antibodies can also be made using recombinant DNA methods as described in U.S. Pat. No. 4,816,567. The polynucleotides encoding a monoclonal antibody are isolated, such as from mature B-cells or hybridoma cell, such as by RT-PCR using oligonucleotide primers that specifically amplify the genes encoding the heavy and light chains of the antibody, and their sequence is determined using conventional procedures. The isolated polynucleotides encoding the heavy and light chains are then cloned into suitable expression vectors, which when transfected into host cells such as E. coli cells, simian COS cells, Chinese hamster ovary (CHO) cells, or myeloma cells that do not otherwise produce immunoglobulin protein, monoclonal antibodies are generated by the host cells. Also, recombinant monoclonal antibodies or fragments thereof of the desired species can be isolated from phage display libraries as described (McCafferty et al., 1990, Nature, 348:552-554; Clackson et al., 1991, Nature, 352:624-628; and Marks et al., 1991, J. Mol. Biol., 222:581-597).
The polynucleotide(s) encoding a monoclonal antibody can further be modified in a number of different manners using recombinant DNA technology to generate alternative antibodies. In one embodiment, the constant domains of the light and heavy chains of, for example, a mouse monoclonal antibody can be substituted 1) for those regions of, for example, a human antibody to generate a chimeric antibody or 2) for a non-immunoglobulin polypeptide to generate a fusion antibody. In other embodiments, the constant regions are truncated or removed to generate the desired antibody fragment of a monoclonal antibody. Furthermore, site-directed or high-density mutagenesis of the variable region can be used to optimize specificity, affinity, etc. of a monoclonal antibody.
In some embodiments, of the present invention the monoclonal antibody against a Borrelia antigen from Table 3 is a humanized antibody. Humanized antibodies are antibodies that contain minimal sequences from non-human (e.g., murine) antibodies within the variable regions. Such antibodies are used therapeutically to reduce antigenicity and HAMA (human anti-mouse antibody) responses when administered to a human subject. In practice, humanized antibodies are typically human antibodies with minimum to no non-human sequences. A human antibody is an antibody produced by a human or an antibody having an amino acid sequence corresponding to an antibody produced by a human.
Humanized antibodies can be produced using various techniques known in the art. An antibody can be humanized by substituting the CDR of a human antibody with that of a non-human antibody (e.g. mouse, rat, rabbit, hamster, etc.) having the desired specificity, affinity, and capability (Jones et al., 1986, Nature, 321:522-525; Riechmann et al., 1988, Nature, 332:323-327; Verhoeyen et al., 1988, Science, 239:1534-1536). The humanized antibody can be further modified by the substitution of additional residue either in the Fv framework region and/or within the replaced non-human residues to refine and optimize antibody specificity, affinity, and/or capability.
Human antibodies can be directly prepared using various techniques known in the art. Immortalized human B lymphocytes immunized in vitro or isolated from an immunized individual that produce an antibody directed against a target antigen can be generated (See, for example, Cole et al., Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, p. 77 (1985); Boemer et al., 1991, J. Immunol., 147 (1):86-95; and U.S. Pat. No. 5,750,373). Also, the human antibody can be selected from a phage library, where that phage library expresses human antibodies (Vaughan et al., 1996, Nature Biotechnology, 14:309-314; Sheets et al., 1998, PNAS, 95:6157-6162; Hoogenboom and Winter, 1991, J. Mol. Biol., 227:381; Marks et al., 1991, J. Mol. Biol., 222:581). Humanized antibodies can also be made in transgenic mice containing human immunoglobulin loci that are capable upon immunization of producing the full repertoire of human antibodies in the absence of endogenous immunoglobulin production. This approach is described in U.S. Pat. Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 5,633,425; and 5,661,016.
This invention also encompasses bispecific antibodies. Bispecific antibodies are antibodies that are capable of specifically recognizing and binding at least two different epitopes.
Bispecific antibodies can be intact antibodies or antibody fragments. Techniques for making bispecific antibodies are common in the art (Millstein et al., 1983, Nature 305:537-539; Brennan et al., 1985, Science 229:81; Suresh et al, 1986, Methods in Enzymol. 121:120; Traunecker et al., 1991, EMBO J. 10:3655-3659; Shalaby et al., 1992, J. Exp. Med. 175:217-225; Kostelny et al., 1992, J. Immunol. 148:1547-1553; Gruber et al., 1994, J. Immunol. 152:5368; and U.S. Pat. No. 5,731,168).
In certain embodiments of the invention, it may be desirable to use an antibody fragment, rather than an intact antibody, to increase tumor penetration, for example. Various techniques are known for the production of antibody fragments. Traditionally, these fragments are derived via proteolytic digestion of intact antibodies (for example Morimoto et al., 1993, Journal of Biochemical and Biophysical Methods 24:107-117 and Brennan et al., 1985, Science, 229:81). However, these fragments are now typically produced directly by recombinant host cells as described above. Thus Fab, Fv, and scFv antibody fragments can all be expressed in and secreted from E. coli or other host cells, thus allowing the production of large amounts of these fragments. Alternatively, such antibody fragments can be isolated from the antibody phage libraries discussed above. The antibody fragment can also be linear antibodies as described in U.S. Pat. No. 5,641,870, for example, and can be monospecific or bispecific. Other techniques for the production of antibody fragments will be apparent to the skilled practitioner.
It may further be desirable, especially in the case of antibody fragments, to modify an antibody in order to increase its serum half-life. This can be achieved, for example, by incorporation of a salvage receptor binding epitope into the antibody fragment by mutation of the appropriate region in the antibody fragment or by incorporating the epitope into a peptide tag that is then fused to the antibody fragment at either end or in the middle (e.g., by DNA or peptide synthesis).
The present invention further embraces variants and equivalents which are substantially homologous to the chimeric, humanized and human antibodies, or antibody fragments thereof, set forth herein. These can contain, for example, conservative substitution mutations, i.e. the substitution of one or more amino acids by similar amino acids. For example, conservative substitution refers to the substitution of an amino acid with another within the same general class such as, for example, one acidic amino acid with another acidic amino acid, one basic amino acid with another basic amino acid or one neutral amino acid by another neutral amino acid. What is intended by a conservative amino acid substitution is well known in the art.
In certain embodiments, after a patient has been diagnosed with Borrelia infection, that patient is administered appropriate antibiotics. However, certain patients may be referactory to antibiotic treatment. In such situations, other treatments are employed, such as using antibodies to one or more of the antigens described in Table 3.
In some embodiments, the present invention provides antibodies that proteins from Table 3. Any suitable antibody (e.g., monoclonal, polyclonal, or synthetic) can be utilized in the therapeutic methods disclosed herein. In some embodiments, the antibodies used for therapy are humanized antibodies. Methods for humanizing antibodies are well known in the art (See e.g., U.S. Pat. Nos. 6,180,370, 5,585,089, 6,054,297, and 5,565,332; each of which is herein incorporated by reference).
In some embodiments, the antibody is conjugated to a cytotoxic agent. For certain applications, it is envisioned that the therapeutic agents will be pharmacologic agents that will serve as useful agents for attachment to antibodies, particularly cytotoxic or otherwise anticellular agents having the ability to kill Borrelia bacteria. The present invention contemplates the use of any pharmacologic agent that can be conjugated to an antibody, and delivered in active form.
The present invention provides compositions comprising protein sequences, as well as the DNA sequences encoding them, of three proteins, BBK07, BBKI2, and BBK19 (“the proteins”), of the Lyme disease agent Borrelia burgdorferi and deduced proteins of other pathogenic Borrelia species that are orthologous to these three proteins. In certain embodiments, the present invention provides diagnostic tests for antibodies to Borrelia burgdorferi or other pathogenic Borrelia species and vaccines for inducing an immune response to Borrelia burgdorferi or other pathogenic Borrelia species. The proteins had not previously been identified or known to be antigens to which an immune response during infection is directed in humans or other animals. It is believed that the immunogenicity of recombinant forms of these proteins have not been previously determined. An improved diagnostic test for Lyme disease is needed and one or more of the proteins, by themselves or in combination with other recombinant proteins, should provide for better sensitivity and specificity than currently available assays. These proteins have also not previously been investigated as sub-unit vaccines, either by themselves or in combination with other recombinant proteins, for protection against infection by Borrelia burgdorferi or other pathogenic Borrelia species. As such, in certain embodiments, the present invention provides vaccines using these proteins.
The encoding DNA sequences and the deduced proteins for BBK07, BBK12, and BBK19 were originally identified when the chromosome sequence and most of the plasmid sequences for the B31 strain of Borrelia burgdorferi (Bb) were detennined (Fraser et aI., Vature, 1997: Casjens et aI., Alolecular Microbiology, 2000). They are located on the lp36 linear plasmid of Bb. We have identified an orthologous DNA sequence to BBK07 in another Borrelia species, Borrelia turicatae (Bt), a cause of relapsing fever. This DNA sequence and the deduced protein are not been published or deposited in a public database.
The evidence of an orthologous gene in a distantly related species of Borrelia as well as Bb indicates the genes for these proteins may occur in the other Borrelia species that cause Lyme disease. These include, but are not limited to, Borrelia afzelii (Ba) and Borrelia garinii (Bg). The chromosomes and the partial plasmid sequences for a single strain each of these species have been published but the deposited sequences do not show evidence of an ortholog of BBK07 and possible not the other genes as well. We would expect to identify orthologs of BBK07, BBK12, and/or BBK 19 in Ba, Bg, and other agents of Lyme disease. In the case of BBK07, this could be done by making an alignment of the Bb and Bt ortholog sequences and design polymerase chain reaction primers on the basis of conserved sequence between the two genes. These primers would then be used to amplify a part of the sought-after gene in these other species. Once the sequence of the resultant cloned DNA was confirmed and characterized, we would use inverse PCR to amplify the 5′ and 3′ ends of the genes and thereby have the complete gene sequence. Alternatively we could use the closed partial gene fragment as a probe for a DNA library of Ba or Bg in a plasmid, bacteriophage, or other cloning vector. For these methods, one could use low passage isolates of Ba and Bg obtained directly from infected animals or to use field collected ticks that have been documented to contain either Sa or Bg. By the same approach, one could also identify and isolate orthologs of BBK07, BSK12, and BBK19 in other relapsing fever species. including Borrelia hermsii.
On the other hand, if the existence of the putative orthologs in Ba and/or Bg cannot be established, it indicates that one or more of these genes and their products are unique to Bb. In this case, a diagnostic test for Lyme disease that was based on detection of antibodies to one or more the proteins would be specific for Bb. Such a test would be very useful in Europe and in Asia where the three species co-occur. Differentiating between infection with Bb or with one of the other species is clinically important because infection with Bb is much more likely to be associated with a chronic form of arthritis.
In certain embodiments, the present invention comprises recombinant proteins of Lyme disease Borrelia species flagella-associated proteins FliL and FlbB. In some embodiments, the methods are a diagnostic test for antibodies to either or both FliL and FlbB in a variety of different formats, in which the FliL and/or FlbB are alone or in combination with one or more other recombinant proteins. The diagnostic assay is for antibodies to Borrelia burgdorferi or another Lyme disease Borrelia species, such as B. afzelii and B. garinii. This assay may be used for laboratory support of the diagnosis of Lyme disease, for staging the infection, and for assessing the outcome of antibiotic therapy. Related proteins of relapsing Borrelia species, the syphilis agent Treponema pallidum, and the leptospirosis agent Leptospira interrogans could also be used as the basis for diagnostic assays for antibodies against these respective etiologic agents.
We have experimental evidence of the immunogenicity and antigenicity of FliL and FlbB in natural infections of Borrelia burgdorferi of humans and the wild mouse Peromyscus leucopus. These studies demonstrated that assays based on one or both proteins were specific as well as sensitive. These data were obtained using an array of approximately 80% of the open reading frames of the Borrelia burgdorferi genome and sera from Lyme disease patients and controls and from infected and uninfected mice.
Two proteins of the flagellar apparatus of Borrelia burgdorferi and related Lyme disease (LD) agents, B. garinii and B. afzeiii, have been identified as important antigens for the serologic (i.e. antibody-based) diagnosis of LD. These are the FlaB protein, which is the major flagellin of flagella and encoded by the flaB (open reading frame BBO147 of the B. burgdorferi genome) gene, and FlgE, which is the hook protein of the flagella apparatus and encoded by the flgE (B80283) gene. There have been several papers demonstrating the importance of the FlaB (formerly known as the “41 kDa” or “p41” protein) for serodiagnosis.
Purified flagella have also been reported as an antigen preparation for a serologic assay for Lyme disease and are the basis of at least one commercial assay (Dako) for antibodies to LD Borrelia sp., and a flagella-based assay was used by the Centers for Disease Control for a period as a reference assay for LD diagnosis. These purified flagella would contain FlaB and possibly FlgE but not the components of the export mechanism, such as FIiL.
Since 1983 there have been several papers and other works that have identified antigens according to apparent molecular weight on SDS polyacrylamide gels and Western blots. Included in this group is the FlaB (41 kDa) protein. Examples of other proteins that were first revealed as antigens through Western blots of native proteins were the OspA protein (“31 kDa”), BmpA protein (“39 kDa”), and Decorin-binding Protein A (“18 kDa”). There have also several other proteins that have been identified as antigens of diagnostic importance when they were expressed as recombinant proteins in E. coli and then reacted with sera from humans and other animals with infection with a LD Borrelia sp. Included in this group is the FlgE protein. While the fliL (BB0279) and flbB (BB0286) genes of B. burgdorferi and related species had been identified in sequence analysis of the parts or all of the genome and the polypeptides encoded by the open reading frames deduced, we know of no evidence that they were designated as informative antigens and of diagnostic importance previous to this disclosure. Neither is their evidence that homologous FliL and FlbB proteins of other pathogen spirochetes, including the Borrelia species that cause relapsing fever, Treponema pallidum, the agent of syphilis, and Leptospira interrogans, the agent of leptospirosis, had been previously identified as informative antigens of diagnostic importance for their respective diseases. All three groups of organisms have sequences that are homologous to the fliL gene of B. burgdorferi and other LD species. The calculated molecular weight of FIiL is 19929. There are multiple proteins migrating with this apparent size in SDS PAGE gels, and they cannot be distinguished. Although there is experimental evidence that the FIiL protein is expressed in vitro as well as in vivo, and thus would be expected to present in the whole cell lysates in gels and Western blots, the FliL protein may have not previously been recognized as an important antigen because it is present in small amounts and also because it would be predicted to migrate in the gel in an area with many other proteins, which could not be discriminated.
In yet other embodiments, the present invention provides kits for the detection and characterization of Borrelia infection. In some embodiments, the kits contain antibodies specific for one or more of the antigens in Table 3, in addition to detection reagents and buffers. In other embodiments, the kits contain reagents specific for the detection of nucleic acid (e.g., oligonucleotide probes or primers). In some embodiments, the kits contain all of the components necessary and/or sufficient to perform a detection assay, including all controls, directions for performing assays, and any necessary software for analysis and presentation of results.
Another embodiment of the present invention comprises a kit to test for the presence of the polynucleotides or proteins. The kit can comprise, for example, an antibody for detection of a polypeptide or a probe for detection of a polynucleotide. In addition, the kit can comprise a reference or control sample; instructions for processing samples, performing the test and interpreting the results; and buffers and other reagents necessary for performing the test. In other embodiments the kit comprises pairs of primers (e.g., as shown in Table 4) for detecting expression of one or more of the antigens in Table 3.
The following examples are provided in order to demonstrate and further illustrate certain preferred embodiments and aspects of the present invention and are not to be construed as limiting the scope thereof.
In the experimental disclosure which follows, the following abbreviations apply: N (normal); M (molar); mM (millimolar); μM (micromolar); mol (moles); mmol (millimoles); μmol (micromoles); nmol (nanomoles); pmol (picomoles); g (grams); mg (milligrams); μg (micrograms); ng (nanograms); l or L (liters); ml (milliliters); μl (microliters); Xg (time gravity); and C (degrees Centigrade).
This example describes genome-wide screening using arrays such that all or most open reading frames will be represented and screened without bias toward those expressed in greatest amounts in culture medium.
Bacterial strain, genome sequences, and primer design. Strain B31 of B. burgdorferi had undergone three passages since its isolation (6, 19). This organism was cultivated in BSK II broth medium (6). A high-passage isolate of strain B31 had been cloned by limiting dilution and had been serially passed in culture medium at least 50 times. Whole-genome DNA was extracted from the low-passage isolate as described previously (62). Primers were based on the sequences and annotations of the chromosome and 21 plasmids of strain B31 (accession numbers NC—001318, NC—000948 to NC—000957, NC—001903, NC—001904, and NC—001849 to NC—001857, all of which are herein incorporated by reference) (http://www., followed by “blackwellpublishing.com/products/journals/suppmat/mole/casjens.htm”) (21, 39). Forward and reverse primers were 20 nucleotides long and were complementary to the 5′ and 3′ ends of each ORF; peripherally they also included 33 nucleotide adapter sequences specific for plasmid pXT7 for recombination cloning, as described previously (29). The forward and reverse primers for about 200 of the ORFs (that were identified as immunogenic) are provided in Table 4 below:
Also included was the type K OspC protein gene of strain 297, in addition to the type A OspC gene of B31 (12, 15). ORFs were named according to the designations assigned to strain B31's genome (21, 39); “BB” followed by a four digit number (e.g., BB0279) indicates a chromosome ORF, while “BB” followed by a third letter and a two-digit number (e.g., BBA25) indicates a linear or circular plasmid ORF, and each replicon is assigned a separate letter (e.g., “A” for linear plasmid lp54 or “B” for circular plasmid cp26). As needed, genome ORF designations were supplemented with names in common use or when polypeptide identity has been inferred from homology to proteins with known functions. The predictions of lipoproteins are those of Casjens et al. (http://www. followed by “blackwellpublishing.com/products/journals/suppmat/mole/casjens.htm”).
Array production. PCR amplification, cloning of amplicons into the plasmid vector, and then transformation of E. coli DH5 were carried out as described previously (29, 86). Of the 1,640 ORFs that were identified in the B. burgdorferi strain B31 genome (21, 39), 1,513 (861 chromosomal genes and 652 plasmid genes) were subjected to PCR with the specific primers. The remaining 127 ORFs had sequences that were so similar to the sequence of at least one other ORF that PCR primers would not distinguish between them. Of the 861 chromosomal ORFs that were attempted to be amplified, 783 (91%) produced a product that was the correct size when PCR was performed, and 756 (88%) were successfully cloned into the vector. Of the 652 plasmid ORFs, 572 (88%) were amplified, and 536 (82%) were cloned into the plasmid vector. A sample consisting of 7% of 1,292 clones from strain B31 was randomly selected for sequencing, and the insert was confirmed in all cases. The coefficient of determination (R2) between the sizes of the ORFs and cloning success was only 0.05. The following 26 plasmid ORFs were randomly selected to be replicated on the array: BBA03, BBA04, BBA14, BBA25, BBA52, BBA59, BBA62, BBA69, BBB07, BBB19, BBC06, BBJ50, BBK50, BBL28, BBL39, BBM38, BBN37, BBO40, BBP28, BBQ35, BBQ60, BBQ80, BBR28, BBR42, BBS30, and BBT07. As a negative control, the arrays also contained 14 pairs of spots with the E. coli coupled transcription-translation reaction mixture without plasmid DNA.
Plasmid DNA was extracted and isolated using QIAprep spin kits (Qiagen). In vitro coupled transcription-translation reactions were performed with RTS 100 E. coli HY kits (Roche) in 0.2-ml tubes that were incubated for 5 h at 30° C. The presence of the polyhistidine tag at the N terminus of the recombinant protein and the presence of the influenza A hemagglutinin at the protein's C terminus were detected with monoclonal antibodies His-1 (Sigma) and 3F10 (Roche), respectively, and confirmed expression in the in vitro reactions. Products of transcription-translation reactions were printed in duplicate on nitrocellulose-coated glass slides (FAST; Whatman) using an Omni Grid 100 apparatus (Genomic Solutions).
Protein purification. Plasmid DNA was extracted from selected clones and transformed in E. coli BL21 Star(DE3)/pLysS cells as described by the manufacturer (Invitrogen). The resultant transformants were cultivated in Terrific broth (Bio 101 Systems) to stationary phase and, after harvesting by centrifugation, were lysed with BugBuster buffer (Novagen). The lysate was applied to a 5-ml HiTrap chelating HP affinity column (GE Healthcare). After the column was washed, bound proteins were eluted with an imidazole step gradient using an Amersham Biosciences AKTA fast protein liquid chromatography system operated with UNICORN 5.01 software. The average amount recovered from a 1.0-liter culture was 1 to 3 mg of protein with a purity of 80 to 90%, as estimated by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. Recovered proteins were printed on array slides, as described above, or subjected to polyacrylamide gel electrophoresis with a 4 to 15% acrylamide gradient and then transferred to nitrocellulose membranes for Western blot analysis (20). For printing on the array, the protein concentrations were 0.03, 0.1, 0.3, and 0.9 mg/ml.
Serum samples. All serum samples were originally collected for other studies for which informed consent had been obtained; patient identifier information had been removed. Serum panel 1 included samples from 48 adults collected between 1990 and 1994, including 24 patients with erythema migrans (early infection), 19 patients with dissemination to other organs or other evidence of persistent infection (later infection), and 5 healthy controls from a region where the organism is not endemic. These samples were provided by the Centers for Disease Control and Prevention, Fort Collins, Colo., which had performed flagellin-based ELISA and IgG and IgM Western blot assays, as described previously (17, 50). Panel 1 also included sera from 13 healthy adult volunteers residing in California.
Serum panel 2 included serum specimens from 20 healthy adult control subjects, 20 adult patients with culture-positive erythema migrans (early infection), and 20 individuals with persistent LB with oliogoarticular arthritis (later infection). All 40 patients with Lyme disease met the criteria of the Centers for Disease Control and Prevention for diagnosis of Lyme disease (95). The 20 patients with erythema migrans were a random sample of 93 patients, seen in a study of early LB, from whom B. burgdorferi was cultured from erythema migrans skin lesions (84, 93). Only convalescent samples, which were obtained at the conclusion of 3 to 4 weeks of antibiotic therapy, were tested for these patients, because seropositivity is more frequent during convalescence than during acute infection. The 20 patients with Lyme arthritis were seen in 2006 and early 2007 in a study of susceptibility to Lyme arthritis (82). For 10 of the 20 patients with Lyme arthritis there was resolution of arthritis with antibiotic therapy (antibiotic-responsive arthritis), and for 10 there was not resolution (antibiotic-refractory arthritis). The samples were obtained when the patients had active arthritis. All serum samples had been kept frozen at −80° C. until use.
Sera from 10 white-footed mice (P. leucopus), which had been captured and then released after blood samples had been obtained at a field site in Connecticut under an approved animal use protocol, were seropositive as determined by a whole-cell ELISA and a Western blot assay, as described previously (18). These sera were compared with sera from four adult laboratory-reared P. leucopus mice that were obtained from the Peromyscus Stock Center, University of South Carolina, and were seronegative as determined by the same assays.
Mouse immunization. Female, 4-week-old BALB/c mice (Jackson) were inoculated intraperitoneally with 10 μg purified protein in phosphate-buffered saline (PBS) or PBS alone emulsified with Freund's complete adjuvant and were boosted twice at 2-week intervals with the antigen solution or PBS alone in incomplete Freund's adjuvant. Plasma samples were collected before each immunization and after the final boost using the Microvette CB 300 system for capillary blood collection (Sarstedt).
Antibody reactions and assays. For experiments with arrays, human sera were diluted 1:200 in protein array blocking buffer (Whatman) that was supplemented with a lysate of E. coli at a final protein concentration 5 mg/ml and then were incubated at room temperature for 30 min with constant mixing (29). The arrays were rehydrated in blocking buffer for 30 min, incubated with the pretreated sera for 12 h at 4° C. with constant agitation, washed in 10 mM Tris (pH 8.0)-150 mM NaCl containing 0.05% Tween 20 buffer, and then incubated with biotin-conjugated goat anti-human IgG (Fc-fragment-specific) serum (Jackson ImmunoResearch) that was diluted 1:200 in blocking buffer. After the array slides were washed in 10 mM Tris (pH 8.0)-150 mM NaCl, bound antibodies were detected with streptavidin conjugated with the dye PBXL-3 (Martek). The washed and air-dried slides were scanned with a Perkin Elmer ScanArray Express HT apparatus at a wavelength of 670 nm and with an output of RGB format TIFF files that were quantitated using ProScanArray Express software (PerkinElmer) with correction for spot-specific background. When P. leucopus serum was used, it was diluted 1:200 in protein blocking buffer, and alkaline phosphatase-labeled goat anti-P. leucopus IgG antiserum (Kirkegaard and Perry) was used as the secondary antibody. Bound antibodies were detected using one-step nitroblue tetrazolium—5-bromo-4-chloro-3-indolylphosphate (BCIP) (Pierce). Arrays were scanned at 2,400 dpi (Hewlett-Packard ScanJet 8200 scanner), and after images were converted to gray scale format and inverted, they were quantitated as described above. Western blot analyses of whole lysates of B. burgdorferi using 10 μg protein per lane or 250 ng purified protein per lane were carried out as described previously (10). Nitrocellulose membranes were incubated with human or mouse serum at a dilution of 1:250, and bound antibodies were detected by incubation with alkaline phosphatase-labeled goat anti-human IgG antiserum or anti-mouse IgG antiserum (Jackson ImmunoResearch) at a dilution of 1:1,000. The murine monoclonal IgG antibodies to OspA (BBA15) and FlaB (BB0147) used were H5332 and H9724, respectively (9, 10).
Data analysis. The raw values from array scans were the mean average intensities of all the pixels in a pair of printed spots for each Orf or negative control; these raw values were then log transformed. A preliminary analysis showed that there was no difference in interpretation whether or not the mean value for DNA controls was subtracted from each raw value before log transformation, and, consequently, this additional step was not included.
The following analyses were carried out. (i) The mean and standard deviation (SD) for each Orf with all control serum samples in each panel were determined. For each Orf and for each serum sample, the number of SDs above or below the mean for the control sera in the same serum panel for the Orf in question was determined. For each sample all the Orfs that had array spots with values that were 2 or 3 SDs above the mean for the negative controls in the experiment for the given Orf were tabulated and summed. The frequencies of each Orf that appeared in this cumulative list were then determined. (ii) Bayesian microarray expression analysis and discriminatory antigen selection were performed with software adapted from Cyber-T for protein arrays (4, 86, 87). The correction of Hochberg and Benjamini was applied to control for false discoveries under the multiple test conditions (47). (iii) Cluster analyses were performed and graphic displays of array results were generated using the MultiExperiment Viewer v 4.0 software available from The Institute for Genomic Research (78). The Euclidian distance criterion with average linkage was used, and 1,000 bootstrap analyses with replacement iterations were carried out. (iv) Receiver operating characteristic curves were generated for selected sets of Orfs using the packages “e1071” and “ROCK” in the R statistical environment, available at the http://www. followed by “R-project.org” website. (v) Standard asymptotic or exact statistical analyses of continuous data were carried out with the SYSTAT version 11 (SYSTAT Software, Inc.) software, the StatExact version 6 (Cytel Software Corporation) software, or Confidence Interval Analysis version 2.1.2 (2), available at the http://www. followed by “som.soton.ac.uk/cia” website. Unless noted otherwise, significance tests were two sided. For means, differences, and odds ratios (OR), 95% confidence intervals are indicated below.
Proteome array and overall binding of antibodies. The array comprised in vitro products of 1,292 ORFs of strain B31 and an additional ospC allele from another strain for a total of 1,293 B. burgdorferi ORFs. In separate experiments array slides were incubated with samples of serum panels 1 and 2. A serum specimen from a patient with early LB in panel 1 was replicated in the same experiment. The Pearson and Spearman correlation coefficients were 0.94 and 0.87, respectively, for paired log-transformed raw intensity values; the mean log 10 difference between the replicates of this serum was only 0.07 (95% confidence interval, 0.06 to 0.08) for the total set of 1,293 ORFs. For the replicates of the 26 Orfs on the array, the Pearson and Spearman correlation coefficients were 0.84 and 0.84, respectively, for panel 1 and 0.81 and 0.83, respectively, for panel 2. The corresponding mean login differences were 0.07 (95% confidence interval, 0.00 to 0.15) and 0.04 (95% confidence interval, −0.03 to 0.11) for the two panels.
The analysis described above first averaged ORF values across clinical groups and then examined correlations for each of the 1,293 ORFs. If, instead, average ORF intensities by serum sample for all 1,293 ORFs were calculated first and then the averages by serum were compared by clinical stage, heterogeneity of the results for individual sera was observed as overlapping distributions for controls and patient groups. For the first serum panel the mean differences in raw intensity values between patient sera and controls were 42 (95% confidence interval, −135 to 218) for early infection and 157 (95% confidence interval, −3 to 318) for later infection; the corresponding differences from controls for panel 2 were −36 (95% confidence interval, −220 to 149) for early infection and 8 (95% confidence interval, −185 to 202) for patients with Lyme arthritis. Thus, the total amount of antibody binding to the array, which is analogous to a whole-cell assay, could not be used to assign sera to infection and control bins with confidence. More promising for this purpose was the smaller number of ORFs populating the long tail in the distributions and the “outliers” in the plots shown in
By using pairwise comparisons of all sera for individual ORFs, it was estimated that the upper of limit of the number of strain B31 Orfs that were informative as immunogens was 200 of the 1,292 Orfs on the array (see below). To identify these immunogens, two complementary approaches were used. The first approach was based on an often-used criterion for setting a “cutoff” between interpretations of positive and negative for serological assays, namely, values that were 3 SDs above the average for control sera in the same run. Before using this approach, it was first determined whether variances were out-of-proportion high when the mean values for control specimens increased. This would have been reflected in a significant increase in the coefficient of variation (CV) (i.e., SD divided by the mean) as the mean increased. For the 1,292 B31 Orfs and the panel 1 control sera, the mean CV was 0.115 (95% confidence interval, 0.113 to 0.117), and there was little correlation (R=0.06) between the mean and the CV over the range of means. Inasmuch as the SDs for the control sera were the same in both experiments (namely, 0.85 for the 18 control sera in panel 1 and 0.84 for the 20 control sera in panel 2), normalizing the data in units of SDs allowed the data sets to be combined for the two panels. Using a simulation procedure (see below), we found that for the combined set of the later LB panel 1 and 2 sera, ORF values that exceeded the cutoff at a frequency of 6 or more times, out of a possible 39, were unlikely by chance at a one-tailed level of confidence of 0.025.
The log-normalized data for later LB sera and controls of both panels were also examined using a Bayesian statistical procedure (4), using software originally developed for DNA microarray analysis and then modified for antibody binding to proteome arrays (86, 87). For each ORF, an analysis of variance (ANOVA) comparing control sera with later LB sera was performed. In this analysis, the empirical sample variances are replaced with Bayes-regularized variance estimates. The Bayes-regularized variance is obtained by incorporating both the empirical sample variance and the variance of proteins with similar intensity levels (3, 4). This analysis produced F scores and P values that were used to rank the ORFs. The log10 of the F score correlated with the frequencies based on a cutoff of 3 SDs (R2=0.73 as determined by linear regression).
Identification of immunogens. The most informative of the 200 immunogenic Orfs were then identified. Table 1 lists in alphabetical and numerical order the 84 Orfs whose values were 3 SDs above the control values 6 or more times out of a possible 39, had F scores of >11, and had corrected P values of <0.001. The Orfs with the highest frequencies of values that were 3 SDs above the control values were BBG33, BB0279, BBL27, and BBA25. An additional 19 Orfs, for a total of 103 (8.0%) out of 1,292 Orfs, had P values of <5×10-4 as determined by the Bayes-regularized analysis. This additional group included VlsE (BBF33) and the chaperonin GroEL (BB0649). The mean numbers of amino acids were 293 (95% confidence interval, 266 to 320) for the 103 Orfs on the list, compared to 260 (95% confidence interval, 248 to 272) for the other 1,190 Orfs (P=0.11, t test). Thus, immunogenicity was not associated with length of the protein. Moreover, there was no difference between the two groups of Orfs in terms of the amount of protein on the array, as measured by the raw values for antibody binding to the hemagglutinin moiety of the recombinant proteins; the values were 2,816 (95% confidence interval, 2,369 to 3,349) for the 103 Orfs and 2,618 (95% confidence interval, 2,493 to 2,750) for the other Orfs (P=0.42).
BB0181
BB0260
BB0286
BB0328
BB0337
BB0649
BBA19
BBA48
BBC03
BBE09
BBF33
BBK23
BBK52
BBL39
BBM36
BBQ13
BBQ35
BBS30
aBold type indicates an ORF that had a P value of <0.005 but whose frequency for later Lyme disease sera was <6.
bThe numbers are the numbers of serum samples whose values were ≧3 SDs above the mean of the controls for the panel. n is the number of individuals in the group for combined panel 1 and 2 sera.
cThe F score is the Bayes-regularized variance (see the text)
dThe P value is the corrected P value (0, P < 1.0E−14).
eThe numbers are the numbers of LB patient serum samples whose values were ≧3 SDs or ≧2 SDs above mean of the controls for panels 1 and 2. n is the number of individuals in the group for combined panel 1 and 2 sera.
fThe numbers are the numbers of P. leucopus sera (out of 10) whose values are ≧3 SDs above the mean for four control P. leucopus mice.
g+, protein predicted to be a lipoprotein; −, protein not predicted to be a lipoprotein.
hAlternative protein designations are given in parentheses.
Several proteins that were known antigens and valuable for serodiagnosis were on the list. These proteins included FlaB (BB0147) (9, 45), the P66 outer membrane protein (BB0603) (5, 16), OspA and OspB (BBA15 and BBA16) (48), decorin-binding protein B (BBA25) (37, 46), OspC (BBB 19) (68, 96), fibronectin-binding protein (BBK32) (71), and VlsE (BBF33) (54, 56). The other reactive Orfs that were previously reported to elicit antibodies during infections of humans or experimental animals were as follows: LA7 (BB0365) (53, 94), the chaperonins DnaK (BB0518) and GroEL (BB0649) (58), FlgE (BB0283) (51), some Erp proteins (59, 85), oligopeptide ABC transporters (OppA; BB0328, BB0329, BBA34, and BBB16) (25, 28, 65), “S2 antigen” (BBA04) (36), the paralogous BBA64 and BBA66 proteins (65), RevA proteins (BBC10 and BBM27) (41, 65), EppA/BapA (BBC06) (63), Mlp proteins (BBN28, BBQ35, and BBS30) (70), and some Bdr proteins (99).
There were several Orfs that previously either were not recognized as immunogens during infection or had received little attention. Notable among this group were the following: (i) the paralogous BBK07 and BBK12 lipoproteins; (ii) BBK19 and BBK53, two other lipoproteins encoded by plasmid lp36; (iii) several more flagellar apparatus proteins, including FliL (BB0279), FlaA (BB0668), and FlgG (BB0774); (iv) additional paralogous family (PF) 44 proteins (BBE09, BBK53, and BBQ04); (v) BB0260, BB0323, BB0543, and BB0751, hypothetical proteins encoded on the chromosome; (vi) BBA03, BBA07, BBA36, and BBA57, hypothetical proteins or lipoproteins uniquely encoded by lp54; and (vii) BBG18 and BBH06, unique hypothetical proteins encoded by other plasmids. On the list of new immunogens there were only a few chromosome-encoded Orfs that were homologous to proteins having established functions in other bacteria, such as the phosphate ABC transporter PstS (BB0215), pyruvate kinase (BB0348), a carboxy-terminal protease (BB0359, and a methyl-accepting chemotaxis protein (BB0681).
Whereas plasmid-encoded Orfs accounted for 536 (41%) of the 1,292 B31 Orfs on the array, 70 (69%) of the 102 immunogenic Orfs of strain B31 are plasmid encoded (OR, 3.1 (95% confidence interval, 2.0 to 4.9); exact P<10−6). Fifty-nine (58%) Orfs, all but tw of which were plasmid encoded, belonged to 1 of 26 PFs. Of a possible 174 Orfs that belong to 1 of these 26 PFs, 114 (66%) were included as amplicons on the array. The greatest representation was that of PF 80, which comprises the Bdr proteins; 12 (92%) of a possible 13 Orfs were on the list of 83 Orfs. These Orfs included high-ranking BBG33 and BBL27 proteins. Other PFs with three or more representatives on the list were the PFs containing the Erp proteins (PFs 162 to 164), oligopeptide ABC transporters (PF 37), Mlp proteins (PF 113), the “S2 antigen” and related proteins (PF 44), and a set of hypothetical proteins with unknown functions (PF 52).
For tabulation of the plasmid locations of the ORFs shown in Table 1, pseudogenes and ORFs were excluded that were less than 300 nucleotides long (21). The sizes of linear plasmids lp38 (38,829 nucleotides) and lp36 (36,849 nucleotides) are similar. Only 1 of lp38's 17 ORFs, BBJ24, was among the ORFs encoding high-ranking antigens, but 8 of the 19 lp36 ORFs were (OR, 11.6 (95% confidence interval, 1.2 to 548); P=0.03). The presence of plasmid lp36 has been associated in one study with infectivity or virulence in a mammalian host (49), as has been the presence of lp25 in another study (72), but only BBE09 of the 10 ORFs of lp25 were among the ORFs encoding immunogens.
Forty-eight (48%) of the 102 immunogens of strain B31 are lipoproteins as determined by prediction or empirical documentation. Of the 756 chromosome-encoded Orfs included in the array, only 32 (4%) are lipoproteins, but, as shown in Table 1, 7 (21%) of the 33 chromosome-encoded Orfs among the immunogens are lipoproteins (OR, 6.1 (95% confidence interval, 2.1 to 15.8); P=0.001). Whereas 85 (16%) of the 536 plasmid-encoded Orfs on the array are predicted lipoproteins, 41 (59%) of the 70 plasmid-encoded proteins of strain B31 on the antigen list are predicted lipoproteins (OR, 7.6 (95% confidence interval, 4.3 to 13.4); P<10−12). In addition to five documented outer membrane proteins (OspA, OspB, OspC, VlsE, and P66), the following three hypothetical proteins among the immunogens were predicted to localize to the outer membrane by the PSORT algorithm for double-membrane bacteria (40): BB0260, BB0751, and BB0811.
Stage of infection. In general, sera from early in infection reacted with fewer antigens per serum sample and antigens from a narrower list of antigens. For 20 (83%) of the 24 panel 1 early LB cases there was at least one Orf in Table 1 whose value exceeded the 3-SD cutoff. Of the four cases for which there was not at least one Orf whose value exceeded the 3-SD cutoff, three (75%) were seronegative as determined by ELISA and IgG and IgM Western blotting. Of the 20 cases of early infection for serum panel 2, 17 (85%) had at least one Orf whose value was >3 SDs. For the 37 samples with one or more reactive Orfs, the number of Orfs whose values were above the threshold ranged from 1 to 37, and the median number was five Orfs per sample. For the 84 antigens identified by the first analysis, the values for 69 (82%) were above the threshold for at least one of the early-infection sera (Table 1). In most cases, the following 15 Orfs whose values fell below the cutoff with all early sera were also among the least prevalent Orfs for sera obtained later in disease: BB0408, BB0476, BB0751,BB0805, BB0844, BBA04, BBB09, BBB16, BBC10, BBJ24, BBK13, BBK53, BBN28, BBO40, and BBR12. The Orfs whose values exceeded the cutoff in at least 10 of the 37 samples were, in descending order, BBA25 (DbpB), BBB19 (OspC type A), BBB19 (OspC type K), BBK32 (fibronectin-binding protein), BBK12, BBG33 (BdrT), BBK07, and BB0279 (FliL).
Sera of the 10 patients with refractory Lyme arthritis were compared with sera of the 10 patients with arthritis that responded to antibiotic therapy. As determined by a t test and nonparametric rank test of log-transformed values, there was not a significant difference (P>0.05) between the two groups for any of the 1,293 Orfs, including both OspC proteins.
White-footed mouse antibodies. Using the same batch of genome-wide arrays, the reactions of sera from 10 P. leucopus mice were examined that were captured in an area in which the level of B. burgdorferi infection of mice approached 100% by the end of the transmission season (18). All 10 mice were seropositive as determined by the whole-cell assay and Western blot analysis (18). These sera were compared with sera from four laboratory-reared P. leucopus mice. As described above, the number of SDs above or below the mean of the controls was calculated for each Orf and each mouse serum. Of the 103 Orfs shown in Table 1, only 30 (29%) were not represented at least once among the Orfs with values of ≧3 SDs with P. leucopus sera. The highest frequencies (≧7 of 10 sera) were those of the following Orfs, in alphabetical order: BB0279, BB0286, BB0844, BBA25, BBA36, BBA57, BBA64, BBB19 (OspC types A and K), BBF33, BBG33, BBK07, BBK12, BBK19, BBK32, BBL40, BBN39, BB039, BBP39, BBQ34, and BBS41. Thirteen Orfs had frequencies of ≧5 among the 10 P. leucopus sera but were not among the high-ranking Orfs with human sera (Table 1). These Orfs included two hypothetical proteins (BB0039 and BB0428), two members of PF 143 (BBP26 and BBS26), and the BBK50 protein, another lp36-encoded protein. But also represented among the high-ranking Orfs with P. leucopus sera were members of PFs, at least one of which was frequently recognized by human antibodies, including two additional PF 113 proteins, MlpH (BBL28) and MlpA (BBP28); another PF 164 protein, ErpK (BBM38); and an additional PF 54 protein, BBA73. Overall, there was considerable overlap in the sets of immunogens for humans and P. leucopus infected with B. burgdorferi.
Second array. To confirm the results described above, we produced a second array with 66 recombinant proteins selected from the 103 Orfs shown in Table 1. The second array contained three additional proteins that were not cloned for the first array. Two of these, BB0383 (BmpA or P39 protein) and BB0744 (P83/100 protein), are among the 10 signal antigens for a commonly used criterion for Western blot interpretation (33). The third additional ORF was BBA24 or decorin-binding protein A (DbpA). The smaller arrays were incubated with 12 later LB sera and three control sera from panel 1.
Purified proteins. Five Orfs were selected for further investigation as purified recombinant proteins: BB0279 (FliL), (FlgE), BBA25 (DbpB), BBG33 (BdrT), and BBK12. Western blot analyses were carried out with sera from 17 patients with later LB and five panel 1 controls (
While detection of antibody to an Orf was evidence of expression of the predicted polypeptide, this evidence was indirect. One of the purified proteins, BBK12, was used to immunize mice and thereby provide a reagent for more direct documentation of expression. It is noted that such immunization could be performed with any of the proteins found to be immunogenic (e.g., in Table 1 or Table 3) in order to generated an antibody reagent for diagnostic or other applications. This Orf was chosen because it and the product of a paralogous gene, BBK07, had not been previously reported to be immunogenic. In fact, there was little previous comment on either of these proteins beyond their annotation as hypothetical lipoproteins with unknown functions.
How many antigens are sufficient? This Example permitted study an estimation of the minimum number of antigens that would be needed to achieve a highly specific B. burgdorferi diagnostic assay. For this, the discriminatory power of different sets of ORFs was studied using receiver operating characteristic (ROC) curves, where the false-positive rate (1—specificity) is the x axis and the true positive rate (sensitivity) is they axis for different thresholds of the underlying classifier. The area under the curve (AUC) summarizes the results. An AUC of 1.0 indicates a perfect classifier, while an AUC of 0.51 (95% confidence interval, 0.38 to 0.64) is the expected value for a classifier that works by chance for the data set, as inferred by the method of Truchon and Bayly (89). The log-transformed data for controls and later LB sera from both panels were used for this analysis. First, ROC curves were generated for single antigens to assess the ability to separate the control and disease. The Orf number is the rank based on the Bayes-regularized ANOVA F score (see Table 5).
The top Orfs discriminate very well. The first nine Orfs all have an AUC of >0.95, and further down the rank, the ability diminishes. The 25th immunogen has an AUC of 0.90, the 50th immunogen has an AUC of 0.85, the 100th immunogen has an AUC of 0.74, and the 165th Orf has an AUC of 0.65, which still exceeds the upper 95% confidence interval for random expectations for the AUC.
To extend the analysis to combinations of antigens, kernel methods and support vector machines were used, as described by Vapnik (92), to build linear and nonlinear classifiers. Different kernels, including linear, polynomial, and radial basis function, were evaluated. Only the radial basis function kernel showed an increase in the AUC when noise was added, and accordingly, this kernel was chosen for subsequent simulations in which noise was introduced. For each data set, the support vector machines were tuned using a wide parameter sweep to achieve the best gamma and cost values. Results were validated with 10 runs of threefold cross-validation. As input to the classifier, the highest-ranking 2, 5, 25, and 45 Orfs were used on the basis of either Bayes-regularized ANOVA F scores or frequencies of later LB sera exceeding a 3-SD cutoff. The results of two ranking schemes were similar, and only the frequency ranking results are shown in
For the present data set, there were negligible differences in the ROC curves obtained using 2, 5, 25, or 45 antigens. The mean AUC values over the 10 validation runs were >0.98 for two antigens and a perfect 1.0 for five or more antigens. The unsurpassable performance in this experiment with relatively few antigens may be attributed to the high discrimination provided by the first several antigens on the list by themselves. In a realistic diagnostic setting with sera coming from various sources and backgrounds and with interoperator variances, one might expect some addition of noise in the data. To further examine how combinations of antigens increase the discriminatory power, two different noise models and their effects on the classifiers were explored. The noise model involves the addition of uniform Gaussian noise. Each point (u) in the data set has some noise added such that u′=u+N(μ=0, σ2=s), where s is constant across the whole data set. Noise levels are generated by scaling s. In general, using more antigens in the classifier increases resistance of the simulated assay to noise. All of the classifiers discriminate very well with low noise levels. For the two-antigen classifier, the AUC dropped to the value expected by chance by the time noise was at a scale of 75. The five-antigen classifier value dropped to 0.6 with a noise level of 150. The 25- and 45-antigen classifiers still performed relatively well, with mean AUC values of 0.74 and 0.71, respectively. Hence, based on the criteria of high predictive value and robustness in the face of increasing noise, 25 antigens were as informative as 45 antigens.
The genome-wide protein array for B. burgdorferi allowed comparison of far more proteins than could be compared previously with one-dimensional Western blots (8, 24, 33). While comparable numbers of proteins for analysis might theoretically be obtained with two-dimensional electrophoresis (66), scarce immunogens in the lysates would be overlooked. Moreover, unless the microbe's cells were taken directly from an infected animal, informative antigens that were expressed only in vivo would be not be included from samples subjected to electrophoresis. This Example of natural infections of humans and white-footed mice with B. burgdorferi followed genome-wide array analyses of antibody responses to poxvirus infections in humans immunized with a smallpox vaccine and to F. tularensis infections in experimental animals (29, 35, 87) and ELISA format studies of T. pallidum ORFs (11, 61). The major emphasis of the previous studies was identification of immunogens after immunization with whole microbes or during infection. That same goal was pursued in this Example in the study of natural infections that occurred in two very different ecological settings: (i) patients with different stages of Lyme disease, including the arthritis in later disease, and (ii) white-footed mice, which are a major reservoir host of B. burgdorferi in the United States and in which infection is nearly universal in enzootic areas. As discussed below, the goal of discovery of new antigens was met: many new immunogens were identified among the Orfs of B. burgdorferi.
Of equal interest was a second question: how many of the predicted proteins of this pathogen elicit an antibody response during natural infection? For this, the concern was the set of proteins that were not demonstrably immunogenic. Only by including most of a genome's ORFs in the experiment could one address this question, which as a general principle is relevant to many other infectious diseases. Important for hypothesis testing for this second goal was the likelihood of false negatives or type II errors. If minimizing false positives or type I errors (i.e., inaccurately identifying an Orf as an immunogen) was the experimental design challenge for the first goal, then minimizing false negatives (i.e., overlooking Orfs that were truly immunogenic) was the challenge for the second goal. In the present study, type II errors could happen for several reasons.
Indisputably, failure to amplify, clone, and then express a given ORF would lead to a miss of an Orf that was actually immunogenic. Of the ˜20% of the Orfs that were absent from the array, undoubtedly some elicit antibody responses during infection. But in many of these cases, the missing Orf was a member of a PF, at least one member of which was represented in the array. Other ORFs were not included because they had characteristics of pseudogenes. Taking these considerations into account, it was estimated that at least 90% of the nonredundant ORFs that were true genes were included in the array analysis. When called for, some missing ORFs were successfully amplified in reattempts using either the original primers or modifications of the primers. In these instances, addition of the antigens missing from first array to a second array did not materially change the results. This suggests that returns diminish as further efforts to fully constitute the array consume greater resources.
Another basis for type II errors would be posttranslational modifications that are important for antibody recognition that occur in B. burgdorferi but not in E. coli. While one cannot rule out a limitation to the study for this reason, there is no evidence or only scant evidence that glycosylation or a similar posttranslational modification affects antigenicity in Borrelia spp. The most prevalent protein modification in Borrelia spp. appears to be the addition of a lipid moiety to the N terminus of the processed proteins in a fashion typical of many types of bacteria. While E. coli cells are capable of carrying out this lipidation function for recombinant Borrelia proteins, this activity did not occur in the acellular transcription-translation reactions used here. This indicates that the significantly greater representation of lipoproteins among immunogens than that expected based on a lipoprotein's size among all Orfs was not attributable to antibodies to the lipid moieties themselves. Instead, the comparatively greater immunogenicity of lipoproteins may be a consequence of the mitogenicity and adjuvant like qualities of bacterial lipopeptides. For the 1,292 B31 ORFs that were successfully amplified, cloned, and expressed, some of the products may have been overlooked as immunogens because their epitopes are conformational and proper folding was not achieved in the in vitro reaction or subsequently when the polypeptide was printed. This possibility cannot be rule out. But the correct calls for the well-established antigens included in the array, such as OspC, FlaB, P66, P83/100, BmpA, fibronectin-binding protein, and VlsE, among others, as “immunogens” indicate that there were few instances of type II errors on the basis of loss of conformational epitopes or some other artifact of the procedures.
Another limitation of the study, at least in the case of the human sera, was the restriction of secondary antibodies to antibodies that were specific for IgG. By failing to account for IgM antibody binding, the total number instances in which the Orfs were recognized by antibodies during early infection may have been under estimated. However, it is not suspected that this effect was great if it occurred at all. There was no instance of an Orf that was recognized by antibodies in sera from early infection and not by antibodies obtained later in the disease. The rationale for limiting antibody detection to IgG was the generally poorer specificity of IgM-based assays for Lyme disease (34, 91). The importance of eventually evaluating antigens for their predictive value with IgM as well as IgG antibodies is recognized, but the focus here was on identification of immunogens with the greatest informative value (that is, with high specificity as well as sensitivity). Notwithstanding the actual and theoretical limitations of the study, we concluded that the array results were not confined to identification of new immunogens but could also be used to gauge the proportion of proteins that are not immunogens. As far as it is known, this perspective on immune responses during natural infection is unique among studies of proteomes of bacteria, fungi, or parasites. By taking this perspective, it was estimated that the number of Orfs that elicited antibodies in at least some individuals that were infected was about 200, or ˜15% of the 1,292 Orfs subjected to analysis with two panels of sera representing different stages of infection. Three types of data supported this conclusion: the magnitude of sign differences between pairs of LB patient sera with control sera (see Table 2), the number of Orfs with corrected, regularized P values <0.01, and number of Orfs with areas under the ROC curve that exceeded the upper confidence limit for random expectations (see Table 5). Of this larger set of immunogens, ˜100 were broadly enough reactive across several LB serum samples that they could be used to distinguish groups of infected individuals from groups of controls. This interpretation also seemed to hold true for white-footed mice, which generally recognized the same subset of proteins as humans. The absolute number of distinct (i.e., non-cross-reactive) antigens is probably less than the first accounting suggested, because of the heavy representation of proteins in PFs on the immunogen list (Table 1). The several Bdr proteins on the list could probably be replaced in an array by one or two Bdr proteins with no loss of sensitivity.
aThe sera were panel 1 sera from controls and from patients with early and later LB. CI, confidence interval.
bDetermined by the exact Wilcoxon signed-rank test.
The question of the minimal set of antigens necessary for discrimination between sera from patients and sera from controls was also addressed by the ROC curve analysis (
To sum up, it was determined that proteins that detectably elicit antibodies during natural infection constitute about 15% of the polypeptides that might be expressed. In certain embodiments, incorporation of 2% of the total Orfs in an assay appears to be sufficient to provide very high levels of sensitivity and specificity. The attention now turns to what the high-value immunogens are. In the course of this study, it was discovered that several protein antigens of B. burgdorferi that have promise for serodiagnosis of LB but which were unappreciated as immunogens during infection. These previously unknown antigens appear to be as informative as other proteins, such as FlaB, OspC, P66, BmpA, and VlsE, that have established value for LB serodiagnosis. In addition, in this study we also rediscovered several other proteins that may have been observed in a limited number of studies to be immunogenic in either natural or experimental infections but whose value had not been confirmed or which had not been further developed. Among these are the Bdr proteins.
The list of immunogenic proteins identified by proteome array analysis was compared with lists of genes that were more highly expressed under various conditions simulating infection in the natural hosts and were reported by Revel et al. (74), Ojaimi et al. (67), Brooks et al. (13), and Tokarz et al. (88). The concurrence between the proteome list and the four DNA array lists was greatest for the study of Revel et al., and accordingly, this study was the study used for comparison. Revel et al. employed three experimental conditions: (i) 23° C. and pH 7.4 in broth medium, which represented the environment in the unfed tick; (ii) 37° C. and pH 6.8 in broth medium, which represented the environment in ticks as they are feeding on a host and transmitting B. burgdorferi; and (iii) a dialysis chamber in the peritoneum of rats. Of the 79 Orfs that showed a 2-fold increase in mRNA under fed-tick conditions in comparison to unfed-tick conditions, the following 23 (29%) were among the high-ranking immunogens: BB0323, BB0329, BB0365, BB0668, BB0681, BB0844, BBA03, BBA07, BBA25, BBA34, BBA36, BBA66, BBB19, BBI42, BBK07, BBK13, BBK32, BBK53, BBL40, BBM27, BBO40, BBP39, and BBQ03. Four of these Orfs are encoded by the lp36 plasmid. Among the 19 Orfs whose expression was found by Revel et al. to significantly increase in dialysis chambers in comparison to conditions mimicking unfed ticks, 5 (26%) were on the antigen list. The only three Orfs whose expression decreased under conditions associated with mammalian infection were BBA15 (OspA) and BBA16 (OspB), whose expression was known to decrease in the fed ticks and during early infections in mammals (32, 80, 81), and BB0385 (BmpD). Thus, there was an association between the upregulation of genes in the fed ticks and mammals and the immunogenicity of the gene products in infected humans.
Western blots of two-dimensional electrophoresis gels provide greater resolution than one-dimensional gels and allow detection of less abundant immunogens in lysates. Nowalk et al. performed such an proteomic analysis with the same samples that constituted serum panel 1 (66). Fifteen of the 21 proteins identified by Nowalk et al. as immunogens were also high-ranking Orfs in the present study. These proteins include four Erp proteins (BBL39, BBL40, BBN38, and BBP39), three oligopeptide ABC transporters belonging to PF 37 (BB0328, BB0329, and BBB16), two PF 54 proteins (BBA64 and BBA66), a RevA protein (BBM27), and the unique hypothetical protein BBA03, as well as the established antigens BB0147 (FlaB), BB0365 (LA7), BB0603 (P66), and BBA15 (OspA).
The large number of proteins newly identified as immunogenic precludes discussion of each of them in depth here. Instead, we limit our remarks to the Bdr proteins (PF 113), flagellar apparatus proteins, and BBK07 and BBK12, the two members of PF 59. Of all the PF proteins, the Bdr proteins were the most prevalent among the Orfs shown in Table 1. It was previously reported that LB patients, but not controls, had antibodies to some of the Bdr proteins (99), but in that study we did not include BdrT (BBG33), the highest-ranked Bdr protein here. While proteins in PFs tend in general to be more immunogenic than other non-PF Orfs, if only because of their multiple versions in a cell, the Bdr proteins may be doubly immunogenic because they have intramolecular repeats as well (98). The number of copies of the peptide TKIDWVEKNLQKD or a variation of this peptide in a Bdr sequence determines the size of the protein. The BBG33 protein, which is 266 amino acids long, is the largest Bdr protein encoded by the B31 genome. Most of the other Bdr proteins are less than 200 residues long. If the internal repeats are immunodominant epitopes, then BdrT would display more of these repeats for the binding of antibodies than other Bdr proteins and, consequently, generate higher spot intensities. The coefficient for BBL27 regressed on BBG33 is 0.86, and the y intercept is −0.29 (
This study revealed that several flagellar apparatus proteins besides FlaB flagellin (BB0147), the FlgE hook protein (BB0283), and the FlaA protein (BB0668) (42, 51, 69, 77) elicit antibody responses during infection. Brinkmann et al. found that FlgE of T. pallidum was frequently bound by antibodies in sera from patients with syphilis (11). FliL (BB0279) stood out among this larger group of flagellar antigens because of the frequency with which it was recognized by both human and white-footed mouse serum. Indeed, the field mice had antibody to FliL more frequently than they had antibody to FlaB, the long-standing flagellar antigen of choice for diagnosis. FliL has 178 residues and is the flagellar basal body-associated protein, which as an inner membrane protein interacts with the cytoplasmic ring of the basal body of the flagellum apparatus. Among all organisms, the most similar proteins outside the genus Borrelia are the FliL proteins of T. pallidum, Treponema denticola, and Leptospira interrogans, but the sequence identities with the proteins of these other spirochete species are less than 35%. In comparison, the FlaB protein of B. burgdorferi is 40% identical to the homologous flagellin proteins of Treponema spp. As a component of an immunoassay, FliL may show less antigenic cross-reactivity with the homologous proteins of other bacteria than has been the case with FlaB (59, 73). Of all the newly identified Orfs, the most attention was paid to BBK07 and BBK12. As determined by stringent criteria, these are predicted lipoproteins, and although the amino acid sequences are 88% identical, the ORFs are located several ORFs apart in the left arm of the lp36 plasmid. Comparison of the BBK07 and BBK12 gene sequences of strain B31 with the sequences of two other strains, 297 and N40, revealed >98% sequence identity between the strains for these sequences, an indication that a single example of each could be used to detect antibodies to other strains of B. burgdorferi. Although BBK12 and, by inference, BBK07 are expressed by cells cultivated in the laboratory (
Estimation of the number of immunogenic Orfs. The size of the set of proteins that were immunogenic in B. burgdorferi infections was assessed by examining the relative amounts of binding for antibodies in panel 1 serum specimens and each of the 1,292 strain B31 Orfs. To do this, the sign for each possible pair of sera in the panel was determined. The member of a pair that had the higher intensity value for a given Orf was assigned a value of “1,” and the pair member with lower reactivity was assigned a value of “0.” As a hypothetical example, if serum a had an intensity value of 3,246 for Orf x and serum b had a value of 1,711 for Orf x, then serum a was assigned a value of “1” and serum b was assigned a value of “0” for the pairwise comparison in the matrix.
Under the null hypothesis, a given serum sample would have the higher value of the pair in one-half of the comparisons, or 646 comparisons in this case. This was observed when controls were compared to controls, early-infection sera were compared to early-infection sera, and later-infection sera were compared to later-infection sera; the observed mean values were 646 (95% confidence interval, 482 to 810), 646 (95% confidence interval, 524 to 768), and 646 (95% confidence interval, 505 to 787), respectively. In contrast to these results for within-group pairs were the results for between-group pairs, e.g., a control serum and an LB serum. Table 2 summarizes the intergroup comparisons. Excess binding in the range from 100 to 200 Orfs was also observed with later-infection sera compared to early-infection sera. From these results, it was estimated that the upper limit for immunogenic Orfs during human infection was 200, or 15% of the 1,292 strain B31 Orfs on the array.
Simulation to establish a cutoff frequency. The mean and SD for each Orf with all control serum samples in each panel were determined. Then, for each Orf and for every serum sample in each panel, the number of SDs above or below the mean for the control sera in the same serum panel for the Orf in question was determined for normalization. For each sample all the Orfs that had array spots with values that were 3 SDs above the mean for the negative controls in the experiment for the given Orf were tabulated and summed. The frequencies of each Orf that appeared in this cumulative list were then determined. To provide an exact test of the significance of the counts that were obtained, the linkages were randomized for a given normalized value and an Orf and then likewise counted the times that an Orf was associated by chance with an SD that was 3 above the controls. This gave an estimate of the distribution under random conditions. Four replicates were performed, and the means were used to provide a distribution under the null hypothesis of random association between SD and Orf.
All publications and patents mentioned in the above specification are herein incorporated by reference. Various modifications and variations of the described method and system of the invention will be apparent to those skilled in the art without departing from the scope and spirit of the invention. Although the invention has been described in connection with specific preferred embodiments, it should be understood that the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the described modes for carrying out the invention which are obvious to those skilled in chemistry and molecular biology or related fields are intended to be within the scope of the following claims.
The present application claims priority to U.S. Provisional Application Ser. No. 60/970,837, filed Sep. 7, 2007, which is herein incorporated by reference.
The invention was made with government support under grant numbers AI24424, AI065359, AI072872, LM007743, and AR20358 awarded by the NIH, and grant number MRI EIA-0321390 awarded by the NSF. The government has certain rights in the invention.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US2008/075613 | 9/8/2008 | WO | 00 | 7/15/2010 |
Number | Date | Country | |
---|---|---|---|
60970837 | Sep 2007 | US |