The present invention relates to the diagnosis of cancer e.g. prostate cancer. In particular, it relates to a subgroup of human endogenous retroviruses (HERVs) which show up-regulated expression in prostate tumors, and to the polypeptides encoded by spliced mRNAs expressed by these viruses.
The content of the following submission on ASCII text file is incorporated herein by reference in its entirety: a computer readable form (CRF) of the Sequence Listing (file name: 223002123310SeqList.txt, date recorded: Jan. 14, 2014, size: 215 KB).
References 1 and 2 disclose that human endogenous retroviruses (HERVs) of the HML-2 subgroup of the HERV-K family show up-regulated expression in prostate tumors. The contents of references 1 and 2 are incorporated herein by reference.
It is an object of the invention to provide further materials that can be used in the prevention, treatment and diagnosis of cancer, e.g., prostate cancer. It is a further object to provide improvements in the prevention, treatment and diagnosis of cancer e.g. prostate cancer and breast cancer.
HERVs have been known for many years, and genomic sequence for the HERV-K family has been known since 1986 {ref. 187}. The usual gag, prt, pol and env retroviral proteins have been identified for HERV-K, as has an analogue of IHV Rev or HTLV Rex, known as cORF or Rec {3}, but analogues of other regulatory proteins (e.g. HIV Tat or HTLV Tax proteins) have not been identified.
The Rev/Rex analog ‘cORF’ is encoded by an ORF which shares the same 5′ region and start codon as env, but in which a splicing event removes env-coding sequences and shifts to a reading frame +1 relative to that of env {4, 5}. Within the final exon in the env region of PCAV, therefore, reading frames 1 and 2 encode env and cORF, respectively, but no protein encoded by the third reading frame has previously been reported, and this +2 reading frame has no known function in HERV-K.
The inventors have now found a series of proteins generated by splicing in the env region of HERV-K genomes, including several which utilize the +2 reading frame. The proteins show activity typical of transcriptional regulators, and they also have oncogenic potential. These proteins can be used in cancer diagnosis and therapy, and are also drug targets e.g. for adjuvant therapy.
The identification of these new polypeptide products is remarkable because full sequence information has been available for HERV-K viruses for over 15 years.
The invention provides a method for diagnosing cancer, the method comprising the step of detecting the presence or absence in a patient sample of a HML-2 expression product produced by a splicing event in which the 5′ region and start codon of the env coding region are joined to a downstream coding region in the reading frame +2 relative to that of env in the genome. Higher levels of expression product relative to normal tissue indicate that the patient from whom the sample was taken has cancer (e.g. prostate cancer). The expression product may or may not be functional in a viral life cycle.
The expression product which is detected is either a mRNA transcript or a polypeptide translated from such a transcript. These expression products may be detected directly or indirectly. A direct test uses an assay which detects HML-2 RNA or polypeptide in a patient sample. An indirect test uses an assay which detects biomolecules which are not directly expressed in vivo from HML-2 e.g. an assay to detect cDNA which has been reverse-transcribed from a HML-2 mRNA, or an assay to detect an antibody which has been raised in response to a HML-2 polypeptide.
Where the diagnostic method of the invention is based on mRNA for diagnosis of cancer, the patient sample will generally comprise cells from the tissue of interest e.g. prostate cells for prostate cancer, breast cells for breast cancer, etc. These cells may be present in a sample of tissue taken from the relevant organ, or may be cells which have escaped into circulation (e.g. during metastasis). Instead of or as well as comprising cells, the sample may comprise virions which contain mRNA from HML-2, or bodily fluids.
Where the diagnostic method of the invention is based on polypeptide, the patient sample may comprise cells and/or virions (as described above for mRNA), or may comprise antibodies which recognize the polypeptide. Such antibodies will typically be present in circulation.
In general, therefore, the patient sample for males is a prostate sample (e.g. a biopsy) or a blood sample, and for females it is a breast sample (e.g. a biopsy) or a blood sample.
The patient is generally a human, and preferably an adult human.
Expression products may be detected in the patient sample itself, or may be detected in material derived from the sample (e.g. the supernatant of a cell lysate, or a RNA extract, or cDNA generated from a RNA extract, or polypeptides translated from a RNA extract, or cells derived from culture of cells extracted from a patient, etc.). These are still considered to be “patient samples” within the meaning of the invention.
Methods of the invention can be conducted in vitro or in vivo.
Other possible sources of patient samples include isolated cells, whole tissues, or bodily fluids (e.g. blood, plasma, serum, urine, pleural effusions, cerebro-spinal fluid, breast milk, colostrum, other fluids secreted by the breast, semen, seminal fluid, etc.)
Where the diagnostic method of the invention is based on mRNA detection, it typically involves detecting a RNA which encodes a polypeptide of the invention. The RNA will comprise the ATG codon of the Env ORF which, through splicing as shown in
Preferred RNAs comprise a sequence which has at least s % sequence identity to SEQ ID 52. SEQ ID 52 is the 50 nucleotides of the HERV-K(C7) virus {ref. 6} immediately downstream of ‘Potential splice site B’ in
Other preferred RNAs comprise a sequence which has at least s % sequence identity to one or more of SEQ IDs 19, 20, 21, 24, 25, 26, 38, 40 and/or 42. Particularly preferred RNAs comprise a sequence which has at least s % sequence identity to one or more of SEQ IDs 38, 40 and/or 42.
Preferred RNAs comprise a sequence which encodes a polypeptide having at least s % sequence identity to one or more of SEQ IDs 7, 8, 9, 10, 11, 21, 28, 29, 30, 31, 34, 35, 36, 39, 41, 43, 67, 68 and 69. Particularly preferred RNAs comprise a sequence which encodes a polypeptide having at least s % sequence identity to one or more of SEQ IDs 7, 8 and/or 9.
The value of s is preferably at least 50 (e.g. at least 55, 60, 65, 70, 75, 80, 85, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 99.5, 99.9, etc.).
Preferred RNAs encode a polypeptide which may bind to RNA comprising SEQ ID 49.
The RNA will usually also comprise one, two, three, four or five of the following:
The percent identity of the sequences described above are determined by the Smith-Waterman algorithm using the default parameters: open gap penalty=−20 and extension penalty=−5.
These mRNA molecules are referred to below as “PCA-mRNA” molecules (“prostate cancer associated mRNA”), and endogenous viruses which express these PCA-mRNAs are referred to as PCAVs (“prostate cancer associated viruses”). Nevertheless, said PCAVs may also be associated with other types of cancer and, in particular, breast cancer.
In general, therefore, the mRNA to be detected has formula N1-N2-N3-N4-N5-polyA, wherein:
N1 is preferably at the 5′ end of the mRNA (i.e. 5′-N1- . . . ). Although N1 is defined above by reference to SEQ ID 49, up to 100 nucleotides (e.g. 10, 20, 30, 40, 50, 60, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90 or 100) from the 5′ end of SEQ ID 49 may be omitted, depending on the start site of transcription e.g. N1 may at least 75% sequence identity to SEQ ID 478.
Where N5 is present, it is preferably immediately before a 3′ polyA tail (i.e. . . . N5-polyA-3′).
The RNA will generally have a 5′ cap.
Where diagnosis is based on mRNA detection, the method of the invention preferably comprises an initial step of: (a) extracting RNA (e.g. mRNA) from a patient sample; (b) removing DNA from a patient sample without removing mRNA; and/or (c) removing or disrupting DNA comprising SEQ ID 4, but not RNA comprising SEQ ID 4, from a patient sample. This is necessary because the genomes of both normal and cancerous cells contain multiple PCAV DNA templates, whereas increased PCA-mRNA levels are only found in cancerous cells. As an alternative, a RNA-specific assay can be used which is not affected by the presence of homologous DNA.
Methods for extracting RNA from biological samples are well known {e.g. refs. 8 & 17} and include methods based on guanidinium buffers, lithium chloride, SDS/potassium acetate etc. After total cellular RNA has been extracted, mRNA may be enriched e.g. using oligo-dT techniques.
Methods for removing DNA from biological samples without removing mRNA are well known {e.g. appendix C of ref. 8} and include DNase digestion.
Methods for removing DNA, but not RNA, comprising PCA-mRNA sequences will use a reagent which is specific to a sequence within a PCA-mRNA e.g. a restriction enzyme which recognizes a DNA sequence within SEQ 11) 4, but which does not cleave the corresponding RNA sequence.
Methods for specifically purifying PCA-mRNAs from a sample may also be used. One such method uses an affinity support which binds to PCA-mRNAs. The affinity support may include a polypeptide sequence which binds to the LTR of PCAV e.g. the tat polypeptide described below.
Various techniques are available for detecting the presence or absence of a particular RNA sequence in a sample {e.g. refs. 8 & 17}. If a sample contains genomic PCAV DNA, the detection technique will generally be RNA-specific; if the sample contains no PCAV DNA, the detection technique may or may not be RNA-specific.
Hybridization-based detection techniques may be used, in which a polynucleotide probe complementary to a region of PCA-mRNA is contacted with a RNA-containing sample under hybridizing conditions. Detection of hybridization indicates that nucleic acid complementary to the probe is present. Hybridization techniques for use with RNA include Northern blots, in situ hybridization and arrays.
Sequencing may also be used, in which the sequence(s) of RNA molecules in a sample are obtained. These techniques reveal directly whether a sequence of interest is present in a sample. Sequence determination of the 5′ end of a RNA corresponding to N1 will generally be adequate.
Amplification-based techniques may also be used. These include PCR, SDA, SSSR, LCR, TMA, NASBA, T7 amplification etc. The technique preferably gives exponential amplification. A preferred technique for use with RNA is RT-PCR {e.g. see chapter 15 of ref. 8}. RT-PCR of mRNA from prostate cells is reported in references 9, 10, 11, 12, etc., and RT-PCT of mRNA from breast cells is reported in references 13, 14, 15, 16, etc.
Rather than detect RNA directly, it may be preferred to detect molecules which are derived from RNA (i.e. indirect detection of RNA). A typical indirect method of detecting mRNA is to prepare cDNA by reverse transcription and then to directly detect the cDNA. Direct detection of cDNA will generally use the same techniques as described above for direct detection of RNA (but it will be appreciated that methods such as RT-PCR are not suitable for DNA detection and that cDNA is double-stranded, so detection techniques can be based on a sequence, on its complement, or on the double-stranded molecule).
The invention provides polynucleotide materials e.g. for use in the detection of PCAV nucleic acids.
The invention provides an isolated polynucleotide comprising: (a) the nucleotide sequence N1-N2-N3-N4-N5-polyA as defined above; (b) a fragment of at least x nucleotides of nucleotide sequence N1-N2-N3-N4-N5 as defined above; (c) a nucleotide sequence having at least s % identity to nucleotide sequence N1-N2-N3-N4-N5 as defined above; or (d) the complement of (a), (b) or (c). These polynucleotides include variants of nucleotide sequence N1-N2-N3-N4-N5-polyA (e.g. degenerate variants, allelic variants, homologs, orthologs, mutants, etc.).
Fragment (b) preferably comprises a fragment of N4.
The value of x is at least 7 (e.g. at least 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 60, 70, 75, 80, 90, 100 etc.). The value of x may be less than 2000 (e.g. less than 1000, 500, 100, or 50).
The value of s is preferably at least 50 (e.g. at least 55, 60, 65, 70, 75, 80, 85, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 99.5, 99.9 etc.).
The invention also provides an isolated polynucleotide having formula 5′-A-B-C-3′, wherein: -A- is a nucleotide sequence consisting of a nucleotides; -C- is a nucleotide sequence consisting of c nucleotides; -B- is a nucleotide sequence consisting of either (a) a fragment of b nucleotides of nucleotide sequence N1-N2-N3-N4-N5 as defined above or (b) the complement of a fragment of b nucleotides of nucleotide sequence N1-N2-N3-N4-N5 as defined above; and said polynucleotide is neither (a) a fragment of nucleotide sequence N1-N2-N3-N4-N5 or (b) the complement of a fragment of nucleotide sequence N1-N2-N3-N4-N5.
The -B- region is preferably a fragment of N4. The -A- and/or -C- portions may comprise a promoter sequence (or its complement) e.g. for use in TMA.
The value of a+c is at least 1 (e.g. at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100 etc.). The value of b is at least 7 (e.g. at least 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100 etc.). It is preferred that the value of a+b+c is at least 9 (e.g. at least 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100 etc.). It is preferred that the value of a+b+c is at most 500 (e.g. at most 450, 400, 350, 300, 250, 200, 190, 180, 170, 160, 150, 140, 130, 120, 110, 100, 90, 80, 70, 60, 50, 40, 30, 25, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9).
Where -B- is a fragment of N1-N2-N3-N4-N5, the nucleotide sequence of -A- typically shares less than n % sequence identity to the a nucleotides which are 5′ of sequence -B- in N1-N2-N3-N4-N5 and/or the nucleotide sequence of -C- typically shares less than n % sequence identity to the c nucleotides which are 3′ of sequence -C- in N1-N2-N3-N4-N5. Similarly, where -B- is the complement of a fragment of N1-N2-N3-N4-N5, the nucleotide sequence of -A- typically shares less than n % sequence identity to the complement of the a nucleotides which are 5′ of the complement of sequence -B- in N1-N2-N3-N4-N5 and/or the nucleotide sequence of -C- typically shares less than n % sequence identity to the complement of the c nucleotides which are 3′ of the complement of sequence -C- in N1-N2-N3-N4-N5. The value of n is generally 60 or less (e.g. 50, 40, 30, 20, 10 or less).
The invention also provides an isolated polynucleotide which selectively hybridizes to a nucleic acid having nucleotide sequence N1-N2-N3-N4-N5 as defined above or to a nucleic acid having the complement of nucleotide sequence N1-N2-N3-N4-N5 as defined above. The polynucleotide preferably hybridizes to at least to N4.
Hybridization reactions can be performed under conditions of different “stringency”. Conditions that increase stringency of a hybridization reaction of widely known and published in the art {e.g. page 7.52 of reference 17}. Examples of relevant conditions include (in order of increasing stringency): incubation temperatures of 25° C., 37° C., 50° C., 55° C. and 68° C.; buffer concentrations of 10×SSC, 6×SSC, 1×SSC, 0.1×SSC (where SSC is 0.15 M NaCl and 15 mM citrate buffer) and their equivalents using other buffer systems; formamide concentrations of 0%, 25%, 50%, and 75%; incubation times from 5 minutes to 24 hours; 1, 2, or more washing steps; wash incubation times of 1, 2, or 15 minutes; and wash solutions of 6×SSC, 1×SSC, 0.1×SSC, or de-ionized water. Hybridization techniques are well known in the art {e.g. see references 8, 17, 18, 19, 20 etc.}. Depending upon the particular polynucleotide sequence and the particular domain encoded by that polynucleotide sequence, hybridization conditions upon which to compare a polynucleotide of the invention to a known polynucleotide may differ, as will be understood by the skilled artisan.
In some embodiments, the isolated polynucleotide of the invention selectively hybridizes under low stringency conditions; in other embodiments it selectively hybridizes under intermediate stringency conditions; in other embodiments, it selectively hybridizes under high stringency conditions. An exemplary set of low stringency hybridization conditions is 50° C. and 10×SSC. An exemplary set of intermediate stringency hybridization conditions is 55° C. and 1×SSC. An exemplary set of high stringent hybridization conditions is 68° C. and 0.1×SSC.
Particularly preferred polynucleotides of the invention encode a polypeptide as defined below. By “encode”, it is not necessarily implied that the polynucleotide (e.g. RNA) is translated, but it will include a series of codons which encode the amino acids of the polypeptides defined below.
The invention also provides a polynucleotide comprising: (a) a nucleotide sequence selected from the group consisting of SEQ IDs 278 to 477; (b) a fragment of at least x nucleotides of (a); (c) a nucleotide sequence having at least s % identity to (a); or (d) the complement of (a), (b) or (c).
The invention also provides a polynucleotide comprising: (a) a nucleotide sequence selected from the group consisting of SEQ IDs 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 38, 40, 42, 51 and 52; (b) a fragment of at least x nucleotides of (a); (c) a nucleotide sequence having at least s % identity to (a); or (d) the complement of (a), (b) or (c).
The polynucleotides of the invention are particularly useful as probes and/or as primers for use in hybridization and/or amplification reactions.
More than one polynucleotide of the invention can hybridize to the same nucleic acid target (e.g. more than one can hybridize to a single RNA).
References to a percentage sequence identity between two nucleic acid sequences mean that, when aligned, that percentage of bases are the same in comparing the two sequences. This alignment and the percent homology or sequence identity can be determined using software programs known in the art, for example those described in section 7.7.18 of reference 20. A preferred alignment program is GCG Gap (Genetics Computer Group, Wisconsin, Suite Version 10.1), preferably using default parameters, which are as follows: open gap=3; extend gap=1.
Polynucleotides of the invention may take various forms e.g. single-stranded, double-stranded, linear, circular, vectors, primers, probes etc.
Polynucleotides of the invention can be prepared in many ways e.g. by chemical synthesis (at least in part), by digesting longer polynucleotides using restriction enzymes, from genomic or cDNA libraries, from the organism itself etc.
Polynucleotides of the invention may be attached to a solid support (e.g. a bead, plate, filter, film, slide, resin, etc.)
Polynucleotides of the invention may include a detectable label (e.g. a radioactive or fluorescent label, or a biotin label). This is particularly useful where the polynucleotide is to be used in nucleic acid detection techniques e.g. where the nucleic acid is a primer or as a probe for use in techniques such as PCR, LCR, TMA, NASBA, bDNA, etc.
The term “polynucleotide” in general means a polymeric form of nucleotides of any length, which contain deoxyribonucleotides, ribonucleotides, and/or their analogs. It includes DNA, RNA, DNA/RNA hybrids, and DNA or RNA analogs, such as those containing modified backbones or bases, and also peptide nucleic acids (PNA) etc. The term “polynucleotide” is not intended to be limiting as to the length or structure of a nucleic acid unless specifically indicated, and the following are non-limiting examples of polynucleotides: a gene or gene fragment, exons, introns, mRNA, tRNA, rRNA, ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, any isolated DNA from any source, any isolated RNA from any sequence, nucleic acid probes, and primers. Polynucleotides may have any three-dimensional structure, and may perform any function, known or unknown. Unless otherwise specified or required, any embodiment of the invention that includes a polynucleotide encompasses both the double-stranded form and each of two complementary single-stranded forms known or predicted to make up the double stranded form.
Polynucleotides of the invention may be isolated and obtained in substantial purity, generally as other than an intact chromosome. Usually, the polynucleotides will be obtained substantially free of other naturally-occurring nucleic acid sequences, generally being at least about 50% (by weight) pure, usually at least about 90% pure.
Polynucleotides of the invention (particularly DNA) are typically “recombinant” e.g. flanked by one or more nucleotides with which it is not normally associated on a naturally-occurring chromosome.
The polynucleotides can be used, for example: to produce polypeptides; as probes for the detection of nucleic acid in biological samples; to generate additional copies of the polynucleotides; to generate ribozymes or antisense oligonucleotides; and as single-stranded DNA probes or as triple-strand forming oligonucleotides. The polynucleotides are preferably uses to detect PCA-mRNAs.
A “vector” is a polynucleotide construct designed for transduction/transfection of one or more cell types. Vectors may be, for example, “cloning vectors” which are designed for isolation, propagation and replication of inserted nucleotides, “expression vectors” which are designed for expression of a nucleotide sequence in a host cell, “viral vectors” which is designed to result in the production of a recombinant virus or virus-like particle, or “shuttle vectors”, which comprise the attributes of more than one type of vector.
A “host cell” includes an individual cell or cell culture which can be or has been a recipient of exogenous polynucleotides. Host cells include progeny of a single host cell, and the progeny may not necessarily be completely identical (in morphology or in total DNA complement) to the original parent cell due to natural, accidental, or deliberate mutation and/or change. A host cell includes cells transfected or infected in vivo or in vitro with a polynucleotide of this invention.
The invention provides a kit comprising primers (e.g. PCR primers) for amplifying a template sequence contained within a PCAV nucleic acid, the kit comprising a first primer and a second primer, wherein the first primer is substantially complementary to said template sequence and the second primer is substantially complementary to a complement of said template sequence, wherein the parts of said primers which have substantial complementarity define the termini of the template sequence to be amplified. The first primer and/or the second primer may include a detectable label.
The invention also provides a kit comprising first and second single-stranded oligonucleotides which allow amplification of a PCAV template nucleic acid sequence contained in a single- or double-stranded nucleic acid (or mixture thereof), wherein: (a) the first oligonucleotide comprises a primer sequence which is substantially complementary to said template nucleic acid sequence; (b) the second oligonucleotide comprises a primer sequence which is substantially complementary to the complement of said template nucleic acid sequence; (c) the first oligonucleotide and/or the second oligonucleotide comprise(s) sequence which is not complementary to said template nucleic acid; and (d) said primer sequences define the termini of the template sequence to be amplified. The non-complementary sequence(s) of feature (c) are preferably upstream of (i.e. 5′ to) the primer sequences. One or both of the (c) sequences may comprise a restriction site {21} or promoter sequence {22}. The first and/or the second oligonucleotide may include a detectable label.
The kit of the invention may also comprise a labeled polynucleotide which comprises a fragment of the template sequence (or its complement). This can be used in a hybridization technique to detect amplified template.
The primers and probes used in these kits are preferably polynucleotides as described in section B.4.
The target is preferable a polynucleotide sequence as defined in section B.1.
Where the method is based on polypeptide detection, it will involve detecting expression of a polypeptide which is encoded by a transcript produced by a splicing event in which the 5′ region and start codon of the env coding region are joined to a downstream coding region in the reading frame +2 relative to that of env in the genome. The polypeptide may or may not be functional in a viral life cycle.
Transcripts which encode HML-2 polypeptides are generated by alternative splicing of the full-length mRNA copy of the endogenous genome {e.g.
The polypeptides of the invention are encoded by ORFs which share the same 5′ region (and start codon) as env. A splicing event removes env-coding sequences, but the coding sequence continues in the reading frame +2 relative to that of env. Examples of spliced nucleotide sequences are: SEQ IDs 18-27, 38, 40 & 42. Examples of encoded polypeptide sequences are: SEQ IDs 7-12 and SEQ IDs 28, 29, 30, 31, 34, 35, 36, 39, 41, 43, 67, 68 and 69. Some of these (e.g. SEQ IDs 10-12) inhibit the function of PCAP4 in a transdominant fashion.
Various techniques are available for detecting the presence or absence of a particular polypeptides in a sample. These are generally immunoassay techniques which are based on the specific interaction between an antibody and an antigenic amino acid sequence in the polypeptide. Suitable techniques include standard immunohistological methods, immunoprecipitation, ELISA, RIA, FIA, immunofluorescence etc.
In general, therefore, the invention provides a method for detecting the presence of and/or measuring a level of Tat polypeptide of the invention in a biological sample, wherein the method uses an antibody specific for the polypeptide. The method generally comprises the steps of: a) contacting the sample with an antibody specific for the polypeptide; and b) detecting binding between the antibody and polypeptides in the sample.
Polypeptides of the invention can also be detected by functional assays e.g. assays to detect binding activity or enzymatic activity. For instance, transcriptionally-active polypeptides of the invention can be assayed by detecting expression of a reporter gene driven by the PCAV LTR, as described in the examples herein.
Another way for detecting polypeptides of the invention is to use standard proteomics techniques e.g. purify or separate polypeptides and then use peptide sequencing. For example, polypeptides can be separated using 2D-PAGE and polypeptide spots can be sequenced (e.g. by mass spectroscopy) in order to identify if a sequence is present in a target polypeptide.
Detection methods may be adapted for use in vivo (e.g. to locate or identify sites where cancer cells are present). In these embodiments, an antibody specific for a target polypeptide is administered to an individual (e.g. by injection) and the antibody is located using standard imaging techniques (e.g. magnetic resonance imaging, computed tomography scanning, etc.). Appropriate labels (e.g. spin labels etc.) will be used. Using these techniques, cancer cells are differentially labeled.
An immunofluorescence assay can be easily performed on cells without the need for purification of the target polypeptide. The cells are first fixed onto a solid support, such as a microscope slide or microliter well. The membranes of the cells are then permeablized in order to permit entry of polypeptide-specific antibody (NB: fixing and permeabilization can be achieved together). Next, the fixed cells are exposed to an antibody which is specific for the encoded polypeptide and which is fluorescently labeled. The presence of this label (e.g. visualized under a microscope) identifies cells which express the target PCAV polypeptide. To increase the sensitivity of the assay, it is possible to use a second antibody to bind to the anti-PCAV antibody, with the label being carried by the second antibody. {23}
Rather than detect polypeptides directly, it may be preferred to detect molecules which are produced by the body in response to them (i.e. indirect detection of a polypeptide). This will typically involve the detection of antibodies, so the patient sample will generally be a blood sample. Antibodies can be detected by conventional immunoassay techniques e.g. using PCAV polypeptides of the invention, which will typically be immobilized.
Antibodies against HERV-K polypeptides have been detected in humans {195}.
The invention provides an isolated polypeptide comprising: (a) an amino acid sequence selected from the group consisting of SEQ IDs 7, 8, 9, 10, 11, 12, 28, 29, 30, 31, 34, 35, 36, 39, 41, 43, 67, 68 and 69; (b) a fragment of at least x amino acids of (a); or (c) a polypeptide sequence having at least s % identity to (a). These polypeptides include variants (e.g. allelic variants, homologs, orthologs, functional and non-functional mutants etc.).
The value of x is at least 5 (e.g. at least 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 60, 70, 75, 80, 90, 100 etc.). The value of x may be less than 2000 (e.g. less than 1000, 500, 100, or 50).
The value of s is preferably at least 50 (e.g. at least 55, 60, 65, 70, 75, 80, 85, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 99.5, 99.9 etc.).
The invention also provides an isolated polypeptide having formula NH2-A-B-C-COOH, wherein: A is a polypeptide sequence consisting of a amino acids; C is a polypeptide sequence consisting of c amino acids; B is a polypeptide sequence consisting of a fragment of b amino acids of an amino acid sequence selected from the group consisting of SEQ IDs 7, 8, 9, 10, 11, 12, 28, 29, 30, 31, 34, 35, 36, 39, 41, 43, 67, 68 and 69; and said polypeptide is not a fragment of polypeptide sequence SEQ ID 7, 8, 9, 10, 11, 12, 28, 29, 30, 31, 34, 35, 36, 39, 41, 43, 67, 68 or 69.
The value of a+c is at least 1 (e.g. at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100 etc.). The value of b is at least 7 (e.g. at least 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100 etc.). It is preferred that the value of a+b+c is at least 9 (e.g. at least 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100 etc.). It is preferred that the value of a+b+c is at most 500 (e.g. at most 450, 400, 350, 300, 250, 200, 190, 180, 170, 160, 150, 140, 130, 120, 110, 100, 90, 80, 70, 60, 50, 40, 30, 25, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9).
The amino acid sequence of -A- typically shares less than n % sequence identity to the a amino acids which are N-terminal of sequence -B- in SEQ IDs 7, 8, 9, 10, 11, 12, 28, 29, 30, 31, 34, 35, 36, 39, 41, 43, 67, 68 and 69 and the amino acid sequence of -C- typically shares less than n % sequence identity to the c amino acids which are C-terminal of sequence -B- in SEQ IDs 7, 8, 9, 10, 11, 12, 28, 29, 30, 31, 34, 35, 36, 39, 41 43, 67, 68 and 69. The value of n is generally 60 or less (e.g. 50, 40, 30, 20, 10 or less).
The fragment of (b) or -B- may comprise a T-cell or, preferably, a B-cell epitope of SEQ IDs 7, 8, 9, 10, 11, 12, 28, 29, 30, 31, 34, 35, 36, 39, 41 43, 67, 68 and 69. T- and B-cell epitopes can be identified empirically (e.g. using the PEPSCAN method {24, 25} or similar methods), or they can be predicted (e.g. using the Jameson-Wolf antigenic index {26}, matrix-based approaches {27}, TEPITOPE {28}, neural networks {29}, OptiMer & EpiMer {30, 31}, ADEPT {32}, Tsites {33}, hydrophilicity {34}, antigenic index {35} or the methods disclosed in reference 36 etc.). These methods have proved successful in identifying B-cell and T-cell epitopes for HIV tat and HTLV tax {e.g. 31, 37, 38, 39, 40, 41, 42, 43, 44, etc.}.
Preferred fragments of (b) or -B- are located downstream of the splice site i.e. within exon 3. Examples of such fragments are 61 to 68 (or sub-fragments thereof). A polypeptide may include one or more of these sequences. For instance, it may include two or more (e.g. 2, 3, 4) of SEQ IDs 62 to 65, preferably in that order (e.g. NH2—O1-62-O2-63-O3-64-O4-65-O5—COOH, where O1 to O5 are optional sequences of one or more amino acids), and optionally SEQ ID 61 as well (preferably upstream of SEQ ID 62). Other polypeptides may include SEQ ID 66 and/or SEQ ID 67.
Thus the invention provides a polypeptide comprising: (a) an amino acid sequence selected from the group consisting of SEQ IDs 61 to 68; (b) a fragment of at least x amino acids of (a); or (c) a polypeptide sequence having at least s % identity to (a).
The invention also provides a polypeptide comprising: (a) an amino acid sequence selected from the group consisting of SEQ IDs 78 to 277; (b) a fragment of at least x amino acids of (a); or (c) a polypeptide sequence having at least s % identity to (a).
Within the group consisting of SEQ IDs 7, 8, 9, 10, 11, 12, 28, 29, 30, 31, 34, 35, 36, 39, 41, 43, 67, 68 and 69, a preferred subset is SEQ IDs 7, 8, 11 and 12 (PCAP2, PCAP3, PCAP4 and PCAP4a).
Preferred polypeptides may bind to RNA comprising SEQ ID 49.
References to a percentage sequence identity between two amino acid sequences means that, when aligned, that percentage of amino acids are the same in comparing the two sequences. This alignment and the percent homology or sequence identity can be determined using software programs known in the art, for example those described in section 7.7.18 of reference 20. A preferred alignment is determined by the Smith-Waterman homology search algorithm using an affine gap search with a gap open penalty of 12 and a gap extension penalty of 2, BLOSUM matrix of 62. The Smith-Waterman homology search algorithm is taught in reference 45.
Polypeptides of the invention can be prepared in many ways e.g. by chemical synthesis (at least in part), by digesting longer polypeptides using proteases, by translation from RNA, by purification from cell culture (e.g. from recombinant expression), from the organism itself (e.g. isolation from prostate or breast tissue), from a cell line source etc.
Polypeptides of the invention can be prepared in various forms (e.g. native, fusions, glycosylated, non-glycosylated etc.).
Polypeptides of the invention may be attached to a solid support.
Polypeptides of the invention may comprise a detectable label (e.g. a radioactive or fluorescent label, or a biotin label).
In general, the polypeptides of the subject invention are provided in a non-naturally occurring environment e.g. they are separated from their naturally-occurring environment. In certain embodiments, the subject polypeptide is present in a composition that is enriched for the polypeptide as compared to a control. As such, purified polypeptide is provided, whereby purified is meant that the polypeptide is present in a composition that is substantially free of other expressed polypeptides, where by substantially free is meant that less than 90%, usually less than 60% and more usually less than 50% of the composition is made up of other expressed polypeptides.
The term “polypeptide” refers to amino acid polymers of any length. The polymer may be linear or branched, it may comprise modified amino acids, and it may be interrupted by non-amino acids. The terms also encompass an amino acid polymer that has been modified naturally or by intervention; for example, disulfide bond formation, glycosylation, lipidation, acetylation, phosphorylation, or any other manipulation or modification, such as conjugation with a labeling component. Also included within the definition are, for example, polypeptides containing one or more analogs of an amino acid (including, for example, unnatural amino acids, etc.), as well as other modifications known in the art. Polypeptides can occur as single chains or associated chains. Polypeptides of the invention can be naturally or non-naturally glycosylated (i.e. the polypeptide has a glycosylation pattern that differs from the glycosylation pattern found in the corresponding naturally occurring polypeptide).
Mutants can include amino acid substitutions, additions or deletions. The amino acid substitutions can be conservative amino acid substitutions or substitutions to eliminate non-essential amino acids, such as to alter a glycosylation site, a phosphorylation site or an acetylation site, or to minimize misfolding by substitution or deletion of one or more cysteine residues that are not necessary for function. Conservative amino acid substitutions are those that preserve the general charge, hydrophobicity/hydrophilicity, and/or steric bulk of the amino acid substituted. Variants can be designed so as to retain or have enhanced biological activity of a particular region of the polypeptide (e.g. a functional domain and/or, where the polypeptide is a member of a polypeptide family, a region associated with a consensus sequence). Selection of amino acid alterations for production of variants can be based upon the accessibility (interior vs. exterior) of the amino acid (e.g. ref. 46), the thermostability of the variant polypeptide (e.g. ref. 47), desired glycosylation sites (e.g. ref. 48), desired disulfide bridges (e.g. refs. 49 & 50), desired metal binding sites (e.g. refs. 51 & 52), and desired substitutions with in proline loops (e.g. ref. 53). Cysteine-depleted muteins can be produced as disclosed in reference 54.
The invention also provides isolated antibodies, or antigen-binding fragments thereof, that bind to a polypeptide of the invention. The invention also provides isolated antibodies or antigen binding fragments thereof, that bind to a polypeptide encoded by a polynucleotide of the invention.
Antibodies of the invention may be polyclonal or monoclonal and may be produced by any suitable means (e.g. by recombinant expression).
Antibodies of the invention may include a label. The label may be detectable directly, such as a radioactive or fluorescent label. Alternatively, the label may be detectable indirectly, such as an enzyme whose products are detectable (e.g. luciferase, β-galactosidase, peroxidase etc.).
Antibodies of the invention may be attached to a solid support.
Antibodies of the invention may be prepared by administering (e.g. injecting) a polypeptide of the invention to an appropriate animal (e.g. a rabbit, hamster, mouse or other rodent).
Antigen-binding fragments of antibodies include Fv, scFv, Fc, Fab, F(ab′)2 etc.
To increase compatibility with the human immune system, the antibodies may be chimeric or humanized {e.g. refs. 55 & 56}, or fully human antibodies may be used. Because humanized antibodies are far less immunogenic in humans than the original non-human monoclonal antibodies, they can be used for the treatment of humans with far less risk of anaphylaxis. Thus, these antibodies may be preferred in therapeutic applications that involve in vivo administration to a human such as, use as radiation sensitizers for the treatment of neoplastic disease or use in methods to reduce the side effects of cancer therapy.
Humanized antibodies may be achieved by a variety of methods including, for example: (1) grafting non-human complementarity determining regions (CDRs) onto a human framework and constant region (“humanizing”), with the optional transfer of one or more framework residues from the non-human antibody; (2) transplanting entire non-human variable domains, but “cloaking” them with a human-like surface by replacement of surface residues (“veneering”). In the present invention, humanized antibodies will include both “humanized” and “veneered” antibodies. {57, 58, 59, 60, 61, 62, 63}.
CDRs are amino acid sequences which together define the binding affinity and specificity of a Fv region of a native immunoglobulin binding site {e.g. refs. 64 & 65}.
The phrase “constant region” refers to the portion of the antibody molecule that confers effector functions. In chimeric antibodies, mouse constant regions are substituted by human constant regions. The constant regions of humanized antibodies are derived from human immunoglobulins. The heavy chain constant region can be selected from any of the 5 isotypes: alpha, delta, epsilon, gamma or mu.
One method of humanizing antibodies comprises aligning the heavy and light chain sequences of a non-human antibody to human heavy and light chain sequences, replacing the non-human framework residues with human framework residues based on such alignment, molecular modeling of the conformation of the humanized sequence in comparison to the conformation of the non-human parent antibody, and repeated back mutation of residues in the framework region which disturb the structure of the non-human CDRs until the predicted conformation of the CDRs in the humanized sequence model closely approximates the conformation of the non-human CDRs of the parent non-human antibody. Such humanized antibodies may be further derivatized to facilitate uptake and clearance e.g, via Ashwell receptors. {refs. 66 & 67}
Humanized or fully-human antibodies can also be produced using transgenic animals that are engineered to contain human immunoglobulin loci. For example, ref. 68 discloses transgenic animals having a human Ig locus wherein the animals do not produce functional endogenous immunoglobulins due to the inactivation of endogenous heavy and light chain loci. Ref. 69 also discloses transgenic non-primate mammalian hosts capable of mounting an immune response to an immunogen, wherein the antibodies have primate constant and/or variable regions, and wherein the endogenous immunoglobulin-encoding loci are substituted or inactivated. Ref. 70 discloses the use of the Cre/Lox system to modify the immunoglobulin locus in a mammal, such as to replace all or a portion of the constant or variable region to form a modified antibody molecule. Ref. 71 discloses non-human mammalian hosts having inactivated endogenous Ig loci and functional human Ig loci. Ref. 72 discloses methods of making transgenic mice in which the mice lack endogenous heavy claims, and express an exogenous immunoglobulin locus comprising one or more xenogeneic constant regions.
Using a transgenic animal described above, an immune response can be produced to a PCAV polypeptide, and antibody-producing cells can be removed from the animal and used to produce hybridomas that secrete human monoclonal antibodies. Immunization protocols, adjuvants, and the like are known in the art, and are used in immunization of, for example, a transgenic mouse as described in ref. 73. The monoclonal antibodies can be tested for the ability to inhibit or neutralize the biological activity or physiological effect of the corresponding polypeptide.
HML-2 transcripts are up-regulated in tumors. To detect such up-regulation, a reference point is needed i.e. a control. Analysis of the control sample gives a standard level of RNA and/or protein expression against which a patient sample can be compared.
A negative control gives a background or basal level of expression against which a patient sample can be compared. Higher levels of expression product relative to a negative control, such as a lifetime baseline or pooled normal samples, indicate that the patient from whom the sample was taken has a tumor. Conversely, equivalent levels of expression product indicate that the patient does not have a HML-2-related tumor.
A positive control gives a level of expression against which a patient sample can be compared. Equivalent or higher levels of expression product relative to a positive control indicate that the patient from whom the sample was taken has a tumor. Conversely, lower levels of expression product indicate that the patient does not have a HML-2 related tumor.
For direct or indirect RNA measurement, or for direct polypeptide measurement, a negative control will generally comprise cells which are not from a tumor cell (e.g. a breast tumor or a prostate tumor). For indirect polypeptide measurement, a negative control will generally be a blood sample from a patient who does not have a tumor. The negative control could be a sample from the same patient as the patient sample, but from a tissue in which HML-2 expression is not up-regulated e.g. a non-tumor non-prostate cell for a male, or a non-tumor non-breast cell for a female. The negative control could be a prostate or breast cell from the same patient as the patient sample, but taken at an earlier stage in the patient's life. The negative control could be a cell from a patient without a tumor. This cell may or may not be a prostate/breast cell. The negative control cell could be a prostate cell from a patient with BPH. The negative control could be normal semen, seminal fluid, colostrum, breast milk, etc.
For direct or indirect RNA measurement, or for direct polypeptide measurement, a positive control will generally comprise cells from the type of tumor in question. For indirect polypeptide measurement, a negative control will generally be a blood sample from a patient who has a prostate tumor or breast tumor. The negative control could be a prostate or breast tumor cell from the same patient as the patient sample, but taken at an earlier stage in the patient's life (e.g. to monitor remission). The positive control could be a cell from another patient with a prostate or breast tumor. The positive control could be a prostate cell line or a breast cell line.
Other suitable positive and negative controls will be apparent to the skilled person.
HML-2 expression in the control can be assessed at the same time as expression in the patient sample. Alternatively, HML-2 expression in the control can be assessed separately (earlier or later).
Rather than actually compare two samples, however, the control may be an absolute control i.e. a level of expression which has been empirically determined from samples taken from tumor patients (e.g. under standard conditions).
The up-regulation relative to the control (100%) will usually be at least 150% (e.g. 200%, 250%, 300%, 400%, 500%, 600% or more).
The invention provides a method for diagnosing cancer. It will be appreciated that “diagnosis” according to the invention can range from a definite clinical diagnosis of disease to an indication that the patient should undergo further testing which may lead to a definite diagnosis. For example, the method of the invention can be used as part of a screening process, with positive samples being subjected to further analysis.
Furthermore, diagnosis includes monitoring the progress of cancer in a patient already known to have the cancer. Cancer can also be staged by the methods of the invention.
The efficacy of a treatment regimen (therametrics) of a cancer can also monitored by the method of the invention e.g. to determine its efficacy.
Susceptibility to cancer can also be detected e.g. where up-regulation of expression has occurred, but before cancer has developed. Prognostic methods are also encompassed.
Of the various types of cancer, the invention is particularly suited to prostate cancer (including prostatic intraepithelial neoplasia) and breast cancer (including mammary carcinoma).
All of these techniques fall within the general meaning of “diagnosis” in the present invention.
HIV Tat acts as a transcription factor and its RNA target is the TAR. SEQ IDs 14 and 49 are examples of 150 nucleotide RNAs comprising a putative HML-2 TAR. As for HIV, the minimal tat-binding motif in the TAR may be shorter than these two molecules.
The invention provides an isolated polynucleotide of comprising: (a) the nucleotide sequence of SEQ ID 14 or 49; (b) a fragment of at least x nucleotides of (a); (c) a nucleotide sequence having at least s % identity to (a); or (d) the complement of (a), (b) or (c).
The isolated polynucleotide is preferably shorter than 250 nucleotides (e.g. shorter than 240, 230, 220, 210, 200, 190, 180, 170, 160, or 150 nucleotides).
The value of x is at least 7 (e.g. at least 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 60, 70, 75, 80, 90, 100 etc.). The value of x may be less than 2000 (e.g. less than 1000, 500, 100, or 50).
The value of s is preferably at least 50 (e.g. at least 55, 60, 65, 70, 75, 80, 85, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 99.5, 99.9 etc.).
The isolated polynucleotide can preferably bind to a protein comprising the amino acid sequence SEQ ID 7, 8 and/or 9 (putative tat analogs).
Inhibiting the Tat/TAR interaction has been used for HIV therapy, and inhibition of Tax function has been used for HTLV therapy. By analogy, inhibiting the equivalent functions in PCAV offers ways of treating cancer, and also for treating other diseases linked to HERV-K viruses (e.g. testicular cancer {194}, multiple sclerosis {74}, insulin-dependent diabetes mellitus (IDDM) (75) etc.).
Various methods have been proposed for inhibiting the Tat/TAR interaction e.g.:
The use of RNA decoys comprising multiple TARs to sequester Tat {76}
Antisense Tat {77}
Dominant negative Tat mutants {78}
Tat Ribozymes {79}
Anti-TAR hammerhead ribozymes {80}
The use of small molecule inhibitors of the Tat/TAR interaction {81, 82, 83}
Use of aptamers {84}
Use of inhibitory RNAs (siRNAs) for RNA interference {85}
Similar approaches have been used for inhibiting Tax function {86, 87, 88}, although a significant difference between Tax and Tat is that Tat binds nucleic acid directly. All of these methods can be applied to the putative PCAV tat/TAR interaction or Tax function.
The invention therefore provides the following, together with their use as pharmaceuticals and their use in the manufacture of a medicament for treating prostate cancer, testicular cancer, multiple sclerosis and/or insulin-dependent diabetes mellitus:
A polynucleotide encoding or comprising two or more copies of the putative HML-2 TAR;
A polynucleotide complementary to a putative tat-coding sequence;
A polypeptide which can bind to a functional putative tat and act in a transdominant way;
A ribozyme which can attack tat and/or tar sequences;
Small molecule inhibitors of the putative Tat/TAR interaction;
Antibodies or oligobodies {89,90} which specifically bind to putative tat; and
Aptamer inhibitors of the putative Tat/TAR interaction.
Small inhibitory RNAs {e.g. refs. 91 to 96} complementary to the putative TAR sequence.
In relation to transdominant inhibitors of putative tat function, the invention provides a protein as defined in section C.3 above, comprising: (a) an amino acid sequence selected from the group consisting of SEQ IDs 10, 11, 12 and 13; (b) a fragment of at least x amino acids of (a); or (c) a polypeptide sequence having at least s % identity to (a). Proteins having amino acid sequences SEQ IDs 10, 11, 12 and β have all been found to suppress the activity of putative tat, with SEQ ID β (cORF) being the strongest dominant negative.
The invention also provides methods of screening for compounds with activity against cancer, comprising: contacting a test compound with a putative Tat polynucleotide or polypeptide, or with a putative TAR polynucleotide; and detecting a binding interaction between the test compound and the polynucleotide/polypeptide. A binding interaction indicates potential anti-cancer efficacy of the test compound.
The invention also provides methods of screening for compounds with activity against prostate cancer, comprising: contacting a test compound with a putative Tat polypeptide of the invention; and assaying the function of the polypeptide. Inhibition of the polypeptide's function (e.g. loss of expression of a reporter gene driven by the PCAV LTR, as described in the examples herein) indicates potential anti-cancer efficacy of the test compound.
Typical test compounds include, but are not restricted to peptides (including cyclic peptides {82}), peptoids, proteins, lipids, metals, nucleotides, nucleosides, small organic molecules {97}, antibiotics, polyamines, and combinations and derivatives thereof. Small organic molecules have a molecular weight of more than 50 and less than about 2,500 daltons, and most preferably between about 300 and about 800 daltons. Complex mixtures of substances, such as extracts containing natural products, or the products of mixed combinatorial syntheses, can also be tested and the component that binds to the target RNA can be purified from the mixture in a subsequent step.
Test compounds may be derived from large libraries of synthetic or natural compounds {98}. For instance, synthetic compound libraries are commercially available from Maybridge Chemical Co. (Trevillet, Cornwall, UK) or Aldrich (Milwaukee, Wis.). Alternatively, libraries of natural compounds in the form of bacterial, fungal, plant and animal extracts may be used. Additionally, test compounds may be synthetically produced using combinatorial chemistry either as individual compounds or as mixtures.
Agonists or antagonists of the polypeptides of the invention can be screened using any available method known in the art, such as signal transduction, antibody binding, receptor binding, mitogenic assays, chemotaxis assays, etc. The assay conditions ideally should resemble the conditions under which the native activity is exhibited in vivo, that is, under physiologic pH, temperature, and ionic strength. Suitable agonists or antagonists will exhibit strong inhibition or enhancement of the native activity at concentrations that do not cause toxic side effects in the subject. Agonists or antagonists that compete for binding to the native polypeptide can require concentrations equal to or greater than the native concentration, while inhibitors capable of binding irreversibly to the polypeptide can be added in concentrations on the order of the native concentration.
Such screening and experimentation can lead to identification of an agonist or antagonist of a HML-2 polypeptide. Such agonists and antagonists can be used to modulate, enhance, or inhibit HML-2 expression and/or function. {99}
The present invention relates to methods of using the polypeptides of the invention (e.g. recombinantly produced HML-2 polypeptides) to screen compounds for their ability to bind or otherwise modulate, such as, inhibit, the activity of HML-2 polypeptides, and thus to identify compounds that can serve, for example, as agonists or antagonists of the HML-2 polypeptides. In one screening assay, the HML-2 polypeptide is incubated with cells susceptible to the growth stimulatory activity of HML-2, in the presence and absence of a test compound. The HML-2 activity altering or binding potential of the test compound is measured. Growth of the cells is then determined. A reduction in cell growth in the test sample indicates that the test compound binds to and thereby inactivates the HML-2 polypeptide, or otherwise inhibits the HML-2 polypeptide activity.
Transgenic animals (e.g. rodents) that have been transformed to over-express HML-2 genes can be used to screen compounds in vivo for the ability to inhibit development of tumors resulting from HML-2 over-expression or to treat such tumors once developed. Transgenic animals that have prostate tumors of increased invasive or malignant potential can be used to screen compounds, including antibodies or peptides, for their ability to inhibit the effect of HML-2 polypeptides. Such animals can be produced, for example, as described in the examples herein.
Screening procedures such as those described above are useful for identifying agents for their potential use in pharmacological intervention strategies in prostate cancer treatment. Additionally, polynucleotide sequences corresponding to HML-2, including LTRs, may be used to assay for inhibitors of elevated gene expression.
Potent inhibitors of HERV-K protease are already known {100}. Inhibition of HERV-K protease by HIV-1 protease inhibitors has also been reported {101}. These compounds can be studied for use in prostate cancer therapy, and are also useful lead compounds for drug design.
Transdominant negative mutants of cORF have also been reported {102,103}. Transdominant cORF mutants can be studied for use in prostate cancer therapy.
Antisense oligonucleotides complementary to HML-2 mRNA can be used to selectively diminish or oblate the expression of the polypeptide. More specifically, antisense constructs or antisense oligonucleotides can be used to inhibit the production of HML-2 polypeptide(s) in prostate tumor cells. Antisense mRNA can be produced by transfecting into target cancer cells an expression vector with a HML-2 polynucleotide of the invention oriented in an antisense direction relative to the direction of PCAV-mRNA transcription. Appropriate vectors include viral vectors, including retroviral vectors, as well as non-viral vectors. Alternately, antisense oligonucleotides can be introduced directly into target cells to achieve the same goal. Oligonucleotides can be selected/designed to achieve the highest level of specificity and, for example, to bind to a PCAV-mRNA at the initiator ATG.
Monoclonal antibodies to HML-2 polypeptides can be used to block the action of the polypeptides and thereby control growth of cancer cells. This can be accomplished by infusion of antibodies that bind to HML-2 polypeptides and block their action.
The invention also provides high-throughput screening methods for identifying compounds that bind to a Tat and/or TAR. Preferably, all the biochemical steps for this assay are performed in a single solution in, for instance, a test tube or microtitre plate, and the test compounds are analyzed initially at a single compound concentration. for the purposes of high throughput screening, the experimental conditions are adjusted to achieve a proportion of test compounds identified as “positive” compounds from amongst the total compounds screened. The assay is preferably set to identify compounds with an appreciable affinity towards the target e.g., when 0.1% to 1% of the total test compounds from a large compound library are shown to bind to a given target with a Ki of 10 μM or less (e.g. 1 μM, 100 nM, 10 nM, or less)
The invention also provides structure-based drug design techniques which can be applied to structural representations of the putative Tat and/or putative TAR in order to identify compounds that can block their putative interaction. A variety of suitable techniques {e.g. ref. 104} are available to the skilled person.
Software packages for implementing molecular modelling techniques for use in structure-based drug design include SYBYL {105}, AMBER {106}, CERIUS2 {107}, INSIGHT II {107}, CATALYST {107}, QUANTA {107}, HYPERCHEM {108}, CHEMSITE {109}, etc. This software can be used to determine binding surfaces of the putative Tat and/or putative TAR in order to reveal features such as van der Waals contacts, electrostatic interactions, and/or hydrogen bonding opportunities.
The invention also provides in silico screening methods for identifying compounds that bind to putative Tat and/or TAR. Structural representations of potential ligands are saved in a computer readable format, such as SD or MDL formats. A 3D structure of the ligands is preferably generated from the 2D representation using a program such as CORINA, CONCORDE or InsightII. Once a ligand has been identified which interacts in silico with a receptor, this may be provided (synthesised, purified or purchased, for instance) and the interaction can be verified experimentally. The invention provides a ligand identified using the methods of the invention.
Structure-based in silico screening has been used to identify inhibitors of the Tat/TAR interaction of HIV {110}.
Efficacy of these various methods can be tested by monitoring expression of polynucleotides and/or polypeptides of the invention after administration of the composition of the invention. All of the methods previously successfully used in tat-based HIV immunization can be used.
Tat protein has been used as a vaccine antigen for HIV therapy, and Tax protein has been used as a vaccine antigen for HTLV therapy. Polypeptide vaccines {111,112,113,114,115} and DNA vaccines {116,117} have both been proposed. By analogy, the polypeptides of the invention can be used for immunizing against prostate or breast cancer, and also for treating other diseases linked to HERV-K viruses (e.g. testicular cancer, multiple sclerosis, IDDM etc.).
The invention therefore provides a composition comprising (a) a polypeptide as defined in section C.3 above and (b) a pharmaceutically acceptable carrier. The invention also provides a composition comprising (a) a polynucleotide encoding a polypeptide as defined above and (b) a pharmaceutically acceptable carrier.
The composition may additionally comprise an adjuvant. For example, the composition may comprise one or more of the following adjuvants: (1) oil-in-water emulsion formulations (with or without other specific immunostimulating agents such as muramyl peptides (see below) or bacterial cell wall components), such as for example (a) MF59™ {118; Chapter 10 in ref. 119}, containing 5% Squalene, 0.5% Tween 80, and 0.5% Span 85 (optionally containing MTP-PE) formulated into submicron particles using a microfluidizer, (b) SAF, containing 10% Squalane, 0.4% Tween 80, 5% pluronic-blocked polymer L121, and thr-MDP either microfluidized into a submicron emulsion or vortexed to generate a larger particle size emulsion, and (c) Ribi™ adjuvant system (RAS), (Ribi Immunochem, Hamilton, Mont.) containing 2% Squalene, 0.2% Tween 80, and one or more bacterial cell wall components from the group consisting of monophosphorylipid A (MPL), trehalose dimycolate (TDM), and cell wall skeleton (CWS), preferably MPL+CWS (Detox™); (2) saponin adjuvants, such as QS21 or Stimulon™ (Cambridge Bioscience, Worcester, Mass.) may be used or particles generated therefrom such as ISCOMs (immunostimulating complexes), which ISCOMS may be devoid of additional detergent {120}; (3) Complete Freund's Adjuvant (CFA) and Incomplete Freund's Adjuvant (IFA); (4) cytokines, such as interleukins (e.g. IL-1, IL-2, IL-4, IL-5, IL-6, IL-7, IL-12 etc.), interferons (e.g. gamma interferon), macrophage colony stimulating factor (M-CSF), tumor necrosis factor (TNF), etc.; (5) monophosphoryl lipid A (MPL) or 3-O-deacylated MPL (3dMPL) {e.g. 121, 122}; (6) combinations of 3dMPL with, for example, QS21 and/or oil-in-water emulsions {e.g. 123, 124, 125}; (7) oligonucleotides comprising CpG motifs i.e. containing at least one CG dinucleotide, with 5-methylcytosine optionally being used in place of cytosine; (8) a polyoxyethylene ether or a polyoxyethylene ester {126}; (9) a polyoxyethylene sorbitan ester surfactant in combination with an octoxynol {127} or a polyoxyethylene alkyl ether or ester surfactant in combination with at least one additional non-ionic surfactant such as an octoxynol {128}; (10) an immunostimulatory oligonucleotide (e.g. a CpG oligonucleotide) and a saponin {129}; (11) an immunostimulant and a particle of metal salt {130}; (12) a saponin and an oil-in-water emulsion {131}; (13) a saponin (e.g. QS21) +3dMPL +IL-12 (optionally +a sterol) {132}; (14) aluminium salts, preferably hydroxide or phosphate, but any other suitable salt may also be used (e.g. hydroxyphosphate, oxyhydroxide, orthophosphate, sulphate etc. {chapters 8 & 9 of ref. 119}). Mixtures of different aluminium salts may also be used. The salt may take any suitable form (e.g. gel, crystalline, amorphous etc.); (15) chitosan; (16) cholera toxin or E. coli heat labile toxin, or detoxified mutants thereof {133}; (17) microparticles of poly(a-hydroxy)acids, such as PLG; (18) other substances that act as immunostimulating agents to enhance the efficacy of the composition. Aluminium salts and/or MF59™ are preferred.
The composition is preferably sterile and/or pyrogen-free. It will typically be buffered around pH 7.
The composition is preferably an immunogenic composition and is more preferably a vaccine composition. The composition can be used to raise antibodies in a mammal (e.g. a human).
Vaccines of the invention may be prophylactic (i.e. to prevent disease) or therapeutic (i.e. to reduce or eliminate the symptoms of a disease).
Efficacy can be tested by monitoring expression of polynucleotides and/or polypeptides of the invention after administration of the composition of the invention. All of the methods previously used in tat-based HIV immunization can be used.
The invention provides a pharmaceutical composition comprising polynucleotide, polypeptide, or antibody as defined above. The invention also provides their use as medicaments, and their use in the manufacture of medicaments for treating cancer. The invention also provides a method for raising an immune response, comprising administering an immunogenic dose of polynucleotide or polypeptide of the invention to an animal.
Pharmaceutical compositions encompassed by the present invention include as active agent, the polynucleotides, polypeptides, or antibodies of the invention disclosed herein in a therapeutically effective amount. An “effective amount” is an amount sufficient to effect beneficial or desired results, including clinical results. An effective amount can be administered in one or more administrations. For purposes of this invention, an effective amount is an amount that is sufficient to palliate, ameliorate, stabilize, reverse, slow or delay the symptoms and/or progression of cancer.
The compositions can be used to treat cancer as well as metastases of primary cancer. In addition, the pharmaceutical compositions can be used in conjunction with conventional methods of cancer treatment, e.g. to sensitize tumors to radiation or conventional chemotherapy. The terms “treatment”, “treating”, “treat” and the like are used herein to generally refer to obtaining a desired pharmacologic and/or physiologic effect. The effect may be prophylactic in terms of completely or partially preventing a disease or symptom thereof and/or may be therapeutic in terms of a partial or complete stabilization or cure for a disease and/or adverse effect attributable to the disease. “Treatment” as used herein covers any treatment of a disease in a mammal, particularly a human, and includes: (a) preventing the disease or symptom from occurring in a subject which may be predisposed to the disease or symptom but has not yet been diagnosed as having it; (b) inhibiting the disease symptom, i.e. arresting its development; or (c) relieving the disease symptom, i.e. causing regression of the disease or symptom.
Where the pharmaceutical composition comprises an antibody that specifically binds to a gene product encoded by a differentially expressed polynucleotide, the antibody can be coupled to a drug for delivery to a treatment site or coupled to a detectable label to facilitate imaging of a site comprising cancer cells, such as prostate cancer cells. Methods for coupling antibodies to drugs and detectable labels are well known in the art, as are methods for imaging using detectable labels.
The term “therapeutically effective amount” as used herein refers to an amount of a therapeutic agent to treat, ameliorate, or prevent a desired disease or condition, or to exhibit a detectable therapeutic or preventative effect. The effect can be detected by, for example, chemical markers or antigen levels. Therapeutic effects also include reduction in physical symptoms. The precise effective amount for a subject will depend upon the subject's size and health, the nature and extent of the condition, and the therapeutics or combination of therapeutics selected for administration. The effective amount for a given situation is determined by routine experimentation and is within the judgment of the clinician. For purposes of the present invention, an effective dose will generally be from about 0.01 mg/kg to about 5 mg/kg, or about 0.01 mg/kg to about 50 mg/kg or: about 0.05 mg/kg to about 10 mg/kg of the compositions of the present invention in the individual to which it is administered.
A pharmaceutical composition can also contain a pharmaceutically acceptable carrier. The term “pharmaceutically acceptable carrier” refers to a carrier for administration of a therapeutic agent, such as antibodies or a polypeptide, genes, and other therapeutic agents. The term refers to any pharmaceutical carrier that does not itself induce the production of antibodies harmful to the individual receiving the composition, and which can be administered without undue toxicity. Suitable carriers can be large, slowly metabolized macromolecules such as proteins, polysaccharides, polylactic acids, polyglycolic acids, polymeric amino acids, amino acid copolymers, and inactive virus particles. Such carriers are well known to those of ordinary skill in the art. Pharmaceutically acceptable carriers in therapeutic compositions can include liquids such as water, saline, glycerol and ethanol. Auxiliary substances, such as wetting or emulsifying agents, pH buffering substances, and the like, can also be present in such vehicles. Typically, the therapeutic compositions are prepared as injectables, either as liquid solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid vehicles prior to injection can also be prepared. Liposomes are included within the definition of a pharmaceutically acceptable carrier. Pharmaceutically acceptable salts can also be present in the pharmaceutical composition, e.g. mineral acid salts such as hydrochlorides, hydrobromides, phosphates, sulfates, and the like; and the salts of organic acids such as acetates, propionates, malonates, benzoates, and the like. A thorough discussion of pharmaceutically acceptable excipients is available in Remington: The Science and Practice of Pharmacy (1995) Alfonso Gennaro, Lippincott, Williams, & Wilkins, or reference 134.
Once formulated, the compositions contemplated by the invention can be (1) administered directly to the subject (e.g. as polynucleotide, polypeptides, small molecule agonists or antagonists, and the like); or (2) delivered ex vivo, to cells derived from the subject (e.g. as in ex vivo gene therapy). Direct delivery of the compositions will generally be accomplished by parenteral injection, e.g. subcutaneously, intraperitoneally, intravenously or intramuscularly, intratumoral or to the interstitial space of a tissue. Other modes of administration include oral and pulmonary administration, suppositories, and transdermal applications, needles, and gene guns or hyposprays. Dosage treatment can be a single dose schedule or a multiple dose schedule.
Methods for the ex vivo delivery and reimplantation of transformed cells into a subject are known in the art {e.g. ref. 135}. Examples of cells useful in ex vivo applications include, for example, stem cells, particularly hematopoetic, lymph cells, macrophages, dendritic cells, or tumor cells. Generally, delivery of nucleic acids for both ex vivo and in vitro applications can be accomplished by, for example, dextran-mediated transfection, calcium phosphate precipitation, polybrene mediated transfection, protoplast fusion, electroporation, encapsulation of the polynucleotide(s) in liposomes, and direct microinjection of the DNA into nuclei, all well known in the art.
Differential expression PCAV polynucleotides has been found to correlate with tumors. The tumor can be amenable to treatment by administration of a therapeutic agent based on the provided polynucleotide, corresponding polypeptide or other corresponding molecule (e.g. antisense, ribozyme, etc.). In other embodiments, the disorder can be amenable to treatment by administration of a small molecule drug that, for example, serves as an inhibitor (antagonist) of the function of the encoded gene product of a gene having increased expression in cancerous cells relative to normal cells or as an agonist for gene products that are decreased in expression in cancerous cells (e.g. to promote the activity of gene products that act as tumor suppressors).
The dose and the means of administration of the inventive pharmaceutical compositions are determined based on the specific qualities of the therapeutic composition, the condition, age, and weight of the patient, the progression of the disease, and other relevant factors. For example, administration of polynucleotide therapeutic compositions agents includes local or systemic administration, including injection, oral administration, particle gun or catheterized administration, and topical administration. Preferably, the therapeutic polynucleotide composition contains an expression construct comprising a promoter operably linked to a polynucleotide of the invention. Various methods can be used to administer the therapeutic composition directly to a specific site in the body. For example, a small metastatic lesion is located and the therapeutic composition injected several times in several different locations within the body of tumor. Alternatively, arteries which serve a tumor are identified, and the therapeutic composition injected into such an artery, in order to deliver the composition directly into the tumor. A tumor that has a necrotic center is aspirated and the composition injected directly into the now empty center of the tumor. An antisense composition is directly administered to the surface of the tumor, for example, by topical application of the composition. X-ray imaging is used to assist in certain of the above delivery methods.
Targeted delivery of therapeutic compositions containing an antisense polynucleotide, subgenomic polynucleotides, or antibodies to specific tissues can also be used. Receptor-mediated DNA delivery techniques are described in, for example, references 136 to 141. Therapeutic compositions containing a polynucleotide are administered in a range of about 100 ng to about 200 mg of DNA for local administration in a gene therapy protocol. Concentration ranges of about 500 ng to about 50 mg, about 1 μg to about 2 mg, about 5 μg to about 500 μg, and about 20 μg to about 100 μg of DNA can also be used during a gene therapy protocol. Factors such as method of action (e.g. for enhancing or inhibiting levels of the encoded gene product) and efficacy of transformation and expression are considerations which will affect the dosage required for ultimate efficacy of the antisense subgenomic polynucleotides. Where greater expression is desired over a larger area of tissue, larger amounts of antisense subgenomic polynucleotides or the same amounts re-administered in a successive protocol of administrations, or several administrations to different adjacent or close tissue portions of, e.g., a tumor site, may be required to effect a positive therapeutic outcome. In all cases, routine experimentation in clinical trials will determine specific ranges for optimal therapeutic effect.
The therapeutic polynucleotides and polypeptides of the present invention can be delivered using gene delivery vehicles. The gene delivery vehicle can be of viral or non-viral origin (see generally references 142, 143, 144 and 145). Expression of such coding sequences can be induced using endogenous mammalian or heterologous promoters. Expression of the coding sequence can be either constitutive or regulated.
Viral-based vectors for delivery of a desired polynucleotide and expression in a desired cell are well known in the art. Exemplary viral-based vehicles include, but are not limited to, recombinant retroviruses (e.g. references 146 to 156), alphavirus-based vectors (e.g. Sindbis virus vectors, Semliki forest virus (ATCC VR-67; ATCC VR-1247), Ross River virus (ATCC VR-373; ATCC VR-1246) and Venezuelan equine encephalitis virus (ATCC VR-923; ATCC VR-1250; ATCC VR 1249; ATCC VR-532)), adenovirus vectors and adeno-associated virus (AAV) vectors (e.g. see refs. 157 to 162). Administration of DNA linked to killed adenovirus {163} can also be employed.
Non-viral delivery vehicles and methods can also be employed, including, but not limited to, polycationic condensed DNA linked or unlinked to killed adenovirus alone {e.g. 163}, ligand-linked DNA {164}, eukaryotic cell delivery vehicles cells {e.g. refs. 165 to 169} and nucleic charge neutralization or fusion with cell membranes. Naked DNA can also be employed. Exemplary naked DNA introduction methods are described in refs. 170 and 171. Liposomes that can act as gene delivery vehicles are described in refs. 172 to 176. Additional approaches are described in refs. 177 & 178.
Further non-viral delivery suitable for use includes mechanical delivery systems such as the approach described in ref. 178. Moreover, the coding sequence and the product of expression of such can be delivered through deposition of photopolymerized hydrogel materials or use of ionizing radiation {e.g. refs. 179 & 180}. Other conventional methods for gene delivery that can be used for delivery of the coding sequence include, for example, use of hand-held gene transfer particle gun or use of ionizing radiation for activating transferred gene {179 & 182}.
Genomes of all eukaryotes contain multiple copies of sequences related to infectious retroviruses. These endogenous retroviruses have been well studied in mice where both true infectious forms and thousands of defective retrovirus-like elements (e.g. the IAP and Etn sequence families) exist. Some members of the IAP and Etn families are “active” retrotransposons since insertions of these elements have been documented which cause germ line mutations or oncogenic transformation.
Endogenous retroviruses were identified in human genomic DNA by their homology to retroviruses of other vertebrates {183, 184}. It is believed that the human genome probably contains numerous copies of endogenous proviral DNAs, but little is known about their function. Most HERV families have relatively few members (1-50) but one family (HERV-H) consists of ˜1000 copies per haploid genome distributed on all chromosomes. The large numbers and general transcriptional activity of HERVs in embryonic and tumor cell lines suggest that they could act as disease-causing insertional mutagens or affect adjacent gene expression in a neutral or beneficial way.
The K family of human endogenous retroviruses (HERV-K) is well known {185}. It is related to the mouse mammary tumor virus (MMTV) and is present in the genomes of humans, apes and old world monkeys, but several human HERV-K proviruses are unique to humans {186}. The HERV-K family is present at 30-50 full-length copies per haploid human genome and possesses long open reading frames that potentially are translated into viral proteins {187, 188}. Two types of proviral genomes are known, which differ by the presence (type 2) or absence (type 1) of a stretch of 292 nucleotides in the overlapping boundary of the pol and env genes {189}. Some members of the HERV-K family are known to code for the gag protein and retroviral particles, which are both detectable in germ cell tumors and derived cell lines {190}. Analysis of the RNA expression pattern of full-length HERV-K has also identified a doubly-spliced RNA that encodes a 105 amino acid protein termed central ORF (‘cORF’) which is a sequence-specific nuclear RNA export factor that is functionally equivalent to the Rev protein of HIV {191}. HERV-K10 has been shown to encode a full-length gag homologous 73 kDa protein and a functional protease {192}.
Patients suffering from germ cell tumors show high antibody titers against HERV-K gag and env proteins at the time of tumor detection {193}. In normal testis and testicular tumors the HERV-K transmembrane envelope protein has been detected both in germ cells and tumor cells, but not in the surrounding tissue. In the case of testicular tumor, correlations between the expression of the env-specific mRNA, the presence of the transmembrane env, cORF and gag proteins and antibodies against HERV-K specific peptides in the serum of the patients, have been reported. Reference 194 reports that HERV-K10 gag and/or env proteins are synthesized in seminoma cells and that patients with those tumors exhibit relatively high antibody titers against gag and/or env.
Gag proteins released in form of particles from HERV-K have been identified in the cell culture supernatant of the teratocarcinoma derived cell line Tera 1. These retrovirus-like particles (termed “human teratocarcinoma derived virus” or HTDV) have been shown to have a 90% sequence homology to the HERV-K10 genome {190, 195}.
While the HERV-K family is present in the genome of every human cell, high level expression of mRNAs, proteins and particles is observed only in human teratocarcinoma cell lines {196}. In other tissues and cell lines, only a basal level of expression of mRNA has been demonstrated even using very sensitive methods {197}. The expression of retroviral proviruses is generally regulated by elements of the 5′ long terminal repeat (LTR). The activity of HERV-K LTRs is known to be up-regulated by transcriptional factors. Furthermore, the activation of expression of an endogenous retrovirus may trigger the expression of a downstream gene that triggers a neoplastic effect.
The sequence of HERV-K(II), which locates to chromosome 3, has been disclosed {198}.
HML-2 is a subgroup of the HERV-K family {199}. HERV isolates which are members of the HML-2 subgroup include HERV-K10 {189,194}, the 27 HML-2 viruses shown in
Because HML-2 is a well-recognized family, the skilled person will be able to determine without difficulty whether any particular endogenous retroviruses is or is not a HML-2. Preferred members of the HML-2 family for use in accordance with the present invention are those whose proviral genome has an LTR which has at least 75% sequence identity to SEQ ID 44 (the LTR sequence from HML-2.HOM {7}). Example LTRs include SEQ IDs 45-48.
In some embodiments, the invention may not encompass polypeptides having one of amino acid sequences SEQ IDs 69 to 76, or polypeptides comprising SEQ IDs 69 to 76 {204}.
In some embodiments, the invention may not encompass: (i) nucleic acid comprising a nucleotide sequence disclosed in reference 1; (ii) nucleic acid comprising a nucleotide sequence within SEQ IDs 1 to 225 in reference 1; (iii) a known nucleic acid; (iv) a polypeptide comprising an amino acid sequence disclosed in reference 1; (v) a polypeptide comprising an amino acid sequence within SEQ IDs 1 to 225 in reference 1; (vi) a known polypeptide; (vii) a nucleic acid or polypeptide known as of 7 Dec. 2001 (e.g. whose sequence is available in a public database such as GenBank or GeneSeq before 7 Dec. 2001); or (viii) a polypeptide or nucleic acid known as of 10 Jun. 2002 (e.g. whose sequence is available in a public database such as GenBank or GeneSeq before 10 Jun. 2002).
The term “comprising” means “including” as well as “consisting” e.g. a composition “comprising” X may consist exclusively of X or may include something additional e.g. X +Y.
The term “about” in relation to a numerical value x means, for example, x±10%.
The terms “neoplastic cells”, “neoplasia”, “tumor”, “tumor cells”, “cancer” and “cancer cells”, (used interchangeably) refer to cells which exhibit relatively autonomous growth, so that they exhibit an aberrant growth phenotype characterized by a significant loss of control of cell proliferation (i.e. de-regulated cell division). Neoplastic cells can be malignant or benign and include tissue derived from prostate or breast cancer.
The word “substantially” does not exclude “completely” e.g. a composition which is “substantially free” from Y may be completely free from Y. Where necessary, the word “substantially” may be omitted from the definition of the invention.
Certain aspects of the present invention are described in greater detail in the non-limiting examples that follow. The examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to make and use the present invention, and are not intended to limit the scope of what the inventors regard as their invention nor are they intended to represent that the experiments below are all and only experiments performed. Efforts have been made to ensure accuracy with respect to numbers used (e.g. amounts, temperature, etc.) but some experimental errors and deviations should be accounted for. Unless indicated otherwise, parts are parts by weight, molecular weight is weight average molecular weight, temperature is in degrees Celsius, and pressure is at or near atmospheric.
Reference 1 describes the association of prostate cancer with the up-regulation of expression of the HML-2 subgroups of the HERV-K endogenous retroviruses.
Northern blotting of prostate cancer cell lines indicates that they express PCAV transcripts of several sizes, corresponding to both full-length viral genomic sequences and to sub-genomic spliced transcripts (
DNA fragments corresponding to the transcripts of env and other ORFs could be detected in these experiments only when reverse-transcriptase was included in the RT-PCR reactions (lane 1, 3, 5, 7 and 9 in
To determine the precise splicing patterns, the RT-PCR products obtained from cell lines and patient tissues were cloned and sequenced. In addition to env spliced mRNA, many other splice variants were seen (named Splice A to J in
Exon 1 comprises sequences from the transcription start site in the LTR to Splice Site I, as indicated schematically in
Exon 1.5 is very small and was only detected in the Splice D mRNA (
Exon 2 is very heterogeneous, containing two different 3′ splice junctions at the 5′ end of the exon (Splice Sites IV and V—see
The size of Exon 2 in each splice variant depends on which splice sites are used in each independent splice event.
Exon 3 sequences begin about 90 nucleotides before the second LTR (at position 8817 of the prototype sequence Y17832, see
This pattern of splicing and potential to encode multiple products depending of which splice sites are utilized resembles in general the splicing pattern of HIV and HTLV. This suggests that PCAV belongs to the lentivirus type of retroviruses. All possible splice variants are identified as SEQ IDs 18 to 43 (consensus sequences) and are described in Table 1.
A defining characteristic of lentiviruses is that they encode a polypeptide that can activate transcription from the viral LTR promoter. HIV's tat polypeptide is the best understood example of these activators. The tat gene physically overlaps the rev and env genes in HIV and is made through alternative splicing of HIV mRNA spanning the env region. Tat polypeptide binds to the 5′ end of HIV mRNA at a specific site called TAR and provides HIV-specific activation.
Full-length HERV-K mRNAs can be spliced twice—once to remove gag-prt-pol and once to remove the bulk of the env gene (
Multiple alternative splice sites in PCAV-mRNAs have been identified (
A functional expression assay was designed to determine if the third reading frame in the final env exon encodes a polypeptide with the ability to activate transcription of PCAV-mRNA. The first component of the assay is an adenovirus vector with a PCAV LTR (SEQ ID 45) driving GFP expression (
GFP expression from this LTR was minimal in ovarian, breast, colon and liver cancer cells. It was also minimal in 293 cells, an immortalized kidney cell line, and also in primary prostate epithelium cells. GFP was easily detected in various prostate cancer cell lines (PC3, LNCaP, MDA2B PCA, DU145). Representative data are shown in
As GFP expression from the LTR appeared to be silent in primary prostate cells and active in prostate cancer, polypeptides from the env region were tested for their ability to activate expression in primary prostate cells. The coding sequences shown in
Vectors encoding cORF or the five PCAP products (
The interactions of PCAP4 and the non-activating PCAP products were tested by infecting cells with the GFP vector, the PCAP4 vector, and an excess of the vector encoding the non-activating product. PCAP 1, 2 & 3 and cORF could all suppress the activity of PCAP4, with cORF being the strongest dominant negative.
These data suggested that PCAV-mRNAs encode a tat homolog which contains a RNA binding domain (NLS), a polypeptide dimerization region and the third reading frame. The nucleotide sequences that make up this polypeptide product have been known since 1986, but their functional connection via alternative splicing has not previously been reported.
The RNA ligand of tat polypeptide in HIV is the TAR. Potential TAR sites in the LTR of PCAV-mRNAs have been investigated (
However, other work suggests that the 5′ end of PCAV-mRNA is further downstream.
This result was confirmed using RNase protection assays (
These two experiments suggest that the deletions used to generate the earlier data may have resulted in deletion of promoter sequences as well as transcribed sequences.
To resolve the discrepancy, stem and loop sequences of the predicted TAR structure (
These data therefore indicate that the stem and loop regions are not involved in HERV-K LTR-driven expression, suggesting that PCAV is not controlled using a lentiviral-like tat/TAR system. Another mechanism used by complex retroviruses to activate infected cells for viral expression is the tax type, employed by HTLV I and II. Tax acts at multiple levels in infected T-cells {202}. It up-regulates HTLV transcription by binding to several transcription factors and coactivators, and deregulates the cell cycle by binding to inhibitors of CDK4/6. This combination leads to aberrant differentiation of infected cells in which the virus is activated, and is thought to be instrumental in eventually inducing adult T-cell leukemia in infected individuals. One of the hallmarks of tax-type activation is that multiple promoters respond to tax, as opposed to the high specificity of tat for the HIV TAR.
PCAP4 activates HERV-K LTR (LTR60), but not murine leukemia virus (MLV) LTR (
In a separate experiment, high passage PrECs (approaching senescence) were co-infected with an adenovirus vector expressing GFP from an old-type HERV-K LTR (‘MDALTR’: SEQ ID 77), and a second vector expressing PCAP3 or PCAP4 at moi of about 20. After 3 days, the fluorescent intensity was measured by FACs and activation by PCAP3 and PCAP4 was seen (
The PCAP proteins of the invention therefore seem more akin to tax than to tat, although the precise mechanism of their action is not important to the basic practice of the invention.
Within the final exon in the env region of PCAV, reading frames 1 and 2 encode env and cORF, respectively (
The majority of the PCAP2 coding sequence is thus located after the splice, within the exon which contains the 3′ LTR. Although the +2 reading frame has no known function in HERV-K, cDNA prepared from prostate tumors included PCAP2-encoding transcripts.
Inspection of various aligned HERV-K genomes suggests that PCAP2 is a mutated form of an original protein. The protein is thus unlikely to be functioning in its original capacity, but oncogenic activity could arise through retention of a functional domain. Retention of activity by fragments is another property which matches tax rather than tat.
To study the subcellular localization of PCAP2, in order to better understand its role, an adenovirus expressing PCAP2 with a C-terminal V5 tag (SEQ ID 60) was used to infect primary prostate epithelial cells. The protein was not highly expressed, but was visible in the nucleoli using anti-V5 and, more diffusely, throughout the whole cell (
These results are consistent with the presence of NLS motifs in PCAP2.
RWPE1 cells were created by immortalizing normal prostate epithelial cells with human papillomavirus 18 {203}. The cells are non-tumorgenic in nude mice and possess markers and growth characteristics of normal prostate epithelial cells.
A plasmid expressing PCAP2 from an EF1A cassette was co-transfected into RWPE1 with a puromycin selection marker. Individual resistant colonies were expanded, total RNA was prepared and positive clones were picked based on RT-PCR analysis. To assess growth characteristics, parental cells, DU145 prostate cancer cells, or selected clones were plated into matrigel plus complete keratinocyte serum-free media (complete KSFM is media with bovine pituitary extract and EGF supplements). The plated cells are shown in
Normal prostate epithelial cells and RWPE1 cells migrated toward each other upon plating in matrigel, and over a week these aggregates formed hollow structures reminiscent of a gland. In contrast, DU145 cancer cells seeded solid cored colonies without apparent migration or differentiation. In the cell lines tested, both GFP lines resembled the parent RWPE1, indicating that the introduction of the vector, the selection process and the culture conditions did not change the cells. The cells expressing PCAP 1 also behaved similarly to RWPE 1. A clone expressing cORF initially aggregated like RWPE1, but then the structure dissolved and the cells took on more of a colony morphology. Three independent PCAP2 colonies failed to aggregate but instead seeded colonies like DU145 cancer cells. These data suggest that PCAP2 interferes with normal prostate cell growth and differentiation.
Using the same cell lines, the effect of PCAP2 on anchorage-independent growth of RWPE1 was tested. RWPE1 cells do not grow in 0.35% soft agar, but they do grow at lower agar concentrations (e.g. 0.3%). 1,000 cells of each type were plated in complete KSFM plus soft agar (0.35%). As shown in
PCAP2 expression has been found to be associated with various tumor tissues and transformed cell lines, but not with normal non-transformed cells {204}. In particular, expression has been seen in mammary carcinoma cell lines and patient tissues.
RNA extracted from tissues or cell lines as described in reference 204 has been analyzed by RT-PCR on a panel of established cell lines, tumor biopsies, lymphocytes from leukemic and normal individuals, and normal non-transformed cells.
The RT-PCR results in
SEQ IDs 12 & 36 are PCAP3, which shares the same 5′ region and start codon as env, but in which a splicing event removes env-coding sequences and shifts to a reading frame +2 relative to that of env:
PCAP3 is thus similar to PCAP2, but the shift into +2 reading frame for PCAP3 is caused by small deletions in a type 2 genome rather than the large deletion seen in type 1 genomes for PCAP2.
cDNA prepared from prostate cancer cell line MDA Pca-2b included PCAP3 transcripts, as did prostate cancer mRNA e.g. more than 2-fold in 79% of patient samples and more than 5-fold in 53%. These figures support the view that PCAP3 is involved in many prostate cancers. Furthermore, the figures do not reflect the whole relationship between cancer and PCAP3 expression—if patients are grouped according to Gleason grades, grade 3 tumors show high up-regulation of PCAP3 whereas more developed grade 4 tumors seem to show PCAP3 suppression (
The subcellular localization of PCAP3 was studied in the same way as described above for PCAP2. The protein was relatively stable and was seen in the nucleoplasm. The concentration of this small protein in this cellular location shows that it is specifically interacting with a target in the nucleus.
As mentioned above, PCAP4 activates expression from the PCAV LTR and also from the HIV LTR. PCAP4 is generated following splicing involving a 5′ splice site 52 bases upstream of the normal cORF spice site. This splicing event causes a shift into the third reading frame in the last exon.
Staining of PCAP4 as described above for PCAP2 and PCAP3 shows nucleolar location (
To explore this finding further, stable NIH3T3 cell lines expressing either no extra gene, PCAP4 or cORF were made by inserting the genes in pCEP4, a plasmid with a hygromycin marker (
Like PCAP2, PCAP4 was able to make RWPE1 cells behave like DU145 cancer cells (
The above data show that PCAP2, PCAP3 and PCAP4, all of which use the third reading frame of exon 3, have a strong effect on the growth properties of immortal cell lines, including on approximately-normal human prostate epithelial cells. This oncogenic potential, combined with their expression in tumor tissue but not normal tissue, suggests a clear link with cancer.
Prostate cancer is believed to arise in the luminal epithelial layer, but normal luminal epithelial cells are capable of very few cell divisions. In contrast, NIH3T3 and RWPE1 cells are immortal. Because PCAV seems to be involved in early stages of cancer (see above), the effects of PCAP polypeptides on primary prostate epithelial cells (PrEC), which normally senesce rapidly, were tested.
Primary human epithelial cells have a very limited division potential. After a certain number of divisions the cells will enter senescence. Senescence is distinct from quiescence (immortal or pre-senescent cells enter quiescence when a positive growth signal is withdrawn, or when an inhibitory signal such as cell-cell contact is received, but can be induced to divide again by adding growth factors or by re-plating the cells at lower density) and is a permanent arrest in division, although senescent cells can live for many months without dividing if growth medium is regularly renewed.
Certain genes, particularly viral oncogenes (e.g. SV40 T-antigen) force cells to ignore senescence signals. T-antigen stimulates cells to continue division up to a further expansion barrier termed ‘replicative crisis’. Two processes occur in crisis: cells continue to divide, but cells die in parallel at a very high rate from accumulated genetic damage. When cell death exceeds division then virtually all cells die in a short period. The rare cells which grow out after crisis have become immortal and yield cell lines. Cell lines typically have obvious genetic rearrangements: they are frequently close to tetraploid, there are frequent non-reciprocal chromosomal translocations, and many chromosomes have deletions and amplifications of multiple loci {205, 206, 207}.
Gene products that lead to crisis are particularly interesting because prostate cancers exhibit high genomic instability, which could be caused by post-senescence replication. Current theory holds that prostate cancer arises from lesions termed prostatic intraepithelial neoplasia (PIN) {208}. Genetic analyses of PIN show that many of the genetic rearrangements characteristic of prostate cancer have already occurred at this stage {209}. PIN cells were thus tested for PCAV expression to determine if the virus could play a role in the earliest stages of prostate cancer. PCAV gag was found to be abundantly expressed, indicating that PCAV expression is high at the time when the genetic changes associated with prostate cancer occur. As PCAP2 and PCAP3 was seen to be expressed in prostate tumors, their roles were investigated by seeing if they are capable of inducing cell division in PrEC after senescence.
Initial attempts to select drug-resistant PrECs after transfection with PCAP expression plasmids failed. Analysis of PrEC after infection with adenovirus vectors expressing GFP, PCAP2 or PCAP3 revealed abundant cell death on day 4 post-infection in the PCAP cells. A dose-dependent increase in terminal deoxytransferase end labeling (TUNEL), to mark nuclei with nicked DNA, confirmed that the cells were undergoing apoptosis (
These results suggested that apoptosis would have to be blocked before the effect of PCAP expression in PrECs could be assessed. Plasmids encoding PCAPs 2, 3 and 4 plus neomycin markers were thus co-transfected with expression plasmids encoding either bcl-2 or bcl-XL to block apoptosis. As controls, cells were transfected with plasmids expressing single proteins. After two weeks under selection, the bcl-2 and bcl-XL dishes all had numerous resistant cells that grew to fill in a fraction of the dish. When these cell were split they failed to divide further, but were viable and resembled senescent parental cells. In contrast, the cells which expressed PCAP2, PCAP3 or PCAP4 plus an anti-apoptosis protein yielded some colonies made up of small cells which divided to fill the initial plate and continued to divide when split.
In parallel to the above drug selections, the growth potential of cells was assessed. The parental PrECs went through seven population doublings before reaching senescence. In contrast, drug-resistant cells co-transfected with an anti-apoptotic gene plus a PCAP expanded well beyond the senescence point before ceasing to grow:
Cells transfected with PCAP4 grew rapidly for around two weeks. Expansion of the cells then slowed and finally ceased. Concomitantly, the number of floating and dead cells increased and the appearance of the cells changed—they no longer had the regular “cobblestone” appearance of epithelial cells, but instead had several morphologies, and there were many multinucleate cells. Cells died 2 weeks later, while the cells transfected with lacZ or lacZ+bcl-2 were still alive 1 month later.
The PCAP2 and PCAP3 cells behaved similarly.
Neither senescent cells nor cells approaching crisis expand in number. One difference between them, however, is that cells approaching crisis are dividing and dying at an appreciable rate, and so cell division can distinguish between the two states. After labeling with bromo-deoxyuridine, 30% of pre-senescent PrECs were labeled, as were 10% of PrEC transfected with either PCAP2 or PCAP3 (plus anti-apoptosis proteins), but none of the senescent lacZ or cORF+bcl-2 controls were labeled (
These results show that PCAP proteins are capable of inducing growth in prostate epithelial cells, and this growth could be an underlying cause of prostate cancer. The ability to drive cells past senescence is another property which matches tax rather than tat.
PCAP Products from Other HERV-K Viruses
The amino acid sequences encoded by the third reading frame of exon 3 for various HERV-Ks found in the human genome are given as SEQ IDs 78 to 277. Nucleotide sequences which encode these 200 amino acid sequences are given as SEQ IDs 278 to 477 although other nucleotide sequences, either found naturally in the human genome or designed artificially, can encode the same amino acid sequences due to codon degeneracy. The amino acid sequences are aligned below:
All publications and patent applications mentioned in this specification are incorporated herein by reference to the same extent as if each individual publication or patent application were specifically and individually indicated to be incorporated by reference.
The foregoing description of preferred embodiments of the invention has been presented by way of illustration and example for purposes of clarity and understanding. It is not intended to be exhaustive or to limit the invention to the precise forms disclosed. It will be readily apparent to those of ordinary skill in the art in light of the teachings of this invention that many changes and modifications may be made thereto without departing from the spirit of the invention. It is intended that the scope of the invention be defined by the appended claims and their equivalents.
This Application is a Divisional of U.S. patent application Ser. No. 10/497,786, filed Jun. 7, 2004, which is a U.S. National Phase of International Patent Application No. PCT/US2002/039344, filed Dec. 9, 2002, which is a Continuation-in-Part of U.S. patent application Ser. No. 10/016,604, filed Dec. 7, 2001, now U.S. Pat. No. 7,776,523, issued Aug. 17, 2010, and claims the benefit of U.S. Provisional Patent Application No. 60/340,064, filed Dec. 7, 2001, and U.S. Provisional Patent Application No. 60/388,046, filed Jun. 12, 2002, all of which are incorporated herein by reference in their entirety.
Number | Date | Country | |
---|---|---|---|
60340064 | Dec 2001 | US | |
60388046 | Jun 2002 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 10497786 | Dec 2005 | US |
Child | 14156167 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 10016604 | Dec 2001 | US |
Child | 10497786 | US |