Method of identifying potential inhibitors of human papillomavirus protein E2 using x-ray atomic coordinates

FIELD OF THE INVENTION

The invention relates to the papillomavirus E2 protein, particularly the crystalline structure of the human papillomavirus 11 (HPV-11) E2 protein transactivation domain complexed with an inhibitor. Particularly, the invention provides crystal structure coordinates that define an inhibitor-binding pocket and 3-dimension structural model for identifying potential inhibitors that would fit in this pocket. Also disclosed are methods for enabling the design and selection of inhibitors of E2 protein activity involved in papillomavirus DNA replication, particularly human papillomavirus.

BACKGROUND OF THE INVENTION

Papillomaviruses (PV) are non-enveloped DNA viruses that induce hyperproliferative lesions of the epithelia. The papillomaviruses are widespread in nature and have been recognized in higher vertebrates. Viruses have been characterized, amongst others, from humans, cattle, rabbits, horses, and dogs. The first papillomavirus was described in 1933 as cottontail rabbit papillomavirus (CRPV). Since then, the cottontail rabbit as well as bovine papillomavirus type 1 (BPV-1) have served as experimental prototypes for studies on papillomaviruses. Most animal papillomaviruses are associated with purely epithelial proliferative lesions, and most lesions in animals are cutaneous. In the human there are more than 75 types of papillomavirus (HPV) that have been identified and they have been catalogued by site of infection: cutaneous epithelium and mucosal epithelium (oral and genital mucosa). The cutaneous-related diseases include flat warts, plantar warts, etc. The mucosal-related diseases include laryngeal papillomas and anogenital diseases comprising cervical carcinomas (Fields, 1996, Virology, 3rd ed. Lippincott—Raven Pub., Philadelphia, N.Y.).

There are more than 25 HPV types that are implicated in anogenital diseases; these are grouped into “low risk” and “high risk” types. The low risk types include HPV type 6, and type 11, which induce mostly benign lesions such as condyloma acuminata (genital warts) and low grade squamous intraepithelial lesions (SIL). In the United States, there are approximately 5 million people with genital warts of which 90% is attributed to HPV-6 and HPV-11.

The high-risk types are associated with high grade SIL and cervical cancer and include most frequently HPV types 16, 18, 31, 33, 35, 45, and 52. The progression from low-grade SIL to high-grade SIL is much more frequent for lesions that contain high risk HPV-16 and 18 as compared to those that contain low risk HPV types. In addition, only four HPV types are detected frequently in cervical cancer (types 16, 18, 31 and 45). About 500,000 new cases of invasive cancer of the cervix are diagnosed annually worldwide (Fields, 1996, supra).

Treatments for genital warts include physical removal such as cryotherapy, CO₂laser, electrosurgery, or surgical excision. Cytotoxic agents may also be used such as trichloroacetic acid (TCA), podophyllin or podofilox. Immunomodulatory agents are also available such as Interferon and imiquimod (Aldara®, 3M Pharmaceuticals). These treatments are not completely effective in eliminating all viral particles and there is either a high cost incurred or uncomfortable side effects related thereto. Also recurrent warts are common (Beutner & Ferenczy, 1997, Amer. J. Med., 102(5A):28–37).

The ineffectiveness of the current methods to treat HPV infections has demonstrated the need to identify new means to control or eliminate such infections. In recent years, efforts have been directed towards finding antiviral compounds, and especially compounds capable of interfering with viral replication (Hughes and Romanos, 1993, Nucleic Acids Res. 21:5817–5823; Clark et al., Antiviral Res., 1998, 37(2):97–106; Hajduk et al., 1997, J. Med. Chem., 49(20):3144–3150 and Cowsert et al., 1993, Antimicrob. Agents. Chemother., 37(2):171–177). To that end, it has therefore become important to study the genetics of HPVs in order to identify potential chemotherapeutic targets to contain and possibly eliminate any diseases caused by HPV infections.

The life cycle of PV is closely coupled to keratinocyte differentiation. Infection is believed to occur at a site of tissue disruption in the basal epithelium. Unlike normal cells, cellular division continues as the cell undergoes vertical differentiation. As the infected cells undergo progressive differentiation, the cellular replication machinery is maintained which allows viral DNA replication to increase, with eventual late gene expression and virion assembly in terminally differentiated keratinocytes and the release of viral particles (Fields, supra).

The coding strand for each of the papillomavirus genome contains approximately ten designated translational open reading frames (ORFs) that have been classified as either early ORFs or late ORFs. The E1 to E8 genes are expressed early in the viral replication cycle. The two late genes (L1 and L2) code for the major and minor capsid proteins respectively. The E1 and E2 gene products function in viral DNA replication, whereas E5, E6 and E7 modulate host cell proliferation. The functions of E3, E4 and E8 gene products are uncertain at present.

Studies of HPV have shown that proteins E1 and E2 are the only viral proteins required for viral DNA replication (Kuo et al., 1994, J. Biol. Chem. 30: 24058–24065). This requirement is similar to that of bovine papillomavirus type 1 (BPV-1). Indeed, there is a high degree of similarity between E1 and E2 proteins and the ori-sequences of all papillomaviruses (PV) regardless of the viral species and type (Kuo et al., 1994, supra).

When viral DNA replication proceeds in vitro, where E1 protein is present in excess, replication can proceed in the absence of E2. In vivo, in the presence of a vast amount of cellular DNA, replication requires the presence of both E1 and E2. The mechanism for initiating replication in vivo is believed to involve the cooperative binding of E1 and E2 to the origin, leading to the assembly of a ternary protein-DNA complex (Mohr et al., 1990, Science 250:1694–1699]. The E2 protein is a transcriptional activator that binds to the E1 protein and, by doing so enhances binding of E1 to the BPV origin of replication (Seo et al., 1993b, Proc. Natl. Acad. Sci., 90:2865–2869). Hence, E2 acts as a specificity factor in directing E1 to the origin of replication (Sedman and Stenlund, 1995, Embo. J. 14:6218–6228). In HPV, Lui et al. suggested that E2 stabilizes binding of E1 to the ori (1995, J. Biol. Chem. 270(45): 27283–27291 and McBride et al., 1991, J. Biol. Chem 266:18411–18414). These interactions of DNA-protein and protein-protein occur at the origin of DNA replication (Sverdrup and Myers, supra).

The ˜45 kD E2 proteins characterized from numerous human and animal serotypes share a common organization of two domains. The N-terminal transactivation domain (TAD) is about 220 amino acids and the C-terminal DNA-binding domain (DBD) is 100 amino acids in length. Both domains are joined by a flexible linker region.

E2 activates viral replication through cooperative binding with the viral initiator protein E1 to the origin of DNA replication, ultimately resulting in functional E1 hexamers. E2 is also a central regulator of viral transcription. It interacts with basal transcription factors, including TATA-binding protein, TFIIIB, and human TAF_II70; proximal promoter binding protein such as Sp1; and other cellular factors such as AMF-1, which positively affect E2's transcriptional activation.

Which of these many interactions are sufficient or necessary to achieve transcriptional activation is more ambiguous. These details are consistent with the idea that enhancer binding proteins function as transcriptional activators by using specific protein-protein contacts to link components of the general transcription machinery to a promoter, with the goal of recruiting RNA polymerase II. A third function of E2 is to aid in the faithful segregation of viral DNA. The bovine papillomavirus (BPV) genome and E2 protein co-localize with host cell chromosomes during mitosis, dependent on an intact E2 TAD.

The E2 DBD dimerizes to form a β-barrel with flanking recognition helices positioned in the major grooves of the DNA binding site. In contrast, the structure of E2 TAD has remained elusive until Harris and Botchan (1999, Science, 284 (5420); 1673) provided a first model of a proteolytic fragment of HPV-18 E2 TAD by X-ray crystallography. The model suggests a cashew-shaped protein of 55 Å×40 Å×30 Å with a concave cleft on one side of the protein and ridges on the opposite surface. Harris and Botchan studied whether discrete surfaces correlated with known E2 activities and particularly identified a prominent cluster of residues constituting the inner edge of the main cavity encompassing E175, L178, Y179, and I73 defining a distinctive surface important for transcription.

Antson et al (2000, Nature, (403) 805–809) disclose the crystal structure of the complete E2 TAD from HPV-16, including a second newly identified putative E2—E2 TAD interface comprising a cluster of 7 conserved residues (R37, A69, I73, E76, L77, T81, and Q80). Anston et al suggested that Q12 and E39 may be involved in interaction with E1.

The E2 protein is considered a potential target for antiviral agents. However, drug discovery efforts directed towards E2 have been hampered by the lack of structural information of an E2 complexed with an inhibitor. Neither the model of Harris, nor that of Antson provides any information as to the localization and/or characterization of a potential inhibitor binding pocket. Structural information of the apo-E2 TAD has provided some valuable knowledge of the surface on the apo-protein but it now appears clear that this is not representative of the changes in conformation induced upon binding with an inhibitor.

The lack of specific E2 inhibitors, which is necessary for obtaining co-crystal of E2 and inhibitors, has hampered the search for the inhibitor binding pocket in E2. Thus, X-ray crystallographic analysis of such protein-inhibitor complex has not been possible.

The present invention refers to a number of documents, the contents of which are herein incorporated by reference.

SUMMARY OF THE INVENTION

The present invention provides a novel composition comprising a human papillomavirus E2 protein transactivation domain complexed with a small molecule inhibitor of E2 and methods for making such composition. Advantageously, the present invention further provides an E2-inhibitor complex that is capable of being crystallized and analyzed by X-ray diffraction, thereby providing important information on the inhibitor-binding pocket of the transactivation domain of the HPV E2 protein. The inhibitor provides an invaluable tool to produce a co-crystal allowing characterization of a previously unknown inhibitor-binding pocket that may be involved in interaction with E1 during the replication cycle of HPV.

The invention also provides a method for determining at least a portion of the three-dimensional structure of molecules or molecular complexes, which contains at least some structurally similar features to a HPV E2 inhibitor binding pocket.

The invention also provides a 3-D model for analyzing and predicting binding of potential inhibitors to aid in the search for further inhibitors binding to the identified pocket. Localization and characterization of this pocket, as described in the present invention provides a potential new therapeutic target in the treatment of PV infections.

The invention also provides a screening method for identifying agents capable of modulating this new target and a system to select at least one such agent capable of interfering with PV DNA replication.

The invention also provides a method for producing a drug, which inhibits interaction of the E1–E2 interaction comprising identifying a drug, or designing a drug that fits into the pocket as described herein.

According to a first aspect of the invention, there is provided a crystallizable composition, comprising an PV E2 TAD-like polypeptide of SEQ ID NO. 2 complexed with an inhibitor L:

embedded image

According to a second aspect of the invention, there is provided a crystal comprising an PV E2 TAD-like polypeptide of SEQ ID NO. 2 complexed with said inhibitor L, as defined above.

According to a third aspect of the invention, there is provided a method for producing a crystallized PV E2 TAD-inhibitor complex (PV E2 TAD-L), as defined above, comprising:

- a) mixing purified PV E2 TAD, contained in a purification buffer, with solubilized inhibitor L to generate a complex solution containing said PV E2 TAD-L complex; and
- b) crystallizing said complex from a) in a crystallization buffer.

According to a fourth aspect of the invention, there is provided a method for producing crystallized apo PV E2 TAD, comprising:

- a) mixing apo PV E2 TAD, contained in a purification buffer, with a crystallization buffer.

According to a fifth aspect of the invention, there is provided a method for producing a crystallized PV E2 TAD-inhibitor complex (PV E2 TAD-L), as defined above, comprising:

- a) solubilizing inhibitor L in a crystallization buffer; and
- a) soaking crystallized apo PV E2 TAD, as defined above, into a).

According to a sixth aspect of the invention, there is provided X-ray crystal structure coordinates of PV E2 TAD-inhibitor complex (PV E2 TAD-L), as defined above.

According to a seventh aspect of the invention there is provided a computer-readable data storage medium comprising a data storage material encoded with the X ray crystal structure coordinates, or at least a portion of the structure coordinates, set forth in FIG. 9.

According to a eighth aspect of the present invention, there is provided a computer for generating a three dimensional representation of said PV E2 TAD-L complex, as defined herein, comprising:

- a) a computer readable data storage medium having a data storage material encoded with said structure coordinates set forth in FIG. 9;
- b) a memory for storing instructions for processing said computer readable data;
- c) a central processing unit coupled to said computer readable data storage medium for processing said computer readable data into said three dimensional representation; and
- d) a display unit coupled to said central processing unit for displaying said three dimensional representation.

According to an ninth aspect of the invention, there is provided a method for producing an E2 protein, said protein being useful for identifying or characterizing E2 TAD inhibitors, comprising:

- a) using the HPV E2 TAD-L crystal structure, as defined herein, to identify HPV inhibitor binding pocket residues;
- b) comparing said HPV inhibitor binding pocket residues with analogous residues in another PV E2;
- c) mutating said other PV residues to said HPV residues, to produce a hybrid; and
- d) testing said hybrid for inhibition by an inhibitor.

BRIEF DESCRIPTION OF THE DRAWINGS

Having thus generally described the invention, reference will now be made to the accompanying drawings, showing by way of illustration a preferred embodiment thereof, and in which:

FIG. 1A depicts the amino acid sequence of the HPV-11 E2 transactivation domain (SEQ ID NO. 1) as obtained by Sakai, et al. 1996, J Virol. V70 1602–11;

FIG. 1B depicts the amino acid sequence of the HPV-11 E2 transactivation domain (SEQ ID NO. 2) as obtained according to the procedure of Example 1;

FIG. 2 depicts stereo ribbon diagrams of the apo-E2 from HPV-16 as described in Antson et al. (supra);

FIG. 3 depicts stereo ribbon diagrams of the apo-E2 from HPV-11 as produced by the Applicant;

FIG. 4 depicts stereo ribbon diagrams of the E2 from HPV-11 complexed with compound L as described herein;

FIG. 5 depicts a solvent accessible surface representation of the inhibitor-binding pocket of the apo-E2 TAD from HPV-16 (Antson et al., supra);

FIG. 6A depicts a solvent accessible surface representation of the inhibitor-binding pocket of the apo-E2 from HPV-11 as produced by the Applicant;

FIG. 6B depicts a solvent accessible surface representation of the inhibitor-binding pocket of the co-crystal;

FIG. 7 depicts a schematic representation of the movement of Y19 and H32 occurring in the pocket upon binding with an inhibitor;

FIG. 8 depicts a solvent accessible surface top view of the pocket showing particularly a deep cavity and a shallow cavity;

FIG. 9 lists the atomic structure coordinates for the E2 TAD (SEQ ID NO 2) complexed with compound L as derived by X-ray diffraction from co-crystals of that complex (hereinafter referred to HPV TAD E2-L). The preparation of the complex is described in Example 3. The following terms have these meanings: the term A.A. refers to the amino acid which is identified by each coordinate, in this column: the term “CPR” means cis-proline; BLHA=first molecule of inhibitor L; BLHB=second molecule of inhibitor L. Information on amino acids 197 to 201 from chain A is lacking due to the high flexibility of those residues that renders them invisible to x-ray. For the same reason, the following amino acids are modeled as Alanine: E2, K107, K173, S180, M182, H183 and P196. “X, Y, Z” crystallographically define the atomic position determined for each atom in a Cartesian coordinate space. “Occ” is an occupancy factor that refers to the fraction of the molecules in which each atom occupies the position specified by the coordinates. A value of “1” indicates that each atom has the same conformation, e.g., the same position, in all molecules of the crystal. “B” is a thermal factor that measures movement of the atom around its atomic center. The coordinates of the residues that form the deep cavity are shown in bold; and

FIG. 10 depicts the alignment of the amino acid sequence clusters that define generally the inhibitor-binding pocket region of the E2 transactivation domain from HPV-6A, HPV-11, HPV-16 and HPV-18. The residues in bold indicate that they define the deep cavity of the inhibitor binding pocket. The single underline defines the residues of the bottom of the deep pocket. The double underline indicates the shallow pocket residues. Y19 is indicated in italics.

DETAILED DESCRIPTION OF THE INVENTION

Definitions

The following abbreviations are used throughout the specification.

The term “associating with” or “binding” refers to a condition of proximity between chemical entities or compounds, or portions thereof. The association may be non-covalent—wherein the juxtaposition is energetically favored by hydrogen bonding or van der Waals or electrostatic interactions—or it may be covalent.

The term “binding pocket”, as used herein, refers to a region of a molecule or molecular complex, that, as a result of its shape, favorably associates with another molecule, molecular complex, chemical entity or compound. As used herein, the pocket comprises at least a deep cavity and, optionally a shallow cavity.

As used herein the term “complex” refers to the combination of a molecule or a protein, conservative analogs or truncations thereof associated with a chemical entity.

The abbreviations for the α-amino acids used in this application are set forth as follows:

Amino Acid
Symbol
Single letter code

Alanine
Ala
A

Arginine
Arg
R

Aspartic acid
Asp
D

Asparagine
Asn
N

Cysteine
Cys
C

Glutamic acid
Glu
E

Glutamine
Gln
Q

Glycine
Gly
G

Histidine
His
H

Isoleucine
Ile
I

Leucine
Leu
L

Lysine
Lys
K

Methionine
Met
M

Phenylalanine
Phe
F

Proline
Pro
P

Serine
Ser
S

Threonine
Thr
T

Tryptophan
Trp
W

Tyrosine
Tyr
Y

Valine
Val
V

The term “analog” as used herein denotes, in the context of this invention, a sequence of amino acid that retains a biological activity (either functional or structural) that is substantially similar to that of the original sequence. This analog may be from the same or different species and may be a natural analog or be prepared synthetically. Such analogs include amino acid sequences having substitutions, deletions, or additions of one or more amino acids, provided that the biological activity of the protein is conserved. Particularly, the term “conservative analog” denotes an analog having amino acid substituted by another amino acid having strong or weak similarity (see, for example, Dayhoff, M. O., (1978), Atlas of Protein Sequence and Structure, 5, suppl. 3, National Biomedical Research Foundation, Washington, D.C.) as defined according to the following Table:

Table of amino acid similarity

Amino acid
Strong
Weak

A
G, S
C, T, V

C

A, S

D
E
G, H, K, N, Q, R, S

E
D
H, K, N, Q, R, S

F
W, Y
H, I, L, M

G
A
D, N, S

H
Y
D, E, F, K, N, Q, R

I
L, M, V
F

K
R
D, E, H, N, Q, S, T

L
I, M, V
F

M
I, L, V
F

N
Q
D, E, G, H, K, R, S, T

P

S, T

Q
N
D, E, H, K, R, S

R
K
D, E, H, N, Q

S
A, T
C, D, E, G, K, N, P, Q

T
S
A, K, N, P, V

V
I, L, M
A, T

W
F, Y

Y
F, H, W

The term “side chain” with reference to an amino acid or amino acid residue means a group attached to the α-carbon atom of the α-amino acid. For example, the R-group side chain for glycine is hydrogen, for alanine it is methyl, for valine it is isopropyl. For the specific R-groups or side chains of the α-amino acids reference is made to A. L. Lehninger's text on Biochemistry (see chapter 4).

The term “truncation” refers to any segment of the E2 TAD amino acid sequence and/or any segment of any of the analogs described herein above that comprise the amino acids sufficient to define the deep cavity of the inhibitor-binding pocket of the present invention in the same spatial relationship as the one defined by the coordinates of FIG. 9.

The term “root mean square deviation” or “rms deviation” or “rmsd” means the square root of the arithmetic mean of the square of the deviations from the mean. In the context of atomic objects, the numbers are given in angstroms (Å). It is a way to express the deviation or variation from a trend or object. For the purpose of the present invention, all rmsd comparison were obtained by comparing structures that had been superimposed using the main chain atoms of H32, W33 and L94 only, to the minimum overlap rms, by rigid body movement only. The main chain atom rmsd for this action between our apo structure and the complex disclosed herein is 0.078 Å.

PREFERRED EMBODIMENTS

1. Composition

According to a first embodiment, there is provided a crystallizable composition, comprising an HPV E2 TAD-like polypeptide of SEQ ID NO. 2 complexed with an inhibitor L:

embedded image

Preferably, the composition comprises amino acids 1–220 of the HPV E2 protein (SEQ ID NO. 1) as defined according to the numbering of Swiss Prot: locus VE2_HPV11 accession P04015; unique ID: g137671, conservative analogs or truncations thereof. More preferably, the trans-activation domain (TAD) of E2 comprises amino acids 1–218, particularly 1–215 and even more preferably 1–201. Still, most preferably, the E2 TAD used for the present invention comprises amino acids 2–201 and still most particularly 2–196. Even most preferably, the composition comprises amino acids 15–104 of the E2 TAD.

In another aspect of the first embodiment, the HPV E2 TAD used for the present invention is obtained from the HPV-11 strain and is complexed with the small molecule inhibitor L. Other types of papillomavirus (PV) are also contemplated by the present invention, including BPV (bovine papillomavirus) or CRPV (Cotton Tail Rabbit Papilloma Virus).

According to a second embodiment, there is provided a crystal comprising an HPV E2 TAD-like polypeptide of SEQ ID NO. 2 complexed with the inhibitor L.

2. Method of Crystallizing

According to a third embodiment of the invention, there is provided a method for producing a crystallized HPV E2 TAD-inhibitor complex (HPV E2 TAD-L), as defined above, comprising:

- a) mixing purified HPV E2 TAD, contained in a purification buffer, with solubilized inhibitor L to generate a complex solution containing said HPV E2 TAD-L complex; and
- b) crystallizing said complex from a) in a crystallization buffer.

In a preferred aspect of the third embodiment step a), the inhibitor L is solubilized in 100% DMSO at a concentration of 60 mM.

In a preferred aspect of the third embodiment step a), the purification buffer contains a reducing agent that may be selected from TCEP or DTT. More preferably the reducing agent is TCEP. Preferably, the reducing agent is TCEP at a concentration of about 1 mM to about 10 mM. More preferably, the reducing agent is TCEP at a concentration of 5 mM.

Preferably, the purification buffer is used at a pH of between 7 and 9. More preferably, the purification buffer is used at pH of 8.

Further to the reducing agent, a salt can be added to aid stability of the HPV E2 TAD. Preferably, the salt may be selected from NaCl, NH₄SO₄, or KCl. More preferably, the salt is NaCl at a concentration of about 200 mM to about 800 mM. More preferably, the salt is NaCl at a concentration of 500 mM.

Further to the reducing agent, a buffer can be added to further aid the stability of the HPV E2 TAD. Preferably, the buffer may be selected from Tris-HCl, HEPES or bis-Tris. More preferably, the buffer is Tris-HCl at a concentration of between 0 nM and 50 mM. Most preferably, the buffer is Tris-HCl at a concentration of 25 nM.

Further to the reducing agent, a chelating agent may be added to reduce degradation of HPV E2 TAD by proteases. Preferably, the chelating agent may be EDTA or EGTA. More preferably, the chelating agent is EDTA at a concentration of between 0 mM and 1 mM. Even more preferably, the chelating agent is EDTA at a concentration of between 0 mM and 0.5 mM. Most preferably, the chelating agent is EDTA at a concentration of 0.1 mM.

In a preferred aspect of the third embodiment step a), preferably the HPV E2 TAD protein solution is used at a concentration of about 5 mg/ml to about 15 mg/ml in the purification buffer. More preferably, the HPV E2 TAD is used at a concentration of about 10 mg/ml HPV E2 TAD in the purification buffer.

In a preferred aspect of the third embodiment step b), preferably the crystallization buffer may be selected from MES, sodium phosphate, potassium phosphate, sodium acetate or sodium succinate. More preferably, the crystallization buffer is MES at a concentration of about 50 mM to about 0.2M. Most preferably, the crystallization buffer is MES at a concentration of 0.1M.

Preferably, the crystallization buffer further contains a precipitating agent, which aids crystallization of the HPV E2 TAD. Preferably, the precipitating agent may be selected from MPD, isopropanol, ethanol, or tertiary butanol. More preferably, the precipitating agent is MPD at a concentration of 30% to about 40%. Most preferably, the precipitating agent is MPD at a concentration of 35%.

Preferably, the crystallization buffer is used at a pH of between 4.5 and 6.5. Most preferably, the crystallization buffer is used at a pH of 5.5

In a preferred aspect of the third embodiment step b), the crystallization is carried out at between 0° C. and 10° C. More preferably, the crystallization is carried out at 4° C.

In a preferred aspect of the third embodiment, crystallization of the HPV E2 TAD-L complex was carried out using the hanging drop vapor diffusion technique.

In an important aspect of the third embodiment, the crystallized HPV E2 TAD-L complex invention is amenable to X-ray crystallography. Using X-ray crystallography analysis, the HPV E2 TAD-inhibitor complex crystals obtained belong to space group P4(1) with unit cell dimension of a=b=60.7 Å and c=82.5 Å and contain one molecule per asymmetric unit. Initial diffraction data were measured using a home source x-ray generator (Rigaku, Japan) equipped with an R-axis II image plate area detector (Molecular Structure Corp, Texas). Preferably, data to a resolution of 3.15 Å were collected on a single crystal of the complex cooled at 100 K.

According to a fourth embodiment of the invention, there is provided a method for producing crystallized apo HPV E2 TAD, comprising:

- a) mixing apo HPV E2 TAD, contained in a purification buffer, with a crystallization buffer.

In a preferred aspect of the fourth embodiment, the apo HPV E2 TAD is apo HPV-11 E2 TAD. More preferably, the apo HPV E2 TAD is apo Se-HPV-11 E2 TAD.

In a preferred aspect of the fourth embodiment, the purification buffer contains is as described herein. Preferably, the apo HPV E2 TAD protein solution is used at a concentration of about 1 mg/ml to about 15 mg/ml in the purification buffer. More preferably, the apo HPV E2 TAD is used at a concentration of about 1 mg/ml to about 10 mg/ml E2 TAD in the purification buffer. Most preferably, the apo HPV E2 TAD is used at a concentration of 5 mg/ml in the purification buffer.

In a preferred aspect of the fourth embodiment, the crystallization buffer may be selected from MES, sodium phosphate, potassium phosphate, sodium acetate or sodium succinate. More preferably, the crystallization buffer is sodium succinate at a concentration of about 50 mM to about 0.2M. Most preferably, the crystallization buffer is sodium succinate at a concentration of 0.1M.

Preferably, the crystallization buffer further contains PEG8K, PEG4K or PEG5K mono methyl ether. More preferably, the crystallization buffer further contains PEG5K mono methyl ether at a concentration of about 10% to about 25%. Most preferably, the crystallization buffer further contains PEG5K mono methyl ether at a concentration of 18%.

Preferably, the crystallization buffer is used at a pH of between 4.5 and 6.5. Most preferably, the crystallization buffer is used at a pH of 5.0

Preferably, the crystallization buffer further contains ammonium sulfate at a concentration of about 0.1M to about 0.4M. Most preferably, the crystallization buffer further contains ammonium sulfate at a concentration of 0.2M.

In a preferred aspect of the fourth embodiment step, the crystallization is carried out at between 0° C. and 10° C. More preferably, the crystallization is carried out at 4° C.

The apo HPV-11 E2 TAD crystals belong to space group C222 with unit cell dimension of a=54.9 Å, b=169.9 Å and c=46.1 Å and contained one molecule per asymmetric unit. Diffraction data were collected on beamline X4a (NSLS, Brookhaven National Laboratory, New York). Four data sets were collected form a single crystal cooled at 100 K, at four different x-ray wavelengths near the selenium absorption edge (0.9790 Å, 0.9794 Å, 0.9743 Å, and 0.9879 Å). Images were collected on a ADSC Q4 CCD. Preferably, the maximum resolution was 2.4 Å.

According to a fifth embodiment of the invention, there is provided a method for producing a crystallized HPV E2 TAD-inhibitor complex (HPV E2 TAD-L), as defined above, comprising:

- a) solubilizing inhibitor L in a crystallization buffer; and
- b) soaking crystallized apo HPV E2 TAD, as defined above, into a).

In an alternative aspect of the fifth embodiment of the invention, there is provided a method for producing a crystallized HPV E2 TAD-inhibitor complex (HPV E2 TAD-L), as defined above, comprising:

- a) adding inhibitor L into a crystallization buffer containing crystallized HPV E2 TAD.
  
  3. X-ray Coordinates

According to a sixth embodiment, there is provided X-ray crystal structure coordinates of the HPV E2 TAD-inhibitor complex (HPV E2 TAD-L), as defined above. More preferably, the coordinates are of the inhibitor-binding pocket. Even more preferably, the set of coordinates for the HPV E2 TAD-inhibitor complex are defined according to FIG. 9.

Preferably, the inhibitor-binding pocket comprises a deep cavity which is delimited by the side chains of amino acids H32, W33 and L94, wherein the side chain of Y19 of the HPV E2 TAD is moved away from its native position to form a deep cavity of such dimensions as to allow entry of a small molecule inhibitor. More preferably, the deep cavity is lined at its bottom by amino acids H29 and T97. Most preferably, the pocket further comprises a shallow cavity that is delimited by one or more of amino acids L15, I36, E39, K68, N71 and A72.

Preferably, the inhibitor-binding pocket is defined according to the coordinates assigned to the following clusters of amino acids:

15 21.....28 39....68 72......90 104

LLELYEE ..... KHIMHWKCIRLE .... KGHNA ...... EPWTLQDTSYEMLT

(SEQ ID NO.9) (SEQ ID NO.10) (SEQ ID NO.11) (SEQ ID NO.18)

More preferably, the inhibitor-binding pocket and particularly its deep cavity is defined by the coordinates of H32, W33 and L94 according to FIG. 9. More preferably, the coordinates of the side chains of H32, W33 and L94.

Alternatively, one may consider changing the side chain of Y19 from a protein construct that would reproduce a similar deep cavity without the hindrance of the Y19 side chain.

Even more preferably, the bottom of the deep pocket is defined by the coordinates of amino acids H29 and T97. Even most preferably, the shallow cavity of the inhibitor-binding pocket is defined by the coordinates of one or more of amino acids L15, I 36, E39, K68, N71 and A72.

The three-dimensional structure of the HPV E2 TAD-L complex of this invention is defined by a set of structure coordinates as set forth in FIG. 9. The term “structure coordinates” refers to Cartesian coordinates derived from mathematical operations related to the patterns obtained on diffraction of a monochromatic beam of X-rays by the atoms (scattering centers) of an E2-L complex in crystal form. The diffraction data are used to calculate an electron density map of the repeating unit of the crystal. The electron density maps are then used to establish the positions of the individual atoms of the E2 TAD inhibitor pocket.

Those of skill in the art will understand that a set of structure coordinates for a protein or protein-inhibitor complex or a portion thereof, is a relative set of points that define a shape in three dimensions. Thus, it is possible that an entirely different set of coordinates could define a similar or identical shape.

The variations in coordinates may be generated by mathematical manipulations of the structure coordinates. For example, the structure coordinates set forth in FIG. 9 could be manipulated by crystallographic permutations of the structure coordinates, fractionalization or matrix operations to sets of the structure coordinates or any combination of the above.

Various computational analyses are necessary to determine whether a molecule or molecular complex or a portion thereof is sufficiently similar to all or parts of the HPV E2 protein or HPV E2 TAD described above as to be considered the same. Such analyses may be carried out in current software applications, such as the Molecular Similarity application of QUANTA (Molecular Simulations Inc., San Diego, Calif.) version 4.1.

The Molecular Similarity application permits comparisons between different structures, different conformations of the same structure, and different parts of the same structure. The procedure used in Molecular Similarity to compare structures is divided into four steps: 1) load the structures to be compared; 2) define the atom equivalence in these structures; 3) perform a fitting (superposition) operation; and 4) analyze the results.

Each structure is identified by a name. One structure is then identified as the target (i.e., the fixed structure); all remaining structures are working structures (i.e., moving structures). Since atom equivalency within QUANTA is defined by user input, for the purpose of this invention rmsd values were determined using main chain atoms for amino acids H32, W33 and L94 between the two structures being compared.

When a rigid fitting method is used, the working structure is translated and rotated to obtain an optimum fit with the target structure. The fitting operation uses an algorithm that computes the optimum translation and rotation to be applied to the moving structure, such that the root mean square difference of the fit over the specified pairs of equivalent atom is an absolute minimum. After superposition of the two structures, a rmsd value can be calculated for specific sets of equivalent atoms.

4. Coordinates Stored on Machine Readable Medium

In a seventh embodiment, there is provided a computer-readable data storage medium comprising a data storage material encoded with the structure coordinates, or at least a portion of the structure coordinates set forth in FIG. 9. Examples of such computer readable data storage media are well known to those skilled in the art and include, for example CD-ROM and diskette (“floppy disks”).

Thus, in accordance with the present invention, the structure coordinates of a HPV E2-inhibitor complex, and in particular a HPV E2 TAD-L complex, and portions thereof can be stored in a machine-readable storage medium. Such data may be used for a variety of purposes, such as drug discovery and X-ray crystallographic analysis of protein crystal.

Accordingly, in an eighth embodiment, there is provided a computer for generating a three dimensional representation of the HPV E2 TAD-L complex, comprising:

- a) a computer readable data storage medium comprising a data storage material encoded with the structure coordinates set forth in FIG. 9;
- b) a memory for storing instructions for processing said computer readable data;
- c) a central processing unit coupled to said computer readable data storage medium for processing said computer readable data into said three dimensional representation; and
- d) a display unit coupled to said central processing unit for displaying said three dimensional representation.
  
  5. 3-Dimensional Structure of Pocket

The invention also provides a 3-dimensional structure of at least a portion of the molecular complex, which contains features structurally similar to a HPV E2 TAD inhibitor binding pocket.

The shape of the inhibitor binding pocket, according to the present invention, can be viewed as comprising a deep pocket and, optionally, a shallower pocket (see FIG. 7). The shape of the deep cavity is defined by the relative positions of the side chains of amino acids H32, W33 and L94 and not their absolute coordinates according to FIG. 9. Similar coordinates or three-dimensional model may be obtained from different techniques (e.g. NMR, modeling, etc.) and are considered to fall within the scope of the present invention.

Thus, this invention also provides the three-dimensional structure of an HPV E2-inhibitor complex, specifically an HPV E2 TAD-L complex. Importantly, this has provided for the first time, information about the shape and structure of this HPV E2 TAD inhibitor-binding pocket.

6. Using the Three-dimensional Model for Screening

In a ninth embodiment, there is provided a method for evaluating the potential of a chemical entity to associate with a papillomavirus E2 transactivation domain comprising a binding pocket defined by the structure coordinates of an HPV-11 E2 protein transactivation domain comprising amino acids H32, W33 and L94, or a three-dimensional model thereof.

Optionally, the invention further provides for the same method where the binding pocket further comprises the structure coordinates of one or both of H29 and T97 that define the bottom of the deep pocket.

Optionally, the invention further provides for the same method where the binding pocket further comprises the structure coordinate of at least one amino acid selected from the group consisting of: L15, I36, E39, K68, N71 and A72.

For the first time, the present invention permits the use of structure-based or rational drug design techniques to design, select, and synthesize chemical entities, including inhibitory compounds that are capable of fitting and/or binding to HPV E2 TAD inhibitor binding pocket, or any portion thereof.

One particularly useful drug design technique enabled by this invention is iterative drug design. Iterative drug design is a method for optimizing associations between a protein and a compound by determining and evaluating the three-dimensional structures of successive sets of protein/compound complexes.

Those of skill in the art will realize that association of natural ligands or substrates with the binding pocket of their corresponding receptors or enzymes is the basis of many biological mechanisms of action. Similarly, many drugs exert their biological effects through association with the binding cavities of receptors and enzymes. Such associations may occur with all or any parts of the binding pocket. An understanding of such associations will help lead to the design of drugs having more favorable associations with their target receptor or enzyme, and thus, improved biological effects. Therefore, this information is valuable in designing potential ligands or inhibitors of receptors or enzymes, such as inhibitors of HPV E2-like polypeptides, and more importantly HPV E2 TAD.

In iterative drug design, crystals of a series of protein/compound complexes are obtained and then the three-dimensional structure of each complex is solved. Such an approach provides insight into the association between the proteins and compounds of each complex. This is accomplished by selecting compounds with inhibitory activity, obtaining crystals of this new protein/compound complex, solving the three-dimensional structure of the complex, and comparing the associations between the new protein/compound complex and previously solved protein/compound complexes. By observing how changes in the compound affected the protein/compound associations, these associations may be optimized.

In some cases, iterative drug design is carried out by forming successive protein-compound complexes and then crystallizing each new complex. Alternatively, a pre-formed protein crystal is soaked in the presence of an inhibitor, as described above, thereby forming a protein/compound complex and obviating the need to crystallize each individual protein/compound complex. Advantageously, the HPV E2 protein crystals, and in particular the E2 TAD crystals, provided by this invention may be soaked in the presence of an inhibitor or in particular an E2 inhibitor, such as compound L, to provide E2-inhibitor crystal complexes, as described above.

7. Using the Pocket for Screening

In certain instances, one may be able to engineer an E2 TAD lacking the side chain of Y19 to reproduce the inhibitor-binding pocket as defined herein. Such modifications of the primary sequence to achieve a similar binding pocket is intended to be within the scope of the present invention. Also covered is the use of such a modified E2 TAD for screening purposes (either by NMR, MS, probe displacement assays, etc.) to screen for potential inhibitor of the newly defined pocket.

8. Alteration of Cottontail Rabbit Papillomavirus (CRPV) E2 for Efficient Binding of Inhibitors

In tenth embodiment, there is provided a method for producing an E2 protein, said protein being useful for identifying or characterizing E2 TAD inhibitors, comprising:

- a) using the HPV E2 TAD-L crystal structure, as defined above, to identify HPV inhibitor binding pocket residues;
- b) comparing said HPV inhibitor binding pocket residues with Cottontail Rabbit Papilloma Virus (CRPV) protein residues;
- c) mutating said CRPV residues to said HPV residues, to produce a hybrid; and
- d) testing said hybrid for inhibition by an inhibitor.

Infection of laboratory rabbits with cottontail rabbit papillomavirus (CRPV) or introduction of the CRPV genome into the skin of these rabbits results in the growth of large warts. The CRPV model system has been used to evaluate potential anti-HPV treatments (Kreider, J. W., et al. (1992) “Preclinical system for evaluating topical podofilox treatment of papillomas: dose response and duration of growth prior to treatment” J. Invest. Dermatol. 99, 813–818.). One can envisage that this would constitute a convenient system for testing the in vivo efficacy of E2-binding HPV DNA replication inhibitors. However, the CRPV and HPV E2 proteins share only 39% sequence identity and inhibitors which bind to the HPV protein may not bind to CRPV E2.

The HPV E2 TAD-inhibitor crystal structure, as described herein, can be used to identify residues, which are members of the HPV inhibitor binding pocket and which differ in the CRPV protein. The corresponding CRPV residues can then be mutated to the HPV counterpart. The resulting hybrid can be tested by in vitro translation of the hybrid gene to produce an E2 protein which could be tested in vitro assays, such as the E2-dependent E1-DNA binding assay (see Example 6). If the hybrid protein is functional in the assay, and proves to be sensitive to HPV inhibitors, the corresponding gene can be used to induce the growth of warts on rabbits. Warts resulting from this procedure should be treatable by inhibitors originally targeted to HPV E2. Thus use of this hybrid model, generated by analysis of the HPV TAD inhibitor complex, could be used to test HPV compounds in an animal model. This technique may also be applicable to other papilloma viruses such as, but limited to, bovine papilloma virus (BPV).

In order that this invention be more fully understood, the following examples are set forth. These examples are for the illustrative purposes only and are not to be construed as limiting the scope of this invention in any way.

EXAMPLES
Example 1

Expression and Purification of HPV-11 E2 TAD

Expression of His-tagged HPV-11 E2 transactivation domain. Amino acids 2–201 of HPV-11 E2 (SEQ ID NO. 2) were amplified by PCR from plasmid pCR3-E2 (Titolo, 1999) using the primers 5′-CAA GAC GTG CGC TAG ACC ATG GGA CAT CAC CAT CAC CAT CAC GAA GCA ATA GCC AAG-3′(sense) (SEQ ID NO. 3) and 5′-CAC CAA GTG GAT CCG CTA GCT TAG CTA GAT ACA GAT GCA GGA-3′ (antisense) (SEQ ID NO. 4). The PCR product was digested using Ncol and BamHI and ligated into plasmid pET-28b, which had been similarly digested. The ligation product was transformed into MAX Efficiency® competent DH5α E. coli (Life Technologies). Recombinant plasmid encoding His-tagged HPV11 E2 TAD (His-TAD) was isolated from a culture of the transformed DH5α, and the DNA sequence of the E2 TAD was verified to be correct. The isolated plasmid was then transformed into E. coli strain BL21(DE3)pLysS (Novagen).

A second construct encoding an additional four lysines placed at the C-terminus of the E2 transactivation domain (Lys-tailed TAD) was generated by pcr using the sense primer 5′-GGG CGC TAG ACC ATG GGA CAT CAC CAT CAC CAT CAC GAA GCA ATA GCC AAG CGT TTA G-3′ (SEQ ID NO. 5) and the anti sense primer 5′-CCC CGG ATC CTC ATT ACT TTT TCT TTT TGC TAG ATA CAG ATG CAG GAG AAC-3′ (SEQ ID NO. 6). This PCR product was digested as above and ligated into plasmid pET15b. The DNA sequence encoding for HPV11 E2 amino acids 2–201 was verified to be correct, and the plasmid was transformed into E. coli strain BL21 (DE3)pLysS as described above.

For protein expression, CircleGrow medium (Bio101) containing 34 μg/mL chloramphenicol and 50 μg/mL kanamycin (His-TAD) or 100 μg/mL ampicillin (Lys-tailed TAD) was inoculated with one-twenty fifth volume of a fresh overnight culture and cells were grown at 37° C. until an O.D.(600 nm) of approximately 1.0 was reached. The culture was then shifted to 22° C. and protein expression was induced at O.D.(600 nm)=1.4 with 0.5 mM IPTG. After six hours, cells were harvested by centrifugation and frozen on dry ice, then stored at −80° C.

Purification of His-tagged HPV11 TAD proteins. The purification procedure was identical for the His-tagged TAD and Lys-tailed TAD proteins; all steps were performed at 4° C. Cells were resuspended at 5 mL per gram in purification buffer (25 mM Tris-HCl pH 8.0, 500 mM NaCl, 5 mM TCEP) plus protease inhibitors pepstatin, leupeptin, and antipain (each at 5 μg/ml), phenylmethylsulfonyl fluoride (1 mM), and Pefabloc® (Roche, 0.4 mM). The suspension was sonicated, and the crude lysate was centrifuged for 30 min at 26,000 g. The supernatant was injected onto a 5 mL Hi-Trap chelating column (APB) equilibrated with nickel sulfate. After washing with purification buffer plus 0 mM and 25 mM imidazole, TAD was eluted with purification buffer containing 100 mM imidazole. TAD-containing fractions were pooled and concentrated to less than 5 mL, then loaded onto a Superdex-75 gel filtration column (APB) equilibrated with purification buffer plus 0.1 mM EDTA. Fractions containing pure TAD were pooled and concentrated to approximately 5 mg/mL (His-tagged TAD) or 12 mg/mL (Lys-tailed TAD). Concentrated protein was aliquoted, frozen on dry ice, and stored at −80° C.

Expression and purification of His-TAD containing selenomethionine. The plasmid encoding His-TAD was transformed into E. coli strain B834 (auxotrophic for methionine). A single bacterial colony was used to inoculate an overnight culture in LB medium containing 34 μg/mL chloramphenicol and 50 μg/mL kanamycin. A portion of this culture was diluted 4000-fold in DL30 medium (D. M. LeMaster and R. M. Richards, Biochemistry (1985) v24, 7263–68), lacking methionine and supplemented with 2 μg/mL biotin and thiamin and 50 μg/mL D,L-selenomethionine and the same antibiotics. After 26 hours at 37° C., the culture had reached a density of 0.8 (O.D. 600 nm), and expression was induced at 23° C. with 0.5 mM IPTG. After approximately seven hours, cells were harvested and stored as described above. Purification was performed as described above for His-TAD, except that purification buffers were sparged with helium before use,and His-TAD was eluted with 200 mM imidazole after washes at 50 and 100 mM.

Example 2

Synthesis and Purification of Compound L

embedded image

5-Methyl 1,3-indanedione (A)

To a suspension of 4-methyl phthalic anhydride (25.65 g, 158.2 mmol) in MeOH (79 mL) at room temperature, was added sodium methoxide (69 mL of 25% wt solution, 316 mmol). After 30 min. the reaction mixture was diluted with water and the aqueous layer was washed with Et₂O. The aqueous layer was acidified with HCl (4N) and extracted with Et₂O. The organic layer was rinsed with brine, dried (MgSO₄), filtered and concentrated under reduced pressure.

The crude residue was dissolved in acetonitrile (79 mL) and cooled to 0° C. To the resulting solution was added successively DBU (31.3 g, 206 mmol), and iodomethane (33.7 g, 237.3 mmol). After 1 hour at 0° C., iodomethane (33.7 g, 237.3 mmol) was added and the reaction was warmed to room temperature and stirred for a further hour. The reaction mixture was concentrated under reduced pressure, and the residue was diluted with Et₂O (300 mL). The ethereal solution was washed successively with aqueous HCl (4N, 100 mL), NaOH (10%) and Brine, dried (MgSO4), filtered and concentrated to dryness. The resulting residue was treated with an ethereal solution of diazomethane to complete the esterification, after which was concentrated to give the 4-methyl dimethyl phtalate (22.2 g, 67% yield) as a pale yellow oil.

To a solution of crude 4-methyl dimethyl phthalate (22.20 g, 106.6 mmol) in ethyl acetate (107 mL), was added sodium hydride (97%, 3.84 g, 160 mmol). The resulting suspension was heated to reflux for 4.5 hours followed by cooling to room temperature and Et₂O (100 mL) addition to give a yellow precipitate. The yellow solid was filtered and washed twice with a mixture of ethyl alcohol/diethyl ether (1/1).

This yellow solid was then dissolved in HCl (4N, 100 mL) and heated to reflux for 30 min. After cooling EtOAc was added and the organic phase separated and washed with brine, dried (MgSO₄), filtered and concentrated to give 5-methyl 1,3-indanedione as a yellow solid (3.7 g, 22% yield)

Step a:

To a solution of 5-methyl indan-1,3-dione (A) (410 mg, 2.6 mmol) in EtOH (13 mL) was added 3,4-dichlorobenzaldehyde (B) (493 mg, 2.8 mmol) followed by piperidine (1 drops). The reaction mixture was heated at reflux for 30 min. After cooling, to the reaction mixture was added aqueous hydrogen peroxide (30%, 0.87 mL, 7.7 mmol) and DBU (97 mg, 0.6 mmol). Stirring was continued for 30 min. then hexane (5 mL) was added and the precipitate was filtered. The resulting solid was triturated twice with a mixture of propanol/hexane (1/1) and dried under high vacuum to give 3-(3,4-dichlorophenyl)-spiro (oxirane-2,2′-[5-Methyl-indan])-1′,3′-dione (D) (701 mg, 82% yield).

Step c:

A mixture of 3-(3,4-dichlorophenyl)-spiro (oxirane-2,2′-[5-Methyl-indan])-1′,3′-dione (D) (200 mg, 0.8 mmol) and 1-(4-[1,2,3}thiazol-4yl-phenyl)-pyrrole-2,5-dione (e) (155 mg, 0.6 mmol) in toluene (4.6 mL) was heated to reflux for 16 h. After cooling and concentration, the residue was triturated with EtOAc to give a mixture of two compounds F/G (racemic cis/cis isomers, 228 mg, 60% yield)

Step d:

To a solution of compounds F/G (210 mg, 0.36 mmol) in CH₃CN (36 mL) was added NaOH (0.02N, 17.8 mL, 0.36 mmol) using a syringe pump over 1 h. After the addition was completed, the reaction mixture was stirred for an extra 1 h. The solution was then lyophilized to give a mixture of racemic compounds J/K (227 mg, quantitative yield). Pure enantiomer L was obtained via separation on preparative HPLC using a chiral column (Chiracel OD, isocratic eluent 65% CH₃CN/H₂O containing 0.06% TFA; UV lamp at 205 nm; flow 7 mL/min.). The desired fractions were combined and lyophilized. The corresponding sodium salt was prepared by treatment with NaOH (0.02N, 1 equiv.) in acetonitrile followed by lyophilization to give the sodium salts (15 mg) as white solid. L: ¹H-NMR (400 MHz, DMSO-d₆) δ 10.35 (s, 1H), 8.40 (d, J=8.6 Hz, 2H), 7.89–7.80 (m, 3H), 7.64 (m, 3H), 7.52 (d, J=8.3 Hz, 1H), 7.51–7.34 (m, 1H), 5.75 (s, 1H), 4.19 (m, 1H), 3.78 (m, 1H), 2.57 (s, 3H); ES MS m/z 606 (MH+).

The inhibitory activity of the compound was assessed according to the enzymatic assays described in Example 6 and was determined to have an IC₅₀of 180 nM. Selectivity of the inhibitor was verified by lack of activity (or lower potency) in the SV40 large T antigen assay as described in Example 7.

Example 3

E2 TAD-Inhibitor Complex Formation

Inhibitor L powder was solubilized in 100% DMSO at a concentration of 60 mM. The protein solution consisted of 10 mg/ml E2TAD in purification buffer (25 mM Tris-HCl pH to 8.0, 500 mM NaCl, 5 mM TCEP, 0.1 mM EDTA). The complex of E2TAD-L was made by mixing 1 μl of inhibitor L in 74 μL of protein solution. The solution was kept at 4° C. for 2–3 hours before the crystallization experiments were performed.

Example 4

Crystallization and Data Collection

Crystallization of the apo-E2 TAD and complex E2TAD-L were carried out using the hanging drop vapor diffusion technique (A. McPherson, Preparation and Analysis of Protein Crystals, Krieger Pub. 1989) in VDX crystallization plates (Hamton Research, Laguna Niguel, Calif.).

In particular for the apo HPV-11 E2 TAD: 1 μL of the Se-E2 TAD solution (5 mg/ml in purification buffer) was mixed with 1 μL of a solution made of 0.1M Na succinate pH 5.0, 18% PEG5000 mme and 0.2M ammonium sulfate. The resulting 2 μL drop was suspended above a 1 ml reservoir solution made of 0.1M Na succinate pH5.0, 18% PEG5000 mme and 0.2M ammonium sulfate. The crystals obtained at 4° C. belong to space group C222 with unit cell dimension of a=54.9 Å, b=169.9 Å and c=46.1 Å and contained one molecule per asymmetric unit

Diffraction data were collected on beamline X4a (NSLS, Brookhaven National Laboratory, New York). Four data sets were collected form a single crystal cooled at 100 K, at four different x-ray wavelengths near the selenium absorption edge (0.9790 Å, 0.9794 Å, 0.9743 Å, and 0.9879 Å). Images were collected on a ADSC Q4 CCD, the maximum resolution was 2.4 Å.

For crystallization of the complex: 1 μL of the complex solution, as described in example 3, was mixed with 1 μL of a solution made of 0.1M MES pH 5.5 and 35% MPD (methyl pentane diol). The resulting drop was suspended above a 1 mL reservoir solution made of 0.1M MES pH 5.5, 35% MPD. Plates were then stored at 4 C. The crystals obtained belong to space group P4(1) with unit cell dimension of a=b=60.7 Å and c=82.5 Å and contain one molecule per asymmetric unit.

Initial diffraction data were measured using a home source x-ray generator (Rigaku, Japan) equipped with an R-axis II image plate area detector (Molecular Structure Corp, Texas). Data to a resolution of 3.15 Å were collected on a single crystal of the complex cooled at 100 K.

High resolution diffraction data were then collected on beamline X25 (NSLS, Brookhaven National Laboratory, New York). Diffraction image were collected on a Brandeis B4 detector (Brandeis University) mounted on a kappa-axis goniometer (Enraf-Nonius, The Netherlands). A full data set to a resolution of 2.4 Å was collected on a single crystal of the complex cooled at 100 K (presented in FIG. 9).

Example 5

Phasing, Model Building and Refinement

Phasing of the apo crystal data was done by MAD (Multi wavelength Anomalous Dispersion) using the program MLPHARE (Collaborative Computational Project, number4, 1994, the CCP4 suite: programs for Protein Crystallography, Acta Cryst. D50, 760–763).

For the complex crystal, Molecular Replacement (MR) method was used for initial estimation of diffraction data phases. The apo structure of Se-E2TAD was used as a model. A rotation and translation search were done using the program AMORE (Collaborative Computational Project, number4, 1994, the CCP4 suite: programs for Protein Crystallography, Acta Cryst. D50, 760–763).

Model building into electron density map was carried out with the software O (Alwyn Jones, Upsala University, Sweden) and model refinement was done with software CNX (Molecular Simulation Inc, San Diego, Calif.). The new model was then improved by a cycling procedure including electron-density map calculation, model rebuilding and model refinement steps. The final model included residues 2 to 196 of E2TAD and two inhibitor L molecules. The latest crystallographic R factor was 24.6% and R_freefactor is 29.3%.

Example 6

E2-Dependent E1 Origin-Binding Assay

This assay was modeled on a similar assay for SV40 T Antigen described by McKay (J. Mol. Biol., 1981,145:471). A 400 bp radiolabeled DNA probe, containing the HPV-11 origin of replication (Chiang et al., 1992, Proc. Natl. Acad. Sci. USA 89:5799) was produced by PCR, using plasmid pBluescript™ SK encoding the origin (nucleotides 7886-61 of the HPV-11 genome in unique BAMH1 site) as template and primers flanking the origin. Radiolabel was incorporated as [³³P]dCTP. Binding assay buffer consisted of: 20 mM Tris pH 7.6, 100 mM NaCl, 1 mM DTT, 1 mM EDTA.

Other reagents used were protein A-SPA beads (type II, Amersham) and K72 rabbit polyclonal antiserum, raised against a peptide corresponding to the C-terminal 14 amino acids of HPV-11 E1. Following the protocol from Amersham, one bottle of beads was mixed with 25 mL of binding assay buffer. For the assay, a saturating amount of K72 antiserum was added to the beads and the mixture was incubated for 1 h, washed with one volume of binding assay buffer, and then resuspended in the same volume of fresh binding assay buffer. Binding reactions contained 8 ng of E2, approximately 100–200 ng of E1-containing nuclear extract expressed from baculovirus-infected cells (as reported in WO 99/57283), and 0.4 ng of radiolabeled probe in a total of 80 μL of binding assay buffer. After 1 h at room temperature, 25 μL of K72 antibody-SPA bead suspension was added to the binding reaction and mixed. After an additional hour of incubation at room temperature, the reactions were centrifuged briefly to pellet the beads and the extent of complex formation was determined by scintillation counting on a Packard TopCount™. Typically, the signal for reactions containing E1 and E2 was 20–30 fold higher than the background observed when either E1, E2, or both was omitted.

Example 7

SV40 T Antigen-DNA Binding Assay

This assay measures the formation of an SV40 T Antigen (TAg)-origin complex. The assay was developed by R. D. G. McKay (J. Mol. Biol. (1981) 145, 471–488). In principle, it is very similar to the E2-dependent E1-DNA binding assay (Example 6), with TAg replacing E1 and E2, and a radiolabeled SV40 ori probe replacing the HPV ori probe. The assay is used as a counterscreen for the assay of Example 6, since TAg shares functional homology to E1 and E2, but has very low sequence similarity.

The radiolabeled ori-containing DNA probe was made by PCR using pCH110 plasmid (Pharmacia) as a template. This template encodes the SV40 minimal origin of replication at nucleotides 7098-7023. Primers were “sv40-6958sens”=5′-GCC CCT AAC TCC GCC CAT CCC GC (SEQ ID NO. 7), and “sv40-206anti”=5′-ACC AGA CCG CCA CGG CTT ACG GC (SEQ ID NO. 8). The PCR product was approximately 370 base pairs long and was radiolabeled using 50 μCi/100 μL PCR reaction of dCTP (α-³³P). Subsequent to the PCR reaction, the product was purified using either the Qiagen® PCR purification kit, or a phenol extraction/ethanol precipitation procedure. The purified product was diluted to 1.5 ng/μL (estimated by gel electrophoresis) in TE. Fresh preparations had approximately 150,000 cpm/μL.

Binding reactions were performed by mixing 30 μl of TAg solution (100 ng/well, 200 ng of a ³³P-radiolabeled DNA probe, and 7.5 μl of 10×DNA binding buffer (200 mM Tris-HCl pH 7.6, 100 mM NaCl, 1 mM EDTA, 10 mM DTT) in a final volume of 75 μl. Binding reactions were allowed to proceed at room temperature for 60 min. The Large T Antigen: Purchased from Chimerx, at 2.0 mg/mL.

The protein-DNA complexes were immunocaptured using an α-TAg monoclonal antibody (PAb 101, subtype IgG2a, hybridoma obtained from ATCC and antibody purified in-house) bound to protein A-SPA beads. Immunoprecipitation of protein-DNA complexes was carried out for 1 hr at room temperature. The plates were spun briefly and the precipitated radiolabeled DNA fragments were counted on a TopCount® counter.

Discussion

FIG. 2 shows a model of the crystal structure of E2 TAD from HPV-16 (Antson et al., 2000, Nature, (403) 805–809). A zoom view on the binding pocket region in this model, as shown in FIG. 5, reveals that amino acids Y32, W33 and L94 define a cavity that is too small to define a suitable pocket that will enable a small molecule inhibitor to bind therein, without comparable adjustments of the amino acid side chains to accommodate the inhibitor.

Even when the corresponding HPV-11 E2 TAD domain is crystallized and modeled, the corresponding amino acids again reveal a cavity too small to define any sort of pocket that could be viewed as a target suitable for inhibitor-binding (FIG. 6A). As shown in FIG. 6B, the present invention for the first time, now shows that the crystal structure of the new E2 TAD-inhibitor complex provides a novel and unexpected inhibitor-binding pocket that constitutes a unique tool for identifying potential inhibitors of the HPV DNA replication process.

Surprisingly, the structure of the E2 TAD-inhibitor complex reveals that binding of inhibitor L induces a movement of the side chain of tyrosine at position 19 (FIG. 7) where the aromatic ring rotates in a significant manner out of the small cavity seen in the apo-structure, resulting in the formation of a deep cavity. The movement of the tyrosine 19 side chain gives an rms deviation for all atoms of 1.959 Å. One skilled in the art will understand that this deviation constitutes a huge movement, which could not have been predicted to occur on its own or in the presence of a small molecule inhibitor.

In addition, the imidazole ring of histidine 32 rotates by 90 degrees to accommodate the inhibitor but still remains part of the deep cavity. The movement of the histidine 32 main chain gives an rms deviation for all atoms of 0.704 Å. Neither of these two rotational movements could have been predicted to occur and result in the formation of this deep cavity within the binding pocket.

As shown in FIG. 6A, the deep cavity is defined by amino acids histidine 32, tryptophan 33, and leucine 94. The “all atoms” rmsd displacement of these three amino acids residues is 0.515 Å. Such rms can not be accounted for by the native flexibility of these residues within the context of the binding pocket. Indeed, a rms deviation of 1.0 Å is considered within normal limits in the context of a whole protein of 200 amino acids. In the present case, the rms variation for all atoms of H32, W33 and L94 between HPV-16 apo E2TAD of Antson supra and Applicant's HPV-11 apo E2TAD, is 0.212 Å. This defines the predictable (upper) limit by which these 3 residues can move in concert. The present invention is outside that range of predictable movement for these three residues.

Serine 98 is not on the same plane as H32, W33 and L94 and forms part of a shallower portion that may also be used for generating models of a larger pocket comprising a deep cavity formed by the H32, W33 and L94 and a shallow cavity defined by one or more amino acids selected from: L15, I36, E39, K68, N71, A72, S98 and Y99 (see FIG. 8).

FIG. 9 lists the X-ray coordinates of the protein-inhibitor complex which can be used for modeling purposes. Apparent from these coordinates is the fact that the complex obtained by the Applicant contains two molecules of inhibitor, however the model revealed that the second inhibitor resides outside the deep cavity and does not interact with the protein in a significant manner. Also, the following amino acids are modeled as Alanine due to their high flexibility that renders them invisible to x-ray: E2, K107, K173, S180, M182, H183 and P196.

According to Harris & Botchan, 1999 (Science, 284 (5420); 1673), various E2 proteins average only 30% amino acid sequence identity. However, mutational analysis suggest that various E2 TADs share a common fold and mechanism of action. In keeping with this last statement, the amino acid clusters defining the inhibitor-binding pocket identified by the Applicant possess a surprising amount of identity/similarity, even between low-risk and high-risk HPVs (FIG. 10). The first cluster identified comprises the side chain of amino acid Y19 that moves away from the pocket region thereby opening up the deep cavity. This amino acid is highly conserved among various types of HPV having 100% identity between HPV-6, 11, 16, and 18. The second cluster comprises histidine 32 and tryptophan 33 that define the deep cavity of the pocket. Histidine 32 is identical between HPV-6 and -11 and has strong similarity between low-risk and high-risk HPV, whereas tryptophan 33 is 100% identical amongst the four types. Finally, the fourth cluster comprises Leucine 94 that also define the deep cavity of the pocket and is 100% conserved between the 4 HPV types.

When defining the bottom of the deep pocket, H29 is identical among HPV-6, -11 and -16 and is similar in HPV-18. Similarly, T97 is identical among HPV-6, -11 and -18 and is similar in HPV-16.

When defining the shallow cavity of the pocket, amino acid L15 is part of the first cluster identified and is highly similar between the low risk and high risk HPV. Within the second cluster, I36 is also highly similar whereas E39 is highly conserved amongst all 4 types. A third cluster is identified that lines the shallow cavity of the binding pocket wherein K68 and N72 are both highly conserved throughout the types. Finally, N71 is identical between HPV-6 and 11 and is similar with the high risk types. The shallow pocket further comprises amino acids of the fourth cluster such as S98 and Y99 that are also highly similar among the different types of HPV.

The high degree of identity/similarity strongly indicates that this pocket as defined according to the HPV-11 E2 TAD of the invention will also be found in other types of HPV, either low risk or high risk. Presumably, inhibitors binding to this pocket, particularly the deep cavity, as modeled using the data of FIG. 9 have a strong likelihood of binding/inhibiting the E2 protein from a wide range of papilloma viruses

Number	Date	Country
0 969 013	Jan 2000	EP
WO-99-57283	Nov 1999	WO
WO 0121645	Mar 2001	WO
WO 0250082	Jun 2002	WO

Method of identifying potential inhibitors of human papillomavirus protein E2 using x-ray atomic coordinates

Information

Patent Number

Date Filed

Date Issued

Inventors

Original Assignees

Examiners

Agents

CPC

US Classifications

Field of Search

US

International Classifications

Term Extension

Abstract

Description

Claims

RELATED APPLICATIONS

US Referenced Citations (1)

Foreign Referenced Citations (4)

Related Publications (1)

Provisional Applications (1)