The ST.26 XML Sequence listing named “10311US20240104SequenceListingST26”, created on Jun. 28, 2022, and having a size of 10,540 bytes, is hereby incorporated herein by this reference in its entirety.
The present invention belongs to the field of structural biology, more particularly to the field of cytokine biology. In particular the invention provides co-crystals of the anaplastic lymphoma kinase (ALK) and its ligand ALKAL2 and the related leukocyte tyrosine kinase (LTK) and its ligand ALKAL1. The invention also provides computer-assisted and other methods for selecting molecules able to modulate the interaction between ALK, LTK with their respective ligands.
The architectural hallmark of ALK family receptors is a membrane-proximal segment in their extracellular domain marked by multiple stretches of glycine residues coupled to an EGF-like (EGFL) module. The glycine-rich composition of this segment complicates detection of a globular fold motif but has led to its sequence-based classification as Glycine-rich PFAM domain PF128104. Faint, local similarity of a predicted beta-strand segment in ALK with part of the TNF superfamily beta-jellyroll scaffold led to a proposed delineation of a TNF-like (TNFL) module preceding the glycine-rich (GR) region of ALK and LTK5. Whereas this module together with EGFL constitutes the bulk of LTK, ALK builds a much more substantial extracellular segment comprising an MAM-LDLa-MAM domain cassette and an N-terminal heparin binding domain (HBD) (
Remaining functionally enigmatic, ALK is best known for its involvement in cancer10, such as non-small cell lung cancer and pediatric neuroblastoma8. Moreover, ALKAL1 has been linked to BRAF inhibitor resistance in melanoma11 and ALKAL2 to poor survival in neuroblastoma patients12. The disease context of LTK is less clear and currently situates in autoimmune disorder lupus13 and acute myeloid leukemia14. Interestingly, the newly proposed role of ALK in metabolic control also appears to be present in Drosophila indicating an evolutionary conservation of ALK function15. As ALK family receptors and cognate cytokines are gaining therapeutic importance16,17 the field is limited by the stark paucity of structural and mechanistic insights, in contrast to most other RTK.
The present invention satisfies this need and provides 3D-structural models of ALK and LTK respectively binding to their ligands ALKAL2 and ALKAL1.
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
ALKTG-EGFL (pink trace), ALKTG-EGFL-ALKAL2R123E/R136E (green trace) and ALKTG-EGFL-ALKAL2F97E (blue trace). The ALKAL1 site 1 mutant is unable to form a complex with ALK while the site 2 mutant is still forms a binary complex.
ITC titration curves, the left panel shows the raw data with the differential electrical power (DP) plotted against time. The right panel represents the binding isotherm obtained from the integration of the raw data and fitted to a one-site model. Standard deviations were calculated based on 3 measurements.
The present invention will be described with respect to particular embodiments and with reference to certain drawings but the invention is not limited thereto but only by the claims. Any reference signs in the claims shall not be construed as limiting the scope. The drawings described are only schematic and are non-limiting. In the drawings, the size of some of the elements may be exaggerated and not drawn on scale for illustrative purposes. Where the term “comprising” is used in the present description and claims, it does not exclude other elements or steps. Where an indefinite or definite article is used when referring to a singular noun e.g. “a” or “an”, “the”, this includes a plural of that noun unless something else is specifically stated. Furthermore, the terms first, second, third and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and that the embodiments of the invention described herein are capable of operation in other sequences than described or illustrated herein. The following terms or definitions are provided solely to aid in the understanding of the invention. Unless specifically defined herein, all terms used herein have the same meaning as they would to one skilled in the art of the present invention. Practitioners are particularly directed to Sambrook et al., Molecular Cloning: A Laboratory Manual, 4th ed., Cold Spring Harbor Press, Plainsview, New York (2012); and Ausubel et al., current Protocols in Molecular Biology (Supplement 100), John Wiley & Sons, New York (2012), for definitions and terms of the art. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art (e.g. in molecular biology, biochemistry, structural biology, and/or computational biology).
Anaplastic lymphoma kinase (ALK) and the related leukocyte tyrosine kinase (LTK) are recently deorphanized receptor tyrosine kinases (RTK) involved in neural development, cancer, and autoimmune diseases1,2. Furthermore, ALK has emerged as a surprising key regulator of energy expenditure and weight gain through signaling in the hypothalamus3. Despite such pleiotropy in physiology and disease, structural insights into ALK and LTK and their complexes with cognate cytokines had remained elusive. Here, we show that the ALKAL cytokine-binding segments of ALK and LTK comprise an unprecedented architectural chimera of a permuted TNF-like module that braces a Glycine-rich subdomain featuring a hexagonal lattice of long poly-glycine-II helices. The cognate ALKAL cytokines are monomeric, asymmetric three-helix bundles with strikingly open structures. Yet, cytokine-mediated homodimerization of ALK and LTK leads to receptor-receptor contacts with twofold symmetry that fully tent a single cytokine molecule near the cell membrane via distinct cytokine-receptor interfaces. We show that the apparent cytokine preferences of ALK and LTK are dictated by their membrane-proximal EGF-like domains. Assisted by diverse structure-function findings, in the present invention a structural and mechanistic blueprint for the extracellular complexes of ALK/LTK family receptors is provided, thereby completing the repertoire of cytokine-driven dimerization mechanisms adopted by human RTK. Accordingly, in a first embodiment the present invention provides compositions in crystalline form selected from the complex i) ALKTG (SEQ ID NO: 1) interacting with ALKAL2 (SEQ ID NO: 2) and the complex ii) LTKTG (SEQ ID NO: 3) interacting with ALKAL1 (SEQ ID NO: 4) and Nb3.16 (SEQ ID NO: 5), characterized in that the crystals are:
The wording “interacting with” is equivalent to “bound with, or bound to, or bounding with, or bounding to” or “interacted to”. Thus, ALKTG (SEQ ID NO: 1) interacting with ALKAL2 (SEQ ID NO: 2) can also be written as ALKTG (SEQ ID NO: 1)-ALKAL2 (SEQ ID NO: 2) and ii) LTKTG (SEQ ID NO: 3) interacting with ALKAL1 (SEQ ID NO: 4) and Nb3.16 (SEQ ID NO: 5) can be written as LTKTG (SEQ ID NO: 3)-ALKAL1 (SEQ ID NO: 4)-Nb3.16 (SEQ ID NO: 5).
In a specific embodiment the compositions have a three-dimensional structure wherein the crystal i) comprises an atomic structure characterized by the coordinates depicted in 7NWZ and wherein the crystal ii) comprises an atomic structure characterized by the coordinates depicted in 7NX0. 7NWZ and 7NXO are the ID numbers of the structures present in the PDB database also known as the RSCB Protein databank and available on internet on https://www.rcsb.org). These ID numbers are also publicly available in De Munck S et al (2021) Nature, Vol 600, starting on page 143—see Extended Data Table 1 in this article reference.
In yet another embodiment the invention provides a computer-assisted method of identifying, designing or screening for a compound that can potentially interact with a crystal selected from a crystal i) or ii) as defined herein before, comprising performing structure-based identification, design or screening of a compound based on the compound's interactions with a structure defined by the atomic coordinates as defined herein before.
In yet another embodiment the invention provides a method for identifying a compound that can bind to the complex i) ALKTG-ALKAL2 or to the complex ii) LTKTG-ALKAL1-Nb3.16, comprising dipping candidate small molecule compounds with the complex ALKTG-ALKAL2 or the complex LTKTG-ALKAL1-Nb3.16L, and allowing co-crystallization, and screening candidate agonists or antagonists by using a method for measuring intermolecular interaction and comparing, designing and docking the 3D structures i) or ii) as defined herein and a candidate ligand by computer modeling.
In yet another embodiment the invention provides a method of identifying, designing or screening for a compound that can interact with the complex i) ALKTG-ALKAL2 or to the complex ii) LTKTG-ALKAL1-Nb3.16, including performing structure-based identification, design, or screening of a compound based on the compound's interactions with the complex i) ALKTG-ALKAL2 or to the complex ii) LTKTG-ALKAL1-Nb3.16.
In yet another embodiment the invention provides a method for identifying an agonist or antagonist compound interacting with the complex i) ALKTG-ALKAL2 or with the complex ii) LTKTG-ALKAL1-Nb3.16 comprising an entity selected from the group consisting of an antibody, a peptide, a non-peptide molecule and a chemical compound, wherein said compound is capable of enhancing or disrupting the interaction of the bound entities (or molecules) of the complex i) ALKTG-ALKAL2 or the interaction of the bound entities (or molecules) complex ii) LTKTG-ALKAL1-Nb3.16 wherein said process includes:
As used herein the term “homologue” means a protein having at least 80% amino acid sequence identity with human ALK, LTK, ALKAL1 or ALKAL2. Preferably, the percentage identity is 85, 90%, 95% or higher.
As used herein, the term “crystal” means a structure (such as a three-dimensional (3D) solid aggregate) in which the plane faces intersect at definite angles and in which there is a regular structure (such as an internal structure) of the constituent chemical species. The term “crystal” refers in particular to a solid physical crystal form such as an experimentally prepared crystal.
Details about the crystal structures (the complexes i) and ii) as described herein) are depicted in Table 1.
Crystals may be constructed with the wild-type ALKTG, ALKAL2 or wild type LTKTG, ALKAL1 polypeptides or variants thereof, including allelic variants and naturally occurring mutations as well as genetically engineered variants. Typically, variants have at least 90%, at least 95% sequence identity with a corresponding wild-type polypeptide. In a preferred embodiment the polypeptides are human. In another preferred embodiment the polypeptides are from dog, cat, swine, horse, chicken.
SEQ ID NO: 6 depicts the amino acid sequence of the human ALK tyrosine kinase receptor (herein abbreviated as ALK).
SEQ ID NO: 1 (herein abbreviated as ALKTG) is the sequence of amino acids 648 to 985 of the sequence depicted in SEQ ID NO: 6.
SEQ ID NO: 7 depicts the amino acid sequence of the human leukocyte tyrosine kinase receptor (herein abbreviated as LTK).
SEQ ID NO: 3 (LTKTG) is the sequence of amino acids 63 to 379 of the sequence depicted in SEQ ID NO: 7.
SEQ ID NO: 2 depicts the amino acid sequence of human ALKAL1.
SEQ ID NO: 4 depicts the amino acid sequence of human ALKAL2.
SEQ ID NO: 5 depicts the amino acid sequence of Nb3.16.
As used herein, the term “atomic coordinates” or “set of coordinates” refers to a set of values which define the position of one or more atoms with reference to a system of axes. It will be understood by those skilled in the art that the atomic coordinates may be varied, without affecting significantly the accuracy of models derived therefrom. Thus, although the invention provides a very precise definition of a preferred atomic structure, it will be understood that minor variations are envisaged and the claims are intended to encompass such variations.
It will be understood that any reference herein to the atomic coordinates or subset of the atomic coordinates shown in 7NWZ and 7NXO present in the on line RCSB protein database shall include, unless specified otherwise, atomic coordinates having a root mean square deviation of backbone atoms of not more than 1.5 Å, preferably not more than 1 Å, when superimposed on the corresponding backbone atoms described by the atomic coordinates shown in 7NWZ or 7NXO present in the on line RCSB protein database.
The following defines what is intended by the term “root mean square deviation (‘RMSD’)” between two data sets. For each element in the first data set, its deviation from the corresponding item in the second data set is computed. The squared deviation is the square of that deviation, and the mean squared deviation is the mean of all these squared deviations. The root mean square deviation is the square root of the mean squared deviation.
In a preferred embodiment, the crystals have the atomic coordinates as shown in 7NWZ and 7NX0 present in the on line RCSB protein database.
Further, it will be appreciated that a set of atomic coordinates for one or more polypeptides is a relative set of points that define a shape in three dimensions. Thus, it is possible that an entirely different set of coordinates could define a similar or identical shape. Moreover, slight variations in the individual coordinates will have little effect on overall shape. The variations in coordinates may be generated due to mathematical manipulations of the atomic coordinates. For example, the atomic coordinates set forth in 7NWZ and 7NXO present in the on line RCSB protein database could be manipulated by crystallographic permutations of the atomic coordinates, fractionalization of the atomic coordinates, integer additions or subtractions to sets of the structure coordinates, inversion of the atomic coordinates, special labeling of amino acids, polypeptide chains, heteroatoms, ligands, solvent molecules, or combinations thereof.
Alternatively, modification in the crystal structure due to mutations, additions, substitutions, and/or deletions of amino acids, or other changes in any of the components that make up the crystal could also account for variations in atomic coordinates.
Various computational analyses are used to determine whether a molecular complex or a portion thereof is sufficiently similar to all or parts of the structure of the complex i) ALKTG-ALKAL2 or with the complex ii) LTKTG-ALKAL1-Nb3.16. Such analyses may be carried out in available software applications which are known to the skilled person. For example, a molecular similarity program permits comparisons between different structures, different conformations of the same structure, and different parts of the same structure. Comparisons typically involve calculation of the optimum translations and rotations required such that the root mean square deviation of the fit over the specified pairs of equivalent atoms is an absolute minimum. This number is given in Angstroms (Å). Accordingly, atomic coordinates of the complex i) ALKTG-ALKAL2 or the complex ii) LTKTG-ALKAL1-Nb3.16 of the present invention include atomic coordinates related to the atomic coordinates listed in 7NWZ and 7NXO present in the on line RCSB protein database by whole body translations and/or rotations. Accordingly, RMSD values listed above assume that at least the backbone atoms of the structures are optimally superimposed which may require translation and/or rotation to achieve the required optimal fit from which to calculate the RMSD value. A three-dimensional structure of the complex i) ALKTG-ALKAL2 or the complex ii) LTKTG-ALKAL1-Nb3.16 or a region thereof which substantially conforms to a specified set of atomic coordinates can be modelled by a suitable modelling computer program, using information, for example, derived from the following data: (1) the amino acid sequence of the polypeptides of the complex i) ALKTG-ALKAL2 or the complex ii) LTKTG-ALKAL1-Nb3.16; (2) the amino acid sequence of the related portion(s) of the protein represented by the specified set of atomic coordinates having a three-dimensional configuration; and (3) the atomic coordinates of the specified three-dimensional configuration. A three-dimensional structure of the polypeptides of the complex which substantially conforms to a specified set of atomic coordinates can also be calculated by a method such as molecular replacement, which is described in detail below.
Atomic coordinates are typically loaded onto a machine-readable medium for subsequent computational manipulation. Thus, models and/or atomic coordinates are advantageously stored on machine-readable media, such as magnetic or optical media and random-access or read-only memory, including tapes, diskettes, hard disks, CD-ROMs and DVDs, flash memory cards or chips and servers. Typically, the machine is a computer. The atomic coordinates may be used in a computer to generate a representation, e.g. an image of the three-dimensional structure of polypeptides of the complex which can be displayed by the computer and/or represented in an electronic file. The atomic coordinates and models derived therefrom may be used for a variety of purposes such as drug discovery, biological reagent (binding protein) selection and X-ray crystallographic analysis of other protein crystals.
The structure coordinates of the polypeptide such as those set forth in 7NWZ and 7NX0 present in the on line RCSB protein database or a subset thereof, can also be used for determining the three-dimensional structure of a distant crystallized polypeptide of the complex ALKTG-ALKAL2 or the complex ii) LTKTG-ALKAL1 (e.g. derived from another species such as a relevant veterinary species including cat, dog, swine, horse and chicken). This may be achieved by any of a number of well-known techniques, including molecular replacement. Methods of molecular replacement are generally known by those skilled in the art. Generally, molecular replacement involves the following steps: i) X-ray diffraction data are collected from the crystal of a crystallized target structure, then ii) the X-ray diffraction data are transformed to calculate a Patterson function, then iii) the Patterson function of the crystallized target structure is compared with a Patterson function calculated from a known structure (referred to herein as a search structure or search model), iv) the Patterson function of the search structure is rotated on the target structure Patterson function to determine the correct orientation of the search structure in the crystal to obtain a rotation function, v) a translation function is then calculated to determine the location of the search structure with respect to the crystal axes. Alternatively, likelihood-based molecular replacement methods can be used to determine the location of the search structure. Once the search structure has been correctly positioned in the unit cell, initial phases for the experimental data can be calculated. These phases are necessary for calculation of an electron density map from which structural features and differences can be observed to allow construction of a molecular model and refinement of the structure. Preferably, the structural features (e.g., amino acid sequence, conserved disulphide bonds, and beta-strands or beta-sheets) of the search molecule are related to the crystallized target structure. The electron density map can, in turn, be subjected to any well-known model building and structure refinement techniques to provide a final, accurate structure of the unknown (i.e. target) crystallized molecular structure. Obtaining accurate values for the phases, by methods other than molecular replacement, is often a time-consuming process that involves iterative cycles of approximations and refinements and greatly hinders the solution of crystal structures. However, when the crystal structure of a protein containing at least a homologous portion has been solved, the phases from the known structure provide a satisfactory starting estimate of the phases for the unknown structure. By using molecular replacement, all or part of the structure coordinates of the polypeptides of the complexes as described herein (and set forth in 7NWZ or 7NX0 present in the on line RCSB protein database) can be used to determine the structure of another crystallized complex (ALKTG-ALKAL2 or the complex ii) LTKTG-ALKAL1) whose structure is unknown, more rapidly and more efficiently than attempting to determine such information ab initio.
The structure of any portion of any crystallized molecular ALKTG-ALKAL2 complex or the complex ii) LTKTG-ALKAL1 that is sufficiently homologous to any portion of the human ALKTG-ALKAL2 complex or the complex ii) LTKTG-ALKAL1 can be solved by this method.
Such structure coordinates are also particularly useful to solve the structure of crystals of ALKTG-ALKAL2 complex or the complex ii) LTKTG-ALKAL1 co-complexed with a variety of molecules, such as chemical entities. For example, this approach enables the determination of the optimal sites for the interaction between chemical entities, and the interaction of candidate disrupting entities of the complexes or agonists of the complexes.
Design, Selection, Fitting and Assessment of Chemical Entities that Bind the ALKTG-ALKAL2 Complex or the Complex ii) LTKTG-ALKAL1
Using a variety of known modelling techniques, the crystal structures of the present invention can be used to produce models for evaluating the interaction of compounds with the complexes described herein. As used herein, the term “modelling” includes the quantitative and qualitative analysis of molecular structure and/or function based on atomic structural information and interaction models. The term “modelling” includes conventional numeric-based molecular dynamic and energy minimization models, interactive computer graphic models, modified molecular mechanics models, distance geometry and other structure-based constraint models. Molecular modelling techniques can be applied to the atomic coordinates of the complexes or parts thereof to derive a range of 3D models and to investigate the structure of binding sites, such as the binding sites with compounds. These techniques may also be used to screen for or design small and large chemical entities which are capable of binding the complexes and modulate the interaction of the elements of the complexes, in particular the disruption of the complexes. Such a screen may employ a solid 3D screening system or a computational screening system. Such modelling methods are to design or select chemical entities that possess stereochemical complementary to identified binding sites between the individual elements in the complexes. By “stereochemical complementarity” it is meant that the compound makes a sufficient number of energetically favourable contacts with individual elements in the complexes as to have a net reduction of free energy on binding to these individual elements in the complexes. By “stereochemical similarity” we mean that the compound makes about the same number of energetically favourable contacts with the complexes set out by the coordinates shown in 7NWZ and 7NX0 present in the on line RCSB protein database. In addition, modelling methods may also be used to design or select chemical entities that possess stereochemical complementarity to the complexes of the invention. By stereochemical complementarity it is meant that the compound makes energetically favourable contacts with the complexes as defined by coordinates shown in 7NWZ and 7NX0 present in the on line RCSB protein database. By “match” we mean that the identified portions interact with the surface residues, for example, via hydrogen bonding or by non-covalent van der Waals and Coulomb interactions (with surface or residue) which promote dissolvation of the molecule within the site, in such a way that retention of the molecule at the binding site is favoured energetically. It is preferred that the stereochemical complementarity is such that the compound has a Kd for the binding site of less than 10−4M, more preferably less than 10−5M and more preferably 10−6M. In a most preferred embodiment, the Kd value is less than 10−8M and more preferably less than 10−9M.
Chemical entities which are complementary to the shape and electrostatics or chemistry of the complexes characterized by amino acids positioned at atomic coordinates set out in 7NWZ and 7NX0 present in the on line RCSB protein database will be able to bind to the complexes, and when the binding is sufficiently strong, substantially inhibit the interaction of the individual elements in the complexes.
A number of methods may be used to identify chemical entities possessing stereochemical and structural complementarity to the structure or substructures of the complexes. For instance, the process may begin by visual inspection of a selected binding site in the complexes on the computer screen based on the coordinates in 7NWZ and 7NXO present in the on line RCSB protein database generated from the machine-readable storage medium. Alternatively, selected fragments or chemical entities may then be positioned in a variety of orientations, or docked, within the selected binding site. Modelling software is well known and available in the art. This modelling step may be followed by energy minimization with standard available molecular mechanics force fields. Once suitable chemical entities or fragments have been selected, they can be assembled into a single compound. In one embodiment, assembly may proceed by visual inspection of the relationship of the fragments to each other on the three-dimensional image displayed on a computer screen in relation to the structure coordinates of selected binding sites in the complexes. This is followed by manual model building, typically using available software. Alternatively, fragments may be joined to additional atoms using standard chemical geometry. The above-described evaluation process for chemical entities may be performed in a similar fashion for chemical compounds.
Databases of chemical structures are available from a number of sources including Cambridge Crystallographic Data Centre (Cambridge, U.K.), Molecular Design, Ltd., (San Leandro, Calif.), Tripos Associates, Inc. (St. Louis, Mo.), Chemical Abstracts Service (Columbus, Ohio), the Available Chemical Directory (Symyx Technologies, Inc.), the Derwent World Drug Index (WDI), BioByteMasterFile, the National Cancer Institute database (NCI), Medchem Database (BioByte Corp.), and the Maybridge catalogue. Once an entity or compound has been designed or selected by the above methods, the efficiency with which that entity or compound may bind to the complexes can be tested and optimised by computational evaluation. For example, a compound that has been designed or selected to function as binding compound to the complexes must also preferably traverse a volume not overlapping that occupied by the binding site when it is bound to the native complexes. An effective complex binding compound must preferably demonstrate a relatively small difference in energy between its bound and free states (i.e. a small deformation energy of binding). Thus, the most efficient complex binding compound should preferably be designed with a deformation energy of binding of not greater than about 10 kcal/mole, preferably, not greater than 7 kcal/mole. Complex binding compounds may interact with the complexes in more than one conformation that are similar in overall binding energy. In those cases, the deformation energy of binding is taken to be the difference between the energy of the free compound and the average energy of the conformations observed when the compound binds to the protein. Further, a compound designed or selected as binding to a complex may be further computationally optimised so that in its bound state it would preferably lack repulsive electrostatic interaction with the target protein.
Once a binding compound to the complexes has been optimally selected or designed, as described above, substitutions may then be made in some of its atoms or side groups to improve or modify its binding properties. Generally, initial substitutions are conservative, i.e. the replacement group will have approximately the same size, shape, hydrophobicity and charge as the original group. It should, of course, be understood that components known in the art to alter conformation should be avoided. Such substituted chemical compounds may then be analysed for efficiency of fit to the complexes by the same computer methods described in detail above.
Naturally, specific computer software is available in the art to evaluate compound deformation energy and electrostatic interaction. The screening/design methods may be implemented in hardware or software, or a combination of both. However, preferably, the methods are implemented in computer programs executing on programmable computers each comprising a processor, a data storage system (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. Program code is applied to input data to perform the functions described above and generate output information. The output information is applied to one or more output devices, in known fashion. The computer may be, for example, a personal computer, microcomputer, or workstation of conventional design. Each program is preferably implemented in a high-level procedural or object-oriented programming language to communicate with a computer system. However, the programs can be implemented in assembly or machine language, if desired. In any case, the language may be compiled or interpreted language. Each such computer program is preferably stored on a storage medium or device (e.g., ROM or magnetic diskette) readable by a general or special purpose programmable computer, for configuring and operating the computer when the storage media or device is read by the computer to perform the procedures described herein. The system may also be considered to be implemented as a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner to perform the functions described herein.
Compounds of the present invention include both those designed or identified using a screening method of the invention and those which are capable of recognising and binding to the complexes as defined above. Compounds capable of recognising and binding to the complexes may be produced using a screening method based on use of the atomic coordinates corresponding to the 3D structure of the complexes. Compounds capable of recognising and binding to the complexes may be produced using a screening method based on the use of the atomic coordinates corresponding to the 3D structure of the complexes. The candidate compounds and/or compounds identified or designed using a method of the present invention may be any suitable compound, synthetic or naturally occurring, preferably synthetic. In one embodiment, a synthetic compound selected or designed by the methods of the invention preferably has a molecular weight equal to or less than about 5000, 4000, 3000, 2000, 1000 or 500 daltons. A compound of the present invention is preferably soluble under physiological conditions. The compounds may encompass numerous chemical classes, though typically they are organic molecules, preferably small organic compounds having a molecular weight of more than 50 and less than about 2,500 daltons, preferably less than 1,500, more preferably less than 1,000 and yet more preferably less than 500. Such compounds can comprise functional groups necessary for structural interaction with proteins, particularly hydrogen bonding, and typically include at least an amine, carbonyl, hydroxyl or carboxyl group, preferably at least two of the functional chemical groups. The compound may comprise cyclical carbon or heterocyclic structures and/or aromatic or polyaromatic structures substituted with one or more of the above functional groups. Compounds can also comprise biomolecules including peptides, saccharides, fatty acids, steroids, purines, pyrimidines, derivatives, structural analogues, or combinations thereof. Compounds may include, for example: (1) peptides such as soluble peptides, including Ig-tailed fusion peptides and members of random peptide libraries and combinatorial chemistry-derived molecular libraries made of D- and/or L-configuration amino acids; (2) phosphopeptides (e.g. members of random and partially degenerate, directed phosphopeptide libraries, (3) antibodies (e.g., polyclonal, monoclonal, humanized, anti-idiotypic, chimeric, and single chain antibodies, nanobodies as well as Fab, (Fab)2, Fab expression library and epitope-binding fragments of antibodies); (4) non-immunoglobulin binding proteins such as but not restricted to avimers, DARPins and lipocalins; (5) nucleic acid-based aptamers; and (6) small organic and inorganic molecules.
Synthetic compound libraries are commercially available from, for example, Maybridge Chemical Co. (Tintagel, Cornwall, UK), AMRI (Budapest, Hungary) and ChemDiv (San Diego, Calif.), Specs (Delft, The Netherlands). In addition, numerous means are available for random and directed synthesis of a wide variety of organic compounds and biomolecules, including expression of randomized oligonucleotides. Alternatively, libraries of natural compounds in the form of bacterial, fungal, plant and animal extracts can be readily produced. In addition, natural or synthetic compound libraries and compounds can be readily modified through conventional chemical, physical and biochemical means and may be used to produce combinatorial libraries. In another approach, previously identified pharmacological agents can be subjected to directed or random chemical modifications, such as acylation, alkylation, esterification, amidification, and the analogues can be screened for disrupting the complexes. In addition, numerous methods of producing combinatorial libraries are known in the art, including those involving biological libraries; spatially addressable parallel solid phase or solution phase libraries; synthetic library methods requiring deconvolution; the “one-bead one-compound” library method; and synthetic library methods using affinity chromatography selection. The biological library approach is limited to polypeptide or peptide libraries, while the other four approaches are applicable to polypeptide, peptide, nonpeptide oligomer, or small molecule libraries of compounds. Compounds also include those that may be synthesized from leads generated by fragment-based drug design, wherein the binding of such chemical fragments is assessed by soaking or co-crystallizing such screen fragments into crystals provided by the invention and then subjecting these to an X-ray beam and obtaining diffraction data. Difference Fourier techniques are readily applied by those skilled in the art to determine the location within the complex structure at which these fragments bind, and such fragments can then be assembled by synthetic chemistry into larger compounds with increased affinity for a particular position in the complexes. Further, compounds identified or designed using the methods of the invention can be a peptide or a mimetic thereof. The isolated peptides or mimetics of the invention may be conformationally constrained molecules or alternatively molecules which are not conformationally constrained such as, for example, non-constrained peptide sequences. The term “conformationally constrained molecules” means conformationally constrained peptides and conformationally constrained peptide analogues and derivatives. In addition, the amino acids may be replaced with a variety of uncoded or modified amino acids such as the corresponding D-amino acid or N-methyl amino acid. Other modifications include substitution of hydroxyl, thiol, amino and carboxyl functional groups with chemically similar groups. With regard to peptides and mimetics thereof, still other examples of other unnatural amino acids or chemical amino acid analogues/derivatives can be introduced as a substitution or addition. Also, a peptidomimetic may be used. A peptidomimetic is a molecule that mimics the biological activity of a peptide but is no longer peptidic in chemical nature. By strict definition, a peptidomimetic is a molecule that no longer contains any peptide bonds (that is, amide bonds between amino acids). However, the term peptide mimetic is sometimes used to describe molecules that are no longer completely peptidic in nature, such as pseudo-peptides, semi-peptides and peptoids. Whether completely or partially non-peptide, peptidomimetics for use in the methods of the invention, and/or of the invention, provide a spatial arrangement of reactive chemical moieties that closely resembles the three-dimensional arrangement of active groups in the peptide on which the peptidomimetic is based. As a result of this similar active-site geometry, the peptidomimetic has effects on biological systems which are similar to the biological activity of the peptide. There are sometimes advantages for using a mimetic of a given peptide rather than the peptide itself, because peptides commonly exhibit two undesirable properties: (1) poor bioavailability; and (2) short duration of action. Peptide mimetics offer an obvious route around these two major obstacles, since the molecules concerned are small enough to be both orally active and have a long duration of action. There are also considerable cost savings and improved patient compliance associated with peptide mimetics, since they can be administered orally compared with parenteral administration for peptides. Furthermore, peptide mimetics are generally cheaper to produce than peptides. Naturally, those skilled in the art will recognize that the design of a peptidomimetic may require slight structural alteration or adjustment of a chemical structure designed or identified using the methods of the invention. In general, chemical compounds identified or designed using the methods of the invention can be synthesized chemically and then tested for ability to disrupt the complexes, using any of the methods described herein. The peptides or peptidomimetics of the present invention can be used in assays for screening for candidate compounds which bind to selected regions or selected conformations of the complexes. Binding can be either by covalent or non-covalent interactions, or both. Examples of non-covalent interactions include electrostatic interactions, van der Waals interactions, hydrophobic interactions and hydrophilic interactions.
Compounds of the present invention preferably have an affinity for the complexes, sufficient to provide adequate binding for the intended purpose. Suitably, such compounds have an affinity (Kd) of from 10−5 to 10−15 M. For use as a therapeutic, the compound suitably has an affinity (Kd) of from 10−7 to 10−15 M, preferably from 10−8 to 10−12 M and more preferably from 10−10 to 10−12 M.
Compounds of the invention may be subjected to further confirmation of binding to the complexes and structural determination, as described herein. Compounds designed or selected according to the methods of the present invention are preferably assessed by a number of in vitro and in vivo assays of ALK and LTK function to confirm their ability to interact with and modulate the complexes, in particular the disruption of the complexes. Libraries may be screened in solution by methods generally known in the art for determining whether ligands competitively bind at a common binding site. Such methods may include screening libraries in solution, or on beads or chips. Where the screening assay is a binding assay, polypeptides of the complexes, may be joined to a label, where the label can directly or indirectly provide a detectable signal. Various labels include radioisotopes, fluorescent molecules, chemiluminescent molecules, enzymes, specific binding molecules, particles, e.g., magnetic particles, and the like. Specific binding molecules include pairs, such as biotin and streptavidin, digoxin and antidigoxin, etc. For the specific binding members, the complementary member would normally be labelled with a molecule that provides for detection, in accordance with known procedures. A variety of other reagents may be included in the screening assay. These include reagents like salts, neutral proteins, e.g., albumin, detergents, etc., which are used to facilitate optimal protein-protein binding and/or reduce non-specific or background interactions. Reagents that improve the efficiency of the assay, such as protease inhibitors, nuclease inhibitors, antimicrobial agents, etc., may be used. The components are added in any order that produces the requisite binding. Incubations are performed at any temperature that facilitates optimal activity, typically between 4° C. and 40° C. Direct binding of compounds to the complexes can also be done for example by Surface Plasmon Resonance (BIAcore).
Saturated (NH4)2SO4
SOLEIL (Proxima
Each data set was collected from a single crystal. Values in parentheses correspond to the highest resolution shell.
1 Chain A
1 Chain A
1 Chain B
103
O
7
O
6
25
O
2
IS
4
1
2
0
54
54
2
IS 1
4
5
2
2
IS 135
3
2 Chain C
2 Chain C
bonds and salt bridges
bonds and salt bridges
bonds and salt bridges
2
4
7
7
70
7
2
2
70
indicates data missing or illegible when filed
It is to be understood that although particular embodiments, specific configurations as well as materials and/or molecules, have been discussed herein for cells and methods according to the present invention, various changes or modifications in form and detail may be made without departing from the scope and spirit of this invention. The following examples are provided to better illustrate particular embodiments, and they should not be considered limiting the application. The application is limited only by the claims.
ALK is an evolutionarily ancient RTK in C. elegans and D. melanogaster, where it is activated by HEN-118 and Jeb19,20, both featuring LDLa domains. The extracellular segment of invertebrate ALK resembles the one in vertebrates but lacks the N-terminal HBD. Whereas invertebrates encode a single ALK family receptor, gene duplication in vertebrates spawned LTK as a second ALK-like receptor21. During vertebrate evolution the ALK ectodomain remained constant and divergent evolution of LTK led to loss of the N-terminal HBD plus the MAM-LDLa-MAM cassette in mammals. Yet, the cytokine-binding segment in the ALK and LTK ectodomains bears no resemblance to any known protein-binding domain among cytokine receptors.
To shed light onto the enigmatic structure-function landscape of ALK and LTK, we pursued crystal structures for human ALK and LTK ectodomains comprising their TNFL, GR, and EGFL membrane-proximal segments (Table 1). We produced glycan-shaved ALKTG-EGFL (residues 648-1030), its complex with a non-neutralizing Fab fragment22, and LTKTG (residues 63-380) in mammalian cells (
Unexpectedly, the TNFL and GR regions in ALK and LTK do not form separate domains, but are intimately interwoven into a large, continuous, and fully globular TG supradomain23 (
The GR subdomain displays 14 long and tightly packed pGII-helices arranged in a honeycomb-like lattice (
A query in the DALI server30 using our structural models for ALKTNFL and LTKTNFL retrieved TNF/C1q-class folds (e.g. r.m.s.d=2.8 Å against C1q and TNF, 72 Calpa atoms). However, the ALKTNFL/LTKTNFL chain topology is radically different and unprecedented (
Since ALKAL1 has been reported as an LTK-specific cytokine and ALKAL2 activates both ALK and LTK6-8, we opted to pursue structures of ALK-ALKAL2 and LTK-ALKAL1 complexes. We could readily purify truncated versions of both ALKALs corresponding to the conserved C-terminal domains (termed ALKAL1 and ALKAL2) in HEK293T cells as well as full-length ALKAL1 (ALKAL1FL). All three purified ALKALs were monomeric (
Our structures revealed that the overarching assembly mode of ALK/LTK-cytokine complexes entails a 2:1 stoichiometric assembly where a single cytokine molecule is cradled proximal to the membrane by two copies of the receptor TG supradomains resulting in receptor dimerization (
ALKAL1 and ALKAL2 adopt highly similar structures (r.m.s.d=0.54 Å, 56 Calfa-atoms) featuring a new type of disulfide-stabilized three-helix bundle, wherein αA connects via a conserved short loop to a helical hairpin constructed from αB and αC, which in turn are tethered by two conserved disulfides (
Despite their engulfment by the receptor dimers, both ALKAL1 and ALKAL2 display a solvent-exposed hydrophobic cavity defined by conserved residues: those lining the internal BC face (Leu117, Phe94, Tyr98) and the AB loop (Val84, Phe86) and those around the outside rim contributing hydroxyl groups (
Despite being monomeric and lacking symmetry, ALKAL1 and ALKAL2 remarkably dimerize their cognate receptors into highly similar, twofold-symmetric assemblies reminiscent of receptor complexes mediated by erythropoietin and growth hormone40,41. However, these hematopoietic cytokines display a certain degree of pseudo-symmetry through their four-helix bundle. ALKAL1 and ALKAL2 contact the same binding sites on LTK and ALK but display different interaction modalities. In site 1, αB and αC use a hydrophobic epicenter surrounded by polar residues to contact the TNFL subdomain via a patch contributed by its D, E, H″ and I strands and the H″-I loop. In each complex, a trio of arginine residues in ALKAL1 (Arg102, Arg112, Arg119) and ALKAL2 (Arg123, Arg133, Arg140) engages two conserved glutamate residues at the periphery of site 1 (
In site 2, which is predominantly hydrophobic, the short ALKAL αA pairs with the tip of the BC hairpin to engage the second receptor molecule (
Cytokine-mediated dimerization of the ALKTG and LTKTG supradomains induces receptor-receptor contacts (site 3) by locking α1 of the TNFL subdomain with α2 and α3 at the distal end of the GR subdomain (
To obtain insights into the contribution of the two distinct ALK/LTK-cytokine interfaces, we used mutants probing the polar interactions of site 1 and the central hydrophobic pocket in site 2. This is especially important to clarify for ALK since its extracellular domain, in contrast to LTK, does not readily proceed to 2:1 stoichiometric complexes with cytokines at the concentrations attainable in solution.
Site 1 contacts were interrogated by introducing charge-reversal mutations of two conserved polar interactions resulting in ALKAL1 mutants ALKAL1R102E, ALKAL1R115E, and ALKAL1R102E/R115E, and ALKAL2 mutants ALKAL2R123E, ALKAL2R136E and ALKAL2R123E/R136E. We first established that ALKAL2 is the high-affinity cytokine for ALK and ALKAL1 for LTK and using these binding benchmarks concluded that all site 1 mutants for ALKAL1 and ALKAL2 drastically reduced their affinity to both receptors (
To probe site 2 we used mutants ALKAL1F76E and ALKAL2F97E aimed at their respective hydrophobic interaction patches, and a ALKAL2H100A mutant. The interaction of ALKAL1F76E, ALKAL2F97E and ALKAL2H100A with LTK resulted in a biphasic binding profile with one interaction obeying faster off-rates (
We used purified ALKAL1/2 mutants to further investigate the role of sites 1 and 2 via our Ba/F3 cellular proliferation assay (Extended Data
Finally, we interrogated the functional importance of site 3 in LTK and ALK. For LTK, mutants LTKR241A and LTKR241A/N369G had a decreased propensity to form the native 2:1 stoichiometric complexes with ALKAL1 (
The presented structural landscapes of ALK/LTK-cytokine complexes revealed key differences in site 2 and site 3, prompting the question whether evolutionarily conserved determinants might be at play. Indeed, we noted the absence of equivalents for Phe143 and His135 in ALK differentiates ALK from LTK across vertebrates. Together with an insert in the LTK e-f loop (Extended Data
Collectively, our structure-function data suggest that site 1 engagement is the driving force for establishing ALK/LTK-cytokine encounter complexes. Given the high sequence conservation of interfacing residues in both cytokines and receptors the observed ALKAL1/2 binding modes will likely apply to all vertebrate receptors consistent with the reciprocal species cross-reactivity of zebrafish ALKAL2a and human ALKALs43. Interestingly, surface amino acids on the opposite side of the ALK cytokine-binding interfaces are highly conserved, in contrast to LTK, suggesting that they might be relevant for interactions with the N-terminal domains, which are absent in LTK.
To expand structure-function insights on ALK and LTK activation we structurally mapped mutations in ALKTG and LTKTG combining documented oncogenic potential and frequently occurring missense single-nucleotide polymorphisms44,45 (
ALKF856S, a gain-of-function mutation linked to acute myeloid leukemia47, and ALKR753Q identified in histiocytic neoplasms48, are roughly equidistant from the start of αC of bound ALKAL2 (
Given the overall structural similarity of ALKTG/LTKTG-cytokine complexes and that EGFL modules can bind ligands49, we hypothesized that the membrane-proximal EGFL domain might underlie the apparent preference of ALK for ALKAL2 over ALKAL1. Benchmarking of the binding thermodynamics showed that even at micromolar concentrations, ALKTG-EGFLformed enthalpically-driven binary complexes with either cytokine characterized by a markedly higher affinity for ALKAL2 (KD=40 nM) than ALKAL1 (KD=600 nM) (
Sequence differences in ALKAL1 and ALKAL2 map to distinct patches proximal to the EGFL domain (
As ALK carries four additional N-terminal domains compared to LTK (
Whereas cytokine-mediated dimerization of ALK and LTK leads to structurally similar ternary complexes, the mechanistic requirements for their assembly appear to be distinct. LTK-cytokine complexes are fully cytokine-driven whereas ALK-cytokine complexes might synergize with glycosaminoglycans (
It is now clear that key differences in the cytokine-binding regions of ALK and LTK dictate cytokine specificity and that receptor-receptor contacts are also important differentiating factors. The resistance of isolated ALK ectodomains towards cytokine-induced dimerization suggests that the reduced dimensionality of their membrane-proximal engagement and additional interactions of its N-terminal domains with glycosaminoglycans or proteoglycans8 might be important for productive cytokine-receptor assemblies, much like bound heparin bridging receptors in FGF-FGFR complexes. In this context, the reported proteolytic shedding of the N-terminal segment of ALK's ectodomain presents with a physiological conudrum59. Intriguingly, the EGFL module of ALK, but not LTK, appears to dictate cytokine specificity, such that the mode of its engagement in ALK-cytokine complexes may impact the orientation of the membrane-proximal domains (and their connected transmembrane helices) to fine tune signaling assemblies.
We envisage application of our findings to further interrogate ALK/LTK signaling in physiology and disease, and in the therapeutic targeting of the ALK/LTK ectodomains17 and their cognate cytokines16.
No statistical methods were used to predetermine sample size. The experiments were not randomized and the investigators were not blinded to allocation during experiments and outcome assessment.
Sequence optimized DNA for full length wild-type ALK (Uniprot ID Q9UM73), LTK (Uniprot ID P29376), ALKAL1 (Uniprot ID Q6UXT8) and ALKAL2 (Uniprot ID Q6UX46) were purchased from Genscript. DNA encoding for different human ALK constructs comprising either amino acids 19-1025 (ALKFL), 648-1025 (ALKTG-EGFL) or 648-985 (ALKTG) and human LTK constructs comprising amino acids 63-420 (LTKTG-EGFL) or 63-379 (LTKTG) were cloned in the pHLsec vector60 in frame with a N-terminal chicken RTPμ-like signal peptide sequence and a C-terminal caspase3 cleavable Fc-Hisx6-tag at the C-terminus.
Sequences encoding for ALKAL1FL (residues 28-129), ALKAL1 (residues 57-129) and ALKAL2 (residues 78-152) were cloned in the pHLsec vector in frame with a N-terminal chicken RTPμ-like signal peptide sequence followed by a caspase3 cleavable Sumostar-tag at the C-terminus. Sequence-optimized DNA encoding the light and heavy chains of Fab324 were purchased from IDT as GBlocks. The N-terminal signal peptide sequences were exchanged for a chicken RTPμ-like signal peptide sequence. The heavy chain was cloned in frame with a C-terminal caspase-3 site followed by an AVI-His6× tag, while the light chain was cloned without a purification tag.
2. Protein Expression in HEK293 and Purification from Conditioned Media
Production of all ALKTG-EGFL and ALKTG constructs was performed in adherently grown HEK293 MGAT1−/− cells61 maintained in DMEM supplemented with 10% FCS. When cells reached 80% confluency they were transiently transfected using branched polyethylenimine 25 kDa as transfection reagent in DMEM with 3.6 mM valproic acid and without FCS.
Expression of the Fab fragment was achieved in adherent HEK293T cells using the same method. For the heterodimeric Fab fragment, plasmids encoding for each chain were co-transfected in a 1:1 ratio. Protein production for ALKFL, ALKAL1/2 and LTK constructs was performed in HEK293S cells grown in suspension and maintained in a mixture of 50% Freestyle (Thermofisher)/50% Ex-Cell (Sigma-Aldrich) growth medium. Transient transfection was performed with linear polyethylenimine (Polysciences) 25 kDa as transfection reagent. One day after transfection valproic acid was added until a final concentration of 1.5 mM62.
For expression in suspension cells conditioned medium was harvested after five days and subsequently clarified by centrifugation for 12 minutes at 8000×g while medium from adherently grown cells was harvested after 6 days and centrifuged for 15 minutes at 6000×g. After centrifugation media were filtered through a 0.22 mm filter prior to chromatographic purification steps.
ALK and LTK constructs were captured via their Fc-tag on a protein A column (HiTrap Protein A HP, Cytiva) and eluted via on-column digest with caspase 3 for 1 hour at 37° C. followed by 2 hours at room temperature and eluted with HBS (HBS (20 mM HEPES pH 7.4 150 mM NaCl). The eluted proteins were then concentrated and injected on a HiLoad 16/600 SD200 (Cytiva) size-exclusion chromatography column pre-equilibrated with HBS. Purified proteins were stored at −80° C. until further use.
ALKAL containing medium was fourfold diluted with 20 mM HEPES pH 7.4 before loading on a cation exchange column packed with SP Sepharose Fast Flow resin (Cytiva) equilibrated in 20 mM HEPES pH7.4 50 mM NaCl. ALKALs were eluted using a NaCl gradient from 50 mM-750 mM for 20 minutes. Fractions containing ALKALs were immediately diluted with 20 mM HEPES pH7.4 to a NaCl concentration of 200 mM and further supplemented with 0.1% (w/v) CHAPS. The C-terminal Sumo-tag was cleaved with caspase 3 overnight at 20° C. To remove undigested protein as well as the cleaved Sumo-tag, the digestion mixture was loaded on a MonoQ 5/50 GL column (Cytiva). Flowthrough containing the ALKALs was concentrated and injected on a Superdex 75 Increase 10/300 GL equilibrated in HBS supplemented with 0.1% CHAPS. Purified ALKALs were stored at −80° C. at a concentration of 1 mg ml−1 until further use. For ALKALs used in BaF/3 assays, endotoxin levels were measured using a Endosafe PTS limulus amebocyte lysate assay (Charles river) and were below 5 EU mg−1.
Fab fragments were captured using cOmplete His-tag purification resin (Roche) and eluted using HBS supplemented with 250 mM imidazole followed by buffer exchange to HBS on a HIPrep 26/10 desalting column. Caspase 3 was added to the purified Fab fragment in order to remove the AviHis tag of the heavy chain by overnight digestion at 20° C. The sample was loaded on an IMAC column in order to remove the enzyme and undigested protein. The flow-through containing tagless Fab fragments was concentrated and injected on a Superdex 200 Increase 10/300 GL (Cytiva) column pre-equilibrated in HBS.
Single domain camelid VHHs (Nanobodies) against LTK were raised by immunizing llamas with LTKTG-EGFL and were selected for specific binding to LTKTG-EGFL via ELISA and BLI. Epitope binning via BLI led to the identification of candidate Nanobodies with non-neutralizing behavior with respect to cytokine binding. The sequences of such non-neutralizing Nanobodies were cloned in a MoClo derived yeast expression vector in frame with a N-terminal preproMF secretory leader sequence followed by the N-terminal Hisx6 tag and a caspase cleavage site. Komagataella phaffii cells were transformed by electroporation and grown on YPDS agar containing 500 μg/ml zeocin. One colony was picked to inoculate 500 ml of BMGY supplemented with 100 μg/ml zeocin and grown at 28° C. for 24 hours. Next, cells were pelleted by centrifugation at 500×g for 7 minutes and resuspended in 500 ml BMMY medium without zeocin and incubated O/N at 28° C. Expression was further induced by adding 2.5 ml of 50% methanol, the same volume of methanol was again added after 8 h and 24 h. After which cells were incubated for another 8 h at 28° C. Finally, conditioned medium was harvested by centrifugation for 10 minutes at 6000×g.
His-tagged camelid single domain VHHs were captured by addition of 2 ml cOmplete resin (Roche) to 500 ml conditioned medium followed by overnight incubation at 4° C. while shaking. Nanobodies were eluted in HBS supplemented with 250 mM imidazole and buffer exchanged to HBS on a HiPrep 26/10 desalting column (Cytiva). The N-terminal Hisx6 tag was removed by an overnight caspase3 digest at 20° C. Undigested protein and the enzyme were removed via IMAC. The flowthrough containing tagless nanobody was concentrated and injected on an SD75 Increase 10/300 GL column (Cytiva) pre-equilibrated in HBS. Purified nanobody was concentrated to a concentration of 4 mg ml−1 and flash-frozen and stored at −80° C.
For ALKTG-EGFL_Fab324 crystals, a 1.5 molar excess of Fab324 was added to ALKTG-EGFL and further subjected to an over-night enzymatic digestion of N-linked glycans with EndoH. The complex was polished by SEC on a Superdex 200 Increase 10/300 GL (Cytiva) and concentrated to 13.5 mg ml−1. Commercial sparse matrix crystallization screens were set up using a Mosquito liquid handling robot (TTP Labtech) in sitting drop format using 100 nL protein mixed with 100 nL mother liquor in SwissSci 96-well triple drop plates incubated at 287 K. A first hit in the Morpheus II screen63 was further optimized to a condition consisting of (40 mM Polyamines, 0.1 M Gly-Gly/AMPD pH 8.5, 11% w/v PEG4000, 19% w/v 1,2,6-Hexanetriol) in sitting drop format with a 100 nl protein mixed with 200 nL mother liquor geometry. Crystals were cryoprotected in mother liquor containing 25% w/v 1,2,6-Hexanetriol prior to being cryocooled in liquid nitrogen. Diffraction data was collected at 100 K at the ID23-2 beam line at ESRF, Grenoble. The datasets were processed using XDS64. Initial phases were calculated by molecular replacement with PHASER65 using the coordinates of a Fab fragment exhibiting the highest sequence identity (PDB: 5nuz, chain A) followed by rigid body refinement in Buster66. A partial polyalanine model was built into the visible electron density in Coot67 followed by density modification via Resolve68. The density modified map showed density for several aromatic sidechains allowing for assignment of the correct register and tracing of the ALK sequence. Additional refinement steps were carried out in PHENIX69 using individual B-factor refinement in combination with TLS, XYZ refinement, optimizing the ×-ray/geometry weights as well as local torsion angle non-crystallographic symmetry (NCS) restraints.
For crystals of ALKTG-EGF, glycans were trimmed by over-night enzymatic digestion with EndoH in HBS. The complex was polished via SEC and concentrated to 10.5 mg ml−1. Commercial sparse matrix sitting drop crystallization screens were set up as described. One hit was obtained in the Morpheus II screen and optimized to 0.5 mM Manganese(II) chloride tetrahydrate, 0.5 mM Cobalt(II) chloride hexahydrate, 0.5 mM Nickel(II) chloride hexahydrate, 0.5 mM Zinc acetate dihydrate, 13% w/v PEG 3000, 28% v/v 1,2,4-Butanetriol, 1% w/v NDSB 256 in hanging drop format. Crystals were cryoprotected in mother liquor containing 25% 1,2,4-Butanetriol prior to being flash frozen in liquid nitrogen. Diffraction data was collected at 100 K at the P14 microfocus beam line at PETRA III, Hamburg and integrated using XDS with standard parameters except for the “BEAM_DIVERGENCE” parameter which was doubled. Initial phases were obtained using maximum likelihood molecular replacement in Phaser using the structure of the ALKTG domain. The structure was refined using Phenix.refine followed by manual building in COOT. The EGF-like domain was manually built into the electron density. Refinement in phenix followed the same protocol as for the ALKTG-EGFL-Fab324 structure except for the absence of NCS restraints and implementation of additional reference restraints based on the structure of ALKTG in complex with Fab324.
For crystals of LTKTG, purified LTKTG was concentrated to 10 mg ml−1 and used to set up sparse matrix screens as previously described. Crystals appeared in a condition of the Morpheus II screen with the final optimized condition containing (MOPSO/Bis-Tris pH 6.3, 12% PEG 20000, 26% Trimethyl propane 1% w/v NDSB 195 5 mM Yttrium (III) chloride hexahydrate, 5 mM Erbium (III) chloride hexahydrate, 5 mM Terbium(II) chloride hexahydrate, 5 mM Ytterbium (III) chloride hexahydrate) set-up in sitting drop format with a 100 nL protein mixed with 200 nL mother liquor geometry was cryoprotected in mother liquor and cryo-cooled in liquid nitrogen. Diffraction data was collected at 100K at the P13 microfocus beam line at PETRAIII, Hamburg and processed using XDS as previously described. Phases were obtained by single wavelength anomalous dispersion making use of the anomalous signal from lanthanide atoms. Determination of the lanthanide substructure for four sites was performed by the hybrid substructure search as implemented in Phenix. Phases were calculated using Phaser-EP. The density was readily interpretable, and a model was manually built in Coot and further refined in Phenix implementing an anisotropic individual B-factor model.
For LTKTG-ALKAL1-Nb3.16 crystals a 3-fold molar excess of Nb3.16 was added to the LTKTG-ALKAL1 complex, concentrated and injected into a Superdex 200 Increase 10/300 GL (Cytiva) equilibrated in HBS. Eluted fractions were concentrated to 13.5 mg ml−1 and used to set up sitting drop crystallization screens as described. Crystals were obtained in a condition consisting of MOPSO/BIS-TRIS pH 6.3 13% PEG 8000, 22% w/v 1,5-pentanediol, 5 mM sodium chromate tetrahydrate, 5 mM sodium molybdate tetrahydrate, 5 mM sodium tungstate tetrahydrate, 5 mM sodium orthovanadate tetrahydrate. Crystals were cryoprotected in mother liquor containing 25% 1,2,4-Butanetriol prior to being flash frozen in liquid nitrogen. Diffraction experiments were performed at 100 K Proxima 2 microfocus beam line at the Soleil synchrotron. Initial phases were obtained by maximum likelihood molecular replacement using Phaser with the previously obtained LTKTG structure. A model for Nb3.16 was automatically built using ArpWarp70,71 followed by manual building of ALKAL1 in Coot. Refinement was performed in Phenix with individual anisotropic ADP parameters with a TLS model.
For ALKTG-ALKAL2 crystals a 3-fold molar excess of ALKAL2 was added to ALKTG and subjected to an overnight EndoH digest, concentrated and injected into a Superdex 200 Increase 10/300 GL (Cytiva) equilibrated in HBS. Eluted fractions were concentrated to 8 mg ml−1 and used to set up sitting drop crystallization screens as described. Initial hits were obtained in a condition consisting of 0.1M MES pH 6.5 15% w/v PEG 6000 5% w/v MPD. A single crystal was used to prepare a seed stock72 using the PTFE seed bead (Hampton Research). The best diffracting crystals were obtained by seeding with a 1:1000 dilution of the seed stock into the optimization screen. Crystals were cryoprotected in 78% (v/v) mother-liquor supplemented with 22% (v/v) ethylene glycol before flash freezing in liquid nitrogen. Diffraction data were collected at 100 K at the Proxima 2 microfocus beam line at the Soleil synchrotron. Data as processed as described above with the difference that anisotropy correction was implemented by the UCLA diffraction anisotropy server73. Initial phases were obtained by molecular replacement in Phaser using the previously determined ALKTG and ALKAL1 structures. First refinement cycles were performed in Buster followed by iterative refinement using Coot and Phenix. B-factors were refined using two isotropic atomic displacement parameters complemented by TLS. During refinement structures of ALKTG and ALKAL1 provided reference model restraints.
For Fab324 crystals, tag-free Fab324 was concentrated to 18.5 mg ml−1 and a pH versus (NH4)2SO4 concentration screen was set up in sitting drop format resulting in crystals in a condition consisting of (1.5M (NH4)2SO4 40 mM Glycine pH 9.5). The crystal was cryoprotected using a saturated (NH4)2SO4 solution. Diffraction experiments were performed at the P13 beamline Petralll, Hamburg and data were processed using XDS. Phases were obtained by molecular replacement using our structure of Fab324 obtained from the ALKTG-EGFL-Fab324 complex. The structure was refined using Coot and Phenix.
All display items containing structures were generated using the PyMOL Molecular Graphics System, version 2.0.5 (Schr6dinger).
Experiments were performed using a MicroCal PEAQ-ITC instrument at 310K. Proteins used in ITC experiments were expressed in HEK293S cells grown in suspension. As a final purification step all proteins were buffer exchanged to the same HBS buffer via size-exclusion chromatography. Titrations were preceded by an initial injection of 0.5 μL. The injection spacing was optimized per experiment to allow for the signal to get back to a stable baseline. Throughout the titration the sample was stirred at a speed of 750 r.p.m. Data were analyzed using the PEAQ-ITC analysis software (version 1.1.0.1262, Malvern) and fit using a “one set of sites” model.
Protein samples of 100 μl at approximately 1.0 mg ml−1 were injected onto a Superdex 200 increase 10/300 GL column (Cytiva) connected in line to a UV-detector (Shimadzu), a miniDawnTReos (Wyatt) multi-angle laser light scattering detector and an optilab T-Rex (Wyatt) refractometer. The refractive increment value (dn/dc) was 0.185 mL g−1. Band broadening was corrected for using reference measurements of BSA (Pierce). Data analysis was carried out using the Astra 6.1.6 software and standard deviations were calculated using Prism8 (Graphpad Software).
Screening of mutant ALKALs was performed by immobilizing wild-type and mutant ALKAL1/2 variants. To this end, residues 57-129 of wild-type ALKAL1 and interface mutants (R102E, R115E, R102E/R115E, F76E and F80E) as well as residues 78-152 of wild-type ALKAL2 and interface mutants (R123E, R136E, R123E/R136E, F97E and H100 Å) were cloned in the pHLsec vector in frame with a C-terminal Avi tag. All constructs were transiently cotransfected in suspension grown HEK293S cells together with a BirA expression plasmid (pDisplayBirA-ER74) as previously described and supplemented with 100 μM biotin upon transfection. After 4 days of expression, excess biotin was removed by desalting the conditioned media to HBS on a HIPrep 26/10 desalting column (Cytiva).
All measurements of binding kinetics and dissociation constants were performed using an Octet Red 96 (Forté Bio) in assay buffer (20 mM HEPES pH 7.4, 150 mM NaCl, 0.02% (w/v) BSA, 0.002% (v/v) Tween 20) at 298K. ALKALs were immobilized to a level of 0.5 nm on streptavidin-coated biosensors (Forté Bio). To verify that no a specific binding was present during the assay, non-functionalized biosensors were used as a control by measuring in parallel all ligand concentrations as well as running buffer. For all mutants a two-fold dilution series from 6.4 μM-400 nM was employed. Data analysis was performed using the Data Analysis software 9.0.0.14 (Forté Bio) and binding curves were exported to Prism8 (Graphpad Software) for plotting of curves.
SEC-SAXS data were collected at the SWING beamline at SOLEIL (France) using an integrated online HPLC set-up. Purified samples of ALKTG-EGFL-ALKAL2 (18.5 mg ml−1) and LTKTG-EGFL-ALKAL1 (19 mg ml−1) expressed in HEK293SMGAT−/− cells were injected on a Biosec-3 column (Agilent) with HBS as a running buffer. The scattering data were collected in continuous flow mode with a flow speed of 0.3 ml/min and a 1 s exposure time per frame. Buffer and sample frames were selected and subtracted using the program RAW75. Theoretical scattering curves and fitting to experimental scattering data was performed with AllosMod-FoXS.
Briefly, a model for the C-terminal EGFL domain of LTK was generated by homology modeling starting from the crystal structure of the ALK EGF-like domain using the SWISS-MODEL server76. This model was manually placed and connected to the C-terminus of the LTKTG domain in Pymol based on the ALKTG-EGFLstructure. Missing regions in the ALK and LTK structures were added using MODELER77. The resulting models were subsequently energy minimized using Rosetta-Relax, and used as an input for AllosMod to add N-linked glycans on positions Asn709, Asn808, Asn886 and Asn986 for ALK and Asn380 and Asn412 for LTK, and calculated resulting model energy landscapes. The output of AllosMod was then used in AIIosMOD-FoXS to calculate fits with theoretical scattering curves during fast AllosMod simulations at 300 K.
Ba/F3 cells (murine pro-B cell line) cells were cultured in RPMI/10% FCS supplemented with mouse interleukin-3 (IL-3, 1 ng ml−1). Ba/F3 cell line was not listed in the database of commonly misidentified cell lines maintained by ICLAC and NCBI Biosample. Ba/F3 cells were transduced with viral supernatant MSCV-ALKWT/ALKR753Q/ALKF856S/EV-IRES-GFP (EV; empty vector) for 2 days in RPMI/10% FCS supplemented with mouse+IL-3 as previously described78. GFP-sorted cells were used for the cell growth assays and western blot. After removal of IL-3 from the media, the cells were stringently washed with PBS for three times. The cell growth curves and heatmaps were made using GraphPad Prism 9 software as mean values, with error bars representing standard deviation.
For Western blotting, the following antibodies were used: ALK (Purchased from Cell Signaling Technologies; catalog no.: 3633; dilution: 1:1,000), Phospho-ALK (Tyr1278) (Cell Signaling Technologies; 6941; 1:250), Phospho-ALK (Tyr1604) (Cell Signaling Technologies; 3341; 1:250), Phospho-p44/42 MAPK (Erk1/2) (Thr202/Tyr204) (Cell Signaling Technologies; 4370; 1:2,000), p44/42 MAPK(Erk1/2) (Cell Signaling Technologies; 9102; 1:1,000), β-actin (Sigma-Aldrich; A-5441; 1:2,000), GAPDH (GENETEX; GTX100118; 1:2000). For the cell growth assay, Crizotinib (Sigma-Aldrich; PZ0191-5MG), DMSO (Signa-Aldrich; D8418-100ML), and IL-3 (Peprotech; AF-213-13) were used.
Statistical significance was determined by two-way ANOVA followed by Tukey's multiple comparison test where multiple comparisons should be adjusted. Data were plotted using GraphPad Prism 9 software as mean values, with error bars representing standard deviation. Heatmaps were also made using GraphPad Prism 9 software based on mean values. *P<0.05, ** P<0.01 and *** P<0.001, respectivNb3.16ely, unless otherwise specified.
A. 112, 15862-15867 (2015).
Number | Date | Country | Kind |
---|---|---|---|
21184332.1 | Jul 2021 | EP | regional |
This application is a national phase entry under 35 U.S.C. § 371 of International Patent Application PCT/EP2022/068425, filed Jul. 4, 2022, designating the United States of America and published in English as International Patent Publication WO 2023/280766 on Jan. 12, 2023, which claims the benefit under Article 8 of the Patent Cooperation Treaty to European Patent Application Serial No. 21184332.1, filed Jul. 7, 2021, the entireties of which are hereby incorporated by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2022/068425 | 7/4/2022 | WO |