This application incorporates by reference the Sequence Listing XML file submitted herewith via the patent office electronic filing system having the file name “211278US2-sequences.xml” and created on Mar. 11, 2024 with a file size of 15,813 bytes.
Single domain antibodies (sdAb) and other antibody products display their merit in a variety of areas such as therapeutics and diagnostics. Antibody engineering has dramatically evolved. Current antibody drugs, for example, have increasingly fewer adverse effects, such that therapeutic antibodies represent a significant fraction of the new drugs developed in recent years.
Engineering thermostability into antibodies or other protein products can extend the product's shelf life and reduce the replenishment costs for stockpiling. In the past, directed evolution has been applied to enhance binding and solubility for some of these heterologous expressed proteins. To date, antibody discovery has remained to be affixed with expensive technologies with minimal yield.
A need exists for techniques for rapidly and inexpensively producing antibodies with improved stability, as well as such antibodies themselves.
Described herein are single domain antibodies (sdAbs) with improved thermostability and methods for the design thereof using a machine learning model (LIME). LIME predicted thermostabilizing mutations were tested in four sets of sequence-related single domain antibodies (sdAb), 12 sdAb in total. The mutations were restricted to the constant regions. Two sets of interacting residues were consistently predicted. The first set was in the core of three variants, A7D, with W37R or W37H; these were predicted to form a salt bridge. The second predicted set of, interacting residues was on the surface and created a network of hydrogen bonds involving P22T with S8T accompanied by Q6S. Six mutations were common to all four sets: Q6S, A7D, S8T, (R/Q/E)14R, S18R, and P22T. Notably, both sets of mutations were near the disulfide bond in the core of the v-set domain. In variant 1.2, a 12.7 degree increase in the Tm was obtained after five mutations. However, similar mutations in variant 18.2 led to a small decrease in Tm suggesting that the variable regions of the sdAb may subtly affect the chi angles of the side chains. AlphaFold multimer models show the predicted epitopes on the antibody target.
In a first embodiment, a single domain antibody comprises a sequence of amino acid residues selected from the group consisting of SEQ ID NOS: 2, 3, 5, 6, 8, 9, 11, 12, 13, 14, and 15.
Further embodiment include single domain antibodies according to the first embodiment, wherein the sequence lacks some or all of the DDDDK tag, and/or has an additional terminal hexahistidine tag.
Another embodiment is a single domain antibody comprising the mutations Q6S, A7D, S8T, (R/Q/E)14R, and P22T. Optionally, the antibody further comprises a tag to improve solubility.
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
Before describing the present invention in detail, it is to be understood that the terminology used in the specification is for the purpose of describing particular embodiments, and is not necessarily intended to be limiting. Although many methods, structures and materials similar, modified, or equivalent to those described herein can be used in the practice of the present invention without undue experimentation, the preferred methods, structures and materials are described herein. In describing and claiming the present invention, the following terminology will be used in accordance with the definitions set out below.
As used herein, the singular forms “a”, “an,” and “the” do not preclude plural referents, unless the content clearly dictates otherwise.
As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.
As used herein, the term “about” when used in conjunction with a stated numerical value or range denotes somewhat more or somewhat less than the stated value or range, to within a range of +10% of that stated.
As described herein, machine learning (ML) was used to predict potentially thermostabilizing mutations in single domain antibodies. Namely, the ML tool Local Interpretable Model-Agnostic Explanations (LIME), which finds a representative model through locally weighted approximations of the original model, was used. This technique allows a low-sample size method for generating new antibody sequences from automatically directing mutations at sites that maintain a local fidelity from its original characteristic and preserve the context of the mutations. Once given an initial input sequence, the model then individually predicted thermostabilizing mutations through a XGBoost regressor.
A series of sdAb variants were generated via ML based on four initial input sequences, denoted: 0.1, 1.1, 17.1, and 18.1. Each of these four initial sequences represent previously published sdAbs having binding activity against Staphylococcal Enterotoxin B (SEB). These four sequences were described in two references. The first reference, K. B. Turner et al., “Next-Generation Sequencing of a Single Domain Antibody Repertoire Reveals Quality of Phage Display Selected Candidates,” PLOS One 11, e0149393 (2016) describes the sdAb sequences where 0.1 as used here is termed Ca in the reference and likewise 1.1 is termed Cd there. The second reference, K. B. Turner et al., “Isolation and epitope mapping of staphylococcal enterotoxin B single-domain antibodies,” Sensors (Basel) 14, 10846-10863 (2014), describes the other two original sequences where 17.1 used here the same as is S222-A2 in the reference and likewise 18.1 here is B9-B11 there.
Starting with representative sequence 1.1 as an input to LIME, the model was asked to produce a variant with improved thermostability. This resulted in the generation of variant 1.2 as seen below in Table 1. In practice, these were each produced with a hexahistidine tail at the C-terminal end for purification.
Some of the sequences in the table have a DDDDK appended for improved protein solubility, as discussed below. This feature, when present, was not introduced by the ML model. However, in candidates for improved temperature stability, the model did introduce mutations at S6, D7, T8, R14, and T22. The odds of these five mutations occurring at random is approximately 1 in 3,200,000 combinations. An additional round identified a sixth mutation, S18R.
Solubility was found to be a primary main limitation for experimental testing of the sdAb sequences obtained from the LIME-based ML model, as the model used has no information available to predict solubility. In a competing methodology of sequence generation, directed evolution (DE) methods using libraries generated from random mutagenesis, a large fraction of the total proteins are inactive, where beneficial mutations only occur at a low frequency of 10−3. In order to obtain soluble ML-designed sdAbs, a range of properties were investigated with a finding that the pI of the insoluble proteins was near 9, while the pI of the soluble proteins was <8.6. By adding an enterokinase cleavage site (DDDDK) to the C-terminus, the pI was lowered and the solubility of the ML-predicted sdAb sequences during periplasmic expression in E. coli was found to be enhanced. Thus, several of the ML-generated sequences were modified post generation with the addition of DDDDK.
The expressed proteins were measured for thermal denaturation, monitoring 2° C. per minute from 10° C. to 95° C. using a Jasco 810 Circular Dichroism (CD) spectropolarimeter fitted with a Peltier temperature control unit. The melting temperature (Tm) was determined from a four-parameter fit of the ellipticity at 205 or 207 nm versus temperature. For Tm measurements, the buffer was exchanged to 1×PBS pH 7.4 using PD-10 columns. A protein concentration of 7-14 μM and a 1 mm cuvette were used for CD analysis. Dye melt experiments were used as another method to measure the Tm. Here SYPRO Orange Protein Gel Stain (Sigma) was diluted 1000-fold in a final volume of 20 μl PBS with a final protein concentration was 500 μg/ml. Fluorescence was measured using a StepOne Real-Time PCR system (Applied Biosystems) as the sdAbs were heated from 25° C. to 99° C. at a rate of 1% using the ROX channel. It was found that three of the mutation series produced sufficiently soluble protein, namely 1.1DDDDK through 1.2DDDDK, 17.1DDDDK through 17.2DDDDK, and 18.1DDDDK through 18.2DDDDK. The 1.2DDDDK and 17.2DDDDK variants each showed increases in Tm 12.7 and 2.9° C., respectively. No significant change was observed in the Tm of the 18.1 to 18.2 variants. The mutations and CD denaturation curves for 1.1DDDDK and 1.2DDDDK are shown in
Next, to determine whether the ability to bind to SEB was retained following mutation, surface plasmon resonance (SPR) and the AlphaFold2 model (which predicts the three-dimensional structure of a protein based on its amino acid sequence) were employed. SPR was performed using a ProteOn XPR36 Protein Interaction Array System (Bio-Rad) on a GLC sensor chip. Immobilization of SEB at was first applied to at 20 μg/mL on all lanes of the chip. The binding kinetics of each sdAb was determined by rotating the chip and flowing various concentrations (300, 100, 33, 11, 3.7, 0 nM) over the chip at 100 μL/minute in the orthogonal direction for 90 s over the SEB-coated chip and then monitoring dissociation for 600 s. AlphaFold2 was also used to predict the epitope recognized by the sdAb. Here, models of the antibody-antigen complexes were generated from sequence with AlphaFold2. The predicted sdAb-SEB interfaces overlap predominantly with the binding surface of the TCR (PDB 4C56), but notably not to any unexpected regions of the SEB superantigen (
It is suspected that, besides the antibodies disclosed in the examples, sdAbs having one or more of the mutations Q6S, A7D, S8T, (R/Q/E)14R, S18R, and P22T would also exhibit improved thermal stability. To maximize thermal stability, most or all of these mutations should be used. The S18R mutation may be less important, thus, in one embodiment, an sdAb has the mutations Q6S, A7D, S8T, (R/Q/E)14R, and P22T.
While the tested sequences include hexahistidine tags, also contemplated are variants where the histidine tag is of a different length or is absent or is substituted with a different tag useful for protein purification. Similarly contemplated are variants lacking the DDDDK tag for solubility, or having a different tag for improved solubility.
The reported increases in the sdAb's Tm values were obtained using a set of novel mutations while retaining SEB-binding. The mutations increased the Tm of two different sdAb (1.2 and 17.2).
The Tm of the sdAb is significantly above body temperature (37° C.); thus, these sdAb could be used for coating medical devices such as catheters or for shelf-stable refrigeration free diagnostics such as lateral flow tests.
The sdAb design strategy described here has a range of advantages relative to prior art. This methodology uses a ML model and subsequent modification (adding the DDDDK solubility tag) to arrive at new sdAb candidates to characterize, avoiding the need for the labor-intensive and costly repetitive rounds of sequential mutation experiments. The solubility tag also eliminated misfolding states from inclusion bodies and enhanced recovery of soluble folded protein by >10-fold for the 1.2DDDDK variant when compared with the 1.1DDDDK variant. Next, the method restricted mutations to constant regions of the sdAb, avoiding the variable CDRs, regions that would be affected via the random mutagenesis used in directed evolution experiments. Finally, the method for obtaining the described sequences is nonspecific to SEB-binding sdAbs and can be applied to a variety of inputs.
All documents mentioned herein are hereby incorporated by reference for the purpose of disclosing and describing the particular materials and methodologies for which the document was cited.
Although the present invention has been described in connection with preferred embodiments thereof, it will be appreciated by those skilled in the art that additions, deletions, modifications, and substitutions not specifically described may be made without departing from the spirit and scope of the invention. Terminology used herein should not be construed as being “means-plus-function” language unless the term “means” is expressly used in association therewith.
This application claims the benefit of Provisional U.S. Patent Application No. 63/491,169 filed on Mar. 20, 2023, which is incorporated herein by reference in its entirety.
The United States Government has ownership rights in this invention. Licensing inquiries may be directed to Office of Technology Transfer, US Naval Research Laboratory, Code 1004, Washington, DC 20375, USA; +1.202.767.7230; nrltechtran@us.navy.mil, referencing NC 211278.
Number | Date | Country | |
---|---|---|---|
63491169 | Mar 2023 | US |