MUTANT AMINOACYL-TRNA SYNTHETASES

REFERENCE TO AN ELECTRONIC SEQUENCE LISTING

The contents of the electronic sequence listing (BGU-P-098-US.xml; Size: 139,200 bytes; and Date of Creation: Jun. 20, 2023) is herein incorporated by reference in its entirety.

FIELD OF INVENTION

The present invention is in the field of artificial amino acid incorporation.

BACKGROUND OF THE INVENTION

Site-specific modification of proteins is a powerful means for investigation and manipulation of the properties of proteins, and has been utilized for a variety of applications, such as fluorescent labeling, analysis of structure and functions, and manipulation of the chemical, biological, and pharmacological properties of target molecules. Beyond single-site modifications, multi-site modifications have been demonstrated to extend and further exploit the potential of such applications, for example for direct polymerization of target proteins, site-specific conjugation of single protein to multiple ligands, and increased performance in analytical chemistry assays.

Traditional methods for protein modification via the chemically reactive cysteine and lysine residues often result in heterogenous product, may disturb the proper folding and function of the modified protein, or may require additional mutations to the protein to achieve single-site modification. In contrast, site-specific introduction of small bio-orthogonal groups to proteins followed by a chemo-selective reaction, does not require the introduction of additional mutations, enables site-selection to minimize interference with the folding and function of the protein, and results in homogenously modified product.

One of the most commonly utilized bio-orthogonal reactions is the copper catalyzed azide-alkyne cycloaddition (CuAAC), which has been employed in a variety of applications, such as modifying and labeling nucleic acids, viruses and proteins both in-vitro and in-vivo. In addition to bio-orthogonality, reactions of this family are rapid, regioselective, and result in high yields of the conjugated product. However, CuAAC bioconjugation application in living systems is hindered due to the Cu(I)-induced generation of reactive oxygen species (ROS) with consequent cellular toxicity. Several methodologies have addressed this limitation by using different ligands to reduce the Cu-mediated toxicity to cells, by utilizing an azide probe that contains an internal copper-chelating moiety, or by using strained alkynes which obviate the need for copper catalysts. However, unlike azides and alkynes, the azide probes with internal copper chelating moiety and strained alkynes are large structures, therefore they may interfere with the function of the labeled protein. In addition, the strain promoted azide-alkyne cycloaddition (SPAAC) is a much slower reaction than the CuAAC and is not strictly bio-orthogonal and regioselective.

As a pre-requisite for site-specific conjugation via CuAAC, an alkyne or azide group must be site-specifically incorporated into the protein. This can be achieved using several methodologies including enzymatic or chemical modification of selected residues (typically post-protein purification), or by incorporation of unnatural amino acids (uAAs) that bear an alkyne or an azide group. Several studies describe the incorporation of such uAAs by substitution of a natural amino acid with a close synthetic analog in auxotrophic strain, which has been used for labeling in various organisms. However, while this method has been applied in various organisms, growth defects due to amino acid substitution, relatively low aminoacylation of the endogenous tRNA with the uAA, compared to the natural amino acid limit the yield of the target protein. In addition, this method replaces all instances of the relevant natural amino acid with the uAA, and therefore is not site specific. Alternatively, uAAs can be incorporated site specifically via codon reassignment or frameshift codons by using orthogonal translation systems (OTSs) consisting of an aminoacyl tRNA synthetase (aaRS), which is able to charge only a cognate tRNA that is not aminoacylated by endogenous aaRSs. Typically, a TAG stop codon (transcribed to UAG during mRNA synthesis) is assigned to the uAA.

Genetic code expansion to integrate light sensitive moieties into proteins. Generally, there are three common classes of light-responsive moieties that are used to impart light-responsive behavior for biopolymers: (1) fusion to a genetically encoded light-responsive protein, however the main drawback is lack of generality, (2) incorporation of an irreversible photolabile group, and (3) incorporation of a reversible photoswitchable group. Among the third group molecules, azobenzenes, a class of molecules known to undergo reversible light-based isomerization, are the most robust and commonly used photoswitches. Upon irradiation with light of the appropriate wavelength (λtrans→cis), the azobenzene molecule undergoes a dramatic switch from the trans to the cis configuration (shortening by at least ˜3.5 Å), with a concomitant change from a hydrophobic to a hydrophilic (polar) molecule (˜3 Debyes). Importantly, this process is reversible, and with time or upon irradiation with a second, different, wavelength within the blue light range (λcis→trans), the azobenzene molecule relaxes back to the trans configuration.

Incorporation of azobenzene into a polypeptide chain can be mediated by incorporation of azobenzene-containing non-standard amino acid (nsAA), using expanded genetic code method as is used for the alkyne or azide groups. This expansion has enabled template-based incorporation of >100 nsAAs containing diverse chemical groups including post-translational modifications, photocaged amino acids, bio-orthogonal reactive groups, and spectroscopic labels. However, with respect to light-responsive nsAA only incorporation of a single nsAA into a single protein has ever been successfully achieved.

Several challenges have limited genetic code modification-based integration of nsAA technology to in some applications only one instance, or in others a few instances, of site-specific uAA incorporation per protein. Improved translation systems that can circumvent these limitations and improved nsAA integration technology are greatly needed.

SUMMARY OF THE INVENTION

The present invention provides mutant aminoacyl-tRNA synthetase (aaRS) proteins. Nucleic acid molecules encoding the mutant aaRSs are provided. Orthogonal translation systems comprising the mutant aaRSs or the nucleic acid molecules are provided. Cells comprising the orthogonal translation systems, mutant aaRSs or nucleic acid molecules are provided. Methods of using the mutant aaRSs, nucleic acid molecules, orthogonal translation systems and cells are also provided.

According to a first aspect, there is provided a mutant aminoacyl-tRNA synthetase (aaRS) comprising an amino acid sequence of an aaRS comprising at least one amino acid mutation selected from the group consisting of: tyrosine 32 mutated to leucine, tyrosine 32 mutated to threonine; leucine 65 mutated to valine; glutamic acid 107 mutated to alanine; phenylalanine 108 mutated to tyrosine; glutamine 109 mutated to methionine; aspartic acid 158 mutated to serine; aspartic acid 158 mutated to glycine; isoleucine 159 mutated to alanine; isoleucine 159 mutated to methionine; isoleucine 159 mutated to cysteine; isoleucine 159 mutated to tyrosine; leucine 162 mutated to glutamic acid; leucine 162 mutated to lysine; leucine 162 mutated to valine; leucine 162 mutated to arginine; leucine 162 mutated to serine; leucine 162 mutated to cysteine; alanine 167 mutated to histidine, alanine 167 mutated to aspartic acid and alanine 167 mutated to tyrosine.

According to some embodiments, the mutant is selected from the group consisting of:

- a) a mutant comprising tyrosine 32 mutated to leucine, aspartic acid 158 mutated to serine, isoleucine 159 mutated to methionine, leucine 162 mutated to lysine, and alanine 167 mutated to histidine;
- b) a mutant comprising tyrosine 32 mutated to leucine, leucine 65 mutated to valine, aspartic acid 158 mutated to glycine, isoleucine 159 mutated to alanine, leucine 162 mutated to glutamic acid, and alanine 167 mutated to histidine;
- c) a mutant comprising alanine 32 mutated to threonine, leucine 65 mutated to valine, glutamic acid 107 mutated to alanine, phenylalanine 108 mutated to tyrosine, glutamine 109 mutated to methionine, aspartic acid 158 mutated to glycine, isoleucine 159 mutated to cysteine, leucine 162 mutated to arginine, and alanine 167 mutated to aspartic acid;
- d) a mutant comprising tyrosine 32 mutated to leucine, leucine 65 mutated to valine, aspartic acid 158 mutated to glycine, isoleucine 159 mutated to methionine, leucine 162 mutated to serine, and alanine 167 mutated to histidine; and
- e) a mutant comprising tyrosine 32 mutated to leucine, leucine 65 mutated to valine, aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine, alanine 162 mutated to cysteine, and alanine 167 mutated to tyrosine.

According to some embodiments, the mutant aaRS comprises an amino acid sequence selected from: SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6.

According to another aspect, there is provided a mutant aminoacyl-tRNA synthetase (aaRS) comprising an amino acid sequence of an aaRS comprising at least one amino acid mutation selected from the group consisting of: tyrosine 32 mutated to leucine, tyrosine 32 mutated to glycine; leucine 65 mutated to valine; leucine 65 mutated to glycine; glutamic acid 107 mutated to serine; glutamic acid 107 mutated to asparagine; glutamic acid 107 mutated to aspartic acid; phenylalanine 108 mutated to valine; phenylalanine 108 mutated to arginine; glutamine 109 mutated to methionine; glutamine 109 mutated to serine; glutamine 109 mutated to leucine; and glutamine 109 mutated to cysteine; aspartic acid 158 mutated to glycine; isoleucine 159 mutated to tyrosine; leucine 162 mutated to serine; leucine 162 mutated to arginine; and alanine 167 mutated to phenylalanine.

According to some embodiments, the mutant aaRS of the invention comprises:

- a) aspartic acid 158 mutated to glycine;
- b) isoleucine 159 mutated to tyrosine; and
- c) leucine 162 mutated to serine or leucine 162 mutated to arginine.

According to some embodiments, the mutant aaRS of the invention further comprises alanine 167 mutated to phenylalanine.

According to some embodiments, the mutant aaRS of the invention further comprises tyrosine 32 mutated to leucine or tyrosine 32 mutated to glycine.

According to some embodiments, the mutant aaRS of the invention further comprises leucine 65 mutated to valine or leucine 65 mutated to glycine.

According to some embodiments, the mutant is selected from the group consisting of:

- a) a mutant comprising tyrosine 32 mutated to leucine, lysine 65 mutated to valine; aspartic acid 158 mutated to glycine; isoleucine 159 mutated to tyrosine; leucine 162 mutated to serine; and alanine 167 mutated to phenylalanine;
- b) a mutant comprising tyrosine 32 mutated to glycine, lysine 65 mutated to valine; aspartic acid 158 mutated to glycine; isoleucine 159 mutated to tyrosine; leucine 162 mutated to serine; and alanine 167 mutated to phenylalanine;
- c) a mutant comprising tyrosine 32 mutated to leucine, lysine 65 mutated to valine; glutamic acid 107 mutated to serine, phenylalanine 108 mutated to valine, glutamine 109 mutated to serine; aspartic acid 158 mutated to glycine; isoleucine 159 mutated to tyrosine; leucine 162 mutated to serine; and alanine 167 mutated to phenylalanine;
- d) a mutant comprising tyrosine 32 mutated to leucine, lysine 65 mutated to valine; glutamic acid 107 mutated to asparagine, phenylalanine 108 mutated to valine, glutamine 109 mutated to leucine; aspartic acid 158 mutated to glycine; isoleucine 159 mutated to tyrosine; leucine 162 mutated to serine; and alanine 167 mutated to phenylalanine;
- e) a mutant comprising tyrosine 32 mutated to leucine, lysine 65 mutated to valine; glutamic acid 107 mutated to aspartic acid, aspartic acid 158 mutated to glycine; isoleucine 159 mutated to tyrosine; leucine 162 mutated to serine; and alanine 167 mutated to phenylalanine;
- f) a mutant comprising tyrosine 32 mutated to leucine, lysine 65 mutated to valine; glutamic acid 107 mutated to serine, phenylalanine 108 mutated to valine, glutamine 109 mutated to cysteine; aspartic acid 158 mutated to glycine; isoleucine 159 mutated to tyrosine; leucine 162 mutated to serine; and alanine 167 mutated to phenylalanine;
- g) a mutant comprising tyrosine 32 mutated to leucine, lysine 65 mutated to valine; aspartic acid 158 mutated to glycine; isoleucine 159 mutated to tyrosine; and leucine 162 mutated to arginine; and
- h) a mutant comprising tyrosine 32 mutated to leucine, lysine 65 mutated to glycine; glutamic acid 107 mutated to aspartic acid, phenylalanine 108 mutated to arginine, glutamine 109 mutated to methionine;
- aspartic acid 158 mutated to glycine; isoleucine 159 mutated to tyrosine; leucine 162 mutated to serine; and alanine 167 mutated to phenylalanine.

According to some embodiments, the mutant comprises an amino acid sequence selected from: SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18 and SEQ ID NO: 19.

According to some embodiments, the amino acid sequence of an aaRS is SEQ ID NO: 1.

According to some embodiments, mutant aaRS of the invention further comprises a mutation of arginine 257 to glycine, a mutation of aspartic acid 286 to arginine or both.

According to another aspect, there is provided a nucleic acid molecule comprising a coding region encoding a mutant aaRS of the invention.

According to some embodiments, the coding region comprises a nucleic acid sequence selected from SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24; SEQ ID NO: 25, SEQ ID NO: 26, and SEQ ID NO: 27.

According to some embodiments, the coding region is operably linked to at least one regulatory element configured to express the coding region in a target cell.

According to another aspect, there is provided an orthogonal translation system, comprising,

- a) a mutant aaRS of the invention, or a nucleic acid molecule of the invention, and
- b) an orthogonal tRNA compatible with the mutant aaRS and comprising an anticodon that corresponds to a stop codon.

According to some embodiments, orthogonal translation system of the invention further comprises a non-standard amino acid (nsAA) recognized by the mutant aaRS.

According to some embodiments, the nsAA is an unnatural amino acid (uAA).

According to some embodiments, the uAA comprises a biorthogonal chemical moiety.

According to some embodiments, the mutant aaRS is the mutant aaRS of the invention and the uAA comprises an azide or an alkyne group.

According to some embodiments, the mutant aaRS is the mutant aaRS of the invention and the uAA comprises an azobenzene group.

According to some embodiments, the nsAA is a modified phenylalanine.

According to some embodiments, the modified phenylalanine is 4-propargyloxy-L-phenylalanine (pPR).

According to some embodiments, the uAA comprising an azobenzene group is selected from phenylalanine-4′-azobenzene (AzoPhe). tri-fluorinated azobenzene (Azo3F), and tetra-ortho-fluorinated azobenzene (Azo4F) amino acids.

According to some embodiments, the stop codon is a TAG stop codon.

According to another aspect, there is provided a cell comprising an orthogonal translation system of the invention.

According to some embodiments, the cell of the invention further comprises an expression vector comprising an open reading frame (ORF) comprising at least one of the stop codons within the open reading frame.

According to some embodiments, the ORF comprises a plurality of stop codons.

According to some embodiments, the ORF comprises at least 10 stop codons.

According to some embodiments, the ORF is operatively linked to at least one regulatory element capable of inducing expression of the ORF within the cell.

According to some embodiments, the cell is devoid of native TAG stop codons and does not express release factor 1 (RF1).

According to some embodiments, the cell comprises RF1 and at least one native TAG stop codon.

According to another aspect, there is provided a method of producing a protein comprising a nsAA, the method comprising introducing into a cell an expression vector comprising an open reading frame encoding the protein wherein the open reading frame comprises a stop codon, wherein the cell comprises an orthogonal translation system of the invention.

According to some embodiments, the method of the invention is for labeling the protein, and the method further comprises converting the nsAA into a detectably labeled amino acid and wherein the mutant aaRS is the mutant aaRS of the invention.

According to some embodiments, the converting comprises addition of a detectable moiety by Click chemistry.

According to some embodiments, the method of the invention is for producing a light-responsive protein, wherein the mutant aaRS is the mutant aaRS of the invention.

According to another aspect, there is provided a protein comprising a nsAA produced by a method of the invention.

Unless otherwise defined, all technical and/or scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the invention pertains. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of embodiments of the invention, exemplary methods and/or materials are described below. In case of conflict, the patent specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and are not intended to be necessarily limiting.

Further embodiments and the full scope of applicability of the present invention will become apparent from the detailed description given hereinafter. However, it should be understood that the detailed description and specific examples, while indicating preferred embodiments of the invention, are given by way of illustration only, since various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A-C: (1A) A table depicting amino acid substitutions present in mutant aminoacyl tRNA synthetases capable of incorporating alkyne-containing non-standard amino acids. The mutation sites are with respect to a M. jannaschii tyrosyl-tRNA synthetase. (1B) A table depicting amino acid substitutions present in mutant aminoacyl tRNA synthetases capable of incorporating azobenzene-containing non-standard amino acids. The mutation sites are with respect to a wild-type M. jannaschii tyrosyl-tRNA synthetase. (1C) Production of GFP(3TAG) by chromosomally integrated parent and evolved aaRS variants in E. coli strain C321.ΔRF1.

FIGS. 2A-Z: Multi-site incorporation of pPR by the parent translation systems and evolution of chromosomally integrated pPR-RS variants. (2A) Schematic illustration of reporter proteins for incorporation of 3, 10 and 30 unnatural amino acids (uAAs) and equivalent control wild-type (WT) proteins. (2B) GFP expression from WT GFP reporter, or from GFP(3TAG) ELP(10TAG)-GFP or ELP(30TAG)-GFP reporters produced by the parent pPR-RS, expressed by plasmid (P) or genomic integration (G). Red bars indicate addition of uAA (pPR), blue bars indicate no addition of uAA (pPR). (2C-E) Incorporation of (2C) 3, (2D) 10 and (2E) 30 pPRs in a single protein by evolved aaRS variants, expressed on a plasmid in C321.ΔRF1. *{circumflex over ( )}P<0.05, ***{circumflex over ( )}P<0.0005, and ****{circumflex over ( )}P<0.0001 indicate comparison of each evolved variant with the parent pPR-RS (2C-D) or with the wild-type protein (2E). n=3; error bars indicate S.D. (2F) Production of ELP(10TAG)-GFP by Mut1-RS and previously evolved aaRS variants, in the absence (red bars) and the presence of the respective uAAs, using the C321.ΔRF1 strain. (pPR for Mut1-RS, pAcF for pAcF-RS.2.t1, pAzF for pAzFRS.2.t1). n=3; error bars indicate s.d. (2G-N) Time-course kinetic analysis of GFP(3TAG) production by Mut1-RS and Mut2-RS expressed from multi-copy plasmids. (2G) production by Mut1-RS in C321.ΔRF1 and 2xYT media, (2H) production by Mut2-RS in C321.ΔRF1 and 2xYT media, (2I) production by Mut1-RS in C321.ΔRF1 and minimal media (MM), (2J) production by Mut2-RS in C321.ΔRF1 and minimal media, (2K) production by Mut1-RS in BL21 and 2xYT media, (2L) production by Mut2-RS in BL21 and 2xYT media, (2M) production by Mut1-RS in BL21 and minimal media, (2N) production by Mut2-RS in BL21 and minimal media. (2O-V) Time-course kinetic analysis of ELP(10TAG)-GFP production by Mut1-RS and Mut2-RS expressed from multi-copy plasmids. (2O) production by Mut1-RS in C321.ΔRF1 and 2xYT media, (2P) production by Mut2-RS in C321.ΔRF1 and 2xYT media, (2Q) production by Mut1-RS in C321.ΔRF1 and minimal media, (2R) production by Mut2-RS in C321.ΔRF1 and minimal media, (2S) production by Mut1-RS in BL21 and 2xYT media, (2T) production by Mut2-RS in BL21 and 2xYT media, (2U) production by Mut1-RS in BL21 and minimal media, (2V) production by Mut2-RS in BL21 and minimal media. (2W-Z) Time-course kinetic analysis of ELP(30TAG)-GFP production by Mut1-RS and Mut2-RS expressed from multi-copy plasmids. (2W) production by Mut1-RS in C321.ΔRF1 and 2xYT media, (2X) production by Mut2-RS in C321.ΔRF1 and 2xYT media, (2Y) production by Mut1-RS in C321.ΔRF1 and minimal media, (2Z) production by Mut2-RS in C321.ΔRF1 and minimal media. Fluorescence signals were normalized by dividing the fluorescence counts by the final OD600 reading. n=3. Note: error bars indicating s.d. may be too small to be visible.

FIGS. 3A-D: MALDI-TOF analysis of WT ELP(10Tyrosine)-GFP protein expressed in (3A) BL21 and (3B) C321.ΔRF1, and ELP(10pPR)-GFP protein expressed by Mut1-RS in (3C) BL21 and (3D) C321.ΔRF1, respectively.

FIGS. 4A-B: (4A) Evaluation of pPR incorporation in the presence of standard (1 mM), twofold (0.5 mM) and fourfold (0.25 mM) reduced pPR concentrations. (4B) Multisite incorporation of pPR by Mut1-RS compared with the parent pPR-RS, in the BL21 and C321.ΔRF1 strains. n=3; error bars indicate S.D.

FIGS. 5A-E: (5A) In-gel fluorescence analysis of purified ELPs containing 1 or 10 instances of pPR conjugated to TAMRA-azide at various protein concentrations, namely: (1) 30 μM; (2) 6 μM; (3) 3 μM; (4) 1.5 μM; (5) 0.6 μM; (6) 0.33 μM; (7) 0.165 μM. (5B-E) TAMRA labeling of C321.ΔRF1 cells expressing (5B) ELP(1pPR) by the parentpPR-RS; (5C) ELP(1pPR) by Mut1-RS; (5D) ELP(10pPR) b the parent pPR-RS and (5E) ELP(10pPR) by the Mut1-RS. Percentage of labeled cells was calculated using ImageJ and is given for each image.

FIGS. 6A-F: Conjugation of multiple fluorophores to ELPs in bacteria. (6A) Conjugation of ELP(1TAG) and ELP(10TAG), produced by either the parent pPR-RS (P) or evolved Mut1-RS (E) in C321.ΔRF1. Conjugation of (6B) ELP(1TAG) and (6C) ELP(10TAG) produced by either the parent pPR-RS or Mut1-RS in BL21 compared to C321.ΔRF1. 1: parent-pPR-RS, BL21; 2: Mut1-RS, BL21; 3: parent pPR-RS, C321.ΔRF1; 4: Mut1-RS, C321.ΔRF1. (6D) In-vitro TAMRA labeling of cell lysates containing ELP(10pPR) expressed in BL21 strain by the (1) the parent pPR-RS or (2) Mut1-RS, or expressed in the C321.ΔRF1 strain by (3) parent pPR-RS or (4) Mut1-RS. (6E-F) Labeling of the OTS by conjugation of pPR to TAMRA. (6E) In-vivo and (6F) in-vitro fluorescent labeling of C321. (1) ΔRF1 cells harboring the Mut1-RS plasmid, (2) ΔRF1 cells harboring the Mut1-RS plasmid, with induction of the aaRS, and (3) ΔRF1 cells harboring both the Mut1-RS and ELP(10pPR) plasmids. A double-band (indicated by a red arrow) was detected when OTS was induced.

FIGS. 7A-B: (7A) Labeling efficiency of cells harboring ELP(10pPR) expressed by Mut1-RS in the presence of reduced copper concentrations. n=3; values are means s.d. (7B) Growth curves of C321.ΔRF1 cells following an in-vivo click reaction. n=3; values are means±s.d.

FIGS. 8A-D: Incorporation of phenylalanine-4′-azobenzene (AzoPhe) in expressed proteins. (8A) GFP expression in GRO from ELP(10Tyr or 30Tyr)-GFP reporters, or from ELP(1TAG, 5TAG, 10TAG or 30TAG)-GFP reporters produced by literary (L) or Mut7 aaRS. Red bars indicate addition of uAA (AzoPhe), grey bars indicate no addition of uAA. error bars; mean±standard error. *P<0.01 indicates comparison of literary aaRS with the evolved. #P<0.01 indicates comparison of evolved aaRS (10Azo) with the endogenous (10Tyr). (8B) Multi-site-specific incorporation of AzoPhe into E2(10TAG). Mass-spectrometry results (MALDI-TOF) of (top) E2 with 10 TAG, expressed with episomal evolved aaRS, in presence of AzoPhe, compared with (bottom) ELP (10Tyr). (8C). MALDI (z=2, z=3) spectra of purified proteins expressed in the GRO. ELP(Tyr)-GFP (Left), ELP(10pPR)-GFP (Right) The deviation from the theoretical MW is indicated. (8D). MALDI (z=2, z=3) spectra of purified proteins expressed in BL21 ELP(Tyr)-GFP (Left), ELP(10pPR)-GFP (Right) The deviation from the theoretical MW is indicated.

FIGS. 9A-G: (9A) Illustration of the reversible trans-to-cis isomerization of an azobenzene molecule. (9B) Illustrations and properties of azobenzene-uAAs 1, 2, and 3. (9C) Illustration of the mechanism for altering the Tt of the ELP by azobenzene isomerization. A change in the transition temperature by cis/trans isomerization generates a “window” in which isothermal (e.g., at T*), light-mediated change in ELP solubility can be achieved. (9D) Schematic illustration of reporter proteins for the incorporation of either 2 (GFP) or 1, 5, or 10 (ELP-GFP) uAAs at TAG codons. (9E-G) Incorporation of either (9E) 1, (9F) 5, or (9G) 10 instances of azobenzene-uAAs 1, 2, or 3 in a single ELP by the previously described AzoRS, expressed from a multi-copy plasmid. Error bars indicate SD (n=3).

FIGS. 10A-D: (10A) Production of GFP(2TAG) by the previously described AzoRS and four evolved variants, expressed from a single chromosomal copy. (10B-D) Production of ELP-GFP fusion proteins containing either (10B) 1, (10C) 5, or (10D) 10 instances of the azobenzene-uAAs depicted in 10B and expressed by episomal versions of the previously described AzoRS, our evolved variants (AzoRS1-4), or MjTyrRS (producing tyrosine-containing control ELPs) in the C321.ΔRF1 strain. The level of GFP fluorescence indicates the production of the ELP-GFP fusion and, therefore, the efficiency of sAA incorporation. *P<0.05, **P<0.001, ***P<0.0005, ****P<0.0001 indicate comparison of each evolved variant with the AzoRS. n=3; error bars indicate s.d.

FIG. 11: MALDI-TOF analysis of ELP60(WT) [expected: 22,760.4, found: 22726.03], ELP60(2×1) [expected: 23,148.87, found: 23083.17], ELP60(6×1) [expected: 23,841.65, found: 23793.87], and ELP60(10×1) [expected: 24,562.47 found: 24519.49].

FIG. 12: Turbidity profile, as a function of temperature and light irradiation for ELP₆₀(tyrosine×10), 25 μM solution in water.

FIG. 13A-R: Characterization of the light-responsive properties of ELPs containing multiple instances of azobenzene-uAA 1. (13A-C) Turbidity profiles as a function of temperature and light irradiation for ELPs (25 μM solutions in water) containing either (13A) 2 (supplemented with 1 M NaCl), (13B) 6, or (13C) 10 instances of 1. (13D-F) CD spectra of light-irradiated ELPs (7.5 μM solutions in water) containing either (13D) 2, (13E) 6, or (13F) 10 instances of 1 at 10° C. or 30° C. (13G-H) Turbidity profiles as a function of the duration of irradiation with either (13G) UV or (13H) blue-light for ELPs containing 10 instances of 1 (25 μM solutions in water). (13I) Reversibility of the light-mediated transition (600 nm), and of azobenzene isomerization (325 nm), over multiple cycles of 30 s illumination of ELPs (25 μM solutions in water) containing 10 instances of azobenzene-uAA 1 at 26° C. (13J-K) CD spectra of (13J) ELP60(tyrosine×10) and (13K) ELP60(benzophenone×10), both as 7.5 μM solutions in water, tested at 10° C. or 30° C. (13L-N) Turbidity profiles as a function of temperature and light irradiation for ELP60(1×10) at concentrations of (13L) 12.5 μM, (13M) 25 μM, or (13N) 50 μM (C) in water. (13O) Turbidity profiles (heating and cooling) of light-irradiated ELP60(1×10), 25 μM in water. (13P-R) UV-vis spectra of light-irradiated ELPs containing 10 instances of azobenzene-uAA (13P) 1, (13Q) 2, or (13R) 3. Insets show the red-shifted band separation for azobenzene-uAAs 2 and 3.

FIGS. 14A-L: Characterization of the light-responsive properties of ELP containing multiple instances of azobenzene-uAA 2 (25 μM solutions in water, unless otherwise indicated). (14A-C) Turbidity profiles as a function of temperature and light irradiation for ELPs containing either (14A) 2 (supplemented with 1 M NaCl), (14B) 6, or (14C) 10 instances of 2. (14D-E) Turbidity profiles as a function of the duration of irradiation with either (14D) blue or (14E) green light for ELPs containing 10 instances of 2. (14F) Reversibility of the light-mediated transition (600 nm), and azobenzene isomerization (340 nm), over (first) five cycles of 5 min and then 5 cycles of 30 s illumination of ELPs containing 10 instances of azobenzene-uAA 2 at 24° C. (14G-I) Turbidity profiles as a function of temperature and light irradiation for ELP60(2×10) at concentrations of (14G) 12.5 μM, (14H) 25 μM, or (14I) 50 μM in water. (14J-L) Comparison of the CD spectra of (14J) ELP60(1×10), (14K) ELP60(2×10), and (14L) ELP60(3×10), all 7.5 μM solutions in water, as a function of light irradiation at 10° C. or 30° C.

FIG. 15: Turbidity profile as a function of temperature and light irradiation for ELP60(3×10) at concentration of 12.5 μM.

FIGS. 16A-16V: (16A-B) Cryo-TEM images of self-assembled molecules of 1 isomerized to the (16A) trans or (16B) cis conformations. (16C-J) Dynamic light scattering analysis of ELPs containing (16C) 10 instances of tyrosine, (16D) 10 instances of a benzophenone-bearing uAA, (16E) 2 instances of 1, irradiated with blue light, (16F) 2 instances of 1, irradiated with UV light, (16G) 6 instances of 1, irradiated with blue light, (16H) 6 instances of 1, irradiated with UV light, (16I) 10 instances of 1, irradiated with blue light, (16J) 10 instances of 1, irradiated with UV light. (16K-P) Dynamic light scattering analysis of ELPs containing (16K) 2 instances of 2, irradiated with blue light, (16L) 2 instances of 2, irradiated with green light, (16M) 6 instances of 2, irradiated with blue light, (16N) 6 instances of 2, irradiated with green light, (16O) 10 instances of 2, irradiated with blue light, (16P) 10 instances of 2, irradiated with green light. (16Q-V) Dynamic light scattering analysis of ELPs containing (16Q) 2 instances of 3, irradiated with blue light, (16R) 2 instances of 3, irradiated with green light, (16S) 6 instances of 3, irradiated with blue light, (16T) 6 instances of 3, irradiated with green light, (16U) 10 instances of 3, irradiated with blue light, (16V) 10 instances of 3, irradiated with green light.

FIGS. 17A-F: Cryo-TEM images of the self-assembly of ELPs containing 10 instances of either 1 irradiated with (17A) blue or (17B) uv light, 2 irradiated with (17C) blue or (17D) green light, or 3 irradiated with (17E) blue or (17F) green light.

FIGS. 18A-N: Characterization of the self-assembly of diblock ELPs as a function of temperature and azobenzene isomerization. (18A) Turbidity profiles (solid lines) as a function of temperature and light irradiation for ELP₆₀(WT)-ELP₆₀(1×10); dots indicate particle size, as determined by DLS. (18B) Reversibility of the light-mediated self-assembly of ELP₆₀(WT)-ELP₆₀(1×10), over ten cycles of 30 s illumination (25 μM solutions in water) at 25° C. (18C-F) Cryo-TEM images of (18C-D) blue- or (18E-F) UV-light irradiated ELP₆₀(WT)-ELP₆₀(1×10). (18G) Turbidity profiles (solid lines) as a function of temperature and light irradiation for ELP60(WT)-ELP60(2×10); dots indicate particle size, as determined by DLS. (18H) Turbidity profiles (solid lines) as a function of temperature and light irradiation for ELP60(WT)-ELP60(2×10); dots indicate particle size, as determined by DLS. (18I-L) Cryo-TEM images of the self-assembly of ELP60(WT)-ELP60(2×10 irradiated with (18I-J) blue or (18K-L) green light. (18M-N) Dynamic light scattering analysis of (18M) ELP60(WT)-ELP60(1×10) and (18N) ELP60(WT)-ELP60(2×10) as a function of light-irradiation, analyzed at 38° C.

FIGS. 19A-D: Kinetic analysis of GFP production by aaRS variants expressed on plasmids. Time course analysis of (19A) GFP(3TAG) expression by Mut1-RS, (19B) GFP(3TAG) expression by Mut2-RS, (19C) ELP(10TAG)-GFP expression by Mut1-RS and (19D) ELP(10TAG)-GFP expression by Mut2-RS, expressed on multi-copy plasmids in the presence of pPR or with no uAA. n=3; Error bars, mean±s.d.

FIG. 20: Post-purification fluorescent labeling of ELPs. In-gel fluorescence analysis of ELPs containing 1 or 10 instances of pPR, conjugated to TAMRA-azide, at varied protein concentrations. ELP(10pPR) (right) shows improved signals and reduced limit of detection for proteins as compared with only a single pPR residue (ELP(1pPR), right).

FIG. 21: In vitro TAMRA labeling of ELP(10pPR) in non-recoded BL21 strain and in the GRO. Proteins were expressed in either BL21 by the (1) parent or (2) Mut1-RS, or in the GRO by (3) parent pPR-RS or (4) Mut1-RS. Typhoon imaging at 532 nm.

FIGS. 22A-B: Staining of the OTS through conjugation of pPR to TAMRA. (22A) In-vivo, or (22B) in-vitro fluorescent labeling of cells harboring Mut1-RS plasmid (1) without or (2) with induction of the OTS, or (3) cells harboring both Mut1-RS and ELP(10pPR) plasmids. Double-band (marked by red arrow) is detected when OTS in induced, suggesting these bands represent the aminoacylated aaRS and aminoacylated aaRS-tRNA complex.

FIG. 23: Expected and experimental molecular weights of ELP(10TAG)-GFP by MALDI-TOF mass spectrometry analysis. Molecular weights (Da) calculated based on doubly charged proteins. pPR-bearing proteins were expressed by Mut1-RS.

FIGS. 24A-J: Sequence and signal intensities of peptides identified LC-MS of tryptic fragments. (24A) ELP(10TAG)-GFP MS, expressed by parent pPR-RS in the C321.ΔRF1 strain. (24B) ELP(10TAG)-GFP MS, expressed by parent pPR-RS in the BL21 strain. (24C) ELP(10TAG)-GFP MS, expressed by Mut1-RS in the C321.ΔRF1 using 1 mM pPR. (24D) ELP(10TAG)-GFP MS, expressed by Mut1-RS in the C321.ΔRF1 using 0.25 mM pPR. (24E) ELP(10TAG)-GFP MS, expressed by Mut1-RS in the BL21 E. coli strain using 1 mM pPR. (24F) ELP(10TAG)-GFP MS, expressed by Mut1-RS in the BL21 E. coli strain using 0.25 mM pPR. (24G) ELP(10TAG)-GFP MS, expressed by Mut2-RS in the C321.ΔRF1 using 1 mM pPR. (24H) ELP(30TAG) MS, expressed in the C321.ΔRF1 by Mut1-RS, using different pPR concentrations. (24I) ELP(30TAG)MS, expressed by Mut2-RS in the C321.ΔRF1, using 1 mM pPR. (24J) Identification of Mut1-RS in the fluorescent band following in-vivo click reaction in C321.ΔRF1 and in-gel trypsin digestion.

FIG. 25: Fluorescent quantification of microscopy images.

DETAILED DESCRIPTION OF THE INVENTION

The present invention provides, in some embodiments, mutant aminoacyl-tRNA synthetase (aaRS) proteins. Nucleic acid molecules encoding the mutant aaRSs are also provided, as are orthogonal translation systems comprising the mutant aaRSs or nucleic acid molecules and cells comprising the orthogonal translation system. Methods of use are also provided.

The present invention is based on the surprising development of highly efficient aaRS variants capable of multi-site incorporation of uAAs in a genomically recoded organism (GRO) that lacks all native TAG codons as well as the associated release factor (RF1). Surprisingly some new aaRS variants were even functional in wild-type cells. The toolbox for multi-site and site-selective protein labeling has thus been greatly expanded via evolution of efficient aaRS variants for the multi-site incorporation of the alkyne-bearing uAA, 4-propargyloxy-L-phenylalanine (pPR), azobenzene-bearing phenylalanine-4′-azobenzene (AzoPhe), tri-fluorinated azobenzene (Azo3F) and tetra-ortho-fluorinated azobenzene (Azo4F). While OTSs have been previously developed, they are suitable for single-site pPR incorporation per-protein generally. For example previous attempts to incorporate pPR found proteins harboring a single pPR were expressed by the system in only moderate yields (<20% or ˜42% of wild-type proteins in E. coli cell free protein synthesis, depending on the position for uAA incorporation in the protein). The newly evolved aaRS variants are capable of incorporating up to 10 or 30 instances of the uAA in a single protein, in both the commonly used, non-recoded E. coli strain BL21 and the GRO, respectively. Further, it is shown herein that multi-site incorporation of uAAs in proteins allows rapid, robust, and non-toxic fluorescent labeling of these proteins or generation of light responsive polymers in vivo. Not only that, but there is shown herein the genetic encoding of light-responsive phase transition and self-assembly of a PBB using photo-switchable uAAs. In particular, azobenzene-containing ELPs with a predetermined number of azobenzenes, incorporated at specific positions, were generated. This allowed for manipulation of the transition temperature of the ELP and control of the ELP's self-assembly and geometry. Finally, light-responsive nanostructures were engineered by incorporating azobenzene-uAAs in the hydrophobic segment of ELP diblock co-polymers.

Mutant aaRS

By a first aspect, the present invention provides a mutant aminoacyl-tRNA synthetase (aaRS).

In some embodiments, the mutant aaRS comprises an amino acid sequence of an aaRS comprising at least one amino acid mutation. In some embodiments, the mutant aaRS comprises at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or 11 mutations. In some embodiments, the mutant aaRS comprises 2 mutations. In some embodiments, the mutant aaRS comprises 5 mutations. In some embodiments, the mutant aaRS comprises 6 mutations. In some embodiments, the mutant aaRS comprises 7 mutations. In some embodiments, the mutant aaRS comprises 8 mutations. In some embodiments, the mutant aaRS comprises 9 mutations. In some embodiments, the mutant aaRS comprises 11 mutations.

As used herein, the term “mutation” refers to any mutation such as can be introduced into an amino acid sequence or into a nucleic acid sequence by any method known in the art. In some embodiments, a mutation is a deletion. In some embodiments, is an insertion. In some embodiments, a mutation is a substitution. In some embodiments, a mutation is a conversion of one amino acid to another. In some embodiments, a mutation is a conversion of one nucleotide to another. In some embodiments, a mutation is a conversion of a plurality of nucleotides to other nucleotides. In some embodiments, a mutation introduced into a nucleic acid sequence when translated, results in a mutant amino acid sequence. In some embodiments, the mutation is not a silent mutation.

In some embodiments, the mutation increases the incorporation rate of a non-standard amino acid (nsAA) into a protein. In some embodiments, the mutation increases the rate of recognition of the aaRS of its cognate tRNA. In some embodiments, the mutation increases the rate of recognition of the aaRS of an orthogonal tRNA. In some embodiments, the mutation increases the rate of recognition of an amino acid. In some embodiments, the mutation increases the rate of recognition of the aaRS of its cognate amino acid. In some embodiments, the mutation increases the rate of recognition of the aaRS of an orthogonal amino acid.

In some embodiments, the amino acid is a non-standard amino acid (nsAA). In some embodiments, the nsAA is an unnatural amino acid (uAA). In some embodiments, a nsAA is a uAA. In some embodiments, the amino acid is an orthogonal amino acid. In some embodiments, the amino acid is a non-naturally occurring amino acid. In some embodiments, the amino acid is a man-made amino acid. The term “unnatural amino acid” as used herein refers to any amino acid that is not genetically encoded for in an organism. The term “unnatural amino acid” as used herein refers to an amino acid that that is not inherently present within the organism. This refers to any amino acid other than the following twenty genetically encoded alpha-amino acids: alanine, arginine, asparagine, aspartic acid, cysteine, glutamine, glutamic acid, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine, proline, serine, threonine, tryptophan, tyrosine and valine. Examples of unnatural amino acids are common in the art.

Methods of generating mutations are well-described in the art and include, but are not limited to, site-directed mutagenesis, nucleotide excision, nucleotide addition, clustered regularly interspaced short palindromic repeats (CRISPR), transcription activator-like effector nuclease (TALEN), multiplexed automated genome engineering (MAGE) and polymerase chain reaction (PCR) with mutation generating primers or probes.

Aminoacyl-tRNA synthetase is a well-known protein that catalyzes the attachment of amino acids to the 3′ end of their cognate tRNAs. In some embodiments, the aaRS is an archaeal aaRS. In some embodiments, the aaRS is a Methanocaldococcus jannaschii (Mj) protein. In some embodiments, the aaRS is a Mj aaRS. Mj is also known as Methanococcus jannaschii. In some embodiments, the aaRS recognizes a tRNA molecule. In some embodiments, the aaRS transfers an amino acid to the tRNA molecule. In some embodiments, the aaRS transfers an amino acid to the tRNA molecule. In some embodiments, the aaRS transfers an amino acid derived molecule to the tRNA molecule. In some embodiments, the aaRS is an orthogonal aaRS (o-aaRS). In some embodiments, the aaRS is a uAA-specific o-aaRS. As used herein the term “uAA-specific o-aaRS” refers to an orthogonal amino-acyl-tRNA synthetase that recognizes only the uAA and the tRNA of the system or cell of the invention.

In some embodiments, the amino acid derived molecule is a non-standard amino acid (nsAA). In some embodiments, the nsAA is an unnatural amino acid (uAA). In some embodiments, the uAA is a D amino acid or an L amino acid. In some embodiments, the uAA is a D amino acid. In some embodiments, the uAA is an L amino acid. In some embodiments, the uAA is an azide- or an alkyne-containing amino acid. In some embodiments, the uAA is an azide containing amino acid. In some embodiments, the uAA is an alkyne containing amino acid. In some embodiments, the uAA is an azobenzene-containing amino acid. In some embodiments, the uAA is a modified phenylalanine. In some embodiments, the modified phenylalanine is 4-propargyloxy-L-phenylalanine (pPR). In some embodiments, the modified phenylalanine is phenylalanine-4′-azobenzene (AzoPhe). In some embodiments, the azobenzene-containing amino acid is AzoPhe or tri-fluorinated azobenzene (Azo3F). In some embodiments, the azobenzene-containing amino acid is AzoPhe, Azo3F or tetra-ortho-fluorinated azobenzene (Azo4F). In some embodiments, the azobenzene-containing amino acid is AzoPhe. In some embodiments, the azobenzene-containing amino acid is Azo3F. In some embodiments, the azobenzene-containing amino acid is Azo4F. In some embodiments, the aaRS transfers 4-propargyloxy-L-phenylalanine (pPR) to the tRNA molecule. In some embodiments, the aaRS transfers phenylalanine-4′-azobenzene (AzoPhe), tri-fluorinated azobenzene (Azo3F) or tetra-ortho-fluorinated azobenzene (Azo4F) to the tRNA molecule. In some embodiments, the aaRS transfers phenylalanine-4′-azobenzene (AzoPhe) to the tRNA molecule. In some embodiments, the aaRS transfers tri-fluorinated azobenzene (Azo3F) to the tRNA molecule. In some embodiments, the aaRS transfers tetra-ortho-fluorinated azobenzene (Azo4F) to the tRNA molecule.

In some embodiments, the tRNA molecule is an orthogonal tRNA (o-tRNA). In some embodiments, the tRNA molecule comprises a stop anticodon. In some embodiments the tRNA molecule comprises an amber anticodon. In some embodiments, the aaRS does not recognize a canonical tRNA in a cell. In some embodiments, the canonical tRNA comprises an anticodon with complementarity to a tyrosine codon. In some embodiments, the cell is a target cell. In some embodiments, the cell is a cell comprising the mutant aaRs. In some embodiments, the cell is a bacterial cell. In some embodiments, the cell is an Escherichia coli cell. In some embodiments, the cell is selected from a bacterium, an Escherichia coli cell, a eukaryotic cell, a yeast cell. a fungal cell, a plant cell, an animal cell.

The term “orthogonal” as used herein refers to molecules (e.g., “orthogonal tRNA synthetase” and “orthogonal tRNA” pairs) that can process information in parallel with wild-type molecules (e.g., tRNA synthetases and tRNAs), but that do not engage in crosstalk with the wild-type molecules of a cell. As a non-limiting example, the orthogonal tRNA synthetase preferentially aminoacylates a complementary orthogonal tRNA (O-tRNA), but no other cellular tRNAs, with a non-canonical amino acid (e.g., Propargyl-1-Lysine), and the orthogonal tRNA is a substrate for the orthogonal synthetase but is not substantially aminoacylated by any endogenous tRNA synthetases. In some embodiments, orthogonal is with respect to a target cell. In some embodiments, the target cell is a cell of the invention.

In the context of tRNAs and aminoacyl-tRNA synthetases, the term “orthogonal” refers to an inability or reduced efficiency, e.g., less than 20% efficiency, less than 10% efficiency, less than 5% efficiency, or less than 1% efficiency, of an O-tRNA to function with an endogenous tRNA synthetase (RS) compared to an endogenous tRNA to function with the endogenous tRNA synthetase, or of O-tRNA synthetase (O-RS) to function with an endogenous tRNA compared to an endogenous tRNA synthetase to function with the endogenous tRNA. For a non-limiting example, an O-tRNA in a cell is aminoacylated by any endogenous RS of the cell with reduced or even zero efficiency, when compared to aminoacylation of an endogenous tRNA by the endogenous RS. In another non-limiting example, an O-tRNA synthetase aminoacylates any endogenous tRNA a cell of interest with reduced or even zero efficiency, as compared to aminoacylation of the endogenous tRNA by an endogenous RS.

In some embodiments, the O-tRNA anticodon loop recognizes a codon, which is not recognized by endogenous tRNAs, on the mRNA and incorporates the UAA at this site in the polypeptide, details of which are further described, for example, in U.S. Pat. No. 2006/0160175, which is hereby incorporated by reference in its entirety. For a non-limiting example, the unique codon may include nonsense codons, such as, stop codons, four or more base codons, rare codons, codons derived from natural or unnatural base pairs and/or the like. In some embodiments, the unique codon is the TAG stop codon.

As used herein, aaRS recognition of a tRNA molecule refers to the association of an aaRS with a specific tRNA molecule including but not limited to contact at the anticodon or the acceptor stem of the tRNA molecule. As used herein, transfer to a tRNA molecule, refers to the process by which an amino acid or an amino acid derived molecule is associated with an aaRS or a mutant aaRS and moved onto the 3′-hydroxyl group on the CCA tail of the tRNA molecule. The process is also referred to in the art as “charging the tRNA molecule”.

As used herein, the term “canonical” describes an endogenous molecule that is present in a cell without any transgenic manipulation to the cell or to the progenitors of the cell.

In some embodiments, the aaRS into which the mutation is introduced comprises or consists of the amino acid sequence

(SEQ ID NO: 1)

MDEFEMIKRNTSEIISEEELREVLKKDEKSAYIGFE

PSGKIHLGHYLQIKKMIDLQNAGFDIIILLADLHA

YLNQKGELDEIRKIGDYNKKVFEAMGLKAKYVYGS

EFQLDKDYTLNVYRLALKTTLKRARRSMELIARED

ENPKVAEVIYPIMQVNDIHYLGVDVAVGGMEQRKI

HMLARELLPKKVVCIHNPVLTGLDGEGKMSSSKGN

FIAVDDSPEEIRAKIKKAYCPAGVVEGNPIMEIAK

YFLEYPLTIKRPEKFGGDLTVNSYEELESLFKNKE

LHPMDLKNAVAEELIKILEPIRKRL

or a sequence with at least 95% identity thereto. In some embodiments, an amino acid sequence of Mj aaRS consists of SEQ ID NO: 1. In some embodiments, an amino acid sequence of wild-type aaRS comprises or consists of SEQ ID NO: 1 or a sequence with 95% identity thereto. In some embodiments, an amino acid sequence of aaRS comprises or consists of SEQ ID NO: 1 or a sequence with 95% identity thereto. In some embodiments, the an amino acid sequence of a non-mutant aaRS comprises or consists of SEQ ID NO: 1 or a sequence with 95% identity thereto. In some embodiments, the amino acid numbering provided herein is with respect to the sequence of SEQ ID NO: 1. In some embodiments, SEQ ID NO:1 comprises a wildtype sequence for an aaRS and the isolated peptide is a mutant aaRS.

In some embodiments, the mutation is selected from the group consisting of: tyrosine 32 mutated to leucine, tyrosine 32 mutated to threonine; leucine 65 mutated to valine; glutamic acid 107 mutated to alanine; phenylalanine 108 mutated to tyrosine; glutamine 109 mutated to methionine; aspartic acid 158 mutated to serine; aspartic acid 158 mutated to glycine; isoleucine 159 mutated to alanine; isoleucine 159 mutated to methionine; isoleucine 159 mutated to cysteine; isoleucine 159 mutated to tyrosine; leucine 162 mutated to glutamic acid; leucine 162 mutated to lysine; leucine 162 mutated to valine; leucine 162 mutated to arginine; leucine 162 mutated to serine; leucine 162 mutated to cysteine; alanine 167 mutated to histidine, alanine 167 mutated to aspartic acid and alanine 167 mutated to tyrosine.

In some embodiments the mutation is tyrosine 32 mutated to leucine, or threonine. In some embodiments the mutation is tyrosine 32 mutated to leucine. In some embodiments the mutation is tyrosine 32 mutated to threonine. In some embodiments the mutation is leucine 65 mutated to valine. In some embodiments, the mutation is glutamic acid 107 mutated to alanine. In some embodiments, the mutation is phenylalanine 108 mutated to tyrosine. In some embodiments, the mutation is glutamine 109 mutated to methionine. In some embodiments, the mutation is aspartic acid 158 mutated to serine, or glycine. In some embodiments, the mutation is aspartic acid 158 mutated to serine. In some embodiments, the mutation is aspartic acid 158 mutated to glycine. In some embodiments, the mutation is isoleucine 159 mutated to alanine, methionine, cysteine, or tyrosine. In some embodiments, the mutation is isoleucine 159 mutated to alanine. In some embodiments, the mutation is isoleucine 159 mutated to methionine. In some embodiments, the mutation is isoleucine 159 mutated to cysteine. In some embodiments, the mutation is isoleucine 159 mutated to tyrosine. In some embodiments, the mutation is leucine 162 mutated to glutamic acid, lysine, valine, arginine, serine or cysteine. In some embodiments, the mutation is leucine 162 mutated to glutamic acid. In some embodiments, the mutation is leucine 162 mutated to lysine. In some embodiments, the mutation is leucine 162 mutated to valine. In some embodiments, the mutation is leucine 162 mutated to arginine. In some embodiments, the mutation is leucine 162 mutated to serine. In some embodiments, the mutation is leucine 162 mutated to cysteine. In some embodiments, the mutation is alanine 167 mutated to histidine, aspartic acid or tyrosine. In some embodiments, the mutation is alanine 167 mutated to histidine. In some embodiments, the mutation is alanine 167 mutated to aspartic acid. In some embodiments, the mutation is alanine 167 mutated to tyrosine. It will be understood by a skilled artisan that any combination of the above recited mutations is envisioned and may be present in the mutant aaRS of the invention.

In some embodiments, the mutation is selected from the group consisting of: tyrosine 32 mutated to leucine, tyrosine 32 mutated to glycine; leucine 65 mutated to valine; leucine 65 mutated to glycine; glutamic acid 107 mutated to serine; glutamic acid 107 mutated to asparagine; glutamic acid 107 mutated to aspartic acid; phenylalanine 108 mutated to valine; phenylalanine 108 mutated to arginine; glutamine 109 mutated to methionine; glutamine 109 mutated to serine; glutamine 109 mutated to leucine; and glutamine 109 mutated to cysteine; aspartic acid 158 mutated to glycine; isoleucine 159 mutated to tyrosine; leucine 162 mutated to serine; leucine 162 mutated to arginine; and alanine 167 mutated to phenylalanine.

In some embodiments, the mutation is selected from the group consisting of: tyrosine 32 mutated to leucine, tyrosine 32 mutated to threonine; tyrosine 32 mutated to glycine; leucine 65 mutated to valine; leucine 65 mutated to glycine; glutamic acid 107 mutated to alanine; glutamic acid 107 mutated to serine; glutamic acid 107 mutated to asparagine; glutamic acid 107 mutated to aspartic acid; phenylalanine 108 mutated to tyrosine; phenylalanine 108 mutated to valine; phenylalanine 108 mutated to arginine; glutamine 109 mutated to methionine; glutamine 109 mutated to serine; glutamine 109 mutated to leucine; and glutamine 109 mutated to cysteine; aspartic acid 158 mutated to serine; aspartic acid 158 mutated to glycine; isoleucine 159 mutated to alanine; isoleucine 159 mutated to methionine; isoleucine 159 mutated to cysteine; isoleucine 159 mutated to tyrosine; leucine 162 mutated to glutamic acid; leucine 162 mutated to lysine; leucine 162 mutated to valine; leucine 162 mutated to arginine; leucine 162 mutated to serine; leucine 162 mutated to cysteine; alanine 167 mutated to histidine, alanine 167 mutated to aspartic acid, alanine 167 mutated to phenylalanine and alanine 167 mutated to tyrosine.

In some embodiments, the mutation is tyrosine 32 mutated to leucine or glycine. In some embodiments, the mutation is tyrosine 32 mutated to leucine. In some embodiments, the mutation is tyrosine 32 mutated to glycine. In some embodiments, the mutation is leucine 65 mutated to valine or glycine. In some embodiments, the mutation is leucine 65 mutated to valine. In some embodiments, the mutation is leucine 65 mutated to glycine. In some embodiments, the mutation is glutamic acid 107 mutated to serine, asparagine or aspartic acid. In some embodiments, the mutation is glutamic acid 107 mutated to serine. In some embodiments, the mutation is glutamic acid 107 mutated to asparagine. In some embodiments, the mutation is glutamic acid 107 mutated to aspartic acid. In some embodiments, the mutation is phenylalanine 108 mutated to arginine. In some embodiments, the mutation is glutamine 109 mutated to methionine, serine, leucine or cysteine. In some embodiments, the mutation is glutamine 109 mutated to methionine. In some embodiments, the mutation is glutamine 109 mutated to serine. In some embodiments, the mutation is glutamine 109 mutated to leucine. In some embodiments, the mutation is glutamine 109 mutated to cysteine. In some embodiments, the mutation is aspartic acid 158 mutated to glycine. In some embodiments, the mutation is isoleucine 159 mutated to tyrosine. In some embodiments, the mutation is leucine 162 mutated to serine or arginine. In some embodiments, the mutation is leucine 162 mutated to serine. In some embodiments, the mutation is leucine 162 mutated to arginine. In some embodiments, the mutation is alanine 167 mutated to phenylalanine.

In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, and isoleucine 159 mutated to tyrosine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine and leucine 162 mutated to serine or arginine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine and leucine 162 mutated to serine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine and leucine 162 mutated to arginine.

In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine, leucine 162 mutated to serine or arginine, and alanine 167 mutated to phenylalanine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine, leucine 162 mutated to serine or arginine, and tyrosine 32 mutated to leucine or glycine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine, leucine 162 mutated to serine or arginine, and tyrosine 32 mutated to leucine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine, leucine 162 mutated to serine or arginine, and tyrosine 32 mutated to glycine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine, leucine 162 mutated to serine or arginine, and leucine 65 mutated to valine or glycine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine, leucine 162 mutated to serine or arginine, and leucine 65 mutated to valine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine, leucine 162 mutated to serine or arginine, and leucine 65 mutated to glycine.

In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine, leucine 162 mutated to serine or arginine, alanine 167 mutated to phenylalanine, and tyrosine 32 mutated to leucine or glycine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine, leucine 162 mutated to serine or arginine, alanine 167 mutated to phenylalanine, and tyrosine 32 mutated to leucine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine mutated to tyrosine, leucine 162 mutated to serine or arginine, alanine 167 mutated to phenylalanine, and tyrosine 32 mutated to glycine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine mutated to tyrosine, leucine 162 mutated to serine or arginine, alanine 167 mutated to phenylalanine, and leucine 65 mutated to valine or glycine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine mutated to tyrosine, leucine 162 mutated to serine or arginine, alanine 167 mutated to phenylalanine, and leucine 65 mutated to valine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine mutated to tyrosine, leucine 162 mutated to serine or arginine, alanine 167 mutated to phenylalanine, and leucine 65 mutated to glycine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine, leucine 162 mutated to serine or arginine, tyrosine 32 mutated to leucine or glycine and leucine 65 mutated to valine or glycine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine, leucine 162 mutated to serine or arginine, tyrosine 32 mutated to leucine or glycine and leucine 65 mutated to valine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine, leucine 162 mutated to serine or arginine, tyrosine 32 mutated to leucine or glycine and leucine 65 mutated to glycine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine, leucine 162 mutated to serine or arginine, tyrosine 32 mutated to leucine and leucine 65 mutated to valine or glycine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine, leucine 162 mutated to serine or arginine, tyrosine 32 mutated to glycine and leucine 65 mutated to valine or glycine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine, leucine 162 mutated to serine or arginine, tyrosine 32 mutated to leucine and leucine 65 mutated to valine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine, leucine 162 mutated to serine or arginine, tyrosine 32 mutated to leucine and leucine 65 mutated to glycine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine, leucine 162 mutated to serine or arginine, tyrosine 32 mutated to glycine and leucine 65 mutated to valine. In some embodiments, the mutant aaRS comprises aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine, leucine 162 mutated to serine or arginine, tyrosine 32 mutated to glycine and leucine 65 mutated to glycine.

In some embodiments, the aaRS further comprises mutation of arginine 257 to glycine, mutation of aspartic acid 286 to arginine, or both. In some embodiments, the aaRS further comprises mutation of arginine 257 to glycine. In some embodiments, the aaRS further comprises mutation of aspartic acid 286 to arginine. In some embodiments, the aaRS further comprises mutation of both arginine 257 to glycine and aspartic acid 286 to arginine. In some embodiments, SEQ ID NO:1 further comprises these two known mutations. In some embodiments, the sequence into which the mutations of the invention are introduced comprises or consists of

(SEQ ID NO: 28)

MDEFEMIKRNTSEIISEEELREVLKKDEKSAYIGFE

PSGKIHLGHYLQIKKMIDLQNAGFDIIILLADLHAY

LNQKGELDEIRKIGDYNKKVFEAMGLKAKYVYGSEF

QLDKDYTLNVYRLALKTTLKRARRSMELIAREDENP

KVAEVIYPIMQVNDIHYLGVDVAVGGMEQRKIHMLA

RELLPKKVVCIHNPVLTGLDGEGKMSSSKGNFIAV

DDSPEEIRAKIKKAYCPAGVVEGNPIMEIAKYFLEY

PLTIKGPEKFGGDLTVNSYEELESLFKNKELHPMR

LKNAVAEELIKILEPIRKRL.

or a sequence with 95% identity thereto. In some embodiments, the sequence into which the mutations of the invention are introduced consists of SEQ ID NO: 28.

In some embodiments, the mutant aaRS comprises tyrosine 32 mutated to leucine, aspartic acid 158 mutated to serine, isoleucine 159 mutated to methionine, leucine 162 mutated to lysine, alanine 167 mutated to histidine, arginine 257 mutated to glycine, and aspartic acid 286 mutated to arginine. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence

(SEQ ID NO: 2)

MDEFEMIKRNTSEIISEEELREVLKKDEKSALIGFE

PSGKIHLGHYLQIKKMIDLQNAGFDIIILLADLHAY

LNQKGELDEIRKIGDYNKKVFEAMGLKAKYVYGSEF

QLDKDYTLNVYRLALKTTLKRARRSMELIAREDENP

KVAEVIYPIMQVNSMHYKGVDVHVGGMEQRKIHMLA

RELLPKKVVCIHNPVLTGLDGEGKMSSSKGNFIAVD

DSPEEIRAKIKKAYCPAGVVEGNPIMEIAKYFLEYP

LTIKGPEKFGGDLTVNSYEELESLFKNKELHPMRLK

NAVAEELIKILEPIRKRL,

or a fragment, a derivative or analog thereof. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence of SEQ ID NO: 2.

In some embodiments, the mutant aaRS comprises: tyrosine 32 mutated to leucine, leucine 65 mutated to valine, aspartic acid 158 mutated to glycine, isoleucine 159 mutated to alanine, leucine 162 mutated to glutamic acid, alanine 167 mutated to histidine, arginine 257 mutated to glycine, and aspartic acid 286 mutated to arginine. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence:

(SEQ ID NO: 3)

MDEFEMIKRNTSEIISEEELREVLKKDEKSALIGFE

PSGKIHLGHYLQIKKMIDLQNAGFDIIIVLADLHAY

LNOKGELDEIRKIGDYNKKVFEAMGLKAKYVYGSEF

QLDKDYTLNVYRLALKTTLKRARRSMELIAREDENP

KVAEVIYPIMQVNGAHYEGVDVHVGGMEQRKIHMLA

RELLPKKVVCIHNPVLTGLDGEGKMSSSKGNFIAVD

DSPEEIRAKIKKAYCPAGVVEGNPIMEIAKYFLEYP

LTIKGPEKFGGDLTVNSYEELESLFKNKELHPMRLK

NAVAEELIKILEPIRKRL,

or a fragment, a derivative or analog thereof. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence of SEQ ID NO: 3.

In some embodiments, the mutant aaRS comprises: tyrosine 32 mutated to threonine, leucine 65 mutated to valine, glutamic acid 107 mutated to alanine, phenylalanine 108 mutated to tyrosine, glutamine 109 mutated to methionine, aspartic acid 158 mutated to glycine, isoleucine 159 mutated to cysteine, leucine 162 mutated to arginine, alanine 167 mutated to aspartic acid, arginine 257 mutated to glycine and aspartic acid 286 mutated to arginine. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence:

(SEQ ID NO: 4)

MDEFEMIKRNTSEIISEEELREVLKKDEKSATIGFE

PSGKIHLGHYLQIKKMIDLQNAGFDIIIVLADLHAY

LNQKGELDEIRKIGDYNKKVFEAMGLKAKYVYGSAY

MLDKDYTLNVYRLALKTTLKRARRSMELIAREDENP

KVAEVIYPIMQVNGCHYRGVDVDVGGMEQRKIHMLA

RELLPKKVVCIHNPVLTGLDGEGKMSSSKGNFIAVD

DSPEEIRAKIKKAYCPAGVVEGNPIMEIAKYFLEYP

LTIKGPEKFGGDLTVNSYEELESLFKNKELHPMRLK

NAVAEELIKILEPIRKRL,

or a fragment, a derivative or analog thereof. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence of SEQ ID NO: 4.

In some embodiments, the mutant aaRS comprises: tyrosine 32 mutated to leucine, leucine 65 mutated to valine, aspartic acid 158 mutated to glycine, isoleucine 159 mutated to methionine; leucine 162 mutated to serine, alanine 167 mutated to histidine, arginine 257 mutated to glycine and aspartic acid 286 mutated to arginine. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence:

(SEQ ID NO: 5)

MDEFEMIKRNTSEIISEEELREVLKKDEKSALIGFEPSGKIHLGHYLQ

IKKMIDLQNAGFDIIIVLADLHAYLNQKGELDEIRKIGDYNKKVFEAM

GLKAKYVYGSEFQLDKDYTLNVYRLALKTTLKRARRSMELIAREDENP

KVAEVIYPIMQVNGMHYSGVDVHVGGMEQRKIHMLARELLPKKVVCIH

NPVLTGLDGEGKMSSSKGNFIAVDDSPEEIRAKIKKAYCPAGVVEGNP

IMEIAKYFLEYPLTIKGPEKFGGDLTVNSYEELESLFKNKELHPMRLK

NAVAEELIKILEPIRKRL,

or a fragment, a derivative or analog thereof. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence of SEQ ID NO: 5.

In some embodiments, the mutant aaRS comprises: tyrosine 32 mutated to leucine, leucine 65 mutated to valine, aspartic acid 158 mutated to glycine, isoleucine 159 mutated to tyrosine, leucine 162 mutated to cysteine, alanine 167 mutated to tyrosine, arginine 257 mutated to glycine, and aspartic acid 286 mutated to arginine. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence:

(SEQ ID NO: 6)

MDEFEMIKRNTSEIISEEELREVLKKDEKSALIGFEPSGKIHLGHYLQ

IKKMIDLQNAGFDIIIVLADLHAYLNQKGELDEIRKIGDYNKKVFEAM

GLKAKYVYGSEFQLDKDYTLNVYRLALKTTLKRARRSMELIAREDENP

KVAEVIYPIMQVNGYHYCGVDVYVGGMEQRKIHMLARELLPKKVVCIH

NPVLTGLDGEGKMSSSKGNFIAVDDSPEEIRAKIKKAYCPAGVVEGNP

IMEIAKYFLEYPLTIKGPEKFGGDLTVNSYEELESLFKNKELHPMRLK

NAVAEELIKILEPIRKRL,

or a fragment, a derivative or an analog thereof. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence of SEQ ID NO: 6.

In some embodiments, the mutant aaRS comprises: tyrosine 32 mutated to leucine, lysine 65 mutated to valine; aspartic acid 158 mutated to glycine; isoleucine 159 mutated to tyrosine; leucine 162 mutated to serine; and alanine 167 mutated to phenylalanine. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence:

(SEQ ID NO: 12)

MDEFEMIKRNTSEIISEEELREVLKKDEKSALIGFEPSGKIHLGHYLQ

IKKMIDLQNAGFDIIIVLADLHAYLNQKGELDEIRKIGDYNKKVFEAM

GLKAKYVYGSEFQLDKDYTLNVYRLALKTTLKRARRSMELIAREDENP

KVAEVIYPIMQVNGYHYSGVDVFVGGMEQRKIHMLARELLPKKVVCIH

NPVLTGLDGEGKMSSSKGNFIAVDDSPEEIRAKIKKAYCPAGVVEGNP

IMEIAKYFLEYPLTIKGPEKFGGDLTVNSYEELESLFKNKELHPMRLK

NAVAEELIKILEPIRKRL

or a fragment, a derivative or an analog thereof. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence of SEQ ID NO: 12.

In some embodiments, the mutant aaRS comprises: tyrosine 32 mutated to glycine, lysine 65 mutated to valine; aspartic acid 158 mutated to glycine; isoleucine 159 mutated to tyrosine; leucine 162 mutated to serine; and alanine 167 mutated to phenylalanine. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence:

(SEQ ID NO: 13)

MDEFEMIKRNTSEIISEEELREVLKKDEKSAGIGFEPSGKIHLGHYLQ

IKKMIDLQNAGFDIIIVLADLHAYLNQKGELDEIRKIGDYNKKVFEAM

GLKAKYVYGSEFQLDKDYTLNVYRLALKTTLKRARRSMELIAREDENP

KVAEVIYPIMQVNGYHYSGVDVFVGGMEQRKIHMLARELLPKKVVCIH

NPVLTGLDGEGKMSSSKGNFIAVDDSPEEIRAKIKKAYCPAGVVEGNP

IMEIAKYFLEYPLTIKGPEKFGGDLTVNSYEELESLFKNKELHPMRLK

NAVAEELIKILEPIRKRL

or a fragment, a derivative or an analog thereof. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence of SEQ ID NO: 13.

In some embodiments, the mutant aaRS comprises: tyrosine 32 mutated to leucine, lysine 65 mutated to valine; glutamic acid 107 mutated to serine, phenylalanine 108 mutated to valine, glutamine 109 mutated to serine; aspartic acid 158 mutated to glycine; isoleucine 159 mutated to tyrosine; leucine 162 mutated to serine; and alanine 167 mutated to phenylalanine. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence:

(SEQ ID NO: 14)

MDEFEMIKRNTSEIISEEKLREVLKKDEKSALIGFEPSGKIHLGHYLQ

IKKMIDLQNAGFDIIIVLADLHAYLNQKGELDEIRKIGDYNKKVFEAM

GLKAKYVYGSSVSLDKDYTLNVYRLALKTTLKRARRSMELIAREDENP

KVAEVIYPIMQVNGYHYSGVDVFVGGMEQRKIHMLARELLPKKVVCIH

NPVLTGLDGEGKMSSSKGNFIAVDDSPEEIRAKIKKAYCPAGVVEGNP

IMEIAKYFLEYPLTIKGPEKFGGDLTVNSYEELESLFKNKELHPMRLK

NAVAEELIKILEPIRKRL

or a fragment, a derivative or an analog thereof. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence of SEQ ID NO: 14.

In some embodiments, the mutant aaRS comprises: tyrosine 32 mutated to leucine, lysine 65 mutated to valine; glutamic acid 107 mutated to asparagine, phenylalanine 108 mutated to valine, glutamine 109 mutated to leucine; aspartic acid 158 mutated to glycine; isoleucine 159 mutated to tyrosine; leucine 162 mutated to serine; and alanine 167 mutated to phenylalanine. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence:

(SEQ ID NO: 15)

MDEFEMIKRNTSEIISEEELREVLKKDEKSALIGFEPSGKIHLGHYLQ

IKKMIDLQNAGFDIIIVLADLHAYLNQKGELDEIRKIGDYNKKVFEAM

GLKAKYVYGSNVLLDKDYTLNVYRLALKTTLKRARRSMELIAREDENP

KVAEVIYPIMQVNGYHYSGVDVFVGGMEQRKIHMLARELLPKKVVCIH

NPVLTGLDGEGKMSSSKGNFIAVDDSPEEIRAKIKKAYCPAGVVEGNP

IMEIAKYFLEYPLTIKGPEKFGGDLTVNSYEELESLFKNKELHPMRLK

NAVAEELIKILEPIRKRL

or a fragment, a derivative or an analog thereof. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence of SEQ ID NO: 15.

In some embodiments, the mutant aaRS comprises: tyrosine 32 mutated to leucine, lysine 65 mutated to valine; glutamic acid 107 mutated to aspartic acid, aspartic acid 158 mutated to glycine; isoleucine 159 mutated to tyrosine; leucine 162 mutated to serine; and alanine 167 mutated to phenylalanine. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence:

(SEQ ID NO: 16)

MDEFEMIKRNTSEIISEEELREVLKKDEKSALIGFEPSGKIHLGHYLQ

IKKMIDLQNAGFDIIIVLADLHAYLNQKGELDEIRKIGDYNKKVFEAM

GLKAKYVYGSDFQLDKDYTLNVYRLALKTTLKRARRSMELIAREDENP

KVAEVIYPIMQVNGYHYSGVDVFVGGMEQRKIHMLARELLPKKVVCIH

NPVLTGLDGEGKMSSSKGNFIAVDDSPEEIRAKIKKAYCPAGVVEGNP

IMEIAKYFLEYPLTIKGPEKFGGDLTVNSYEELESLFKNKELHPMRLK

NAVAEELIKILEPIRKRL

or a fragment, a derivative or an analog thereof. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence of SEQ ID NO: 16.

In some embodiments, the mutant aaRS comprises: tyrosine 32 mutated to leucine, lysine 65 mutated to valine; glutamic acid 107 mutated to serine, phenylalanine 108 mutated to valine, glutamine 109 mutated to cysteine; aspartic acid 158 mutated to glycine; isoleucine 159 mutated to tyrosine; leucine 162 mutated to serine; and alanine 167 mutated to phenylalanine. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence:

(SEQ ID NO: 17)

MDEFEMIKRNTSEIISEEELREVLKKDEKSALIGFEPSGKIHLGHYLQ

IKKMIDLQNAGFDIIIVLADLHAYLNQKGELDEIRKIGDYNKKVFEAM

GLKAKYVYGSSVCLDKDYTLNVYRLALKTTLKRARRSMELIAREDENP

KVAEVIYPIMQVNGYHYSGVDVFVGGMEQRKIHMLARELLPKKVVCIH

NPVLTGLDGEGKMSSSKGNFIAVDDSPEEIRAKIKKAYCPAGVVEGNP

IMEIAKYFLEYPLTIKGPEKFGGDLTVNSYEELESLFKNKELHPMRLK

NAVAEELIKILEPIRKRL

or a fragment, a derivative or an analog thereof. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence of SEQ ID NO: 17.

In some embodiments, the mutant aaRS comprises: tyrosine 32 mutated to glycine, lysine 65 mutated to valine; aspartic acid 158 mutated to glycine; isoleucine 159 mutated to tyrosine; and leucine 162 mutated to arginine. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence:

(SEQ ID NO: 18)

MDEFEMIKRNTSEIISEEELREVLKKDEKSAGIGFEPSGKIHLGHYLQ

IKKMIDLQNAGFDIIIVLADLHAYLNQKGELDEIRKIGDYNKKVFEAM

GLKAKYVYGSEFQLDKDYTLNVYRLALKTTLKRARRSMELIAREDENP

KVAEVIYPIMQVNGYHYRGVDVAVGGMEQRKIHMLARELLPKKVVCIH

NPVLTGLDGEGKMSSSKGNFIAVDDSPEEIRAKIKKAYCPAGVVEGNP

IMEIAKYFLEYPLTIKGPEKFGGDLTVNSYEELESLFKNKELHPMRLK

NAVAEELIKILEPIRKRL

or a fragment, a derivative or an analog thereof. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence of SEQ ID NO: 18.

In some embodiments, the mutant aaRS comprises: tyrosine 32 mutated to leucine, lysine 65 mutated to glycine; glutamic acid 107 mutated to aspartic acid, phenylalanine 108 mutated to arginine, glutamine 109 mutated to methionine; aspartic acid 158 mutated to glycine; isoleucine 159 mutated to tyrosine; leucine 162 mutated to serine; and alanine 167 mutated to phenylalanine. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence:

(SEQ ID NO: 19)

MDEFEMIKRNTSEIISEEELREVLKKDEKSALIGFEPSGKIHLGHYLQ

IKKMIDLQNAGFDIIIGLADLHAYLNQKGELDEIRKIGDYNKKVFEAM

GLKAKYVYGSDRMLDKDYTLNVYRLALKTTLKRARRSMELIAREDENP

KVAEVIYPIMQVNGYHYSGVDVFVGGMEQRKIHMLARELLPKKVVCIH

NPVLTGLDGEGKMSSSKGNFIAVDDSPEEIRAKIKKAYCPAGVVEGNP

IMEIAKYFLEYPLTIKGPEKFGGDLTVNSYEELESLFKNKELHPMRLK

NAVAEELIKILEPIRKRL

or a fragment, a derivative or an analog thereof. In some embodiments, the mutant aaRS comprises or consists of the amino acid sequence of SEQ ID NO: 19.

In some embodiments, the fragment, derivative or analog comprises at least one of the recited mutations. In some embodiments, the fragment, derivative or analog is an active fragment, derivative or analog. In some embodiments, active refers to possessing an aaRS activity. In some embodiments, the aaRS activity is the ability to catalyzes the attachment of an amino acid to its cognate tRNA. In some embodiments, the aaRS activity is the ability to recognize an amino acid. In some embodiments, the aaRS activity is the ability to recognize a tRNA. In some embodiments, the aaRS activity is the ability to transfer an amino acid to a tRNA.

The term “derivative” as used herein, refers to any polypeptide that is based off the polypeptide of the invention and still comprises the recited mutations. A derivative is not merely a fragment of the polypeptide, nor does it need to have amino acids replaced or removed (an analog), rather it may have additional modification made to the polypeptide, such as post-translational modification. Further, a derivative may be a derivative of a fragment of the polypeptide of the invention. In some embodiments, a derivative of a sequence comprises at least 70, 75, 80, 85, 90, 92, 93, 95, 97, 99 or 100% identity to that sequence. Each possibility represents a separate embodiment of the invention. In some embodiments, a derivative of a sequence comprises at least 90% identity to that sequence. In some embodiments, a derivative of a sequence comprises at least 95% identity to that sequence. In some embodiments, a derivative of a sequence comprises at least 97% identity to that sequence. In some embodiments, a derivative of a sequence comprises at least 99% identity to that sequence.

In some embodiments, a fragment comprises at least 50, 100, 150, 200, or 250 amino acids of the aaRS. Each possibility represents a separate embodiment of the invention. In some embodiments, a fragment is a functional fragment. In some embodiments, a fragment comprises at least 50 amino acids of the aaRS. In some embodiments, a fragment comprises at least 100 amino acids of the aaRS.

In some embodiments, the fragment is a portion of the polypeptide comprises any one of a leucine at position 32, a threonine at position 32, a valine at position 65, an alanine at position 107, a tyrosine at position 108, a methionine at position 109, a serine at position 158, a glycine at position 158, an alanine at position 159, a methionine at position 159, a cysteine at position 159, a tyrosine at position 159, a glutamic acid at position 162, a lysine at position 162, a valine at position 162, an arginine at position 162, a serine at position 162, a cysteine at position 162, a histidine at position 167, an aspartic acid at position 167, and a tyrosine at position 167. Such a fragment will still be recognizable as being from the polypeptide of the invention, and as such will be at least 10 amino acids in length. As such, any fragment of the isolated polypeptide of the invention will still comprise at least 10, at least 20, at least 30, at least 40, at least 50, at least 80, or at least 100 amino acids surrounding position 32, position 65, position 107, position 108, position 109, position 158, position 159, position 162, or position 167 of the polypeptide. Each possibility represents a separate embodiment of the present invention.

In some embodiments, the fragment is a portion of the polypeptide comprises any one of a leucine at position 32, a glycine at position 32, a valine at position 65, a glycine at position 65, a serine at position 107, an asparagine at position 107, a aspartic acid at position 107, a valine at position 108, a arginine at position 108, a methionine at position 109, a serine at position 109, a leucine at position 109, a cysteine at position 109, a glycine at position 158, a tyrosine at position 159, a an alanine at position 162, a serine at position 162, and a phenylalanine at position 167. Such a fragment will still be recognizable as being from the polypeptide of the invention, and as such will be at least 10 amino acids in length. As such, any fragment of the isolated polypeptide of the invention will still comprise at least 10, at least 20, at least 30, at least 40, at least 50, at least 80, or at least 100 amino acids surrounding position 32, position 65, position 107, position 108, position 109, position 158, position 159, position 162, or position 167 of the polypeptide. Each possibility represents a separate embodiment of the present invention.

As used herein, the term “analog” includes any peptide having an amino acid sequence substantially identical to one of the sequences specifically shown herein in which one or more residues have been conservatively substituted with a functionally similar residue and which displays the abilities as described herein. Examples of conservative substitutions include the substitution of one non-polar (hydrophobic) residue such as isoleucine, valine, leucine or methionine for another, the substitution of one polar (hydrophilic) residue for another such as between arginine and lysine, between glutamine and asparagine, between glycine and serine, the substitution of one basic residue such as lysine, arginine or histidine for another, or the substitution of one acidic residue, such as aspartic acid or glutamic acid for another. Each possibility represents a separate embodiment of the present invention.

In some embodiments, the mutant aaRS comprises or consists of an amino acid sequence selected from: SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6. In some embodiments, the mutant aaRS comprises or consists of an amino acid sequence selected from: SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 or a fragment, analog or derivative thereof. In some embodiments, the mutant aaRS consists of an amino acid sequence selected from: SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6. In some embodiments, the mutant aaRS consists of an amino acid sequence selected from: SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 or a fragment, analog or derivative thereof.

In some embodiments, the mutant aaRS comprises or consists of an amino acid sequence selected from: SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18 and SEQ ID NO: 19. In some embodiments, the mutant aaRS comprises or consists of an amino acid sequence selected from: SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18 and SEQ ID NO: 19 or a fragment, analog or derivative thereof. In some embodiments, the mutant aaRS consists of an amino acid sequence selected from: SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18 and SEQ ID NO: 19. In some embodiments, the mutant aaRS consists of an amino acid sequence selected from: SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18 and SEQ ID NO: 19 or a fragment, analog or derivative thereof.

By another aspect, the present invention provides an isolated polypeptide, comprising or consisting of an amino acid sequence selected from SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5 SEQ ID NO: 6, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18 and SEQ ID NO: 19.

As used herein, the terms “peptide”, “polypeptide” and “protein” are used interchangeably to refer to a polymer of amino acid residues. In some embodiment, the peptides, polypeptides and proteins described herein have modifications rendering them more stable while in the body, more capable of penetrating into cells or capable of eliciting a more potent effect than previously described. In some embodiment, the terms “peptide”, “polypeptide” and “protein” apply to naturally occurring amino acid polymers. In another embodiment, the terms “peptide”, “polypeptide” and “protein” apply to amino acid polymers in which one or more amino acid residue is an artificial chemical analogue of a corresponding naturally occurring amino acid.

As used herein, the term “isolated polypeptide” refers to a peptide that is essentially free from contaminating cellular components, such as carbohydrate, lipid, or other proteinaceous impurities associated with the peptide in nature. Typically, a preparation of isolated peptide contains the peptide in a highly-purified form, i.e., at least about 80% pure, at least about 90% pure, at least about 95% pure, greater than 95% pure, or greater than 99% pure. Each possibility represents a separate embodiment of the invention.

Nucleic Acid Molecules

In another aspect, there is provided a nucleic acid molecule encoding a mutant aaRS of the invention, or a fragment, a derivative or an analog thereof.

In another aspect, there is provided a nucleic acid molecule comprising a coding region encoding a mutant aaRS of the invention, or a fragment, a derivative or an analog thereof.

In some embodiments, the nucleic acid molecule encodes a mutant aaRS of the invention. In some embodiments, the nucleic acid molecule comprises a coding region encoding a mutant aaRS of the invention.

In some embodiments, the nucleic acid molecule is selected from DNA, RNA, cDNA, genomic DNA (gDNA), vector DNA, vector RNA, LNA, PNA and a combination thereof. In some embodiments, the nucleic acid molecule is DNA. In some embodiments, the nucleic acid molecule is RNA. In some embodiments, the nucleic acid molecule is cDNA. In some embodiments, the nucleic acid molecule is gDNA. In some embodiments, the nucleic acid molecule is LNA. In some embodiments, the nucleic acid molecule is PNA. In some embodiments, the nucleic acid molecule is a hybrid molecule comprising more than one type of nucleic acid.

As used herein, the phrases “coding sequence” and “coding region” are interchangeable and refer to the region that when translated results in the production of an expression product, such as a polypeptide, protein, or enzyme, and specifically the mutant aaRS. In some embodiments, the coding region is operably linked to at least one regulatory element. In some embodiments, the regulatory element is configured to express the coding region in a target cell. In some embodiments, the regulatory element is configured to express a protein encoded by the coding region in a target cell. In some embodiments, the regulatory element is a promoter. In some embodiments, the regulatory element is an enhancer. In some embodiments, the regulatory element is a silencer. The term “operably linked” is intended to mean that the nucleotide sequence of interest is linked to the regulatory element(s) in a manner that allows for expression of the nucleotide sequence (e.g. in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell). In some embodiments, expression of the coding region refers to a state in which mRNA is transcribed from the coding region acting as a template. In some embodiments, expression of the coding region refers to a state in which polypeptide is translated from the mRNA transcribed from the coding region.

The term “promoter” as used herein refers to a group of transcriptional control modules that are clustered around the initiation site for an RNA polymerase i.e., RNA polymerase II. Promoters are composed of discrete functional modules, each consisting of approximately 7-20 bp of DNA, and containing one or more recognition sites for transcriptional activator or repressor proteins. In some embodiments, nucleic acid sequences are transcribed by RNA polymerase II (RNAP II and Pol II). RNAP II is an enzyme found in eukaryotic cells. It catalyzes the transcription of DNA to synthesize precursors of mRNA and most snRNA and microRNA.

In some embodiments, the nucleic acid molecule is a vector. In some embodiments, the vector is a DNA vector. In some embodiments, the vector is an RNA vector. In some embodiments, the vector is an expression vector. In some embodiments, the expression vector is configured for expression in a bacterial cell. In some embodiments, the expression vector is configured for expression in a mammalian cell. In some embodiments, the expression vector is configured for expression in a target cell.

Expressing of a gene or protein within a cell is well known to one skilled in the art. It can be carried out by, among many methods, transfection, viral infection, or direct alteration of the cell's genome. In some embodiments, the gene is in an expression vector such as plasmid or viral vector.

In some embodiments, the vector is introduced into a cell by standard methods including electroporation (e.g., as described in From et al., Proc. Natl. Acad. Sci. USA 82, 5824 (1985)), Heat shock, infection by viral vectors, high velocity ballistic penetration by small particles with the nucleic acid either within the matrix of small beads or particles, or on the surface (Klein et al., Nature 327. 70-73 (1987)), and/or the like. A vector of the invention may be introduced into a target cell by any method known in the art, including but not limited to those provided herein. In some embodiments, the introducing produces a cell of the invention.

Various methods can be used to introduce the expression vector of the present invention into cells. Such methods are generally described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Springs Harbor Laboratory, New York (1989, 1992), in Ausubel et al., Current Protocols in Molecular Biology, John Wiley and Sons, Baltimore, Md. (1989), Chang et al., Somatic Gene Therapy, CRC Press, Ann Arbor, Mich. (1995), Vega et al., Gene Targeting, CRC Press, Ann Arbor Mich. (1995), Vectors: A Survey of Molecular Cloning Vectors and Their Uses, Butterworths, Boston Mass. (1988) and Gilboa et at. [Biotechniques 4 (6): 504-512, and include, for example, stable or transient transfection, lipofection, electroporation and infection with recombinant viral vectors. In addition, see U.S. Pat. Nos. 5,464,764 and 5,487,992 for positive-negative selection methods.

A vector nucleic acid sequence generally contains at least an origin of replication for propagation in a cell and optionally additional elements, such as a heterologous polynucleotide sequence, expression control element (e.g., a promoter, enhancer), selectable marker (e.g., antibiotic resistance), poly-Adenine sequence.

The vector may be a DNA plasmid delivered via non-viral methods or via viral methods. The viral vector may be a retroviral vector, a herpesviral vector, an adenoviral vector, an adeno-associated viral vector or a poxviral vector. The promoters may be active in mammalian cells. The promoters may be a viral promoter.

In some embodiments, mammalian expression vectors include, but are not limited to, pcDNA3, pcDNA3.1 (±), pGL3, pZeoSV2(±), pSecTag2, pDisplay, pEF/myc/cyto, pCMV/myc/cyto, pCR3.1, pSinRep5, DH26S, DHBB, pNMT1, pNMT41, pNMT81, which are available from Invitrogen, pCI which is available from Promega, pMbac, pPbac, pBK-RSV and pBK-CMV which are available from Strategene, pTRES which is available from Clontech, and their derivatives.

In some embodiments, expression vectors containing regulatory elements from eukaryotic viruses such as retroviruses are used by the present invention. SV40 vectors include pSVT7 and pMT2. In some embodiments, vectors derived from bovine papilloma virus include pBV-1MTHA, and vectors derived from Epstein Bar virus include pHEBO, and p2O5. Other exemplary vectors include pMSG, pAV009/A+, pMTO10/A+, pMAMneo-5, baculovirus pDSVE, and any other vector allowing expression of proteins under the direction of the SV-40 early promoter, SV-40 later promoter, metallothionein promoter, murine mammary tumor virus promoter, Rous sarcoma virus promoter, polyhedrin promoter, or other promoters shown effective for expression in eukaryotic cells.

In some embodiments, recombinant viral vectors, which offer advantages such as lateral infection and targeting specificity, are used for in vivo expression. In one embodiment, lateral infection is inherent in the life cycle of, for example, retrovirus and is the process by which a single infected cell produces many progeny virions that bud off and infect neighboring cells. In one embodiment, the result is that a large area becomes rapidly infected, most of which was not initially infected by the original viral particles. In one embodiment, viral vectors are produced that are unable to spread laterally. In one embodiment, this characteristic can be useful if the desired purpose is to introduce a specified gene into only a localized number of targeted cells.

In one embodiment, plant expression vectors are used. In one embodiment, the expression of a polypeptide coding sequence is driven by a number of promoters. In some embodiments, viral promoters such as the 35S RNA and 19S RNA promoters of CaMV [Brisson et al., Nature 310:511-514 (1984)], or the coat protein promoter to TMV [Takamatsu et al., EMBO J. 6:307-311 (1987)] are used. In another embodiment, plant promoters are used such as, for example, the small subunit of RUBISCO [Coruzzi et al., EMBO J. 3:1671-1680 (1984); and Brogli et al., Science 224:838-843 (1984)] or heat shock promoters, e.g., soybean hsp17.5-E or hsp17.3-B [Gurley et al., Mol. Cell. Biol. 6:559-565 (1986)]. In one embodiment, constructs are introduced into plant cells using Ti plasmid, Ri plasmid, plant viral vectors, direct DNA transformation, microinjection, electroporation and other techniques well known to the skilled artisan. See, for example, Weissbach & Weissbach [Methods for Plant Molecular Biology, Academic Press, NY, Section VIII, pp 421-463 (1988)]. Other expression systems such as insects and mammalian host cell systems, which are well known in the art, can also be used by the present invention.

It will be appreciated that other than containing the necessary elements for the transcription and translation of the inserted coding sequence (encoding the polypeptide), the expression construct of the present invention can also include sequences engineered to optimize stability, production, purification, yield or activity of the expressed polypeptide.

A person with skill in the art will appreciate that a gene or protein can also be expressed from a nucleic acid construct administered to the individual employing any suitable mode of administration, described hereinabove (i.e., in vivo gene therapy). In one embodiment, the nucleic acid construct is introduced into a suitable cell via an appropriate gene delivery vehicle/method (transfection, transduction, homologous recombination, etc.) and an expression system as needed and then the modified cells are expanded in culture and returned to the individual (i.e., ex vivo gene therapy).