PLANT SERINE PROTEASES

Incorporated by reference in its entirety herein is a computer-readable sequence listing and identified as follows: One 138,364_Byte ASCII (Text) file named “Sequence_Listing_ST25.txt,” created on Nov. 29, 2021.

TECHNICAL FIELD

The present invention relates to the field of the production of recombinant proteins and other polypeptides in plants.

BACKGROUND ART

Many proteins, in particular in eukaryotic systems, are susceptible to proteolytic enzymes (proteases).

Unwanted proteolysis is important because it may alter physical and chemical properties, conformation, distribution, stability, activity, folding and consequently, function of the proteins. Moreover, many proteins show significantly less or no activity when they are modified by proteases. Secreted proteins, membrane proteins and proteins targeted to vesicles or certain intracellular organelles are likely to be exposed to proteases during their biosynthesis or post harvesting. Types of human proteins which often undergo proteolysis include serum proteins, antibodies, metal carriers, enzymes, coagulation factors, protease inhibitors, extracellular matrix proteins, growth factors and hormones. The sequences targeted by the proteases sometimes contain the amino acids lysine or arginine at the two positions preceding the cleaved peptide bond. However, other sequence motifs recognized by proteases are also known. The proteases responsible for these cleavage reactions are typically derived from extracellular fluids or components of the secretory machinery of eukaryotic cells (e.g. Golgi apparatus, lysosomes or related compartments).

Proteases as well as their regulation play an important role in tobacco plants for modulating the curing of the leaves of said tobacco plants (see WO 2016/009006). Controlling the expression of subtilisin-like serine endopeptidase-like proteins in plants allows the manipulation of water and/or CO2 exchange through plant stomata (see WO 2013/192545).

As mentioned above the preservation of the full-length sequence is often essential for the biological activity of polypeptides and proteins. This has to be considered when host cells are selected to be used for the recombinant production of polypeptides and proteins.

US 2004/106198 discloses a system for the expression of heterologous protease sensitive proteins in a plant cell.

Especially the recombinant expression of animal polypeptides and proteins in plants or plant cells often requires a modification of the host cell to prevent proteolysis of the polypeptides and proteins to be expressed. However, targeted strategies to protect recombinantly produced proteins of animal origin in plant cells against proteolytic degradation depend on knowledge of the responsible proteases, which have not been identified yet.

It is therefore an object of the present invention to provide methods and means enabling a host cell, preferably of non-animal origin, to prevent the unwanted proteolysis of a recombinant protein, preferably derived from an animal, based on the identity and properties of the proteases involved therein.

SUMMARY OF THE INVENTION

The present invention relates to a genetically modified plant or plant cell derived from a wild-type plant or plant cell, said wild-type plant or plant cell producing at least one serine protease comprising the motif SSRGPX₁LKPDX₂X₃APGX₄SGTSMSCPHX₅PX₆WSPX₇AX₈X₉SAX₁₀MTT (SEQ ID No. 1), wherein

X₁is a peptide consisting of 7 amino acid residues, X₂is I or L, X₃is T or M, X₄is a peptide consisting of 27 or 28 amino acid residues, X₅is a peptide consisting of 12 amino acid residues, X₆is T or E, X₇is S or A, X₈is V or I, X₉is K or R and X₁₀is I or M, wherein the proteolytic activity of the at least one serine protease in the genetically modified plant or plant cell is reduced compared to its activity in the wild-type plant or plant cell, wherein the genetically modified plant or plant cell comprises at least one exogenous nucleic acid molecule encoding for at least one protein or polypeptide of interest. It turned surprisingly out that proteolysis of proteins and polypeptides recombinantly produced in plant cells could be significantly reduced or even completely prevented when the activity of at least one serine protease as mentioned above is significantly reduced or prevented.

Another aspect of the present invention relates to a serine protease consisting of an amino acid sequence selected from the group consisting of SEQ ID No. 2, SEQ ID No. 4, SEQ ID No. 6, SEQ ID No. 8, SEQ ID No. 10, SEQ ID No. 14, SEQ ID No. 16, SEQ ID No. 20 and SEQ ID No. 22, preferably SEQ ID No. 2 or SEQ ID No. 14.

A further aspect of the present invention relates to a nucleic acid molecule encoding a serine protease according to the present invention.

Another aspect of the present invention relates to a method for producing at least one protein or polypeptide of interest by cultivating a genetically modified plant or plant cell according to the present invention.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 shows the time-dependent degradation of 2F5, 2G12 and PG9 by proteases present in apoplastic fluid obtained from Nicotiana benthamiana C105 leaves. A characteristic heavy-chain (hc) degradation product (*) is produced due to cleavage within or close to the complementarity-determining region (CDR) H3 loop. The identified cleavage sites are shown in Table 1.

FIG. 2a-c shows that cleaved 2F5, 2G12 and PG9 have a much lower antigen-binding activity than the intact antibodies as determined by ELISA with the respective cognate antigens as ligands. The constant domains of the antibodies are proteolysis-resistant under the conditions used, as demonstrated by equal binding of anti-Fc antibodies.

FIG. 3 shows that the dissociation constant of cleaved 2F5 for its antigen gp41 peptide is far higher than for the intact antibody as determined by surface plasmon resonance (SPR) spectroscopy.

FIG. 4a-b show that purified SBT1 and SBT2 can cleave the heavy chains of 2F5, 2G12 and PG9 at the same positions as apoplastic fluid. The observed cleavage sites (listed in Tables 2 and 3) are identical to those observed for apoplastic fluid. Other serine proteases (trypsin, SBT3) cannot cleave the monoclonal antibody 2F5 at these positions.

FIG. 5a-h shows that SBT1 and SBT2 are effectively inhibited by subtilisin propeptide-like inhibitors.

DESCRIPTION OF EMBODIMENTS

The reduction of the proteolytic activity of the at least one serine protease of the present invention comprising the motif consisting of SEQ ID No. 1 in plants and plant cells resulted surprisingly in a reduction of the proteolysis of exogenous polypeptides and proteins synthesised in said plants and plant cells.

Serine proteases are able to cleave peptide bonds in proteins and polypeptides, in which serine serves as the nucleophilic amino acid at the enzyme's active site. Serine proteases are characterised by a distinctive structure, consisting of two domains that converge at the catalytic active site. Furthermore, these proteases are usually categorised based on their substrate specificity as either trypsin-like, chymotrypsin-like, thrombin-like, subtilisin-like or elastase-like. The serine proteases of the present invention whose proteolytic activity shall be reduced are preferably subtilisin-like serine proteases.

According to a preferred embodiment of the present invention the protease is a plant protease, preferably a serine protease, more preferably a subtilisin-like serine protease, preferably from Nicotiana spp., more preferably from Nicotiana benthamiana, in particular from the Nicotiana benthamiana line C105 (Strasser R et al. Plant Biotechnol J 6(2008):392-402).

The term “plant cell”, as used herein, refers to protoplasts, gamete producing cells and cells which regenerate into whole plants. Plant cells, as used herein, further include cells obtained from or found in seeds, suspension cultures, embryos, meristematic regions, callus tissue, leaves, roots, shoots, gametophytes, sporophytes, pollen and microspores.

“Proteolytic activity” of enzymes refers to their ability to cleave peptide bonds present in polypeptides and proteins. Proteolytic activity can be measured by using methods known in the art (e.g. Twining, 1984, Anal. Biochem. 143, 30-34).

The serine protease motif of the present invention having SEQ ID No. 1 comprises at positions X₁, X₄and X₅peptides consisting of a defined length. However, the composition of these peptides in relation to amino acid residues is variable.

The motif of the serine protease of the present invention is SSRGPX₁LKPDX₂X₃APGX₄SGTSMSCPHX₅PX₆WSPX₇AX₈X₉SAX₁₀MTT (SEQ ID No. 1), wherein X₁is a peptide consisting of 7 amino acid residues, X₂is I or L, X₃is T or M, X₄is a peptide consisting of 27 or 28 amino acid residues, X₅is a peptide consisting of 12 amino acid residues, X₆is T or E, X₇is S or A, X₈is V or I, X₉is K or R and X₁₀is I or M.

According to another preferred embodiment of the present invention the at least one serine protease has at least 80%, preferably at least 85%, more preferably at least 90%, more preferably at least 95%, more preferably at least 96%, more preferably at least 98%, more preferably at least 99%, in particular 100%, sequence identity to a serine protease consisting of an amino acid sequence selected from the group consisting of SEQ ID No. 2, SEQ ID No. 4, SEQ ID No. 6, SEQ ID No. 8, SEQ ID No. 10, SEQ ID No. 14, SEQ ID No. 16, SEQ ID No. 18, SEQ ID No. 20 and SEQ ID No. 22, most preferably SEQ ID No. 2 or SEQ ID No. 14.

“% sequence identity”, as used herein, is obtained by aligning two sequences to be compared to give a maximum correlation between the sequences. This may include inserting “gaps” in either one or both sequences, to enhance the degree of alignment. A % identity may then be determined over the whole length of each of the sequences being compared. In the above context, an amino acid sequence having a “sequence identity” of at least, for example, 80% to a query amino acid sequence, is intended to mean that the sequence of the subject amino acid sequence is identical to the query sequence except that the subject amino acid sequence may include up to twenty amino acid alterations per each 100 amino acids of the query amino acid sequence. In other words, to obtain an amino acid sequence having a sequence of at least 80% identity to a query amino acid sequence, up to 20% (20 of 100) of the amino acid residues in the subject sequence may be inserted or substituted with another amino acid or deleted.

Methods for comparing the identity of two or even more sequences are well known in the art. The percentage to which two sequences are identical can for example be determined by using a mathematical algorithm. Such algorithms are integrated in the BLAST family of programs, e.g. BLAST or NBLAST program (see also Altschul et al., 1990, J. Mol. Biol. 215, 403-410 or Altschul et al., 1997, Nucleic Acids Res., 25, 3389-3402), accessible through the home page of the NCBI (www.ncbi.nlm.nih.gov). According to the present invention sequence identity is preferably determined using the following parameters: Matrix BLOSUM62; Open gap 11 and extension gap 1 penalties; gap x_dropoff50; expect 10.0 word size 3; Filter: none.

According to a further preferred embodiment of the present invention the protease is encoded by a nucleic acid sequence selected from the group consisting of SEQ ID No. 3, SEQ ID No. 5, SEQ ID No. 7, SEQ ID No. 9, SEQ ID No. 11, SEQ ID No. 15, SEQ ID No. 17, SEQ ID No. 19, SEQ ID No. 21 and SEQ ID No. 23, preferably SEQ ID No. 3 or SEQ ID No. 15.

Particularly preferred is a Nicotiana benthamiana subtilisin-like serine protease (SBT1) comprising amino acid sequence SEQ ID No. 2, whereby amino acid residues 1 to 150, preferably 1 to 140, more preferably 1 to 130, more preferably 1 to 120, more preferably 1 to 110, in particular 1 to 109, of SEQ ID No. 2 represent the signal peptide and propeptide region. The amino acids of the catalytic triad (Asp148, His215, Ser553) are underlined:

(SEQ ID No. 2)

MKGIISLFFCFSLFIVSFILREADSASQAQNNGIYIVYMGAAASSNGGTR

HDQARLISSLIRRNKHAVVHSYNNGFSGFAARLSESEAKSMAQRPGVISV

FPDPVLQLHTTHSWDFLKYQTDEKINSSPSSGSDSSLIGADTIIGILDTG

IWPESESFNDKDMGPIPSRWNGTCMDGQDFGSSKCNKKIVGARFYEESDD

SGTKIAGSARDENGHGTHVASTAAGSPVAGASYYGLAAGTATGGSPGSRI

SMYRVCTTFGCRGSAIMKAFDDAIADGVDVLSLSLGSSPGLEPDFPSNPI

AIGAFHAVEKGITVVCSAGNSGPGPKTVVNTAPWILTVAATTIDRDFETD

IVLGGNKLIKGGGINFGNMTKSSVYPLIHGNSTKSNDNVSEADARSCVPG

SLDENKVKGKIVLCENLDDGEYFPSDKLDEVKSRGGVGFILIDDDERTVA

PKFNSFSAGVVSKKDGNEILAYINSTRNPVASILPTVSITKYKPAPVVAY

FSSRGPAYNTPNLLKPDITAPGVAILAAWPGNDTSEALPGQKPPIFNLLS

GTSMSCPHVSGIAATVKAVNPTWSPSAVKSAIMTTAIQTNNLKAPITTVS

GSKATPYDIGAGEASTSGPLKPGLVYETDVADYLQFLCSVGFNISQIKLI

SITVPEDFSCPKNSTSELVSNMNYPSIAISSLKENEPKKVTRTVTNTGEE

ASVYTTVIEAPKGLEVQVIPTKLEFTNKSKKLSYDVSFKASSTSKEDLFG

SITWTNGKYKVRSPFVVSSN

A nucleic acid sequence encoding the Nicotiana benthamiana subtilisin-like serine protease may comprise SEQ ID No. 3, whereby nucleotide residues 1 to 450, preferably 1 to 420, more preferably 1 to 390, more preferably 1 to 360, more preferably 1 to 330, in particular 1 to 327, of SEQ ID No. 3 encode the signal peptide and the propeptide region:

(SEQ ID No. 3)

atgaaaggaattatatctttgtttttctgcttttctcttttcattgtctc

tttcatattaagagaagccgactcagcttctcaagcacaaaacaatggaa

tttatattgtttatatgggtgctgcagcttcatctaatggtggtaccaga

catgatcaagcgcggcttatcagctccttgatcagaaggaacaagcatgc

agtggtacacagctacaacaatggtttctcaggattcgcggcacgtttat

cagaatctgaagctaaatccatggctcaaagacctggagttatctccgta

tttcctgatccagtactgcaactccacactacacattcatgggatttctt

gaagtatcaaactgatgaaaaaatcaattcaagtccaagctctggttctg

attcatcattaattggagctgataccataattggcatattggatacgggt

atatggccagaatctgagagtttcaatgacaaggatatgggtccaattcc

atcccggtggaatggaacttgcatggatggtcaagattttggctcttcaa

aatgcaacaagaagatagttggtgcaagattttatgaggagtctgatgac

agtggaacaaaaatcgctggatcagccagggacgagaacggacatggtac

tcatgttgcgtctactgcagctgggagtcctgttgcaggtgcatcctact

atggcctagctgcaggaactgccacgggtggatctcctggttcgaggatt

tccatgtatcgtgtctgtactacttttggatgccgcggatcagctatcat

gaaagcattcgatgatgcaattgcagatggggttgatgttttatcactat

cacttggttcatcacctggacttgaacctgattttccaagcaatcctatt

gccataggagcatttcatgctgtagaaaagggcattactgttgtttgctc

tgctggaaatagtggccctggaccaaaaactgttgtcaatacagctcctt

ggattcttactgttgcagccaccaccattgatcgtgacttcgagacagat

attgtcttgggtggaaacaagcttattaagggtggaggtataaactttgg

taacatgacaaaatcgtcagtctaccctttgattcatggcaattcaacca

aatcaaacgataatgtttctgaggcggatgcaaggagttgtgttcctggt

tcattagatgaaaacaaagtcaaggggaagattgtcctttgtgaaaatct

tgatgatggtgaatattttcccagtgacaagctagatgaagtgaaaagcc

gaggtggagttggatttatacttatagatgatgatgaaagaactgtggca

cccaaattcaattccttctcagctggtgtagtctctaaaaaggatggaaa

tgagatcctcgcctacattaactcgacgaggaatccagttgcatcaattt

taccaactgtatccataacaaagtacaaaccagctccagttgtggcttac

ttctcatcaagaggccctgcgtacaacacccctaacctcctcaaaccgga

tattacagcaccaggggttgccattcttgctgcttggcctggaaatgaca

cgagtgaggctctccccggccaaaaaccaccaattttcaacctactctca

ggcacttccatgtcctgccctcatgtatccggtattgctgctactgtcaa

agcagtaaaccctacctggagtccttcagctgtcaaatcagctattatga

ccacagctattcagacaaacaatttgaaggctccaatcactacagtctca

ggatccaaagcaacaccatatgacataggtgcaggagaagcaagcacttc

aggtccattaaaaccaggtctagtctacgagacagatgtcgccgactact

tgcagttcctatgttctgttggttttaacatatcacagataaagctgatc

tcaattacagttcctgaagacttttcatgcccaaaaaactcaacctctga

attggtttctaatatgaattatccatcaatagctatttctagtctcaaag

aaaacgagccgaagaaagttactagaactgtaacaaatactggtgaagaa

gcatcagtatatactacagttattgaggcaccaaaaggattggaagtcca

agtgatcccaactaaattggaatttacaaataaaagcaagaaattaagct

atgatgtgtctttcaaagcttcatctacctcaaaggaagatctgtttgga

tcaattacttggactaatggtaagtacaaagtccggagtccattcgtcgt

aagtagcaactaa

Particularly preferred is a Nicotiana tabacum subtilisin-like serine protease comprising amino acid sequence SEQ ID No. 4, whereby amino acid residues 1 to 150, preferably 1 to 140, more preferably 1 to 130, more preferably 1 to 120, more preferably 1 to 110, in particular 1 to 107, of SEQ ID No. 4 represent the signal peptide and propeptide region:

(SEQ ID No. 4)

MKGLSLFFCFSLFIVAFIREADSASQAQNNGIYIVYMGAAASSNGGTRHD

QAQLISSLIRRNKNAVVHSYNNGFSGFAARLSEAEAKSMAQRPGVISVFP

DPVLQLHTTHSWDFLKYQTDEKINSSPSSGSDSSLIGADTIIGILDTGIW

PESESFNDKDMGPIPARWNGTCMDGQDFGSSKCNKKIVGARFYEESDDSG

IKIAGSARDENGHGTHVASTAAGSPVAGASYYGLAAGTATGGSPGSRIAM

YRVCTTFGCRGSAIMKAFDDAIADGVDVLSLSLGSSPGLEPDFPSNPIAI

GAFHAVEKGIVVVCSAGNSGPGPKTVVNTAPWILTVAATTIDRDFETDIV

LGGNKLIKGGGINFGNMTKSSVYPLIHGNSTKSNNNVSEADARSCVPGSL

DENKVKGKIVLCENLDDGEYFPSDKLDEVKSRGGVGFILIDDDERTVAPK

FNSFPAGVVSKKDGTEILAYINSTRNPIASILPTVSITKYKPAPVVAYFS

SRGPAYNTPNLLKPDITAPGVAILAAWPGNDTSEALPGQKPPIFNLLSGT

SMSCPHVSGIAATVKAVNPAWSPSAVKSAIMTTAIQINNLKAPITTVSGS

KATPYDIGAGEASTSGPLKPGLVYETDVADYLQFLCSAGFNISQIKLISS

TVPKDFSCPKNPTSELVSNMNYPSIAISSLKENEPKKVTRTVTNTGEEAS

VYTAIIEAPKGLEVQVIPSKLEFNYKSKKLSYEVSFKASSPSKEDLFGSI

TWTNGKYKVRSPFVVSSN

A nucleic acid sequence encoding the Nicotiana tabacum subtilisin-like serine protease may comprise SEQ ID No. 5, whereby nucleotide residues 1 to 450, preferably 1 to 420, more preferably 1 to 390, more preferably 1 to 360, more preferably 1 to 330, in particular 1 to 321, of SEQ ID No. 5 encode the signal peptide and the propeptide region:

(SEQ ID No. 5)

atgaaaggcctttctttgttcttctgcttttctcttttcattgtcgcttt

cataagagaagccgactcagcttctcaagcacaaaacaatggcatttata

ttgtttatatgggtgctgcggcttcttctaacggtggtaccagacatgat

caagcgcagcttatcagctccttgatcagaaggaacaagaatgcagtggt

acacagctacaacaatggtttctcaggattcgcggcacgtttatcagaag

ctgaagctaaatccatggctcaaagacctggagttatctctgtatttcct

gatccagtactgcaactccacactacacattcatgggatttcttgaagta

tcaaactgatgaaaaaatcaattcaagtccaagctctggttctgattcat

cattaattggagctgataccataattggcatattggatacgggtatatgg

ccagaatctgagagtttcaatgacaaggatatgggtccaattccagcccg

gtggaatggaacttgcatggatggtcaagattttggctcttcaaaatgca

acaagaagatagttggtgcaagattttatgaggagtctgatgacagtgga

ataaaaatcgctggatcagccagggacgagaacggacatggtactcatgt

tgcgtctactgcagctgggagtcctgttgcaggtgcatcctactatggcc

tagctgcaggaactgccacgggtggatctcctggttcaaggattgccatg

tatcgtgtctgtactacttttggatgccgtggatcagctatcatgaaagc

attcgatgacgcaattgcagatggggttgatgttttatcactatcacttg

gttcatcacccggacttgaacctgattttccgagcaatcctattgccata

ggagcatttcatgccgtagaaaagggcattgttgttgtttgctctgctgg

aaatagtggccctggcccaaaaactgttgtcaatacagctccttggattc

taacggttgcagccaccaccattgatcgtgacttcgagacagatattgtc

ttgggtggaaacaagcttattaagggtggaggtataaactttggtaacat

gacaaaatcgtcagtctaccctttgattcatggcaattcaaccaaatcaa

acaataatgtttctgaggcggatgcaaggagttgtgttcctggttcatta

gatgaaaacaaagtcaaggggaagattgttctttgtgaaaatcttgatga

tggcgaatattttcccagtgacaagctagacgaagtgaagagtcgaggtg

gagttggatttatactaatagatgatgatgaaagaactgtggcacccaaa

ttcaattccttcccagcaggtgtagtctctaaaaaggatggtactgagat

ccttgcctacattaactcgactaggaatcccattgcatcaattttaccaa

ctgtttccataacaaagtacaaaccagctccagttgtcgcttacttctca

tcaagaggccctgcgtacaacacccctaacctcctcaaaccggatattac

agcaccaggggttgccattcttgctgcttggcctggaaatgacacgagtg

aggctctccccggccaaaaaccaccaattttcaacctactctcaggcact

tccatgtcctgccctcacgtatccggtattgctgcaactgtcaaagcagt

taaccctgcctggagtccttcagctgtcaaatcagctattatgaccacag

ctattcagataaacaatttgaaggctccaatcactacagtctcaggatcc

aaagcaacaccatatgacataggtgcaggagaagcaagcacttcaggtcc

attaaaaccaggtctagtctacgagacagatgtcgccgactacttgcagt

tcctatgttctgctggctttaacatatcacagataaagctgatctcaagt

acagttcctaaggacttttcatgcccaaaaaacccaacctctgaattggt

ttctaatatgaattatccatcaatagctatttctagtctcaaagaaaacg

agccgaagaaagttactagaacagttacaaatactggtgaagaagcatca

gtatatactgcaatcattgaggcaccaaaaggattggaagtccaagtgat

tccaagtaaattggaatttaattataaaagcaagaaattaagctatgaag

tgtctttcaaagcttcatctccctcaaaggaggatctgtttggctcaatt

acttggacaaatggtaagtacaaagtccggagtccatttgttgtaagtag

caactaa

Particularly preferred is a Nicotiana tabacum subtilisin-like serine protease comprising amino acid sequence SEQ ID No. 6, whereby amino acid residues 1 to 150, preferably 1 to 140, more preferably 1 to 130, more preferably 1 to 120, more preferably 1 to 110, in particular 1 to 107, of SEQ ID No. 6 represent the signal peptide and propeptide region:

(SEQ ID No. 6)

MKGLFLFICFSLFIVSFIIEADSASQSQNNGIYIVYMGAAASSNGGTRHD

QAQLISSLIRRNKNAVVHSYKNGFSGFAARLSEAEAKSMAQRPGVVSVFP

DPVLQLHTTHSWDFLKYQTDEKINSSPSSGSDSSLIGADTIIGILDTGIW

PESESFNDKDMGPIPARWNGTCMDGQDFGSSNCNKKIVGARFYEESDDSG

IKITGSARDENGHGTHVASTAAGSPVAGASYYGLAAGTATGGSPGSRIAM

YRVCTTFGCRGSAIMKAFDDAIADGVDVLSLSLGSSPGLEPDFPTNPIAI

GAFHAVEKGIVVVCSAGNSGPGPKTVVNTAPWILTVAATTIDRDFETDVL

LGGNKLIKGGGINFGNMTKSTVYPLIHGNSTKSNDNVSEADARSCVPGSL

DENKVKGKIVLCENLDDGEYFPSDKLDEVKSRGGVGFILIDDDERTVAPK

FKAFPAGVVSKKDGTEILAYINSTRNPVASILPTISITKYKPAPVVAYFS

SRGPAYNTPNLLKPDITAPGVAILAAWPGNDTSEALPGQKPPIFNLLSGT

SMSCPHVSGIAATVKAVNPTWSPSAVKSAIMTTAIQTNNLKAPITTVSGS

KATPYDIGAGEASTSGPLKPGLVYETDVADYLQFLCSVGFNISQIKLISS

TVPKDFSCPKNSSSELVSNMNYPSIATSSLKENEPKKVTRTVTNTGEEAS

VYTAIIEAPKGLEVQVIPTKLEFTNKRKKVSYDVSFKASSTSKEDLFGSI

TWTNGKYKVRSPFVVSSN

A nucleic acid sequence encoding the Nicotiana tabacum subtilisin-like serine protease may comprise SEQ ID No. 7, whereby nucleotide residues 1 to 450, preferably 1 to 420, more preferably 1 to 390, more preferably 1 to 360, more preferably 1 to 330, in particular 1 to 321, of SEQ ID No. 7 encode the signal peptide and the propeptide region:

(SEQ ID No. 7)

atgaaaggcctttttttgttcatatgcttctctcttttcattgtctcttt

cataatagaagccgactcagcttctcaatcacaaaacaatggcatttata

ttgtttatatgggtgctgcggcttcatctaacggtggtaccagacatgat

caagcgcagcttatcagctccttgatcagaaggaacaagaatgcagtggt

acacagctacaagaatggtttctcaggatttgcggcacgtttatcagaag

ctgaagctaaatcgatggctcaaagacctggagttgtctctgtatttcct

gatccagtactgcaactccacactacacattcatgggatttcttgaagta

tcaaactgatgaaaaaatcaattcaagtccaagctctggttctgattcat

cattaattggagctgataccataattggaatcttggatacgggtatatgg

ccagaatcagagagtttcaatgacaaggatatgggtccaattccagcccg

gtggaatggaacttgcatggatggtcaagattttggctcttccaattgca

acaagaagatagttggtgcaagattttatgaggagtctgatgacagtgga

ataaaaatcactggatcagccagggacgagaacggacatggtactcacgt

tgcatccactgcagctgggagccctgttgcaggtgcatcctactatggcc

tagctgcagggactgcaacaggtggatctccgggttccaggatcgccatg

tatcgtgtctgtactacttttggatgccgtggatcagctatcatgaaagc

attcgatgatgcaattgcagatggggttgatgttttatctctatcactcg

gttcatcacctggacttgaacctgattttccaaccaatcctattgccata

ggagcatttcatgccgtagaaaagggcattgttgttgtttgctctgctgg

aaatagtggccctggcccaaaaactgttgtcaatacagctccttggattc

ttactgttgcagccaccaccattgatcgtgacttcgagacagatgttctc

ttgggtggaaacaagttgattaagggtggaggtataaactttggtaacat

gacaaaatcaacagtctaccccttgattcatggcaattcaaccaaatcaa

acgataatgtttctgaggcagatgcaaggagttgtgttcctggttcatta

gatgaaaacaaagtcaaggggaagattgttctttgtgaaaatcttgatga

tggcgaatattttcccagtgacaagctagacgaagtgaagagtcgaggtg

gagttggatttatactaatagatgatgatgaaagaactgtggcacctaaa

ttcaaagccttcccagcgggtgtagtctctaaaaaggatggtactgagat

cctcgcctacattaactcgacaaggaatccagttgcatcaattttgccaa

ctatttccataacaaagtacaaaccagctccagttgttgcttacttctca

tcaagaggtcctgcatacaatacacctaacctcctcaaaccagatattac

agcaccaggggttgccattcttgctgcttggcctggaaatgacacgagtg

aggctctccccggccaaaaaccaccaattttcaacctactctcaggcact

tccatgtcctgccctcacgtttctggcattgctgcaacggtcaaagcagt

gaacccaacctggagtccttcagctgtcaaatcagctattatgaccacag

ctattcagacaaacaatttgaaggctccaatcactacagtctcaggatcc

aaagcaacaccatatgacataggagcaggagaagcaagcacttcaggtcc

attaaaaccaggtctagtctacgagacagatgtcgccgactacttgcagt

tcctctgttctgttggctttaacatatcacagataaagctgatctcaagt

acagttcctaaggacttttcatgcccaaaaaactcaagctctgaattggt

ttctaacatgaattatccttcaatagctacttctagtctcaaagaaaacg

agcccaagaaagttactagaactgttacaaatactggtgaagaagcatca

gtatatactgcaattattgaggcaccaaaaggattggaagtccaagtgat

tccaactaaattggaatttacaaataaaaggaagaaagtaagctatgatg

tatctttcaaagcctcatctacctcaaaggaagatctgtttggatcaatt

acttggacaaatggtaagtacaaagtccggagtccatttgtcgtaagtag

caactaa

Particularly preferred is a Solanum lycopersicum subtilisin-like serine protease comprising amino acid sequence SEQ ID No. 8, whereby amino acid residues 1 to 150, preferably 1 to 140, more preferably 1 to 130, more preferably 1 to 120, more preferably 1 to 110, in particular 1 to 107, of SEQ ID No. 8 represent the signal peptide and propeptide region:

(SEQ ID No. 8)

MRDIVLFFCFLLFLLSLLRETNAVSQEKNNGVYIVYMGAADSSNDGTKNQ

QAELMSSLIKRKKDAVVHSYNNGFSGFAARLSEAEAKSIAQKPGVISVFP

DPILQLHTTRSWDFLQYQTEVESSSGPISGSDNASPKGVDTIIGILDTGI

WPESESFSDNDMSEVPSKWKGTCMGSHDSISFKCNKKLVGARFYDDSDED

GVRPFGSARDDNGHGTHVASTAAGSLISGASYYGLASGTAKGGSPGSRIA

MYRVCTADGCHGSAIMKAFDDAIADGVDVLSLSLGSSSGLEVEFSRDPIA

IGAFHAVEKGILVSCSAGNDGPGPATVVNVAPWILTVAATTIDRDFETDI

VLGGNKLIKGGGISLGNLTRSPVYPLISGDLAKSSNNVVMEKGARYCYPN

SLDETKVKGKIVLCDNRDGYFSLTEKLTEVKKKGGIGFILIDDNARTVAP

KFNSFPAAVVTEKDSNEILSYINSTKKPVASVLPTVTIANYKPAPLVAYF

SSRGPTYNTHNLLKPDITAPGVAILAAWPGNDTTEAVAGQALPLYNIISG

TSMSCPHVSGIAALVKAQNPSWSPSAIRSAIMTSALQTNNLKAPITTVSG

SVATPYDIGAGEASPSLALNPGLVYETNTADYLQYLCSVGYDKSKIKLIS

NTVPDDFSCPTNSSSESVSQMNYPSIAVSNIKENEIKKVTRTVTNVGQDD

ATYTASIKAPVGLEVQVTPNKLVFTNNSKKLSYEMSFKASSKPKEDLFGS

ITWTNGKYKVRSPFVISTNSQGEHSKTADRRSN

A nucleic acid sequence encoding the Solanum lycopersicum subtilisin-like serine protease may comprise SEQ ID No. 9, whereby nucleotide residues 1 to 450, preferably 1 to 420, more preferably 1 to 390, more preferably 1 to 360, more preferably 1 to 330, in particular 1 to 321, of SEQ ID No. 9 encode the signal peptide and the propeptide region:

(SEQ ID No. 9)

atgagagacattgttctgtttttttgcttccttttatttttactctcttt

gcttagagaaactaatgcagtttctcaagaaaaaaacaatggtgtttata

ttgtttacatgggtgctgctgattcgtcgaacgatggcacgaaaaatcag

caagcagaactaatgagctcattgataaaaaggaaaaaggatgcagttgt

acacagttacaacaatggtttctcaggattcgcagcgcgtttatcagaag

ctgaggctaaatctattgctcaaaaacctggagttatatcagtattccct

gatccaatattgcaactacacacaacgcgttcgtgggactttttgcaata

tcaaactgaagtagaaagcagttctggtccaatatctggttctgataacg

cgtcaccaaaaggcgttgatactataattggaatcctggatacaggaata

tggcctgaatcagagagttttagtgataatgatatgagtgaagttccatc

taagtggaaaggaacttgtatgggaagtcatgattccatctctttcaaat

gcaacaagaagttagttggtgcaaggttctatgatgactctgatgaggat

ggtgtaagaccttttggttcagctagggatgataatggacatggaactca

tgttgcatctacagcagctgggagtttgatttcaggagcatcttattatg

gtttggcttctgggacagcaaagggtggatccccgggttcaagaatagcc

atgtatcgtgtctgcacggctgatggatgtcacggatcagctataatgaa

agcatttgatgatgcaattgcagatggggttgatgttttatcgctatcac

ttggttcatcatctggtcttgaagtcgagttttctagagatcctatagct

attggagcattccatgctgttgaaaagggaattcttgtttcctgttctgc

tggaaatgatggccctggtccggcaactgttgtcaatgttgcaccttgga

ttctcactgttgcagctactacaattgaccgtgacttcgagacagatatt

gtcttaggtggaaacaagttgataaagggaggaggtataagtttaggtaa

cttgacaagatctccagtatacccgttgattagtggcgatttagccaaat

ccagcaataatgttgttatggagaaaggtgcaaggtattgttatccgaat

tcattagatgaaaccaaagttaaggggaaaattgttctctgtgataatcg

cgacggatacttttcacttactgagaaactaacagaagtgaagaagaaag

gtggcattggatttatactaatagatgataatgcaagaactgtggcccca

aaattcaattcctttccagcagctgttgtaactgaaaaggatagcaatga

gatcctttcttacatcaactcaacaaagaaaccagttgcatcagttctgc

caactgttaccatagctaactacaaaccagctcctcttgtggcttacttc

tcatcaaggggtcctacatacaacacacataatctcctcaaaccagatat

tacagcaccaggtgttgcaattctcgcggcttggcctggaaacgacacaa

cagaggctgtcgctggccaggcgttaccactttacaacataatatcaggg

acttccatgtcctgccctcatgtttctggtattgccgcgttagtcaaggc

gcaaaatccttcttggagtccttcagcaatcagatcagctattatgacct

cagctttacagactaacaatttgaaggctccaatcacaacagtctctgga

tccgttgcaacaccatacgacataggagctggagaagcaagcccttcatt

ggcacttaatccaggattggtctacgagactaacaccgcagactacttgc

agtacctatgctcagttggctacgataaatcaaagataaaactcatttca

aatacagttcctgatgatttttcatgtcccaccaactcaagttctgaatc

cgtctcacaaatgaactacccttccattgctgtttcaaatatcaaagaaa

atgaaatcaagaaagtaacaagaactgtaactaacgtaggacaagacgac

gcaacatacacagcaagtataaaggcaccagttggtttggaagtccaagt

gaccccgaataaattggtatttacaaataatagcaagaagttgagctatg

aaatgtctttcaaagcttcatctaaaccaaaggaagacttgtttggatca

attacatggacaaatggtaaatacaaagtccgaagtccattcgttatatc

tactaattcacaaggtgaacactccaaaacagcagatcgcagatcaaact

ag

Particularly preferred is a Solanum tuberosum subtilisin-like serine protease comprising amino acid sequence SEQ ID No. 10, whereby amino acid residues 1 to 150, preferably 1 to 140, more preferably 1 to 130, more preferably 1 to 120, more preferably 1 to 110, in particular 1 to 107, of SEQ ID No. 10 represent the signal peptide and propeptide region:

(SEQ ID No. 10)

MKVIVLFFCFFLLLLSFLRETNAVSQEKNNGVYIVYMGAADSSNDGTKNQ

RAELMSSLIRRKKDAVVHSYSNGFSGFAARLSEAEAKSIAQKPGVISVFP

DPILQLHTTRSWDFLQYQTEVESSSGPISGSDNASPKGVDTIIGILDTGI

WPESESFSDNDMSEVPSKWKGTCMASHDSISFKCNKKLVGARFYDDSDED

GVRPSGSARDENGHGTHVASTAAGSPISGASYYGLASGTAKGGSPGSRIA

MYRVCMTDGCHGSAIMKAFDDAIADGVDVLSLSLGSSSGLEVEFSSDPIA

IGAFHAVEKGILVSCSAGNDGPGPATVVNVAPWILTVAATTIDRDFETDI

VLGGNKLIKGGGISLGNLTRSPVYPLISGDLAKSGNTVVSEKNARFCNPN

SLDGTKVKGKVVLCDNRDGYYSLTEKLTEVKSKGGIGFIVVDDNARTVAP

KFKSFPAAVVTEKDSNEILSYINSTKKPVASVLPTVTIANYKPAPLVAYF

SSRGPTYNTHNLLKPDITAPGVAILAAWPGNDTNEAVAGQAPPLYNIISG

TSMSCPHVSGIAALVKAQNPSWSPSAIKSAIMTSALQTNNLKAPITTVSG

SVATPYDIGAGEASPSLALNPGLVYETNTADYLQYLCSVGYDKSKIKLIS

NTVPNDFSCPTNSSSESVSQMNYPSIAVSNIKENEIKKVTRTVTNVGQED

ATYTASIKAPVGLEVQVTPNKLVFTNNSKKLSYEVSFKASSKPKEDLFGS

ITWINGKYKVRSPFVVSTNSQGV

A nucleic acid sequence encoding the Solanum tuberosum subtilisin-like serine protease may comprise SEQ ID No. 11, whereby nucleotide residues 1 to 450, preferably 1 to 420, more preferably 1 to 390, more preferably 1 to 360, more preferably 1 to 330, in particular 1 to 321, of SEQ ID No. 11 encode the signal peptide and the propeptide region:

(SEQ ID No. 11)

atgaaagtcattgttctgtttttttgcttctttttattattactctcttt

tcttagagaaactaatgcagtttctcaagaaaaaaacaatggtgtataca

ttgtttacatgggcgctgctgattcgtcgaacgatggcacgaaaaatcag

cgagcagaacttatgagctctttgataagaagaaaaaaggatgcagttgt

acacagttacagcaatggtttctcaggattcgcagcgcgtttatcagaag

ctgaggctaaatccattgctcaaaaacctggagttatatcagtattccct

gatccaatattgcaactccacacaacgcgttcgtgggactttttgcaata

tcaaactgaagtagaaagcagttctggtccaatatcaggttctgataacg

cgtcaccaaaaggcgttgatactataattggaatcttggatacaggaata

tggcctgaatcagagagttttagtgataatgatatgagtgaagttccatc

taagtggaaaggaacttgtatggcaagtcatgattccatctctttcaaat

gcaacaagaagttagttggtgcaaggttctatgatgactctgatgaggat

ggtgtaagaccttctggttcagctagggatgagaatggacatggaactca

tgttgcatctacagcagctgggagtccgatttcaggagcatcttattatg

gtttggcttctgggaccgcgaagggtggatccccgggttcaagaatagcc

atgtatcgtgtctgcatgactgatggatgtcacggatcagctataatgaa

agcatttgatgatgcaattgcagatggggttgatgttttatcgctatcac

ttggttcatcatctggtcttgaagttgagttttcgagtgatcctatagct

attggagcattccatgctgttgaaaagggaattcttgtttcctgttctgc

tggaaatgatggccctggtccggcaactgttgtcaacgtcgcgccttgga

ttctcacagttgcagctactacaattgaccgtgacttcgagacagatatt

gtcttaggtggaaacaagttgattaagggtggaggtataagtcttggtaa

tttgacaagatctccagtatacccgttgattagtggcgatttagccaagt

ccggcaatactgttgtttcggagaaaaatgcaaggttttgtaatccgaat

tcattagatggaaccaaagttaaggggaaagttgttctttgtgataatcg

cgacggatactattcacttactgagaaactaacagaagtgaagagcaaag

gtggcattggatttatagtagtagatgataatgcaagaactgtggcacct

aaattcaaatcctttccagcagctgttgtaactgaaaaggatagcaatga

gatcctttcttacatcaactcaacaaagaaaccagttgcatcagttctgc

caactgttaccatagctaactacaaaccagctcctcttgtggcttacttc

tcatcaaggggtcctacatacaacacacataatctcctcaaaccagatat

tacagcaccaggtgttgcaattctcgcggcttggcctggaaacgacacaa

acgaggctgtcgctggccaggcgccaccactttacaacataatatcaggg

acttccatgtcctgccctcatgtttctggtattgccgcattagtcaaggc

gcaaaatccttcttggagtccttcagcaatcaaatcagctatcatgacct

cagctttacagactaacaatttgaaggctccaatcacaacagtctccggc

tccgttgcaacaccatacgacataggagctggagaagcaagcccttcatt

ggcacttaatccaggattggtctacgagactaacactgcagactacttgc

agtacctatgctcagttggctacgataaatcaaagataaaactcatctca

aatactgttcctaatgatttttcatgtcccaccaactcaagttctgaatc

cgtctcacaaatgaactacccttccattgctgtttcaaatatcaaagaaa

atgaaatcaagaaagtaacaagaactgtaactaacgtaggacaagaagac

gcaacatacacagcaagtataaaagcaccagttggtttggaagtccaagt

gaccccgaataaattggtatttacaaataatagcaagaagttgagctatg

aagtgtctttcaaagcttcatctaaaccaaaggaagacttgtttggatca

attacatggacaaatggtaaatacaaagtccgaagtccattcgttgtatc

tactaattcacaaggtgtctga

Another plant subtilisin-like serine protease of Arabidopsis thaliana may comprise amino acid sequence SEQ ID No. 12, whereby amino acid residues 1 to 150, preferably 1 to 140, more preferably 1 to 130, more preferably 1 to 120, more preferably 1 to 110, in particular 1 to 108, of SEQ ID No. 12 represent the signal peptide and propeptide region:

(SEQ ID No. 12)

MKGITFFTPFLSFLYLLCILFMTETEAGSRNGDGVYIVYMGSASSAANAN

RAQILINTMFKRRANDLLHTYKHGFSGFAARLTAEEAKVIAKKPGVVSVF

PDPHFQLHTTHSWDFLKYQTSVKVDSGPPSSASDGSYDSIVGILDTGIWP

ESESFNDKDMGPIPSRWKGTCMEAKDFKSSNCNRKIIGARYYKNPDDDSE

YYTTRDVIGHGSHVSSTIAGSAVENASYYGVASGTAKGGSQNARIAMYKV

CNPGGCTGSSILAAFDDAIADGVDVLSLSLGAPAYARIDLNTDPIAIGAF

HAVEQGILVICSAGNDGPDGGTVTNTAPWIMTVAANTIDRDFESDVVLGG

NKVIKGEGIHFSNVSKSPVYPLIHGKSAKSADASEGSARACDSDSLDQEK

VKGKIVLCENVGGSYYASSARDEVKSKGGTGCVFVDDRTRAVASAYGSFP

TTVIDSKEAAEIFSYLNSTKDPVATILPTATVEKFTPAPAVAYFSSRGPS

SLTRSILKPDITAPGVSILAAWTGNDSSISLEGKPASQYNVISGTSMAAP

HVSAVASLIKSQHPTWGPSAIRSAIMTTATQTNNDKGLITTETGATATPY

DSGAGELSSTASMQPGLVYETTETDYLNFLCYYGYNVTTIKAMSKAFPEN

FTCPADSNLDLISTINYPSIGISGFKGNGSKTVTRTVTNVGEDGEAVYTV

SVETPPGFNIQVTPEKLQFTKDGEKLTYQVIVSATASLKQDVFGALTWSN

AKYKVRSPIVISSESSRTN

A nucleic acid sequence encoding the Arabidopsis thaliana subtilisin-like serine protease may comprise SEQ ID No. 13, whereby nucleotide residues 1 to 450, preferably 1 to 420, more preferably 1 to 390, more preferably 1 to 360, more preferably 1 to 330, in particular 1 to 324, of SEQ ID No. 13 encode the signal peptide and the propeptide region:

(SEQ ID No. 13)

atgaaaggcattacattcttcacaccctttttatcatttctatatctctt

atgcatcttgtttatgacagaaactgaagctgggtcgagaaatggtgatg

gggtctacattgtctacatgggatcagcttcctctgctgcaaacgctaat

agagctcaaatactcataaacaccatgtttaaaaggagagcaaacgatct

tctccacacatataagcatggcttctcaggttttgcagctcgtttgacag

cagaagaggccaaggtcatagccaagaaaccgggagtggtttcagttttt

cctgacccacacttccagcttcatacaactcattcatgggactttctcaa

gtaccaaacatcagtaaaggtcgattccggtccaccttcatcagcctcag

atggatcatatgatagcattgtcggaattcttgacacagggatatggcca

gagtcagagagtttcaatgacaaagacatgggtccaattccgtctcggtg

gaaaggtacatgcatggaagcaaaggacttcaagtcttccaactgtaaca

gaaagatcattggagcaagatactacaaaaatcctgatgatgattcagaa

tactataccacaagggatgtcatcggtcacggttctcatgtgtcctccac

catagctggatctgccgtggagaatgcttcctactatggtgtagcttccg

ggactgcaaagggaggttcacaaaacgctagaatcgctatgtacaaagta

tgcaatccagggggatgcactggctcctctatcttagctgctttcgatga

tgcaatcgcagatggagttgatgttctatctctgtctcttggagctccag

catacgctcgcatcgacttgaacactgatcctattgccattggagcgttt

cacgcggtggagcaaggaatcttggtgatctgctctgcgggtaatgatgg

acctgatggcggtacagttactaatactgcaccttggataatgaccgttg

ctgccaacactattgatagagactttgagtctgatgttgtactaggcggc

aataaagtcatcaagggtgaaggtatacacttttcaaacgttagtaaatc

tcctgtgtatcctctgattcatggcaagtctgctaagagcgctgatgcat

cagaaggatcagccagggcctgtgactctgattctctagatcaagagaag

gtaaaagggaagattgtgttatgcgagaacgttggtggatcatattatgc

atcatccgctagggacgaggtgaagagcaaaggaggtactggttgcgtct

ttgtagatgacagaactagagcagttgctagtgcttatggtagctttcct

actaccgtaattgactcaaaggaagcagctgagatcttctcctacctcaa

ctcaaccaaagatcctgttgcaacaattcttcccactgcaacagttgaaa

agttcacacctgcccctgctgttgcatatttttcttccagaggaccttca

agcctcacaagaagcattctcaaacctgacattaccgcaccaggagtctc

gatactcgctgcatggactggaaacgactcaagcatttcactggaaggca

agccggcttctcagtataacgtcatatcaggaacttccatggcagctcct

catgtttcagctgttgcatctctgatcaaatcacagcatcccacatgggg

tccatccgcgatcagatcagcaattatgacaacagcgactcaaacaaaca

acgacaaaggtcttataacaacagaaactggtgcaacagccacaccttat

gactctggagcaggagaactaagctcaacagcatcaatgcaaccaggact

agtttacgagactactgaaactgactacctgaactttctctgttactatg

gatataacgtaaccacaataaaggctatgtcaaaagcttttccagagaat

tttacttgccctgcagattccaacttagacttgatctccaccatcaatta

cccgtcaattggaatctctggattcaaaggaaatggtagcaagacagtta

caagaacagtgaccaatgttggtgaggacggtgaggcggtttacacagtc

agcgtggaaacaccaccagggtttaacattcaagtgacgccagaaaaact

ccaatttacaaaagatggtgagaagttgacataccaggtgatagtgtctg

ctactgcttcacttaagcaagatgtatttggggctcttacttggtctaat

gccaagtacaaggtcagaagcccaattgtaattagtagcgagagtagccg

cacaaactga

Particularly preferred is a Nicotiana benthamiana subtilisin-like serine protease (SBT2) comprising amino acid sequence SEQ ID No. 14, whereby amino acid residues 1 to 150, preferably 1 to 140, more preferably 1 to 130, more preferably 1 to 120, more preferably 1 to 115, in particular 1 to 110, of SEQ ID No. 14 represent the signal peptide and propeptide region. The amino acids of the catalytic triad (Asp142, His213, Ser529) are underlined:

(SEQ ID No. 14)

MANHITLCIWLLFFFISIFSLAKSETYIIHMDLSAMPKAFSSHHNWYLTT

LSSVSDSSTNHKDFLSSKLVYAYTNAIHGFSASLSPSELEAIKYSPGYVS

SIKDISVKIDTTHTSQFLGLNSKSGVWPTSDYGKDIIIGLVDTGIWPESK

SYSDDGISEVPSRWKGECESGTEFNSSLCNKKLIGARYFNKGLLANNPNL

NISMNSSRDTDGHGTHTSSTAAGSYVEGASYFGYATGTAIGIAPKAHVAM

YKALWEEGVYLSDVLAAIDQATTDGVDVLSLSLGIDAIPLHEDPVAIAAF

AALEKGIFVSTSAGNEGPYYETLHNGTPWVLTVAAGTVDREFIGTLTLGN

GVSVPGLSLYPGNSSSSESSLVYVECQDDKELQKNAHKFVVCLDKNDSVG

EHVYNVRNSKVAGAVFITNTTDLEFYLQSEFPAVFLNLQEGDKVLEYIKS

NSAPKGKLEFQVTHIGAKPAPKVATYSSRGPSPSCPSILKPDLMAPGALI

LASWPQQSPVTDVTSGKLFSNFNIISGTSMSCPHASGVAALLKAAHPEWS

PAAIRSAMMTTSTALDNTQSPIRDIGNKNAAATPLAMGAGHIDPNKALDP

GLIYDATPQDYVNLLCALNFTSKQIKTITRSSSYTCSNPSLDLNYPSFIG

FFNGNSSESDPKRIQEFKRTVTNLQDGTSIYTANFTPMGKFKVSVVPEKL

VFKEKYEKLSYKLRIEGPIVMDDNVVYGSLSWVETGGNYVVRSPIVATSI

KVDPLTGHN

A nucleic acid sequence encoding the Nicotiana benthamiana subtilisin-like serine protease may comprise SEQ ID No. 15, whereby nucleotide residues 1 to 450, preferably 1 to 420, more preferably 1 to 390, more preferably 1 to 360, more preferably 1 to 345, in particular 1 to 330, of SEQ ID No. 15 encode the signal peptide and the propeptide region:

(SEQ ID No. 15)

atggcaaatcatattaccttgtgtatttggttgcttttcttctttatttc

tatattttcactagcaaagtcagaaacatatatcattcatatggatttgt

cagccatgccaaaagctttttctagccatcataattggtacttgacaaca

ctttcttctgtatcagacagcagtacaaatcacaaagacttcttgtcctc

aaaactagtctatgcttatactaatgctatacatggttttagtgcaagtc

tctctccttctgaacttgaagccataaaatattctccaggttatgtttct

tcaattaaggatatttcagttaaaattgacacaactcacacatcccagtt

tcttggcctaaactctaagtctggtgtatggccaacgtccgactatggga

aagatatcataattggcttagttgatactggaatatggcctgagagtaaa

agctatagtgatgatgggattagtgaagtaccatcaagatggaaaggaga

atgtgaaagtggtactgagttcaattcctctttgtgtaacaagaagctca

ttggcgctcgttacttcaataaaggcctacttgccaacaatccaaatctt

aatatttcaatgaattcttctagagataccgatggacatggaactcacac

ttcttctacagctgcgggaagttatgttgagggtgcatcttattttggct

atgccaccggtactgctattggcatagcgccaaaggctcatgtggctatg

tacaaggctctatgggaagaaggtgtatacttgtctgatgttcttgctgc

aattgatcaagcaattacagatggtgtggatgtcttatccttgtcgttag

gcatagacgcgattccactacacgaagatcctgtggcaattgccgcattt

gctgcattggagaaaggtatatttgtttccacctctgcaggaaatgaagg

gccttattatgagactttgcacaatggaacaccttgggtgttaactgttg

cagctggcacagttgaccgcgaatttattggaacattaactcttggaaat

ggagtttcagtccctggtttatcgctataccctgggaattctagttcaag

cgaaagctcccttgtctatgtcgaatgccaagatgacaaggaactgcaga

aaaatgcacacaaatttgttgtctgtctcgacaagaatgattcggttggt

gagcatgtgtacaatgtaagaaattcaaaagttgctggggctgtctttat

aactaatacaactgacttggaattctacctccaaagcgaattcccggctg

tgttcttgaacttacaagagggtgataaagttctagagtacattaagagc

aactctgcacctaaaggaaaacttgaattccaagttacacatattggtgc

taaaccagcgccaaaagttgctacctatagctcaagaggaccgtcaccga

gttgtccaagtatcctcaaacctgatctcatggctcctggtgccttaata

cttgcttcatggccacaacaatcaccagttactgatgttacctcaggaaa

actttttagtaacttcaatattatatctggtacatcaatgtcatgtccac

atgcttctggtgtagcagcacttctaaaagctgcacaccctgaatggagc

cctgcagccatccgatctgccatgatgaccacttccactgcgttggacaa

cacacagagccccatccgagacataggtaacaagaatgcggctgctactc

ctctagccatgggagctggccatatcgatccaaacaaggcgttagatcct

ggacttatctatgatgcaacaccacaagattatgtcaatctcctctgtgc

tctgaacttcacatccaaacaaataaaaaccatcacaagatcctcatctt

atacttgctccaacccatcattggacttaaactatccatccttcattggc

tttttcaatggtaacagcagcgagtcggatcctaaaaggattcaagaatt

taaaagaacagtgacaaacttacaagatggtacatcaatatatacagcaa

acttcactccaatgggtaaatttaaagttagtgttgtacctgaaaagttg

gttttcaaagagaagtatgaaaagctgagctacaagcttagaatagaagg

tccaatagttatggatgataatgtggtttatggttctttgagctgggtag

aaactggaggtaactatgtggttagaagtccaattgttgccacaagcata

aaagtggatcctttgacaggacacaactga

Particularly preferred is a Nicotiana tabacum subtilisin-like serine protease comprising amino acid sequence SEQ ID No. 16, whereby amino acid residues 1 to 150, preferably 1 to 140, more preferably 1 to 130, more preferably 1 to 120, more preferably 1 to 115, in particular 1 to 110, of SEQ ID No. 16 represent the signal peptide and propeptide region:

(SEQ ID No. 16)

MANHITLCIWLLFFFISIISLAKSETYIIHMDFSAMPKAFSSHHNWYLTT

LSSVSDSSTNYKDFLSSKLVYSYTNAIHGFSASLSPSELEAIKNSPGYVS

SIKDISVKIDTTHTSQFLGLNSESGVWPTSEYGKDIIIGLVDTGIWSESK

SYSDDGISEVPSRWKGECESGTEFNSSLCNKKLIGARYFNKGLLANNPNL

NISMNSARDTDGHGTHTSSTAAGSYVEGASYFGYATGTAIGIAPKAHVAM

YKALWEEGVYLSDVLAAIDQAITDGVDVLSLSLGIDAIPLHEDPVAIAAF

AALEKGIFVSTSAGNEGPYYETLHNGTPWVLTVAAGTVDREFIGTLTLGN

GVSVTGLSLYPGNSSSSESSIVYVECQDDKELQKNAHKFVVCLDKNDSVG

EHVYNVRNSKVAGAVFITNTTDLEFYLQSEFPAVFLNLQEGDKVLEYIKS

NSAPKGKLEFQVTHIGAKRAPEVATYSSRGPSPSCPSILKPDLMAPGALI

LASWPQQSPVTDVTSGKLFSNFNIISGTSMSCPHASGVAALLKGAHPEWS

PAAIRSAMMTTSSALDNTQSPIRDIGNVELRNAAATPLAMGAGHIDPNKA

LDPGLIYDATPQDYVNLLCALNFTSKQIKIITRSSSYTCSNPSLDLNYPS

FIGFFNGNSRESDPKRIQEFKRTVTNLQDGTSVYTANLIPMGKFKVSVVP

EKLVFKEKYEKLSYKLRIEGPIVMDDNVVYGSLSWVETGGKYVVRSPIVA

TSIKVDPLTGHN

A nucleic acid sequence encoding the Nicotiana tabacum subtilisin-like serine protease may comprise SEQ ID No. 17, whereby nucleotide residues 1 to 450, preferably 1 to 420, more preferably 1 to 390, more preferably 1 to 360, more preferably 1 to 345, in particular 1 to 330, of SEQ ID No. 17 encode the signal peptide and the propeptide region:

(SEQ ID No. 17)

atggcaaatcatattaccttgtgtatttggttgcttttcttctttatttc

tataatttcactagcaaagtcagaaacatatatcattcatatggattttt

cagccatgccaaaagctttttctagccatcataattggtacttgacaaca

ctttcttctgtatcagacagcagtacaaattacaaagacttcttgtcctc

aaaactagtctattcttatactaatgccatacatggttttagtgcaagtc

tttctccttctgaacttgaagccataaaaaattctccaggctatgtttct

tcaattaaggatatttcagttaaaattgacacaactcacacatcccaatt

tcttggcctaaactctgagtctggtgtatggccaacgtccgagtatggta

aagatatcataattggcttagttgatactggaatatggtcggagagtaaa

agctatagtgatgatgggattagtgaagtaccatcaagatggaaaggaga

atgtgaaagtggtactgagttcaattcctctttgtgtaacaagaaactca

ttggcgctcgttacttcaataaaggcctacttgccaacaatccaaatctt

aacatttcaatgaattctgctagagataccgatggacatggaactcacac

ttcttctacagctgcgggaagttatgtcgagggtgcatcttattttggct

atgccactggcactgctataggcatagcaccaaaggctcatgtggctatg

tacaaggctctatgggaagaaggtgtatacttgtctgatgttcttgctgc

aattgatcaagcaattacagatggtgtggatgtcttatccttgtctttag

gcatagacgcgattccactacacgaagatcctgtggcaattgccgcattt

gctgcattggagaaaggtatatttgtttccacctctgcaggaaatgaagg

gccttattatgagactttgcacaatggaacaccttgggtgctaactgttg

cagctggcacagttgaccgcgaatttattggaacattaactcttggaaat

ggagtttcagtcactggcttatcgctataccctgggaattctagttcaag

cgaaagctccattgtctatgtcgaatgccaagatgacaaggaactacaga

aaaatgcacataaatttgttgtctgtctcgacaagaatgattcggtcggt

gagcatgtgtacaatgtaagaaattcaaaagttgctggggctgtctttat

aactaatacaactgacttggaattctacctccaaagcgaattcccggctg

tgttcttgaacttacaagagggtgataaagttctagagtacattaagagc

aactctgcacctaaaggaaaacttgaattccaagtgacacatattggtgc

taaacgagcaccagaagttgctacctatagctcaagaggaccgtcaccga

gttgtccaagtatcctcaagcctgatctcatggctcctggtgccttaata

cttgcttcatggccacagcaatcaccagttactgatgttacctcaggaaa

gctttttagtaacttcaatattatatctggtacatcaatgtcatgtccac

atgcttctggtgtagcagcacttctaaaaggcgcacaccctgaatggagc

cccgcagccatccgatctgccatgatgaccacttccagtgcgttggacaa

cacacagagccccatccgagacataggtaatgttgaactcaggaatgctg

ctgctactcctctagccatgggagctggccatatcgatccaaacaaggca

ttagatcctggacttatctatgatgcgacaccacaagattatgtcaatct

tctctgtgctctgaacttcacatccaaacaaataaaaatcatcacaagat

cctcatcttatacttgctccaacccatcattggacttaaactatccatcc

ttcattggctttttcaatgggaacagcagagagtcggatcctaaaaggat

tcaagaattcaaaagaacagtgacaaacttacaagatggtacatcagtat

atacagcaaatctcactccaatgggtaaatttaaagttagtgttgtacct

gaaaagttggttttcaaagagaagtatgaaaagctgagctacaagcttag

aatagaaggtccaatagttatggatgataatgtggtttatggttctttga

gctgggtagaaactggaggtaaatatgtggttagaagtccaattgttgcc

acaagcataaaagtggatcctttgacaggacacaactga

Particularly preferred is a Nicotiana tabacum subtilisin-like serine protease comprising amino acid sequence SEQ ID No. 18, whereby amino acid residues 1 to 150, preferably 1 to 140, more preferably 1 to 130, more preferably 1 to 120, more preferably 1 to 115, in particular 1 to 110, of SEQ ID No. 18 represent the signal peptide and propeptide region:

(SEQ ID No. 18)

MASHITLCIWLLFFFISIISLAKPETYIIHMDLSAMPKAFASHHNWYLTT

LASLSDSSTNHKEFLSSKLVYAYTNAINGFSASLSPSEFEAIKNSPGYVS

SIKDMSVKIDTTHTSQFLGLNSESGVWPTSDYGKDIIIGLVDTGIWPESK

SYSDYGISEVPSRWKGECESGIEFNSSLCNKKIIGARYFNKGLLANNPNL

NISMNSARDTDGHGTHTSSTAAGSYVEGASYFGYATGTAIGIAPKAHVAM

YKALWEEGVYLSDVLAAIDQAITDGVDVLSLSLGIDAIPLHEDPVAIAAF

AALEKGIFVSTSAGNEGPYYETLHNGTPWVLTVAAGTVDREFIGALTLGN

GVSVTGLSLYPGNSSSSESSIVYVECQDDKELQKSAHNIVVCLDKNDSVS

EHVYNVRNSKVAGAVFITNITDLEFYLQSEFPAVFLNLQEGDKVLEYIKS

NSAPKGKLEFRVTHIGAKPAPKVATYSSRGPSPSCPSILKPDLMAPGALI

LASWPQQSPVTDVTSGKLFSNFNIISGTSMSCPHASGVAALLKAAHPEWS

PAAIRSAMMTTSNAMDNTQSPIRDIGSKNAAATPLAMGAGHIDPNKALDP

GLIYDATPQDYVNLLCALNFTSKQIKTITRSSSYTCSNPSLDLNYPSFIG

FFNGNSSESDPRRIQEFQRTVTNIGDGMSVYTAKLTTMGKFKVNLVPEKL

VFKEKYEKLSYKLRIEGPLVMDDIVVYGSLSWVETEGKYVVRSPIVATSI

KVDPLTGHN

A nucleic acid sequence encoding the Nicotiana tabacum subtilisin-like serine protease may comprise SEQ ID No. 19, whereby nucleotide residues 1 to 450, preferably 1 to 420, more preferably 1 to 390, more preferably 1 to 360, more preferably 1 to 345, in particular 1 to 330, of SEQ ID No. 19 encode the signal peptide and the propeptide region:

(SEQ ID No. 19)

atggcaagtcatattaccttgtgtatttggttgcttttcttctttatttctataatt

tcactagcaaagccagaaacatatatcattcatatggatttgtcagccatgccaaaa

gcttttgctagccatcataattggtacttgacaacacttgcttctttatcagacagt

agtacaaatcacaaagaattcttgtcctcaaaactagtctatgcttatactaatgcc

atcaatggttttagtgcaagtctttctccttctgaatttgaagccataaaaaattct

ccaggttatgtttcttcaattaaggatatgtcagttaaaattgacacaactcacaca

tcccaattccttggcctaaactctgagtctggtgtatggccaacgtccgactatggt

aaagatatcataattggcttagttgatactggaatatggccagagagtaaaagctat

agtgattatgggattagtgaagtaccatcaagatggaaaggagaatgtgaaagtggc

attgagttcaattcctctttgtgtaacaagaaaatcattggcgctcgttacttcaat

aaaggcctacttgccaacaatccaaatcttaacatttcaatgaattctgctagagat

acagatggacatggaactcacacttcttccacagctgcgggaagttatgtcgagggt

gcatcttattttggctatgccaccggcactgctattggcatagcaccaaaggctcat

gtggctatgtacaaggctctatgggaagaaggtgtatacttgtctgatgttcttgct

gcaattgatcaagcaattacagatggtgtagatgttttatccttgtcattaggcata

gacgcgattccactacacgaagatcctgtggcaattgccgcatttgctgcattggag

aaaggtatatttgtttccacctctgcaggaaatgaagggccttattatgagactttg

cacaatggaacaccttgggtgctaactgttgcagctggcacagttgaccgcgaattt

attggcgcattaactcttggaaatggagtttcagtcactggcttatcgctctaccct

gggaattctagttcaagtgaaagctccattgtctatgttgaatgccaagatgacaag

gaactgcaaaaaagtgcacacaatattgttgtctgccttgacaagaatgattcggtc

agtgagcatgtgtacaatgtgagaaattcaaaagttgctggggctgtcttcataact

aatataactgatttggaattctacctccaaagcgaattcccggctgtgttcttgaac

ttacaagagggtgataaagttctagagtacattaagagcaactctgcacctaaagga

aaacttgaattccgagtgacacatattggtgctaaaccagcaccaaaagttgctacc

tatagctcaagaggaccgtcaccgagctgtccaagtatcctcaagcctgatctcatg

gctcctggtgccttaatactagcttcatggccacaacaatcaccagtgactgatgtt

acctcaggaaaactttttagtaacttcaatattatatctggtacatcaatgtcatgt

ccacatgcttctggtgtagcagcacttctaaaagccgcacaccctgaatggagccct

gcagccatccgatctgccatgatgaccacttccaatgcgatggacaacacacaaagt

cccatccgagacataggtagtaagaatgctgctgctactcctctagccatgggagct

ggccatatcgatccaaacaaggcactagatcctggacttatctatgatgcgacacca

caagattatgtcaatcttctctgtgctctgaacttcacatccaaacaaataaaaacc

atcacaagatcctcatcttatacttgctccaacccatcattggacttaaactatcca

tctttcattggatttttcaatgggaacagcagcgagtcggatcctagaaggatacaa

gaattccagaggaccgtgactaatattggagatggtatgtcagtatacacagcaaaa

ttgaccacaatgggtaaatttaaagttaatcttgtacctgaaaagttggttttcaaa

gagaagtatgaaaagttgagctacaagctaagaatagaaggtccattagttatggat

gatattgtggtttatggttctttgagctgggtagaaactgaaggtaaatatgtggtt

agaagtccaattgttgccacaagcataaaagtagatcctttgacaggacacaactga

Particularly preferred is a Solanum lycopersicum subtilisin-like serine protease comprising amino acid sequence SEQ ID No. 20, whereby amino acid residues 1 to 150, preferably 1 to 140, more preferably 1 to 130, more preferably 1 to 120, more preferably 1 to 110, in particular 1 to 102, of SEQ ID No. 20 represent the signal peptide and propeptide region:

(SEQ ID No. 20)

MANYIALCIWLLSIIQLAKSETYIIHMDLSAMPKAFSSHYNWYLTTLFSVSDSKDLL

SSKLVYTYTNAINGFSASLSPSEIEAIKNSPGYVSSIKDMSVKVDTTHTSQFLGLNS

ESGVWPKSDYGKDVIVGLVDTGIWPESRSYSDDGMNEVPSRWKGECESGTQFNTSLC

NKKLIGARYFNKGLLANNPNLTISMDSARDTDGHGTHTSSTAAGSRVEGASFFGYAA

GTATGVAPKAHVAMYKALWEEGVFLSDILAAIDQATADGVDVLSLSLGIDALPLYED

PVAIAAFAALEKGIFVSTSAGNEGPFLETLHNGTPWVLTVAAGTVDREFIGTVTLGN

GVSVTGLSLYPGNSSSSESSISFVDCQDDKELQKNAHRIVVCIDNNDSVSEDVYNVR

NSKVSGAVFITNSTDLEFYLQSEFPAVFLNIQEGDKVLEYVRSDSAPNAKLEFQVTR

IGAKPAPKVASYSSRGPSPSCPTILKPDLMGPGALILASWPQQTPVTEVTSGKLYSN

FNIISGTSMSCPHASGVAALLKSAHPEWSPAAIRSAMMTTAYVLDNTQSPIQDVGLK

NGVATPLAMGAGHIDPNKALDPGLIYDATPQDYVNHLCGLNFTSKQIQTITRSSTYT

CSNPSLDLNYPSFIGYFNRNSSDSDPKRIQEFKRTVTNLQDGTSVYTAKLTPMGKFK

VSVVPNKLTFKEKYEKQSYKLRIEGPIIMDDIVVDGSLSWMETRGKYIVKSPIVATS

IRVDPLRGHN

A nucleic acid sequence encoding the Solanum lycopersicum subtilisin-like serine protease may comprise SEQ ID No. 21, whereby nucleotide residues 1 to 450, preferably 1 to 420, more preferably 1 to 390, more preferably 1 to 360, more preferably 1 to 330, in particular 1 to 306, of SEQ ID No. 21 encode the signal peptide and the propeptide region:

(SEQ ID No. 21)

atggcaaattatattgccttgtgtatttggttgctttctataattcaattggcaaag

tcagaaacttatatcattcatatggatttgtcagccatgccaaaagctttttctagc

cattataattggtacttgactacacttttttcggtgtcagatagcaaagacttgttg

tcctctaaactagtttatacttatactaatgccatcaatggtttcagtgcaagtctt

tctccttctgagatagaagctattaagaattctccgggttatgtttcttcgattaag

gatatgtcggttaaagttgacacaacccacacatctcaattccttggccttaactct

gagtctggtgtatggccaaagtcggattatggcaaagatgttatagttggattagtt

gacactgggatatggccagagagtagaagttatagtgatgatgggatgaatgaagta

ccatcaagatggaaaggagaatgtgaaagtggaactcagttcaatacctctttgtgc

aataagaaactcattggcgctcgttacttcaataaaggcctacttgctaacaacccg

aatcttaccatttcgatggattctgctcgtgatacggatggacatgggactcacact

tcttccacggctgctggaagtcgtgtagagggtgcatctttttttggatatgctgct

ggaactgctacaggtgtagcaccaaaggctcatgtggccatgtacaaggctctatgg

gaagaaggtgtattcttgtctgatattcttgcagcaattgatcaagcaattgcagat

ggtgtagatgtattgtccttgtcattaggcattgatgcgcttcctttatacgaagat

cctgttgcaattgccgcatttgctgcactggagaaaggcatatttgtttctacatct

gcaggaaatgaagggccttttttggaaactttgcacaatggaacgccttgggtgctc

actgttgctgctggcacagttgaccgggagtttattggcacagtaacacttggaaat

ggagtttcggtcactggattatcgctctaccctgggaactctagttcaagtgaaagc

tccattagcttcgttgattgccaagatgataaggaactgcagaaaaatgcacacaga

atagtggtctgcattgacaataatgattcagttagcgaggacgtctacaatgtaaga

aattcaaaagtttctggtgcagtcttcataactaattcaactgacttggaattctat

ctccaaagtgaatttcccgcggtgtttttgaacattcaagagggtgataaagttctt

gagtacgttaggagtgactctgcacctaacgcaaagcttgaattccaagtgacgcgt

attggtgctaaaccagcaccaaaagttgctagctatagctcaagaggaccatcacca

agctgtcctacaattctaaagcctgaccttatgggtcctggtgcattaatacttgct

tcatggccacaacaaacaccagtaactgaagttacctcgggaaaactttatagtaac

ttcaacattatatcaggcacatcaatgtcttgtccacatgcttctggtgtagcagca

cttctaaaaagtgcacaccctgaatggagccctgctgctatccgatccgccatgatg

accacagcctatgtattggacaacacacaaagccccatccaagacgtaggtctgaag

aatggtgttgctactcctctagctatgggagctggccatatcgatccaaacaaagcg

ttggatcccggactcatctatgatgcaacaccacaagattatgttaaccacctctgt

ggtttgaacttcacatccaaacaaatacaaaccatcacaagatcctcaacttacact

tgctccaacccatcattagacttgaactatccatcattcattggctatttcaaccgg

aatagcagcgattcggatcctaaaaggattcaagaattcaaaagaacagtgactaac

ttacaagatggtacatcagtatacacagcaaagctcactccaatgggtaaatttaaa

gttagtgttgtacctaataagttgactttcaaagaaaagtatgaaaaacaaagctat

aagctaagaatagaaggtccaataattatggatgatattgtggttgatggttcattg

agctggatggaaactagaggaaaatacatagttaaaagtccaattgttgcaacaagt

ataagagtggatcctttgagaggacataactga

Particularly preferred is a Solanum tuberosum subtilisin-like serine protease comprising amino acid sequence SEQ ID No. 22, whereby amino acid residues 1 to 150, preferably 1 to 140, more preferably 1 to 130, more preferably 1 to 120, more preferably 1 to 110, in particular 1 to 102, of SEQ ID No. 22 represent the signal peptide and propeptide region:

(SEQ ID No. 22)

MANHIALCIWLLSIIQLAKSETYIIHMDLSAMPKAFSSHHNWYLTTLSSVSDSKDLL

SSKLVYTYTNAINGFSASLSPSEIEAIKNSPGYVSSIKDMSVKVDTTHTSQFLGLNS

ESGVWPKSDYGKDVIVGLVDTGIWPESRSYSDDGMNEVPSRWKGECESGTQFNSSLC

NKKLIGARYFNKGLLANNPNLTISMDSSRDTDGHGTHTSSTAAGSRVEGASFFGYAA

GTATGVAPKAHVAMYKALWEEGVFLSDILAAIDQAIEDGVDVLSLSLGIDALPLYED

PVAIAAFAALEKGIFVSTSAGNEGPFLETLHNGTPWVLTVAAGTVDREFIGTLTLGN

GVSVTGLSLYPGNSSSSERSISYVDCQDDKELQKNAHKIVVCIDKSDSVSEDVYNVR

NSKVSGAVFITNSTDLEFYLQSEFPAVFLNIQEGDKVLEYVRSEAAPNAKLEFQVTR

IGAKPAPKVASYSSRGPSPSCPIILKPDLMGPGALILASWPQQTPVTEVTSGKLYSN

FNIISGTSMSCPHASGVAALLKSAHPEWSPAAIRSAMMTTAYVLDNTQSPIQDIGLK

NGVATPLAMGAGHIDPNKALDPGLIYDATPQDYVNLLCALNFTSKQIQTITRSSTYT

CSNPSLDLNYPSFIGYFNRNSSDSDPKRIQEFIRTVTNLQDGTSVYTAKITPMGKFK

VSVVPNKLIFKEKYEKQSYKLRIEGPIIMDDIVVDGSLSWMETRGKYIVKSPIVATS

IRVDPLRGHN

A nucleic acid sequence encoding the Solanum tuberosum subtilisin-like serine protease may comprise SEQ ID No. 23, whereby nucleotide residues 1 to 450, preferably 1 to 420, more preferably 1 to 390, more preferably 1 to 360, more preferably 1 to 330, in particular 1 to 306, of SEQ ID No. 23 encode the signal peptide and the propeptide region:

(SEQ ID No. 23)

atggcaaatcatattgccttgtgtatttggttgctttctataattcaattggcaaag

tcagaaacttatatcattcatatggatttgtcagccatgccaaaagctttttctagc

catcataattggtacttgactacactttcttctgtatcagatagcaaagacttgttg

tcctctaaactagtttatacttatactaatgccatcaatggtttcagtgcaagtctt

tctccttctgagatagaagctattaagaattctcccggctatgtttcttcgattaag

gatatgtcggttaaagttgacacaactcacacatctcaattccttggccttaactct

gagtctggtgtatggccaaagtctgattatggcaaagatgtcatagttggattagtt

gacactgggatatggccagagagtagaagttatagtgatgatgggatgaatgaagta

ccatcaagatggaaaggagaatgtgaaagtggaactcagttcaattcctctttgtgc

aataagaaactcattggcgctcgttacttcaataaaggcctacttgctaacaacccg

aatcttaccatttcgatggattcttctcgtgatacggatggacatgggactcatact

tcttccacggctgctggaagtcgtgtagagggtgcatctttttttggctatgccgct

ggaacagctacaggtgtagcaccaaaggctcatgtggccatgtacaaggctctatgg

gaagaaggtgtattcttgtctgatattcttgcagcaattgatcaagcaattgaagat

ggtgtagatgtattgtcgttgtcattaggcattgacgcgcttccattatacgaagat

cctgttgcaattgctgcatttgctgcactggagaaaggcatatttgtttctacatct

gcaggaaatgaagggccatttttggaaactttgcacaatggaacaccttgggtgctc

actgttgctgctggcacggttgaccgtgagtttattggcacactaacacttggaaat

ggagtttcggtcactggattatcgctctaccctgggaattctagttcaagtgaacgc

tccattagctatgttgattgccaagatgataaggaactgcagaaaaatgcacacaaa

atagtggtctgcattgacaagagtgattcagttagcgaagacgtctacaatgtaaga

aattcaaaagtttctggtgcagtcttcataactaattcaactgacttggaattctac

ctccaaagtgaatttcctgcagtgtttttgaacattcaagagggtgataaagttctt

gagtatgttaggagtgaggctgcacctaacgcaaagcttgaattccaagtgacgcgt

attggtgctaaaccagcaccaaaagttgctagctatagctcaagaggaccatcacca

agttgtcctataattctcaagcctgacctcatgggtcctggtgccttaatacttgct

tcatggccacaacaaacaccagtaactgaagttacctcaggaaaactttatagtaac

ttcaacattatatcaggcacatcaatgtcttgtccacatgcttctggtgtagcagca

cttctaaaaagtgcacaccctgaatggagccctgctgctatccgatccgccatgatg

accacagcctatgtattggacaacacacaaagccccatccaagacataggtttgaag

aatggtgttgctactcctctagctatgggagctggccatatcgatccaaacaaggcg

ttggatcccggactcatctatgatgcaacaccacaagattatgttaatctcctttgt

gctttgaacttcacatccaaacaaatacaaaccatcacaagatcctcaacttacact

tgctccaacccatcattagacttgaactatccatctttcattggctatttcaaccgg

aacagcagcgattcagatcctaaaaggattcaagaattcataagaacagtgacaaac

ttacaagatggtacatcagtatacacagcaaagatcactccaatgggtaaatttaaa

gttagtgttgtacctaataagttgattttcaaagaaaagtatgaaaagcaaagctac

aagctaagaatagaaggtccaataattatggatgatattgtggttgatggttccttg

agttggatggaaactagaggaaaatacatagtcaaaagtccaattgttgcaacaagt

ataagagtggatcctttgagaggacataactga

According to a preferred embodiment of the present invention the proteolytic activity of the at least one serine protease in the genetically modified plant or plant cell is at least 10%, preferably at least 25%, more preferably at least 50%, more preferably at least 70%, more preferably at least 80%, more preferably at least 90%, more preferably at least 95%, in particular 100%, reduced compared to its activity in the wild-type plant or plant cell. A preferred method for determining the proteolytic activity is for example the method described by Twining, 1984, Anal. Biochem. 143, 30-34.

According to a particularly preferred embodiment of the present invention said at least one serine protease of the wild-type plant or plant cell is mutated in the genetically modified plant or plant cell at at least one position of its amino acid sequence to reduce its proteolytic activity.

The proteolytic activity of a protease can be influenced by introducing one or more mutations within the amino acid sequence of said protease. The mutations may involve substitution, deletion or introduction of at least one amino acid residue within the amino sequence of the protease. The effect of the introduction of the at least one mutation can be determined by using an assay for determining the proteolytic activity in the mutated variant and comparing these results with the results of a proteolytic activity assay with a non-mutated version of the same protease.

Serine proteases require for their enzymatic activity a catalytic triad. In order to reduce the proteolytic activity of such proteases one or more amino acid residues of this catalytic triad may be substituted or deleted. Hence, the serine protease activity is preferably reduced by at least one amino acid substitution in the catalytic triad of a serine protease motif.

The proteolytic activity of the serine proteases of the present invention can be preferably reduced by mutating at least one amino acid residue (substitution or deletion) within the catalytically active fragment of said serine proteases.

“Catalytically active fragment” or “enzymatically active fragment”, as used herein, refers to a polypeptide fragment that contains the catalytically active domain of a protease sufficient to exhibit activity. A catalytically active fragment is the portion of a protease that, under appropriate conditions, can exhibit catalytic activity and is able to cleave a bond in a peptide, polypeptide or protein. Typically, a catalytically active fragment is a contiguous sequence of amino acid residues of a protease that contains the catalytic domain and required portions for recognizing the substrate to be cleaved. A preferred enzymatically/catalytically active fragment of a protease lacks amino acid residues 1 to 200, preferably 5 to 190, more preferably 10 to 180, more preferably 15 to 170, more preferably 15 to 160, more preferably 20 to 150, more preferably 25 to 125, more preferably 1 to 115, even more preferably 1 to 102-110 of a wild-type serine protease according to the present invention (e.g. SEQ ID No. 2, 4, 6, 8, 10, 14, 16, 18, 20 or 22).

According to a preferred embodiment of the present invention the catalytically active fragment of a subtilisin-like serine protease comprises or consists of an amino acid sequence selected from the group consisting of amino acid residues 101 to 770, preferably 106 to 770, more preferably 108 to 770, more preferably 109 to 770, more preferably 111 to 770, more preferably 112 to 770, more preferably 113 to 770, more preferably 116 to 770, more preferably 121 to 770, in particular 110 to 770, of SEQ ID No. 2; amino acid residues 101 to 768, preferably 106 to 768, more preferably 107 to 768, more preferably 109 to 768, more preferably 111 to 768, more preferably 112 to 768, more preferably 113 to 768, more preferably 116 to 768, more preferably 121 to 768, in particular 108 to 768, of SEQ ID No. 4; amino acid residues 101 to 768, preferably 106 to 768, more preferably 107 to 768, more preferably 109 to 768, more preferably 111 to 768, more preferably 112 to 768, more preferably 113 to 768, more preferably 116 to 768, more preferably 121 to 768, in particular 108 to 768, of SEQ ID No. 6; amino acid residues 101 to 783, preferably 106 to 783, more preferably 107 to 783, more preferably 109 to 783, more preferably 111 to 783, more preferably 112 to 783, more preferably 113 to 783, more preferably 116 to 783, more preferably 121 to 783, in particular 108 to 783, of SEQ ID No. 8; amino acid residues 101 to 773, preferably 106 to 773, more preferably 107 to 773, more preferably 109 to 773, more preferably 111 to 773, more preferably 112 to 773, more preferably 113 to 773, more preferably 116 to 773, more preferably 121 to 773, in particular 108 to 773, of SEQ ID No. 10; amino acid residues 102 to 759, preferably 107 to 759, more preferably 109 to 759, more preferably 110 to 759, more preferably 112 to 759, more preferably 113 to 759, more preferably 114 to 759, more preferably 117 to 759, more preferably 122 to 759, in particular 111 to 759, of SEQ ID No. 14; amino acid residues 102 to 762, preferably 107 to 762, more preferably 109 to 762, more preferably 110 to 762, more preferably 112 to 762, more preferably 113 to 762, more preferably 114 to 762, more preferably 117 to 762, more preferably 122 to 762, in particular 111 to 762, of SEQ ID No. 16; amino acid residues 102 to 759, preferably 107 to 759, more preferably 109 to 759, more preferably 110 to 759, more preferably 112 to 759, more preferably 113 to 759, more preferably 114 to 759, more preferably 117 to 759, more preferably 122 to 759, in particular 111 to 759, of SEQ ID No. 18; amino acid residues 94 to 751, preferably 99 to 751, more preferably 101 to 751, more preferably 102 to 751, more preferably 104 to 751, more preferably 105 to 751, more preferably 106 to 751, more preferably 109 to 751, more preferably 114 to 751, in particular 103 to 751, of SEQ ID No. 20; amino acid residues 94 to 751, preferably 99 to 751, more preferably 101 to 751, more preferably 102 to 751, more preferably 104 to 751, more preferably 105 to 751, more preferably 106 to 751, more preferably 109 to 751, more preferably 114 to 751, in particular 103 to 751, of SEQ ID No. 22.

The proteolytic activity of the serine proteases of the present invention can also be reduced by influencing the promoter region of their respective genes naturally occurring within the plant or plant cell. The promoter region is located at the 5′ end of the nucleic acid region coding for the respective serine proteases and comprises typically 500 to 2000 nucleotides. Deletion, substitution or insertion of nucleotides within this region influences the transcription rate and consequently the proteolytic activity of the serine proteases within the plant and plant cell.

According to a preferred embodiment of the present invention a promoter of a gene encoding said at least one serine protease of the wild-type plant or plant cell is mutated in the genetically modified plant or plant cell at at least one position of its nucleic acid sequence to reduce its expression rate.

A mutation within said promoter may be a deletion or substitution within 100, preferably within 200, more preferably within 300, more preferably within 500, more preferably within 700, more preferably within 1000, nucleotides located adjacent and upstream to the 5′ end of the coding region encoding said at least one serine protease of a plant or plant cell. Whether a mutation within said region results in a reduction or prevention of the expression rate of said at least one serine protease can be determined by methods known in the art and by comparing the expression rate of said at least one serine protease in a plant or plant cell modified within the aforementioned region with the expression rate of the same enzyme(s) in a wild-type plant or plant cell. In a preferred embodiment of the present invention up to 100, preferably up to 200, more preferably up to 300, more preferably up to 500, more preferably up to 700, more preferably up to 1000, nucleotides located adjacent and upstream to the 5′ end of the coding region encoding said at least one serine protease of the wild-type plant or plant cell are deleted. In another preferred embodiment of the present invention 10 to 100, preferably 10 to 200, more preferably 50 to 300, more preferably 50 to 500, more preferably 100 to 700, more preferably 200 to 1000, nucleotides located adjacent and upstream to the 5′ end of the coding region encoding said at least one serine protease of the wild-type plant or plant cell are deleted.

In a particularly preferred embodiment of the present invention at least 30%, preferably at least 40%, more preferably at least 50%, more preferably at least 60%, more preferably at least 70%, more preferably at least 80%, more preferably at least 90%, more preferably at least 95%, in particular 100%, of the coding region of the serine proteases of the present invention having at least 80%, preferably at least 85%, more preferably at least 90%, more preferably at least 95%, more preferably at least 96%, more preferably at least 98%, more preferably at least 99%, in particular 100%, sequence identity to a serine protease consisting of an amino acid sequence selected from the group consisting of SEQ ID No. 2, SEQ ID No. 4, SEQ ID No. 6, SEQ ID No. 8, SEQ ID No. 10, SEQ ID No. 14, SEQ ID No. 16, SEQ ID No. 18, SEQ ID No. 20 and SEQ ID No. 22, preferably SEQ ID No. 2 or SEQ ID No. 14, i.e. a nucleic acid sequence selected from the group consisting of SEQ ID No. 3, SEQ ID No. 5, SEQ ID No. 7, SEQ ID No. 9, SEQ ID No. 11, SEQ ID No. 15, SEQ ID No. 17, SEQ ID No. 19, SEQ ID No. 21 and SEQ ID No. 23, preferably SEQ ID No. 3 or SEQ ID No. 15, are deleted in the plant or plant cell of the present invention by methods known in the art.

According to a preferred embodiment of the present invention an inhibitor (e.g. Arabidopsis thaliana subtilisin propeptide-like inhibitor 1, At1g71950; A. thaliana subtilisin propeptide-like inhibitor 2, At2g39851; S106g065370; S111g018590; Hohl M et al., 2017, J. Biol. Chem. 292, 6389-6401) of said at least one serine protease is overexpressed in the genetically modified plant or plant cell to reduce the expression rate of the at least one serine protease or to inhibit the proteolytic activity of the at least one serine protease.

In a particularly preferred embodiment of the present invention the serine protease inhibitor is Arabidopsis thaliana subtilisin propeptide-like inhibitor 1 (GenBank: AEE35257.1), A. thaliana subtilisin propeptide-like inhibitor 2 (GenBank: AEC09739.1), S106g065370 (SEQ ID No. 33) or S111g018590 (SEQ ID No. 34).

(SEQ ID No. 33)

MMKKDQILSSILLFFFLFTAFITTMADSQASSQPSNESKVHIVYTEQPKDQEPEEYH

IKTLTSVLGSEEAAKEALLYSYKHAASGFSAKLTAEQVSELSKLPGVLQVVPSQTVQ

LHTGRV

(SEQ ID No. 34)

MQKIQIIFLVFLLLFVADCEEAKVYIVFTENPQPKEFHIKTLASVLGSEDAAREALI

YSYKHVISGFAARLTPEQVSELAKKPGVLEIVPSRTYHLDGPKLK

According to a preferred embodiment of the present invention the genetically modified plant or plant cell comprises at least one exogenous nucleic acid molecule encoding for at least one protein or polypeptide of interest.

The genetically modified plant or plant cell can be used to produce/express at least one protein or polypeptide of interest which is not naturally occurring in said plant or plant cell. In order to allow the recombinant expression of heterologous polypeptides and proteins the respective nucleic acid molecule encoding said polypeptides and proteins are introduced into the plant or plant cell of the present invention.

The terms “exogenous protein”, “exogenous polypeptide”, “heterologous protein” or “heterologous polypeptide”, as defined herein, all refer to a protein or polypeptide that is not expressed by the plant or plant cell in nature, i.e. is not naturally occurring in plants and plant cells. This is in contrast with a homologous protein which is a protein naturally expressed by a plant or plant cell. The heterologous expression of polypeptides and proteins within a host cell can be achieved by means and methods known in the art and described below for the polypeptide of the present invention.

The exogenous nucleic acid molecule of the present invention can be a vector, preferably a plant vector, or a nucleic acid molecule carrying elements allowing cloning and/or the recombinant expression of heterologous proteins or polypeptides within a plant or plant cell.

The vector of the present invention can be used to clone the nucleic acid molecule of the present invention, as a shuttle or as an expression vector in host cells. Expression vectors may include an expression cassette which comprises various specified nucleic acid elements which permit transcription of a particular nucleic acid in a host cell. Typically, expression cassettes comprise, among other sequences, a nucleic acid to be transcribed, a promoter and a terminator.

The vector of the present invention can be designed to be integrated into the genome of the host cell. Alternatively the vector is designed to not integrate into the genome of a host cell allowing transient expression of the nucleic acid molecule of the present invention. In the latter case the vector remains in a non-integrated state free within the cell.

A “vector” according to the present invention refers to a nucleic acid used to introduce the nucleic acid molecule of the present invention into a host cell. Expression vectors permit transcription of a nucleic acid inserted therein.

In order to enable a host cell to express a polypeptide of the present invention encoded by the nucleic acid molecule as defined above the vector of the present invention comprises a promoter operably linked to the nucleic acid molecule.

As used herein, “operably linked” refers to a functional linkage between a promoter and the nucleic acid molecule encoding the polypeptide of the present invention, wherein the promoter sequence initiates and mediates transcription of the DNA sequence corresponding to said nucleic acid molecule.

The term “promoter”, as used herein, refers to a region of a nucleic acid molecule upstream from the start of transcription and involved in recognition and binding of RNA polymerase and other proteins to initiate transcription. Promoters are able to control (initiate) transcription in a cell. Plant promoters are able of initiating transcription in plant cells whether or not its origin is a plant cell. Such promoters include promoters obtained from plants, plant viruses and bacteria which comprise genes expressed in plant cells such as Agrobacterium or Rhizobium. The promoter used in the vector of the present invention can be “inducible” or “repressible”, i.e. under environmental control. Such promoters can be controlled by changing the cultivation conditions (e.g. temperature) or by adding specific substances. Of course, the promoter used in the vectors of the present invention may be a “constitutive” promoter. Constitutive promoters are active under most environmental conditions and express continuously a protein or polypeptide of interest.

According to a preferred embodiment of the present invention the promoter is selected from the group consisting of promoters active in plants and plant cells, like the cauliflower mosaic virus 35S promoter, opine (octopine, nopaline, etc.) synthase promoters, actin promoter, ubiquitin promoter, etc.

In order to prevent transcriptional activation of downstream nucleic acid sequences by upstream promoters the vector of the present invention may comprise a “terminator” or “terminator sequence”. According to a preferred embodiment of the present invention the vector comprises a terminator which is preferably a g7T terminator.

According to a preferred embodiment of the present invention the exogenous/heterologous polypeptide or protein is of animal origin, preferably a mammalian, more preferably a human, polypeptide.

Polypeptides and proteins “of animal origin”, as used herein, refers to polypeptides which are not naturally occurring in plants and plant cells or any other non-animal organism. Polypeptides of animal origin can be derived from the genome of animals, preferably mammals, and are usually expressed in animals.

According to a further embodiment of the present invention the heterologous animal polypeptide is an antibody or a variant thereof.

Examples of heterologous polypeptides and proteins that can be advantageously produced by the plants or plant cells of the present invention include antibodies or variants thereof which are selected from the group consisting of monoclonal antibodies, chimeric antibodies, humanized antibodies, human antibodies, multispecific antibodies, Fab, Fab′, F(ab′)₂, antigen-binding Fc fragments (Fcab) and derivatives thereof, Fv, domain antibodies (dAb), complementarity determining region (CDR) fragments, CDR-grafted antibodies, single-chain antibodies (ScFv), single chain antibody fragments, diabodies, triabodies, tetrabodies, minibodies, linear antibodies, chelating recombinant antibodies, tribodies, bibodies, intrabodies, nanobodies, small modular immunopharmaceuticals (SMIP), camelized antibodies, VHH containing antibodies and polypeptides that comprise at least a portion of an immunoglobulin that is sufficient to confer specific antigen binding to the polypeptide, such as one, two, three, four, five or six CDR sequences.

According to a particularly preferred embodiment of the present invention the antibody is an antibody binding to a human immunodeficiency virus (HIV) surface protein.

As used herein, the term “specific for” can be used interchangeably with “binding to” or “binding specifically to”. These terms characterize molecules that bind to an antigen or a group of antigens with greater affinity (as determined by, e.g., ELISA or surface plasmon resonance spectroscopy assays) than other antigens or groups of antigens. According to the present invention molecules “specific for” an antigen may also be able to bind to more than one, preferably more than two, more preferably more than three, even more preferably more than five, antigens. Such molecules are defined to be “cross-reactive” (e.g. cross-reactive immunoglobulins, cross-reactive antigen binding sites).

The antibody expressed by the plant cell and plant of the present invention is preferably an antibody selected from the group consisting of 2F5 (Zwick et al., 2004, J. Virol. 78, 3155-3161), 2G12 (Doores et al., 2010, J. Virol. 84, 10690-10699), PG9 (Loos et al., 2015, Proc. Natl. Acad. Sci. USA 112, 12675-12680), PG16 (Pancera et al., 2013, Nat. Struct. Mol. Biol. 20, 804-813) and variants thereof.

It is known that non-cleaved antibodies may have a greater binding affinity to an antigen compared to their cleaved versions. This in effect is particularly significant for antibodies like 2F5, 2G12, PG9, PG16 and variants thereof. Therefore, it is particularly advantageous to provide efficient tools for producing non-cleaved 2F5, 2G12, PG9, PG16 and variants thereof.

According to a preferred embodiment of the present invention the antibody variant is selected from the group consisting of 2G12 comprising modifications at IH¹⁹(Doores et al., 2010, J. Virol. 84, 10690-10699).

According to a preferred embodiment of the present invention the antibody variant is selected from the group consisting of PG9 comprising modifications at NL²³(Loos et al., 2015, Proc. Natl. Acad. Sci. USA 112, 12675-12680).

According to a preferred embodiment of the present invention the antibody variant is selected from the group consisting of PG9 comprising modifications at TL⁹⁴, RL⁹⁵and R^L95A(Pancera et al., 2013, Nat. Struct. Mol. Biol. 20, 804-813).

According to a preferred embodiment of the present invention the plant or plant cell is of the genus Nicotiana and preferably a species selected from the group consisting of Nicotiana benthamiana and Nicotiana tabacum.

Further preferred plants to be used according to the present invention include Nicotiana benthamiana, tobacco (Nicotiana tabacum), Nicotiana spp, Arabidopsis thaliana, tomato (Solanum lycopersicum), potato (Solanum tuberosum), duckweed (Lemna minor), mosses (e.g. Physcomitrella), corn (Zea mays), rice (Oryza sativa), wheat (Triticum aestivum), peas (Pisum sativum), flaxseed (Linum usitatissimum) and rapeseed (Brassica napus), in particular the Nicotiana benthamiana line C105 (Strasser R et al. Plant Biotechnol J 6(2008):392-402).

A further aspect of the present invention relates to a serine protease consisting of an amino acid sequence selected from the group consisting of SEQ ID No. 2, SEQ ID No. 4, SEQ ID No. 6, SEQ ID No. 8, SEQ ID No. 10, SEQ ID No. 14, SEQ ID No. 16, SEQ ID No. 20 and SEQ ID No. 22, preferably SEQ ID No. 2 or SEQ ID No. 14.

Another aspect of the present invention relates to a nucleic acid molecule encoding a serine protease according to the present invention.

The nucleic acid molecule of the present invention encoding a serine protease as defined herein can be used to recombinantly express said serine protease. The specificity of the serine proteases of the present invention can be used in methods or organisms where such a cleavage specificity is desired. These methods may involve the use of non-plant organisms like animals or microorganisms.

The nucleic acid molecule of the present invention can be part of a vector which can be used for cloning or expression purposes. Respective vectors are well-known in the art.

The nucleic acid according to the present invention may encode a subtilisin-like serine protease and may comprise or consist of a nucleic acid sequence selected from the group consisting of SEQ ID No. 3, SEQ ID No. 5, SEQ ID No. 7, SEQ ID No. 9, SEQ ID No. 11, SEQ ID No. 15, SEQ ID No. 17, SEQ ID No. 21 and SEQ ID No. 23, preferably SEQ ID No. 3 or SEQ ID No. 15.

The nucleic acid molecule encoding a subtilisin-like serine protease comprising or consisting of a nucleic acid sequence selected from the group consisting of SEQ ID No. 3, SEQ ID No. 5, SEQ ID No. 7, SEQ ID No. 9, SEQ ID No. 11, SEQ ID No. 15, SEQ ID No. 17, SEQ ID No. 21 and SEQ ID No. 23, preferably SEQ ID No. 3 or SEQ ID No. 15, may comprise a nucleotide stretch encoding a tag sequence for detection and affinity purification, preferably at the N- or C-terminus, more preferably at the C-terminus of the polypeptide. This tag may consist of 3 to 60, preferably 3 to 45, more preferably 6 to 30, more preferably 3 to 18, in particular 18, nucleotides (n). In a particularly preferred embodiment of the present invention the tag consists of or comprises the nucleic acid stretch encoding the amino acid sequence HHHHHH.

Another aspect of the present invention relates to a polypeptide encoded by a nucleic acid molecule according to the present invention, preferably a polypeptide comprising or consisting of an amino acid sequence selected from the group consisting of SEQ ID No. 2, SEQ ID No. 4, SEQ ID No. 6, SEQ ID No. 8, SEQ ID No. 10, SEQ ID No. 14, SEQ ID No. 16, SEQ ID No. 20 and SEQ ID No. 22, most preferably SEQ ID No. 2 or SEQ ID No. 14.

The serine proteases of the present invention may comprise a tag sequence for detection and/or affinity purification, preferably at the N- or C-terminus, more preferably at the C-terminus. This tag may consist of 1 to 20, preferably 1 to 15, more preferably 2 to 10, more preferably 2 to 6, in particular 6 amino acid residues. In a particularly preferred embodiment of the present invention the tag consists of or comprises the amino acid sequence HHHHHH.

The terms “polypeptide” and “protein” refer to a polymer of amino acid residues, wherein a “polypeptide” polymer may comprise more than 5 amino acid residues and a “protein” polymer may comprise more than 100 amino acid residues. The term “polypeptide” may, thus, include also “proteins”.

Yet, another aspect of the present invention relates to a plant or plant cell capable of cleaving a polypeptide preferably of animal origin heterologously produced in said plant or plant cell comprising a nucleic acid molecule or a vector according to the present invention.

In order to support the correct folding of heterologous, preferably mammalian, more preferably human, proteins or polypeptides expressed in plants and plant cells said proteins or polypeptides may be fused to another polypeptide or peptide, i.e. a fusion partner (see e.g. Sainsbury et al., 2016, Front. Plant Sci. 7, 141). Said fusion partner is linked via a peptide to the protein or polypeptide to be heterologously expressed whereby said peptide comprises cleavage sites which are recognized and cut by the serine proteases of the present invention. The removal of the aforementioned fusion partner can be done directly in the plant and plant cell of the present invention which expresses said serine protease.

The proteins and polypeptides heterologously produced in said plant or plant cell may be fused to peptidic tags which can be used to facilitate the purification of the proteins. Said peptidic tags may have, for instance, a binding affinity to other chemical moieties which may be immobilized on a solid support. The peptidic tags are linked via a peptide comprising a cleavage site which is recognized and cut by the serine proteases of the present invention to the heterologously produced proteins and polypeptides.

The plant and plant cell of the present invention may carry the nucleic acid molecule of the present invention in a non-integrated form (e.g. as a vector) or may be a “transgenic plant” or “transgenic plant cell”. Such a plant or plant cell comprises within its genome a heterologous nucleic acid molecule. This heterologous nucleic acid molecule is usually stably integrated within the genome such that the polynucleotide is passed on to successive generations. In order to allow the plant and plant cell to express the polypeptide of the present invention, its encoding nucleic acid molecule is operably linked to a promoter.

The nucleic acid molecule and the vector according to the present invention enable a plant or plant cell to cleave polypeptides and proteins expressed in said plant or plant cell. Therefore, the transgenic plant or plant cell of the present invention comprise preferably a nucleic acid molecule encoding a heterologous polypeptide operably linked to a promoter region.

Yet, another aspect of the present invention relates to a method of recombinantly producing a non-cleaved polypeptide of animal origin comprising the step of cultivating a plant or plant cell according to the present invention.

Methods and means to cultivate recombinant plants and plant cells are known in the art (e.g. Plant Biotechnology and Genetics: Principles, Techniques, and Applications, edited by C. N. Stewart, 2008, Wiley, ISBN 978-0-470-04381-3).

EXAMPLES
Materials and Methods
1. Cloning of Neutralizing Anti-HIV Antibody 2G12 and Variant Thereof

The signal peptide of barley a-amylase (amino acid residues 1 to 24 of the amino acid sequence of acc. no. CAX_51374.1) was cloned into MagnIcon vectors pICH26033 and pICH31160 (Niemer M et al. Biotechnol J 9(2014):493-500) to give rise to pICHα26033 and pICHα31160. The sequences encoding the 2G12 heavy and light chains (omitting their authentic signal peptides) were amplified from the corresponding pDONR221 constructs (Loos A et al. Plant Biotechnol J 9(2011):179-192) by PCR. The fragments thus generated were then inserted into the BsaI sites of pICHα31160 (heavy chain) and pICHα26033 (light chain). The domain-swapping point mutation 119R (Doores KJ et al. J Virol 84(2010):10690-10699) was introduced into the 2G12 heavy-chain sequence by site-directed mutagenesis. The mutated product was then cloned into pICHα31160 as outlined above.

2G12HC (SEQ ID No. 24; PDB-database: 1ZLS, 1ZLU, 1ZLV, 1ZLW, 2OQJ, 3OAY, 3OAZ, 3OB0) cDNA was PCR-amplified from pDONR221/2G12HC (Loos A et al. Plant Biotechnol J 9(2011):179-192), without signal peptide but with 5′- and 3′ BsaI restriction sites.

2G12HC consists of amino acid sequence SEQ ID No. 24, whereby the signal peptide of barley α-amylase is marked in bold and italic:

EVQLVESGGGLVKAGGSLILSCGVSNFRISAHTM

NWVRRVPGGGLEWVASISTSSTYRDYADAVKGRFTVSRDDLEDFVYLQMHKMRVEDTA

IYYCARKGSDRLSDNDPFDAWGPGTVVTVSPASTKGPSVFPLAPSSKSTSGGTAALGC

LVKDYFPEPVTVSWNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNV

NHKPSNTKVDKKVEPKSCDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVT

CVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKE

YKCKVSNKAFPAPIEKTISKAKGQPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSD

IAVEWESNGQPENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALHN

HYTQKSLSLSPGK

2G12HC is encoded by nucleotide sequence SEQ ID No. 25 (including a stop codon), whereby the nucleic acid stretch encoding the signal peptide of barley α-amylase is marked in bold and italic:

gaggtgcagctggtggagtctgggggaggcctggtcaaggcggg

aggatccctcatactctcctgtggagtctctaattttagaatctctgcccataccatg

aattgggtccgccgggttccagggggggggctggagtgggtcgcttccattagtacga

gttccacttatagagactatgcagacgctgtgaagggccgattcaccgtttccagaga

cgacctcgaagactttgtgtatttgcaaatgcacaaaatgagagtcgaagacacggct

atttattactgcgccagaaagggatctgacagactaagcgacaacgatccttttgatg

cctgggggccaggaacagtggtcaccgtctctcccgcctccaccaagggcccatcggt

cttccccctggcaccctcctccaagagcacctctgggggcacagcggccctgggctgc

ctggtcaaggactacttccccgaaccggtgacggtgtcgtggaactcaggcgccctga

ccagcggcgtgcacaccttcccggctgtcctacagtcctcaggactctactccctcag

cagcgtggtgaccgtgccctccagcagcttgggcacccagacctacatctgcaacgtg

aatcacaagcccagcaacaccaaggtggacaagaaagttgagcccaaatcttgtgaca

aaactcacacatgcccaccgtgcccagcacctgaactcctggggggaccgtcagtctt

cctcttccccccaaaacccaaggacaccctcatgatctcccggacccctgaggtcaca

tgcgtggtggtggacgtgagccacgaagaccctgaggtcaagttcaactggtacgtgg

acggcgtggaggtgcataatgccaagacaaagccgcgggaggagcagtacaacagcac

gtaccgtgtggtcagcgtcctcaccgtcctgcaccaggactggctgaatggcaaggag

tacaagtgcaaggtctccaacaaagccttcccagcccccatcgagaaaaccatctcca

aagccaaagggcagccccgagaaccacaggtgtacaccctgcccccatcccgggatga

gctgaccaagaaccaggtcagcctgacctgcctggtcaaaggcttctatcccagcgac

atcgccgtggagtgggagagcaatgggcagccggagaacaactacaagaccacgcctc

ccgtgctggactccgacggctccttcttcctctacagcaagctcaccgtggacaagag

caggtggcagcaggggaacgtcttctcatgctccgtgatgcatgaggctctgcacaac

cactacacgcagaagagcctctccctgtctccgggtaaatga

2G12HC cDNA encoding an I19R mutation (SEQ ID No. 26; PDB-database: 3OAU) was generated by site-directed mutagenesis, without signal peptide but with 5′- and 3′ BsaI restriction sites.

2G12HC (encoding an I19R mutation) consists of amino acid sequence SEQ ID No. 26, whereby the signal peptide of barley α-amylase is marked in bold and italic:

EVQLVESGGGLVKAGGSLRLSCGVSNFRISAHTM

NWVRRVPGGGLEWVASISTSSTYRDYADAVKGRFTVSRDDLEDFVYLQMHKMRVEDTA

IYYCARKGSDRLSDNDPFDAWGPGTVVTVSPASTKGPSVFPLAPSSKSTSGGTAALGC

LVKDYFPEPVTVSWNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNV

NHKPSNTKVDKKVEPKSCDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVT

CVVVDVSHEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKE

YKCKVSNKAFPAPIEKTISKAKGQPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSD

IAVEWESNGQPENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALHN

HYTQKSLSLSPGK

2G12HC (encoding an I19R mutation) is encoded by nucleotide sequence SEQ ID No. 27 (including a stop codon), whereby the nucleic acid stretch encoding the signal peptide of barley α-amylase is marked in bold and italic:

gaggtgcagctggtggagtctgggggaggcctggtcaaggcggg

aggatccctcagactctcctgtggagtctctaattttagaatctctgcccataccatg

aattgggtccgccgggttccagggggggggctggagtgggtcgcttccattagtacga

gttccacttatagagactatgcagacgctgtgaagggccgattcaccgtttccagaga

cgacctcgaagactttgtgtatttgcaaatgcacaaaatgagagtcgaagacacggct

atttattactgcgccagaaagggatctgacagactaagcgacaacgatccttttgatg

cctgggggccaggaacagtggtcaccgtctctcccgcctccaccaagggcccatcggt

cttccccctggcaccctcctccaagagcacctctgggggcacagcggccctgggctgc

ctggtcaaggactacttccccgaaccggtgacggtgtcgtggaactcaggcgccctga

ccagcggcgtgcacaccttcccggctgtcctacagtcctcaggactctactccctcag

cagcgtggtgaccgtgccctccagcagcttgggcacccagacctacatctgcaacgtg

aatcacaagcccagcaacaccaaggtggacaagaaagttgagcccaaatcttgtgaca

aaactcacacatgcccaccgtgcccagcacctgaactcctggggggaccgtcagtctt

cctcttccccccaaaacccaaggacaccctcatgatctcccggacccctgaggtcaca

tgcgtggtggtggacgtgagccacgaagaccctgaggtcaagttcaactggtacgtgg

acggcgtggaggtgcataatgccaagacaaagccgcgggaggagcagtacaacagcac

gtaccgtgtggtcagcgtcctcaccgtcctgcaccaggactggctgaatggcaaggag

tacaagtgcaaggtctccaacaaagccttcccagcccccatcgagaaaaccatctcca

aagccaaagggcagccccgagaaccacaggtgtacaccctgcccccatcccgggatga

gctgaccaagaaccaggtcagcctgacctgcctggtcaaaggcttctatcccagcgac

atcgccgtggagtgggagagcaatgggcagccggagaacaactacaagaccacgcctc

ccgtgctggactccgacggctccttcttcctctacagcaagctcaccgtggacaagag

caggtggcagcaggggaacgtcttctcatgctccgtgatgcatgaggctctgcacaac

cactacacgcagaagagcctctccctgtctccgggtaaatga

2G12LC (SEQ ID No. 28; PDB-database: 1ZLS, 1ZLU, 1ZLV, 1ZLW, 2OQJ, 3OAU, 3OAY, 3OAZ, 3OB0) cDNA was PCR-amplified from pDONR221/2G12LC (Loos A et al. Plant Biotechnol J 9(2011):179-192), without signal peptide but with 5′- and 3′ BsaI restriction sites.

2G12LC consists of amino acid sequence SEQ ID No. 28, whereby the signal peptide of barley α-amylase is marked in bold and italic:

DVVMTQSPSTLSASVGDTITITCRASQSIETWLA

WYQQKPGKAPKLLIYKASTLKTGVPSRFSGSGSGTEFTLTISGLQFDDFATYHCQHYA

GYSATFGQGTRVEIKRTVAAPSVFIFPPSDEQLKSGTASVVCLLNNFYPREAKVQWKV

DNALQSGNSQESVTEQDSKDSTYSLSSTLTLSKADYEKHKVYACEVTHQGLSSPVTKS

FNRGEC

2G12LC is encoded by nucleotide sequence SEQ ID No. 29 (including a stop codon), whereby the nucleic acid stretch encoding the signal peptide of barley α-amylase is marked in bold and italic:

gatgttgtgatgactcagtctccttccaccctgtctgcatctgt

cggagacacaatcaccatcacttgccgggccagtcagagtattgaaacctggttggcc

tggtatcagcagaagccagggaaagccccaaaactcctaatctacaaggcgtctactt

taaaaactggagtcccgtcaagattcagcggcagtggatctggaacagagttcactct

taccatcagtggcctgcagttcgatgactttgcaacttatcactgtcagcactatgct

ggttattcagcgacgttcggccaggggaccagggtggagatcaaacgaactgtggctg

caccatctgtcttcatcttcccgccatctgatgagcagttgaaatctggaactgcctc

tgttgtgtgcctgctgaataacttctatcccagagaggccaaagtacagtggaaggtg

gataacgccctccaatcgggtaactcccaggagagtgtcacagagcaggacagcaagg

acagcacctacagcctcagcagcaccctgacgctgagcaaagcagactacgagaaaca

caaagtctacgcctgcgaagtcacccatcagggcctgagctcgcccgtcacaaagagc

ttcaacaggggagagtgttag

2G12HC and 2G12HC (encoding an 119R mutation) were inserted into pICHα31160 in frame after the barley α-amylase signal peptide. 2G12LC was inserted into pICHα26033 in frame after the barley α-amylase signal peptide.

The vectors used for the expression of antibodies 2F5 and PG9 have been described (Niemer M et al. Biotechnol J 9(2014):493-500).

All vectors were transformed into Escherichia coli by electroporation and upon sequence confirmation into the Agrobacterium tumefaciens strains GV3101pMP90RK or UIA143.

2. Cloning of Subtilisin-Like Serine Protease Constructs

RNA was extracted from 35-mg samples of leaves from 4-week-old Nicotiana benthamiana C105 plants lacking plant-specific α1,3-fucosylation and β1,2-xylosylation (Strasser R et al. Plant Biotechnol J 6(2008):392-402) using the SV Total RNA Isolation Kit (Promega). First-strand cDNA was synthesized from 1 μg RNA using the RevertAid H Minus First Strand cDNA Synthesis Kit (Thermo Scientific) and oligo(dT)₁₈as primer. For expression in Nicotiana benthamiana C105 plants, the coding sequences of the subtilisin-like serine proteases SBT1 and SBT2 (SEQ ID No. 3 and SEQ ID No. 15) were PCR-amplified using Q5 High-Fidelity DNA polymerase (New England Biolabs) and then inserted into a suitable expression vector, e.g. pPT2 (Strasser R et al. Biochem J 387(2005):385-391) or pART27 (Schardor K et al. Science 354 (2016): 1594-1597). After transformation into E. coli and sequence confirmation, all constructs were transformed into Agrobacterium tumefaciens, e.g. strain UIA143pMP90.

3. In Planta Expression of Monoclonal Antibodies

Nicotiana benthamiana C105 plants were grown for 4-5 weeks at 24° C. with a 16-h light: 8-h dark photoperiod. Infiltration with agrobacteria carrying the respective expression vectors was then performed as reported previously (Strasser R et al. Plant Biotechnol J 6(2008):392-402). Briefly, overnight cultures were pelleted and then resuspended in infiltration buffer (25 mM Mes buffer (pH 5.5), 25 mM MgSO₄, 0.1 mM acetosyringone) at an OD₆₀₀of 0.2 (1.0 OD₆₀₀corresponds to 5×10⁸cells/mL). In the case of 2G12 and PG9 expression, equal amounts of the strains carrying the respective heavy and light chain constructs were used. Infiltrated N. benthamiana leaves were harvested after 4-5 days.

4. Monoclonal Antibody (mAb) Purification

Leaf material (see item 3) infiltrated with the transformed Agrobacterium tumefaciens strains described under item 1 was crushed under liquid nitrogen, extracted twice for 20 min on ice with 0.1 M Tris/HCl (pH 7.0) containing 0.5 M NaCl, 40 mM ascorbic acid and 1 mM EDTA (2 mL per gram leaf wet weight). After a centrifugation step (4° C., 30 min, 27500 g), the extract was clarified by a series of filtration steps with pore sizes ranging from 10 μm to 0.2 μm (Roth, AP27.1, Roth, AP51.1, Roth, CT92.1, Roth, KH54.1). Antibodies were then purified by affinity chromatography on a column packed with 1 mL rProtein A Sepharose 4 Fast Flow (GE Healthcare, 17-1279-01), using 0.1 M glycine/HCl (pH 3.0) for elution. Protein-containing eluate fractions were immediately neutralized by addition of 0.1 M Tris/HC1 (pH 8.0), dialyzed against phosphate-buffered saline (PBS) containing 0.02% (v/v) NaN₃and then concentrated by ultrafiltration using Amicon YM30 centrifugal filter units.

In the case of PG9, an additional size exclusion chromatography step was included to remove antibody aggregates. After dialysis against SEC buffer (10 mM Tris/HCl, 50 mM NaCl, pH 7.5), the affinity-purified sample (1 mL) was loaded onto a Superdex 200 16/600 column (GE Healthcare) operated at a flow rate of 1 mL min⁻⁸. Fractions corresponding to the elution position of monomeric PG9 were combined, dialyzed against PBS containing 0.02% (v/v) NaN3 and then concentrated by ultrafiltration as above.

5. Preparation of Apoplastic Fluid

For the recovery of apoplastic fluid, fully expanded leaves of Nicotiana benthamiana C105 plants (Strasser R et al. Plant Biotechnol J 6(2008):392-402) were submerged in 100 mM sodium acetate/40 mM ascorbic acid (pH 5.5) prior to vacuum exposure in a desiccator. Exogenous buffer was removed prior to centrifugation of the leaves for 15 min at 2000 g and 4° C. The recovered solution was then concentrated 10-20 fold using Microsep Advance centrifugal devices (10K MWCO; Pall Corporation). The total protein content of the concentrate was determined with the Bio-Rad Protein Assay kit (Bio-Rad), using bovine serum albumin (BSA) as a standard.

6. Identification of Subtilisin-Like Serine Proteases Present in Apoplastic Fluid

Concentrated apoplastic fluid prepared from Nicotiana benthamiana C105 plants (see item 5) was incubated for 1 h at 37° C. with 10 μM FP-biotin (Santa Cruz Biotechnology). The sample (2.5 mL) was then chromatographed on a PD-10 column (GE Healthcare) equilibrated in 100 mM sodium acetate (pH 5.5). The recovered eluate (3.5 ml) was incubated with 40 μL avidin-agarose beads (Sigma-Aldrich) for 16 h at 4° C. under constant agitation. The beads were washed four times with 4 mL 100 mM sodium acetate (pH 5.5) and once with 4 mL 10 mM Tris/HC1 (pH 6.8) prior to elution of the bound proteins with 80 μL SDS-PAGE sample buffer (5 min, 95° C.). The samples were then subjected to 12.5% SDS-PAGE under reducing conditions prior to staining of the gel with Coomassie Brilliant Blue R-250. The 60-80 kDa bands were excised, S-carboxamidomethylated and then digested with trypsin (Promega). The peptides thus generated were fractionated on a Thermo BioBasic C18 separation column (5 μm particle size, 150×0.36 mm) operated using a Dionex Ultimate 3000 system (Waters). A gradient from 95% solvent A and 5% solvent B (solvent A: 65 mM ammonium formate buffer at pH 3.0, B: 100% acetonitrile) to 32% B in 45 min was applied, followed by a 15-min gradient from 32% B to 75% B, at a flow rate of 6 μL/min. Eluted peptides were analysed online on a maXis 4G ETD Q-TOF mass spectrometer (Bruker) equipped with an electrospray ionization source and operated in the positive ion mode (m/z range: 150-2200).

7. Purification of Recombinant Subtilisin-Like Serine Proteases

Leaves of Nicotiana benthamiana C105 plants (Strasser R et al. Plant Biotechnol J 6(2008):392-402) expressing a subtilisin-like serine protease construct were submerged in extraction buffer (50 mM sodium phosphate/200 mM KCl, pH 7.0) prior to vacuum exposure in a desiccator. Exogenous buffer was removed prior to centrifugation of the leaves for 15 min at 2000 g and 4° C. The recovered solution was then supplemented with 20 mM imidazole and loaded on a 1 mL column of Chelating Sepharose (GE Healthcare) charged with Ni²⁺ ions. After washing with the same buffer, the recombinant subtilisin-like serine protease was eluted with 250 mM imidazole in extraction buffer. Protein-containing eluate fractions were pooled, dialysed twice against 2 liters of extraction buffer and then concentrated by ultrafiltration using Microsep Advance centrifugal devices (10K MWCO; Pall Corporation).

8. In Vitro Degradation Assays

Monoclonal antibodies (200 μg/mL) produced in Nicotiana benthamiana C105 plants or CHO cells (2F5: Polymun Scientific, AB001; 2G12: Polymun Scientific, AB002; PG9, Polymun Scientific, AB015) were treated with concentrated apoplastic fluid of Nicotiana benthamiana C105 plants (100 μg/mL) or purified proteases (10-100 μg/mL) in 100 mM sodium acetate (pH 5.5). After incubation for up to 16 h, reactions were stopped by addition of SDS-PAGE sample buffer.

For isolation of mAb degradation products, antibodies (250 μg) were digested with apoplastic fluid or purified subtilisin-like serine proteases in 500 μL 100 mM sodium acetate (pH 5.5) to completion. After addition of 10 μL settled rProtein A Sepharose 4 Fast Flow beads, the samples were incubated for 2 h at 4° C. under constant agitation. The beads were then collected by centrifugation, washed four times with 500 μL PBS prior to elution of the bound antibody fragments with 100 μL 100 mM glycine (pH 3.0). The eluate was immediately neutralized with 10 μL 1 M Tris/HCl (pH 8.0). This elution/neutralization cycle was repeated four times. All eluate fractions were combined and concentrated by ultrafiltration after buffer exchange into PBS containing 0.02% (v/v) NaN₃.

9. Western Blotting

Samples were fractionated by 12.5% SDS-PAGE under reducing conditions and then blotted on nitrocellulose membranes (GE Healthcare). After blocking for 1 h in PBS containing 3% BSA, the membranes were incubated with a γ-chain-specific goat anti-human IgG peroxidase conjugate (Sigma-Aldrich, A8775) at a concentration of 0.03 μg/mL PBS containing 0.05% Tween 20 (PBST) and 0.5% BSA for 90 min prior to development using chemiluminescence reagents (Bio-Rad).

10. Cleavage Site Analysis

N-terminal sequence analysis of bands blotted on polyvinylidene difluoride membranes (Bio-Rad) was performed by Edman degradation (Niemer M et al. Biotechnol J 9(2014):493-500). Alternatively, antibodies or their degradation products were digested with PNGase F (Roche) to release their N-glycans, reduced with 10 mM DTE (1 h, 56° C.) and then fractionated on a Thermo ProSwift RP-4H column (250×0.20 mm) using a Dionex UltiMate 3000 HPLC system (Thermo Scientific). After application of the sample (5 μl), elution was performed at 65° C. and a flow rate of 8 μl/min with a gradient of 20-95% solvent B (80% acetonitrile in 0.01% trifluoroacetic acid) in solvent A (0.05% trifluoroacetic acid) over 40 min as follows: 20-65% B (15 min), 65-95% B (5 min). Eluted polypeptides were analysed online on a maXis 4G ETD Q-TOF mass spectrometer (Bruker, Billerica, USA) equipped with an electrospray ionization source and operated in the positive ion mode (m/z range: 400-3800). The analysis files were deconvoluted (Maximum Entropy Method) using DataAnalysis 4.0 (Bruker) and manually annotated.

11. ELISA Assays

96-well enzyme-linked immunosorbent assay (ELISA) plates were coated with 100 ng per well of the respective antigen (2F5: gp41 peptide GGGLELDKWASL (Polymun Scientific); wild-type 2G12: HIV-1 UG37 gp140 (Polymun Scientific); PG9: HIV-1 ZM109 gp120, Loos A et al. Proc Natl Acad Sci USA 112(2015):12675-12680) or Fc-specific goat anti-human IgG F(ab′)2 fragments (Sigma-Aldrich, 13391) in 50 mM sodium carbonate/bicarbonate (pH 9.6) for 16 h at 4° C. The wells were then washed with PBST and subsequently incubated with mAb samples (starting concentration: 1-8 μg/mL) serially diluted (1:2) in PBST containing 1% BSA (dilution buffer) for 1 h at 22-24° C. After washing, bound antibodies were detected with 0.03 μg/mL γ-chain-specific goat anti-human IgG peroxidase conjugate (Sigma-Aldrich, A8775) in dilution buffer. After 1 h at 22-24° C., plates were washed and then developed with 0.1 mg/mL 3,3′,5,5′-tetramethylbenzidine (Sigma-Aldrich, T0440) and 0.006% H202 in 35 mM citric acid/65 mM sodium phosphate (pH 5.0) for 15 min at 22-24° C. Reactions were quenched by addition of 90 mM H₂SO₄prior to analysis by spectrophotometry at 450 nm.

A slightly modified ELISA procedure was used for the analysis of 2G12 containing an I19R mutation in the heavy chain (I19R 2G12). Wells were coated with 500 ng HIV-1 UG37 gp140 in PBS. Antibodies (20 μg/mL) were precomplexed with Fc-specific goat anti-human IgG F(ab′)2 fragments (10 μg/mL) for 15 min at 4° C. (Doores KJ et al. J Virol 84(2010):10690-10699) prior to appropriate dilution and addition to the wells. Detection of antigen-bound mAb molecules was performed with 0.1 μg/mL goat anti-human kappa chain peroxidase conjugate (Sigma-Aldrich, A7164).

12. Surface Plasmon Resonance Spectroscopy

The antigen binding kinetics of mAb 2F5 and its degradation products were determined by surface plasmon resonance spectroscopy using a Biacore T200 instrument (GE Healthcare) operated at 25° C. The running buffer was PBST containing 0.01% BSA. For multiple-cycle kinetics, purified antibody samples were diluted with running buffer to 37.5 μg/mL and then applied to a Protein A series S sensor chip (GE Healthcare) at a flow rate of 10 μL min⁻¹. After mAb capture (20 sec), the chip was exposed to increasing concentrations of gp41 peptide GGGLELDKWASL (for intact 2F5 mAb: 3, 9, 27, 81, 243 nM; for 2F5 degradation product: 50, 100, 200, 400, 800 nM) in running buffer (contact time: 30 sec; dissociation phase: 60 sec). Regeneration was performed with 10 mM glycine/HCl (pH 1.5) for 30 sec at a flow rate of 30 μL min⁻¹. A reference cell was operated in parallel (running buffer instead of mAb) to correct for unspecific binding of the analyte to the sensor chip surface. The sensorgrams thus obtained were analyzed with Biacore T200 Evaluation software by global fitting of the data to a 1:1 binding model.

13. Kinetic Analysis of Protease-Inhibitor Interactions

These assays were performed as described (Hohl M et al., 2017, J. Biol. Chem. 292, 6389-6401). A fluorigenic peptide substrate, Abz-VILDAVRA-Tyr(NO₂), was used for measuring the enzymatic activity of the subtilisin-like serine proteases tested. Recombinant inhibitors (prepared as described in Hohl M et al., 2017, J. Biol. Chem. 292, 6389-6401) were added in serial dilutions. Substrate cleavage was monitored in an Infinite M200 Pro microplate reader as increase in relative fluorescence (θ_ex=320 nm; θ_em=420 nm). Inhibition constants (K_i) were calculated using the Morrison equation.

Example 1
Apoplastic Fluid Proteases Generate Characteristic Fragments of the Monoclonal Antibodies 2F5, 2G12 and PG9

It was previously shown that the monoclonal anti-HIV antibody 2F5 can be cleaved within its CDR H3 loop when incubated with apoplastic proteases in vitro (Niemer M et al. Biotechnol J 9(2014):493-500). This has now been also observed for PG9 and a 2G12 variant (I19R 2G12) with a canonical domain architecture (FIG. 1). To identify the cleavage sites, the degradation products were affinity-purified and then subjected to N-terminal sequencing analysis. In each case, the main cleavage site was located in the CDR H3 loop. For 2G12, additional cleavage sites were identified in the segment adjacent to the C-terminus of the CDR H3 loop (Table 1).

TABLE 1

Sequences of 2F5, 2G12 and PG9 heavy chains. CDR

H3 loops (underlined) and their tips (italic) are

highlighted. Cleavage sites by apoplastic fluid proteases

are indicated by arrows.

2F5 (heavy chain) (SEQ ID No. 30):

RITLKESGPPLVKPTQTLTLTCSFSGFSLSDFGVGVGWIRQPPGKALEWLAIIYSDDD

KRYSPSLNTRLTITKDTSKNQVVLVMTRVSPVDTATYFCAHRRGPTTLF ↓ GVPIARG

PVNAMDVWGQGITVTISSTSTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFPEPVTVS

WNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTKVDKKV

EPKSCDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHEDPEV

KFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKAFPAP

IEKTISKAKGQPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSDIAVEWESNGQPEN

NYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKSLSLSPGK

I19R 2G12 (heavy chain) (SEQ ID No. 31):

EVQLVESGGGLVKAGGSLRLSCGVSNFRISAHTMNWVRRVPGGGLEWVASISTSSTYR

DYADAVKGRFTVSRDDLEDFVYLQMHKMRVEDTAIYYCARKGSD ↓ RLSDNDPFDAWG

PG ↓ TV ↓ VTVSPASTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFPEPVTVSWNSGAL

TSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTKVDKKVEPKSCD

KTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHEDPEVKFNWYV

DGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKAFPAPIEKTIS

KAKGQPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSDIAVEWESNGQPENNYKTTP

PVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKSLSLSPGK

PG9 (heavy chain) (SEQ ID No. 32):

QRLVESGGGVVQPGSSLRLSCAASGFDFSRQGMHWVRQAPGQGLEWVAFIKYDGSEKY

HADSVWGRLSISRDNSKDTLYLQMNSLRVEDTATYFCVREAGGPDYRNGYNYY ↓ D ↓ F

YDGYYNYHYM ↓ DVWGKGTTVIVSSASTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFP

EPVTVSWNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNT

KVDKKVEPKSCDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVS

HEDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSN

KALPAPIEKTISKAKGQPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSDIAVEWES

NGQPENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKSL

SLSPGK

Example 2
Proteolytic Cleavage Impairs Antigen Binding by 2F5, 2G12 and PG9

Previous studies have indicated that the CDR H3 loops are important for the interactions of 2F5, 2G12 and PG9 with their cognate antigens (Zwick M B et al. J Virol 78(2004):3155-3161; Doores K J et al. J Virol 84(2010):10690-10699; McLellan J S et al. Nature 480(2011):336-343). To investigate the antigen-binding properties of cleaved 2F5, 2G12 and PG9, suitable ligands and assay conditions had to be chosen. The antigen-binding activity of 2F5 can be tested with gp41-based peptides (Zwick M B et al. J Virol 78(2004):3155-3161). The nondomain-swapped variant of 2G12 (I19R 2G12) requires precomplexing with anti-Fc antibodies for detectable binding to HIV-1 envelope glycoproteins (Doores K J et al. J Virol 84(2010):10690-10699). PG9 has been described to bind to gp120 monomers of selected HIV strains including ZM109. Therefore HIV-1 ZM109 gp120 containing a C-terminal hexahistidine tag was expressed in mammalian cells and purified by metal-chelate affinity chromatography (Loos A et al. Proc Natl Acad Sci USA 112(2015):12675-12680). For all three antibodies, limited proteolysis within the CDR H3 loop by apoplastic proteases severely reduced binding to the respective antigen (FIG. 2a-c). For intact and cleaved 2F5, the avidity of the antigen-antibody interaction was also determined by surface plasmon resonance spectroscopy. These experiments demonstrated that cleavage of 2F5 by apoplastic proteases leads to a more than 20-fold reduction of its affinity to gp41 (FIG. 3).

Example 3
SBT1 and SBT2 Cleave 2F5, 2G12 and PG9 at the Same Sites as Apoplastic Fluid

Previous studies have indicated that serine proteases participate in the proteolysis of 2F5 in N. benthamiana (Niemer M et al. Biotechnol J 9(2014):493-500). We have therefore utilized a biotin-tagged version of an activity-based probe for serine hydrolases for the selective labelling of serine proteases present in apoplastic fluid extracted from N. benthamiana leaves. After isolation with immobilized avidin, the captured proteins were identified by mass spectrometry. Exploiting the availability of a draft genome sequence for N. benthamiana (Bombarely A et al. Mol Plant Microbe Interact 25 (2012):1523-1530), the respective cDNAs could be cloned by RT-PCR using primers based on the peptide sequences obtained by mass spectrometry and then ectopically expressed in N. benthamiana by means of agroinfiltration. Two recombinant serine proteases, SBT1 and SBT2, were purified by nickel-chelate affinity chromatography exploiting hexahistidine tags exogenously added to their C-termini and then analysed for their capacity to degrade 2F5, PG9 and the non-domain exchanged variant of 2G12 (I19R 2G12). SBT1 was found to degrade all three antibodies. SBT2 was capable of acting on PG9 and I19R 2G12 (FIG. 4a). Two other serine proteases, trypsin and SBT3 (Cedzich A et al. J. Biol. Chem. 284 (2009):14068-14078), did not display cleavage activity (FIG. 4b) when tested on the monoclonal antibody 2F5. The cleavage products obtained upon digestion of 2F5, 119R 2G12 and PG9 with SBT1 and SBT2were then characterized by N-terminal sequencing and mass spectrometry. These studies revealed that the combined action of SBT1 and SBT2 can account for all cleavage sites generated by incubation of the antibodies with unfractionated apoplastic fluid (Tables 2 and 3).

TABLE 2

Sequences of 2F5, 2G12 and PG9 heavy chains. CDR

H3 loops (underlined) and their tips (italic) are

highlighted. Cleavage sites by SBT1 are indicated by

arrows.

2F5 (heavy chain) (SEQ ID No. 30):

RITLKESGPPLVKPTQTLTLTCSFSGFSLSDFGVGVGWIRQPPGKALEWLAIIYSDDD

KRYSPSLNTRLTITKDTSKNQVVLVMTRVSPVDTATYFCAHRRGPTTLF ↓ GVPIARG

PVNAMDVWGQGITVTISSTSTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFPEPVTVS

WNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTKVDKKV

EPKSCDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHEDPEV

KFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKAFPAP

IEKTISKAKGQPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSDIAVEWESNGQPEN

NYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKSLSLSPGK

I19R 2G12 (heavy chain) (SEQ ID No. 31):

EVQLVESGGGLVKAGGSLRLSCGVSNFRISAHTMNWVRRVPGGGLEWVASISTSSTYR

DYADAVKGRFTVSRDDLEDFVYLQMHKMRVEDTAIYYCARKGSDRLSDNDPFDAWGPG

↓ TV ↓ VTVSPASTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFPEPVTVSWNSGALTS

GVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTKVDKKVEPKSCDKT

HTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHEDPEVKFNWYVDG

VEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKAFPAPIEKTISKA

KGQPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSDIAVEWESNGQPENNYKTTPPV

LDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKSLSLSPGK

PG9 (heavy chain) (SEQ ID No. 32):

QRLVESGGGVVQPGSSLRLSCAASGFDFSRQGMHWVRQAPGQGLEWVAFIKYDGSEKY

HADSVWGRLSISRDNSKDTLYLQMNSLRVEDTATYFCVREAGGPDYRNGYNYY ↓ DFY

DGYYNYHYM ↓ DVWGKGTTVIVSSASTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFPE

PVTVSWNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTK

VDKKVEPKSCDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSH

EDPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNK

ALPAPIEKTISKAKGQPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSDIAVEWESN

GQPENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKSLS

LSPGK

TABLE 3

Sequences of 2F5, 2G12 and PG9 heavy chains. CDR

H3 loops (underlined) and their tips (italic) are

highlighted. Cleavage sites by SBT2 are indicated by

arrows.

2F5 (heavy chain) (SEQ ID No. 30):

RITLKESGPPLVKPTQTLTLTCSFSGFSLSDFGVGVGWIRQPPGKALEWLAIIYSDDD

KRYSPSLNTRLTITKDTSKNQVVLVMTRVSPVDTATYFCAHRRGPTTLFGVPIARGPV

NAMDVWGQGITVTISSTSTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFPEPVTVSWN

SGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTKVDKKVEP

KSCDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVICVVVDVSHEDPEVKF

NWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKAFPAPIE

KTISKAKGQPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSDIAVEWESNGQPENNY

KTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKSLSLSPGK

I19R 2G12 (heavy chain) (SEQ ID No. 31):

EVQLVESGGGLVKAGGSLRLSCGVSNFRISAHTMNWVRRVPGGGLEWVASISTSSTYR

DYADAVKGRFTVSRDDLEDFVYLQMHKMRVEDTAIYYCARKGSD ↓ RLSDNDPFDAWG

PGTVVTVSPASTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFPEPVTVSWNSGALTSG

VHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTKVDKKVEPKSCDKTH

TCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHEDPEVKFNWYVDGV

EVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKAFPAPIEKTISKAK

GQPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSDIAVEWESNGQPENNYKTTPPVL

DSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKSLSLSPGK

PG9 (heavy chain) (SEQ ID No. 32):

QRLVESGGGVVQPGSSLRLSCAASGFDFSRQGMHWVRQAPGQGLEWVAFIKYDGSEKY

HADSVWGRLSISRDNSKDTLYLQMNSLRVEDTATYFCVREAGGPDYRNGYNYKD ↓ FY

DGYYNYHYMDVWGKGTTVTVSSASTKGPSVFPLAPSSKSTSGGTAALGCLVKDYFPEP

VTVSWNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTKV

DKKVEPKSCDKTHTCPPCPAPELLGGPSVFLFPPKPKDTLMISRTPEVTCVVVDVSHE

DPEVKFNWYVDGVEVHNAKTKPREEQYNSTYRVVSVLTVLHQDWLNGKEYKCKVSNKA

LPAPIEKTISKAKGQPREPQVYTLPPSRDELTKNQVSLTCLVKGFYPSDIAVEWESNG

QPENNYKTTPPVLDSDGSFFLYSKLTVDKSRWQQGNVFSCSVMHEALHNHYTQKSLSL

SPGK

Example 4
SBT1 and SBT2 Can Be Effectively Inhibited By Plant Inhibitors of Subtilisin-Like Serine Proteases

In a recent study (Hohl M et al., 2017, J. Biol. Chem. 292, 6389-6401), A. thaliana subtilisin propeptide-like inhibitor 1(SPI-1) and three close relatives from A. thaliana (SPI-2) and S. lycopersicum (S106g065370, S111g018590) were described. Recombinant versions of these four inhibitors were now produced in E. coli following published procedures (Hohl M et al., 2017, J. Biol. Chem. 292, 6389-6401) and then tested for their potential to inhibit SBT1 (SEQ ID No. 2) and SBT2 (SEQ ID No. 14). All four inhibitors (At1, A. thaliana SPI-1; At2, A. thaliana SPI-2; S11, S106g065370; S12, S111g018590) were found to bind to SBT1 and SBT2 with high affinities, resulting in effective inhibition of the proteases (FIG. 5a-h).

Conclusion

The examples provided herein aimed at the identification of the proteases affecting the integrity and antigen-binding capacity of the three HIV-specific mAbs 2F5, 2G12 and PG9. Utilizing the activity-based probe FP-biotin, two subtilisin-like serine proteases could be isolated from apoplastic fluid of Nicotiana benthamiana C105 plants and identified by mass spectrometric analysis of their tryptic peptides. The coding sequences of these two proteases were cloned and then used for the expression of recombinant versions of the enzymes, which could be purified by affinity chromatography exploiting their C-terminal tags. Incubation of the mAbs 2F5, 2G12 and PG9 with the purified proteases led to cleavage of the antibodies at exactly the same positions as observed when treated with apoplastic fluid. These results indicate that the identified two subtilisin-like serine proteases are responsible for the fragmentation and inactivation of the mAbs 2F5, 2G12 and PG9 when expressed in plants. The two identified proteases are effectively inhibited by subtilisin propeptide-like inhibitors, which can be exploited for their targeted inactivation in plants.

PLANT SERINE PROTEASES

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)

PCT Information