GENETICALLY MODIFIED MICRORGANISMS THAT CARRY OUT THE HETEROLOGOUS PRODUCTION OF MODIFIED VERSIONS OF THE SURFACTANT PROTEIN LV-RANASPUMIN-1(LV-RSN-1), THE MODIFIED VERSIONS OF SAID SURFACTANT PROTEIN, THE SYNTHETIC GENES ENCODING SAID SURFACTANT PROTEIN, THE EXPRESSION CASSETTES CONTAINING SAID SYNTHETIC GENES, AND THE EXPRESSION VECTORS CONTAINING SAID SYNTHETIC GENES

Information

  • Patent Application
  • 20230029208
  • Publication Number
    20230029208
  • Date Filed
    December 14, 2020
    3 years ago
  • Date Published
    January 26, 2023
    a year ago
Abstract
The present invention refers to the heterologous production in microorganisms of modified versions of a predicted isoform of the surfactant protein Lv-ranaspumin-1 (Lv-Rsn-1), whose sequence was inferred from analyzes of the protein extract of the nest foam from the Northeastern Pepper Frog (Leptodactylus vastus). More specifically, it refers to two surfactant proteins that consist of modified versions of the predicted isoform of Lv-Rsn-1; to two synthetic genes each encoding one of these modified versions of the predicted isoform of Lv-Rsn-1; to two expression cassettes each containing one of the synthetic genes encoding one of the modified versions of the predicted isoform of Lv-Rsn-1; to two expression vectors each containing one of the synthetic genes encoding modified versions of the predicted isoform of Lv-Rsn-1; and to two transgenic microorganisms, a bacterium and a yeast, each transformed with one of these synthetic genes and heterologously producing one of the modified versions of the predicted isoform of Lv-Rsn-1. Lv-Rsn-1 has surfactancy, emulsification and dispersancy properties, among others, and its heterologous production allows it to be used in various applications and industrial products, without the need to extract it from the frog nest foam.
Description
LISTING OF BIOLOGICAL SEQUENCES







<110> PetrĂ³leo Brasileiro S. A.—Petrobras


<120> GENETICALLY MODIFIED MICROORGANISMS THAT CARRY OUT THE HETEROLOGICAL PRODUCTION OF MODIFIED VERSIONS OF THE SURFACTANT PROTEIN LV-RANASPUMIN-1 (LV-RSN-1), THE MODIFIED VERSIONS OF THIS SURFACTANT PROTEIN, THE SYNTHETIC GENES THAT ENCODE THIS SURFACTANT PROTEIN, THE EXPRESSION CASSETTES CONTAINING THESE SYNTHETIC GENES, AND THE EXPRESSION VECTORS CONTAINING THESE SYNTHETIC GENES


<160>10


<210>1


<211>216


<212> PRT


<213> Leptodactylus vastus


<223> predict sequence for one of the isoforms of the Lv-ranaspumin-1


<400>1


Leu Leu Glu Gly Phe Leu Val Gly Gly Gly Val Pro Gly Pro Gly Thr 5 10 15


Ala Cys Leu Thr Lys Ala Leu Lys Asp Ser Gly Asp Leu Leu Val Glu 20 25 30


Leu Ala Val Ile Ile Cys Ala Tyr Gln Asn Gly Lys Asp Leu Gln Glu


35 40 45


Gln Asp Phe Lys Glu Leu Lys Glu Leu Leu Glu Arg Thr Leu Glu Arg


55 60


Ala Gly Cys Ala Leu Asp Asp Ile Val Ala Asp Leu Gly Leu Glu Glu


70 75 80


Leu Leu Gly Ser Ile Gly Val Ser Thr Gly Asp Ile Ile Gln Gly Leu


85 90 95


Tyr Lys Leu Leu Lys Glu Leu Lys Ile Asp Glu Thr Val Phe Asn Ala


100 105 110


Val Cys Asp Val Thr Lys Lys Met Leu Asp Asn Lys Cys Leu Pro Lys


115 120 125


Ile Leu Gln Gly Asp Leu Val Lys Phe Leu Asp Leu Lys Tyr Lys Val 130 135 140


Cys Ile Glu Gly Gly Asp Pro Glu Leu Ile Ile Lys Asp Leu Lys Ile


145 150 155 160


Ile Leu Glu Arg Leu Pro Cys Val Leu Gly Gly Val Gly Leu Asp Asp


165 170 175


Leu Phe Lys Asn Ile Phe Val Lys Asp Gly Ile Leu Ser Phe Glu Gly


180 185 190


Ile Ala Lys Pro Leu Gly Asp Leu Leu Ile Leu Val Leu Cys Pro Asn


195 200 205


Val Lys Asn Ile Asn Val Ser Ser


210 215


<210>2


<211>648


<212> DNA


<213> Artificial Sequence


<220>


<221> CDS


<222> (1) . . . (648)


<223> encoding sequence of one of the isoforms of Lv-ranaspumin-1 after reverse translation of the predicted amino acid sequence


<400>2


ctg ctg gaa ggc ttt ctg gtg ggc ggc ggc gtg ccg ggc ccg ggc acc 48


Leu Leu Glu Gly Phe Leu Val Gly Gly Gly Val Pro Gly Pro Gly Thr


5 10 15


gcg tgc ctg acc aaa gcg ctg aaa gat agc ggc gat ctg ctg gtg gaa


96


Ala Cys Leu Thr Lys Ala Leu Lys Asp Ser Gly Asp Leu Leu Val Glu


20 25 30


ctg gcg gtg att att tgc gcg tat cag aac ggc aaa gat ctg cag gaa


144


Leu Ala Val Ile Ile Cys Ala Tyr Gln Asn Gly Lys Asp Leu Gln Glu


35 40 45


cag gat ttt aaa gaa ctg aaa gaa ctg ctg gaa cgc acc ctg gaa cgc


192


Gln Asp Phe Lys Glu Leu Lys Glu Leu Leu Glu Arg Thr Leu Glu Arg


50 55 60


gcg ggc tgc gcg ctg gat gat att gtg gcg gat ctg ggc ctg gaa gaa


240


Ala Gly Cys Ala Leu Asp Asp Ile Val Ala Asp Leu Gly Leu Glu Glu


65 70 75 80


ctg ctg ggc agc att ggc gtg agc acc ggc gat att att cag ggc ctg 288


Leu Leu Gly Ser Ile Gly Val Ser Thr Gly Asp Ile Ile Gln Gly Leu


85 90 95


tat aaa ctg ctg aaa gaa ctg aaa att gat gaa acc gtg ttt aac gcg


336


Tyr Lys Leu Leu Lys Glu Leu Lys Ile Asp Glu Thr Val Phe Asn Ala


100 105 110


gtg tgc gat gtg acc aaa aaa atg ctg gat aac aaa tgc ctg ccg aaa


384


Val Cys Asp Val Thr Lys Lys Met Leu Asp Asn Lys Cys Leu Pro Lys


115 120 125


att ctg cag ggc gat ctg gtg aaa ttt ctg gat ctg aaa tat aaa gtg


432


Ile Leu Gln Gly Asp Leu Val Lys Phe Leu Asp Leu Lys Tyr Lys Val


130 135 140


tgc att gaa ggc ggc gat ccg gaa ctg att att aaa gat ctg aaa att


480


Cys Ile Glu Gly Gly Asp Pro Glu Leu Ile Ile Lys Asp Leu Lys Ile


145 150 155 160


att ctg gaa cgc ctg ccg tgc gtg ctg ggc ggc gtg ggc ctg gat gat 528


Ile Leu Glu Arg Leu Pro Cys Val Leu Gly Gly Val Gly Leu Asp Asp


165 170 175


ctg ttt aaa aac att ttt gtg aaa gat ggc att ctg agc ttt gaa ggc


576


Leu Phe Lys Asn Ile Phe Val Lys Asp Gly Ile Leu Ser Phe Glu Gly


180 185 190


att gcg aaa ccg ctg ggc gat ctg ctg att ctg gtg ctg tgc ccg aac 624


Ile Ala Lys Pro Leu Gly Asp Leu Leu Ile Leu Val Leu Cys Pro Asn


195 200 205


gtg aaa aac att aac gtg agc agc


648


Val Lys Asn Ile Asn Val Ser Ser


210 215


<210>3


<211>651


<212> DNA


<213> Artificial Sequence


<220>


<221> CDS


<222> (1) . . . (651)


<223> codon frequency optimization of SEQ ID NO:2 for expression in bacteria and addition of the ATG start codon


<400>3


atg ctg ctg gaa ggt ttt ctg gtt ggg ggc ggt gtt ccg ggt cca ggc


48


Met Leu Leu Glu Gly Phe Leu Val Gly Gly Gly Val Pro Gly Pro Gly


5 10 15


acg gcc tgc ttg acg aag get ctg aaa gat agc ggt gac ctg ctg gtg


96


Thr Ala Cys Leu Thr Lys Ala Leu Lys Asp Ser Gly Asp Leu Leu Val


20 25 30


gag tta gcg gtt att att tgt gca tac cag aat ggc aaa gac ctt cag


144


Glu Leu Ala Val Ile Ile Cys Ala Tyr Gln Asn Gly Lys Asp Leu Gln


35 40 45


gag cag gac ttc aaa gaa ctg aag gaa ttg ctg gaa cgt aca ttg gaa


192


Glu Gln Asp Phe Lys Glu Leu Lys Glu Leu Leu Glu Arg Thr Leu Glu


55 60


cgt gcc ggt tgt gcc ctc gat gat att gtg gcc gat tta ggt ctg gaa


240


Arg Ala Gly Cys Ala Leu Asp Asp Ile Val Ala Asp Leu Gly Leu Glu


70 75 80


gaa ctg ctg ggc tcc atc ggc gtt agt acc ggc gat att atc cag ggt


288


Glu Leu Leu Gly Ser Ile Gly Val Ser Thr Gly Asp Ile Ile Gln Gly


85 90 95


ctg tat aaa ctg ttg aag gag tta aaa atc gac gag acc gtc ttt aat


336


Leu Tyr Lys Leu Leu Lys Glu Leu Lys Ile Asp Glu Thr Val Phe Asn


100 105 110


gcg gtc tgc gat gtg acc aaa aaa atg ctg gat aac aag tgc tta ccg


384


Ala Val Cys Asp Val Thr Lys Lys Met Leu Asp Asn Lys Cys Leu Pro


115 120 125


aaa att ctg caa gga gat ctg gta aag ttc ctt gat ctg aag tat aaa


432


Lys Ile Leu Gln Gly Asp Leu Val Lys Phe Leu Asp Leu Lys Tyr Lys


130 135 140


gtt tgt att gaa ggt ggc gat cca gaa ctg att att aag gat ctg aaa


480


Val Cys Ile Glu Gly Gly Asp Pro Glu Leu Ile Ile Lys Asp Leu Lys


145 150 155 160


atc atc ctg gaa cgg ctt ccg tgt gtg ttg ggt gga gtc ggt ttg gat


528


Ile Ile Leu Glu Arg Leu Pro Cys Val Leu Gly Gly Val Gly Leu Asp


165 170 175


gat ctc ttt aag aac att ttt gtt aag gat ggg att ctg tcc ttc gaa


576


Asp Leu Phe Lys Asn Ile Phe Val Lys Asp Gly Ile Leu Ser Phe Glu


180 185 190


ggt att gcg aaa cct ctt ggt gac ctt ctc atc ctt gtc tta tgc ccg


624


Gly Ile Ala Lys Pro Leu Gly Asp Leu Leu Ile Leu Val Leu Cys Pro


195 200 205


aac gtc aag aat atc aat gta tcc tct


651


Asn Val Lys Asn Ile Asn Val Ser Ser


210 215


<210>4


<211>697


<212> DNA


<213> Artificial Sequence


<220> CDS


<223> the SEQ ID NO:3 including the restriction site for Ndel,


the encoding sequence of the polyhistidine tag, the restriction site for EcoRI, the sequence encoding the cleavage site for TEV, the restriction site for Ndel, and the restriction site for Xhol


<400>4


g aat tct gaa aac ttg tat ttc cag ggc agc cat atg atg ctg ctg


46


Asn Ser Glu Asn Leu Tyr Phe Gln Gly Ser His Met Met Leu Leu


5 10 15


gaa ggt ttt ctg gtt ggg ggc ggt gtt ccg ggt cca ggc acg gcc tgc


94


Glu Gly Phe Leu Val Gly Gly Gly Val Pro Gly Pro Gly Thr Ala Cys


20 25 30


ttg acg aag get ctg aaa gat agc ggt gac ctg ctg gtg gag tta gcg


142


Leu Thr Lys Ala Leu Lys Asp Ser Gly Asp Leu Leu Val Glu Leu Ala


35 40 45


gtt att att tgt gca tac cag aat ggc aaa gac ctt cag gag cag gac


190


Val Ile Ile Cys Ala Tyr Gln Asn Gly Lys Asp Leu Gln Glu Gln Asp


50 55 60


ttc aaa gaa ctg aag gaa ttg ctg gaa cgt aca ttg gaa cgt gcc ggt


238


Phe Lys Glu Leu Lys Glu Leu Leu Glu Arg Thr Leu Glu Arg Ala Gly


70 75


tgt gcc ctc gat gat att gtg gcc gat tta ggt ctg gaa gaa ctg ctg


286


Cys Ala Leu Asp Asp Ile Val Ala Asp Leu Gly Leu Glu Glu Leu Leu


85 90 95


ggc tcc atc ggc gtt agt acc ggc gat att atc cag ggt ctg tat aaa


334


Gly Ser Ile Gly Val Ser Thr Gly Asp Ile Ile Gln Gly Leu Tyr Lys


100 105 110


ctg ttg aag gag tta aaa atc gac gag acc gtc ttt aat gcg gtc tgc


382


Leu Leu Lys Glu Leu Lys Ile Asp Glu Thr Val Phe Asn Ala Val Cys


115 120 125


gat gtg acc aaa aaa atg ctg gat aac aag tgc tta ccg aaa att ctg


430


Asp Val Thr Lys Lys Met Leu Asp Asn Lys Cys Leu Pro Lys Ile Leu


130 135 140


caa gga gat ctg gta aag ttc ctt gat ctg aag tat aaa gtt tgt att


478


Gln Gly Asp Leu Val Lys Phe Leu Asp Leu Lys Tyr Lys Val Cys Ile


145 150 155


gaa ggt ggc gat cca gaa ctg att att aag gat ctg aaa atc atc ctg


526


Glu Gly Gly Asp Pro Glu Leu Ile Ile Lys Asp Leu Lys Ile Ile Leu


160 165 170 175


gaa cgg ctt ccg tgt gtg ttg ggt gga gtc ggt ttg gat gat ctc ttt


574


Glu Arg Leu Pro Cys Val Leu Gly Gly Val Gly Leu Asp Asp Leu Phe


180 185 190


aag aac att ttt gtt aag gat ggg att ctg tcc ttc gaa ggt att gcg


622


Lys Asn Ile Phe Val Lys Asp Gly Ile Leu Ser Phe Glu Gly Ile Ala


195 200 205


aaa cct ctt ggt gac ctt ctc atc ctt gtc tta tgc ccg aac gtc aag


670


Lys Pro Leu Gly Asp Leu Leu Ile Leu Val Leu Cys Pro Asn Val Lys


210 215 220


aat atc aat gta tcc tct taactcgag


697


Asn Ile Asn Val Ser Ser


225


<210>5


<211>6395


<212> DNA


<213> Artificial Sequence


<220>


<223> pPBUFCBac-LvRsnl expression vector resulting from the insertion of SEQ ID NO:4 into SEQ ID NO:5


<400>5


gatctcgatc ccgcgaaatt aatacgactc actatagggg aattgtgagc ggataacaat 60


tcccctctag aaataatttt gtttaaactt taagaaggag atatacatat g cat cat 117


His His


cat cat cat cac gtg aat tct gaa aac ttg tat ttc cag ggc agc cat 165


His His His His Val Asn Ser Glu Asn Leu Tyr Phe Gln Gly Ser His


5 10 15


atg atg ctg ctg gaa ggt ttt ctg gtt ggg ggc ggt gtt ccg ggt cca 213


Met Met Leu Leu Glu Gly Phe Leu Val Gly Gly Gly Val Pro Gly Pro


25 30


ggc acg gcc tgc ttg acg aag gct ctg aaa gat agc ggt gac ctg ctg 261


Gly Thr Ala Cys Leu Thr Lys Ala Leu Lys Asp Ser Gly Asp Leu Leu


40 45 50


gtg gag tta gcg gtt att att tgt gca tac cag aat ggc aaa gac ctt 309


Val Glu Leu Ala Val Ile Ile Cys Ala Tyr Gln Asn Gly Lys Asp Leu


55 60 65


cag gag cag gac ttc aaa gaa ctg aag gaa ttg ctg gaa cgt aca ttg 357


Gln Glu Gln Asp Phe Lys Glu Leu Lys Glu Leu Leu Glu Arg Thr Leu


70 75 80


gaa cgt gcc ggt tgt gcc ctc gat gat att gtg gcc gat tta ggt ctg 405


Glu Arg Ala Gly Cys Ala Leu Asp Asp Ile Val Ala Asp Leu Gly Leu 85 90 95


gaa gaa ctg ctg ggc tcc atc ggc gtt agt acc ggc gat att atc cag 453


Glu Glu Leu Leu Gly Ser Ile Gly Val Ser Thr Gly Asp Ile Ile Gln


100 105 110


ggt ctg tat aaa ctg ttg aag gag tta aaa atc gac gag acc gtc ttt 501


Gly Leu Tyr Lys Leu Leu Lys Glu Leu Lys Ile Asp Glu Thr Val Phe


115 120 125 130


aat gcg gtc tgc gat gtg acc aaa aaa atg ctg gat aac aag tgc tta 549


Asn Ala Val Cys Asp Val Thr Lys Lys Met Leu Asp Asn Lys Cys Leu


135 140 145


ccg aaa att ctg caa gga gat ctg gta aag ttc ctt gat ctg aag tat 597


Pro Lys Ile Leu Gln Gly Asp Leu Val Lys Phe Leu Asp Leu Lys Tyr


150 155 160


aaa gtt tgt att gaa ggt ggc gat cca gaa ctg att att aag gat ctg 645


Lys Val Cys Ile Glu Gly Gly Asp Pro Glu Leu Ile Ile Lys Asp Leu


165 170 175


aaa atc atc ctg gaa cgg ctt ccg tgt gtg ttg ggt gga gtc ggt ttg 693


Lys Ile Ile Leu Glu Arg Leu Pro Cys Val Leu Gly Gly Val Gly Leu


180 185 190


gat gat ctc ttt aag aac att ttt gtt aag gat ggg att ctg tcc ttc 741


Asp Asp Leu Phe Lys Asn Ile Phe Val Lys Asp Gly Ile Leu Ser Phe


195 200 205 210


gaa ggt att gcg aaa cct ctt ggt gac ctt ctc atc ctt gtc tta tgc 789


Glu Gly Ile Ala Lys Pro Leu Gly Asp Leu Leu Ile Leu Val Leu Cys


215 220 225


ccg aac gtc aag aat atc aat gta tcc tct taactcgaga tcgatgatat 819


Pro Asn Val Lys Asn Ile Asn Val Ser Ser


230 235


tcgagcctag gtataatcgg atccggctgc taacaaagcc cgaaaggaag ctgagttggc 879


tgctgccacc gctgagcaat aactagcata accccttggg gcctctaaac gggtcttgag 939


gggttttttg ctgaaaggag gaactatatc cggatatccc gcaagaggcc cggcagtacc 999


ggcataacca agcctatgcc tacagcatcc agggtgacgg tgccgaggat gacgatgagc 1059


gcattgttag atttcataca cggtgcctga ctgcgttagc aatttaactg tgataaacta 1119


ccgcattaaa gctagcttat cgatgataag ctgtcaaaca tgagaattaa ttcttgaaga 1179


cgaaagggcc tcgtgatacg cctattttta taggttaatg tcatgataat aatggtttct 1239


tagacgtcag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg tttatttttc 1299


taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat gcttcaataa 1359


tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat tccctttttt 1419


gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt aaaagatgct 1479


gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag cggtaagatc 1539


cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa agttctgcta 1599


tgtggcgcgg tattatcccg tgttgacgcc gggcaagagc aactcggtcg ccgcatacac 1659


tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct tacggatggc 1719


atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac tgcggccaac 1779


ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca caacatgggg 1839


gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat accaaacgac 1899


gagcgtgaca ccacgatgcc tgcagcaatg gcaacaacgt tgcgcaaact attaactggc 1959


gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc ggataaagtt 2019


gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga taaatctgga 2079


gccggtgagc gtgggtctcg cggtatcatt gcagcactgg ggccagatgg taagccctcc 2139


cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg aaatagacag 2199


atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca agtttactca 2259


tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta ggtgaagatc 2319


ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca ctgagcgtca 2379


gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg cgtaatctgc 2439


tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga tcaagagcta 2499


ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa tactgtcctt 2559


ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc tacatacctc 2619


gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg tcttaccggg 2679


ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac ggggggttcg 2739


tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct acagcgtgag 2799


ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc ggtaagcggc 2859


agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg gtatctttat 2919


agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg ctcgtcaggg 2979


gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct ggccttttgc 3039


tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga taaccgtatt 3099


accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg cagcgagtca 3159


gtgagcgagg aagcggaaga gcgcctgatg cggtattttc tccttacgca tctgtgcggt 3219


atttcacacc gcaatggtgc actctcagta caatctgctc tgatgccgca tagttaagcc 3279


agtatacact ccgctatcgc tacgtgactg ggtcatggct gcgccccgac acccgccaac 3339


acccgctgac gcgccctgac gggcttgtct gctcccggca tccgcttaca gacaagctgt 3399


gaccgtctcc gggagctgca tgtgtcagag gttttcaccg tcatcaccga aacgcgcgag 3459


gcagctgcgg taaagctcat cagcgtggtc gtgaagcgat tcacagatgt ctgcctgttc 3519


atccgcgtcc agctcgttga gtttctccag aagcgttaat gtctggcttc tgataaagcg 3579


ggccatgtta agggcggttt tttcctgttt ggtcactgat gcctccgtgt aagggggatt 3639


tctgttcatg ggggtaatga taccgatgaa acgagagagg atgctcacga tacgggttac 3699


tgatgatgaa catgcccggt tactggaacg ttgtgagggt aaacaactgg cggtatggat 3759


gcggcgggac cagagaaaaa tcactcaggg tcaatgccag cgcttcgtta atacagatgt 3819


aggtgttcca cagggtagcc agcagcatcc tgcgatgcag atccggaaca taatggtgca 3879


gggcgctgac ttccgcgttt ccagacttta cgaaacacgg aaaccgaaga ccattcatgt 3939


tgttgctcag gtcgcagacg ttttgcagca gcagtcgctt cacgttcgct cgcgtatcgg 3999


tgattcattc tgctaaccag taaggcaacc ccgccagcct agccgggtcc tcaacgacag 4059


gagcacgatc atgcgcaccc gtggccagga cccaacgctg cccgagatgc gccgcgtgcg 4119


gctgctggag atggcggacg cgatggatat gttctgccaa gggttggttt gcgcattcac 4179


agttctccgc aagaattgat tggctccaat tcttggagtg gtgaatccgt tagcgaggtg 4239


ccgccggctt ccattcaggt cgaggtggcc cggctccatg caccgcgacg caacgcgggg 4299


aggcagacaa ggtatagggc ggcgcctaca atccatgcca acccgttcca tgtgctcgcc 4359


gaggcggcat aaatcgccgt gacgatcagc ggtccaatga tcgaagttag gctggtaaga 4419


gccgcgagcg atccttgaag ctgtccctga tggtcgtcat ctacctgcct ggacagcatg 4479


gcctgcaacg cgggcatccc gatgccgccg gaagcgagaa gaatcataat ggggaaggcc 4539


atccagcctc gcgtcgcgaa cgccagcaag acgtagccca gcgcgtcggc cgccatgccg 4599


gcgataatgg cctgcttctc gccgaaacgt ttggtggcgg gaccagtgac gaaggcttga 4659


gcgagggcgt gcaagattcc gaataccgca agcgacaggc cgatcatcgt cgcgctccag 4719


cgaaagcggt cctcgccgaa aatgacccag agcgctgccg gcacctgtcc tacgagttgc 4779


atgataaaga agacagtcat aagtgcggcg acgatagtca tgccccgcgc ccaccggaag 4839


gagctgactg ggttgaaggc tctcaagggc atcggtcgag atcccggtgc ctaatgagtg 4899


agctaactta cattaattgc gttgcgctca ctgcccgctt tccagtcggg aaacctgtcg 4959


tgccagctgc attaatgaat cggccaacgc gcggggagag gcggtttgcg tattgggcgc 5019


cagggtggtt tttcttttca ccagtgagac gggcaacagc tgattgccct tcaccgcctg 5079


gccctgagag agttgcagca agcggtccac gctggtttgc cccagcaggc gaaaatcctg 5139


tttgatggtg gttaacggcg ggatataaca tgagctgtct tcggtatcgt cgtatcccac 5199


taccgagata tccgcaccaa cgcgcagccc ggactcggta atggcgcgca ttgcgcccag 5259


cgccatctga tcgttggcaa ccagcatcgc agtgggaacg atgccctcat tcagcatttg 5319


catggtttgt tgaaaccgga catggcactc cagtcgcctt cccgttccgc tatcggctga 5379


atttgattgc gagtgagata tttatgccag ccagccagac gcagacgcgc cgagacagaa 5439


cttaatgggc ccgctaacag cgcgatttgc tggtgaccca atgcgaccag atgctccacg 5499


cccagtcgcg taccgtcttc atgggagaaa ataatactgt tgatgggtgt ctggtcagag 5559


acatcaagaa ataacgccgg aacattagtg caggcagctt ccacagcaat ggcatcctgg 5619


tcatccagcg gatagttaat gatcagccca ctgacgcgtt gcgcgagaag attgtgcacc 5679


gccgctttac aggcttcgac gccgcttcgt tctaccatcg acaccaccac gctggcaccc 5739


agttgatcgg cgcgagattt aatcgccgcg acaatttgcg acggcgcgtg cagggccaga 5799


ctggaggtgg caacgccaat cagcaacgac tgtttgcccg ccagttgttg tgccacgcgg 5859


ttgggaatgt aattcagctc cgccatcgcc gcttccactt tttcccgcgt tttcgcagaa 5919


acgtggctgg cctggttcac cacgcgggaa acggtctgat aagagacacc ggcatactct 5979


gcgacatcgt ataacgttac tggtttcaca ttcaccaccc tgaattgact ctcttccggg 6039


cgctatcatg ccataccgcg aaaggttttg cgccattcga tggtgtccgg gatctcgacg 6099


ctctccctta tgcgactcct gcattaggaa gcagcccagt agtaggttga ggccgttgag 6159


caccgccgcc gcaaggaatg gtgcatgcaa ggagatggcg cccaacagtc ccccggccac 6219


ggggcctgcc accataccca cgccgaaaca agcgctcatg agcccgaagt ggcgagcccg 6279


atcttcccca tcggtgatgt cggcgatata ggcgccagca accgcacctg tggcgccggt 6339


gatgccggcc acgatgcgtc cggcgtagag gatcga 6375


<210>6


<211>236


<212> PRT


<213> Artificial Sequence


<220>


<223> amino acid sequence of the modified version of the Lv-Rsn-1 surfactant protein encoded by the nucleotide sequence SEQ ID NO:4, which comprises the SEQ ID NO:3


<400>6


His His His His His His Val Asn Ser Glu Asn Leu Tyr Phe Gln Gly


5 10 15


Ser His Met Met Leu Leu Glu Gly Phe Leu Val Gly Gly Gly Val Pro


20 25 30


Gly Pro Gly Thr Ala Cys Leu Thr Lys Ala Leu Lys Asp Ser Gly Asp


35 40 45


Leu Leu Val Glu Leu Ala Val Ile Ile Cys Ala Tyr Gln Asn Gly Lys


55 60


Asp Leu Gln Glu Gln Asp Phe Lys Glu Leu Lys Glu Leu Leu Glu Arg


70 75 80


Thr Leu Glu Arg Ala Gly Cys Ala Leu Asp Asp Ile Val Ala Asp Leu


85 90 95


Gly Leu Glu Glu Leu Leu Gly Ser Ile Gly Val Ser Thr Gly Asp Ile


100 105 110


Ile Gln Gly Leu Tyr Lys Leu Leu Lys Glu Leu Lys Ile Asp Glu Thr


115 120 125


Val Phe Asn Ala Val Cys Asp Val Thr Lys Lys Met Leu Asp Asn Lys


130 135 140


Cys Leu Pro Lys Ile Leu Gln Gly Asp Leu Val Lys Phe Leu Asp Leu


145 150 155 160


Lys Tyr Lys Val Cys Ile Glu Gly Gly Asp Pro Glu Leu Ile Ile Lys


165 170 175


Asp Leu Lys Ile Ile Leu Glu Arg Leu Pro Cys Val Leu Gly Gly Val


180 185 190


Gly Leu Asp Asp Leu Phe Lys Asn Ile Phe Val Lys Asp Gly Ile Leu


195 200 205


Ser Phe Glu Gly Ile Ala Lys Pro Leu Gly Asp Leu Leu Ile Leu Val


210 215 220


Leu Cys Pro Asn Val Lys Asn Ile Asn Val Ser Ser


225 230 235


<210>7


<211>648


<212> DNA


<213> Artificial Sequence


<220>


<221> CDS


<222> (1) . . . (648)


<223> codon frequency optimization of the SEQ ID NO:2 for expression in yeasts


<400>7


ttg ttg gaa gga ttt ttg gtc gga ggt ggt gtc cct ggt cct ggt aca


48


Leu Leu Glu Gly Phe Leu Val Gly Gly Gly Val Pro Gly Pro Gly Thr


5 10 15


gca tgt ttg act aag gca ttg aaa gac agt gga gac ttg ttg gtt gag


96


Ala Cys Leu Thr Lys Ala Leu Lys Asp Ser Gly Asp Leu Leu Val Glu


20 25 30


ttg gct gtt att att tgt gct tac caa aac ggt aaa gat ttg caa gag


144


Leu Ala Val Ile Ile Cys Ala Tyr Gln Asn Gly Lys Asp Leu Gln Glu


35 40 45


caa gat ttc aag gaa ttg aag gag ttg ttg gaa aga act ttg gaa aga


192


Gln Asp Phe Lys Glu Leu Lys Glu Leu Leu Glu Arg Thr Leu Glu Arg


55 60


gct ggt tgt gct ttg gat gat att gtt gct gat ttg ggt ttg gaa gag


240


Ala Gly Cys Ala Leu Asp Asp Ile Val Ala Asp Leu Gly Leu Glu Glu


70 75 80


ttg ttg ggt tct att ggt gtt tct act gga gat atc atc caa ggt ttg


288


Leu Leu Gly Ser Ile Gly Val Ser Thr Gly Asp Ile Ile Gln Gly Leu


85 90 95


tac aag ttg ttg aag gag ttg aag atc gat gaa act gtt ttt aac get


336


Tyr Lys Leu Leu Lys Glu Leu Lys Ile Asp Glu Thr Val Phe Asn Ala


100 105 110


gtt tgt gat gtt act aag aaa atg ttg gat aac aag tgt ttg cca aag


384


Val Cys Asp Val Thr Lys Lys Met Leu Asp Asn Lys Cys Leu Pro Lys


115 120 125


atc ttg caa gga gat ttg gtt aag ttc ttg gat ttg aag tac aag gtt


432


Ile Leu Gln Gly Asp Leu Val Lys Phe Leu Asp Leu Lys Tyr Lys Val


130 135 140


tgt atc gaa ggt gga gat cca gaa ttg att att aag gat ttg aag atc


480


Cys Ile Glu Gly Gly Asp Pro Glu Leu Ile Ile Lys Asp Leu Lys Ile


145 150 155 160


atc ttg gag aga ttg cct tgt gtt ttg ggt ggt gtt ggt ttg gat gat


528


Ile Leu Glu Arg Leu Pro Cys Val Leu Gly Gly Val Gly Leu Asp Asp


165 170 175


ttg ttt aaa aac atc ttc gtt aag gat ggt att ttg tct ttc gaa ggt


576


Leu Phe Lys Asn Ile Phe Val Lys Asp Gly Ile Leu Ser Phe Glu Gly


180 185 190


att get aag cct ttg gga gat ttg ttg att ttg gtt ttg tgt cct aat


624


Ile Ala Lys Pro Leu Gly Asp Leu Leu Ile Leu Val Leu Cys Pro Asn


195 200 205


gtc aag aat atc aat gtt tca tca


648


Val Lys Asn Ile Asn Val Ser Ser


210 215


<210>8


<211>685


<212> DNA


<213> Artificial Sequence


<220>


<221> CDS


<222> (1) . . . (648)


<223> the SEQ ID NO:8 after addition of the restriction site for


the Pstl endonuclease, of two nucleotides to place the encoding sequence in the same frame of translation as the secretion factor alpha, and of the restriction site for the endonuclease Notl


<400>8


ct gca gga ttg ttg gaa gga ttt ttg gtc gga ggt ggt gtc cct ggt


47


Ala Gly Leu Leu Glu Gly Phe Leu Val Gly Gly Gly Val Pro Gly


5 10 15


cct ggt aca gca tgt ttg act aag gca ttg aaa gac agt gga gac ttg


95


Pro Gly Thr Ala Cys Leu Thr Lys Ala Leu Lys Asp Ser Gly Asp Leu


20 25 30


ttg gtt gag ttg get gtt att att tgt get tac caa aac ggt aaa gat


143


Leu Val Glu Leu Ala Val Ile Ile Cys Ala Tyr Gln Asn Gly Lys Asp


35 40 45


ttg caa gag caa gat ttc aag gaa ttg aag gag ttg ttg gaa aga act


191


Leu Gln Glu Gln Asp Phe Lys Glu Leu Lys Glu Leu Leu Glu Arg Thr


50 55 60


ttg gaa aga gct ggt tgt gct ttg gat gat att gtt gct gat ttg ggt


239


Leu Glu Arg Ala Gly Cys Ala Leu Asp Asp Ile Val Ala Asp Leu Gly


70 75


ttg gaa gag ttg ttg ggt tct att ggt gtt tct act gga gat atc atc


287


Leu Glu Glu Leu Leu Gly Ser Ile Gly Val Ser Thr Gly Asp Ile Ile


85 90 95


caa ggt ttg tac aag ttg ttg aag gag ttg aag atc gat gaa act gtt


335


Gln Gly Leu Tyr Lys Leu Leu Lys Glu Leu Lys Ile Asp Glu Thr Val


100 105 110


ttt aac gct gtt tgt gat gtt act aag aaa atg ttg gat aac aag tgt


383


Phe Asn Ala Val Cys Asp Val Thr Lys Lys Met Leu Asp Asn Lys Cys


115 120 125


ttg cca aag atc ttg caa gga gat ttg gtt aag ttc ttg gat ttg aag


431


Leu Pro Lys Ile Leu Gln Gly Asp Leu Val Lys Phe Leu Asp Leu Lys


130 135 140


tac aag gtt tgt atc gaa ggt gga gat cca gaa ttg att att aag gat


479


Tyr Lys Val Cys Ile Glu Gly Gly Asp Pro Glu Leu Ile Ile Lys Asp


145 150 155


ttg aag atc atc ttg gag aga ttg cct tgt gtt ttg ggt ggt gtt ggt


527


Leu Lys Ile Ile Leu Glu Arg Leu Pro Cys Val Leu Gly Gly Val Gly


160 165 170 175


ttg gat gat ttg ttt aaa aac atc ttc gtt aag gat ggt att ttg tct


575


Leu Asp Asp Leu Phe Lys Asn Ile Phe Val Lys Asp Gly Ile Leu Ser


180 185 190


ttc gaa ggt att get aag cct ttg gga gat ttg ttg att ttg gtt ttg


623


Phe Glu Gly Ile Ala Lys Pro Leu Gly Asp Leu Leu Ile Leu Val Leu


195 200 205


tgt cct aat gtc aag aat atc aat gtt tca tca gag aac ctt tac ttt


671


Cys Pro Asn Val Lys Asn Ile Asn Val Ser Ser Glu Asn Leu Tyr Phe


210 215 220


cag gga gcg gcc gc


685


Gln Gly Ala Ala


225


<210>9


<211>4219


<212> DNA


<213> Artificial Sequence


<220>


<223> pPBUFCYea-LvRsnl expression vector resulting from the insertion of the SEQ ID NO:9 into the SEQ ID NO:10


<400>9


agatctaaca tccaaagacg aaaggttgaa tgaaaccttt ttgccatccg acatccacag 60


gtccattctc acacataagt gccaaacgca acaggagggg atacactagc agcagaccgt 120


tgcaaacgca ggacctccac tcctcttctc ctcaacaccc acttttgcca tcgaaaaacc 180


agcccagtta ttgggcttga ttggagctcg ctcattccaa ttccttctat taggctacta 240


acaccatgac tttattagcc tgtctatcct ggcccccctg gcgaggttca tgtttgttta 300


tttccgaatg caacaagctc cgcattacac ccgaacatca ctccagatga gggctttctg 360


agtgtggggt caaatagttt catgttcccc aaatggccca aaactgacag tttaaacgct 420


gtcttggaac ctaatatgac aaaagcgtga tctcatccaa gatgaactaa gtttggttcg 480


ttgaaatgct aacggccagt tggtcaaaaa gaaacttcca aaagtcggca taccgtttgt 540


cttgtttggt attgattgac gaatgctcaa aaataatctc attaatgctt agcgcagtct 600


ctctatcgct tctgaacccc ggtgcacctg tgccgaaacg caaatgggga aacacccgct 660


ttttggatga ttatgcattg tctccacatt gtatgcttcc aagattctgg tgggaatact 720


gctgatagcc taacgttcat gatcaaaatt taactgttct aacccctact tgacagcaat 780


atataaacag aaggaagctg ccctgtctta aacctttttt tttatcatca ttattagctt 840


actttcataa ttgcgactgg ttccaattga caagcttttg attttaacga cttttaacga 900


caacttgaga agatcaaaaa acaactaatt attcgaaacg atg aga ttt cct tca 955


Met Arg Phe Pro Ser


1 5


att ttt act gct gtt tta ttc gca gca tcc tcc gca tta gct gct cca 1003


Ile Phe Thr Ala Val Leu Phe Ala Ala Ser Ser Ala Leu Ala Ala Pro


10 15 20


gtc aac act aca aca gaa gat gaa acg gca caa att ccg gct gaa gct 1051


Val Asn Thr Thr Thr Glu Asp Glu Thr Ala Gln Ile Pro Ala Glu Ala


25 30 35


gtc atc ggt tac tca gat tta gaa ggg gat ttc gat gtt gct gtt ttg 1099


Val Ile Gly Tyr Ser Asp Leu Glu Gly Asp Phe Asp Val Ala Val Leu


40 45 50


cca ttt tcc aac agc aca aat aac ggg tta ttg ttt ata aat act act 1147


Pro Phe Ser Asn Ser Thr Asn Asn Gly Leu Leu Phe Ile Asn Thr Thr


60 65


att gcc agc att gct gct aaa gaa gaa ggg gta tct ctc gag aaa aga 1195


Ile Ala Ser Ile Ala Ala Lys Glu Glu Gly Val Ser Leu Glu Lys Arg


75 80 85


gag gct gaa gct gca gga ttg ttg gaa gga ttt ttg gtc gga ggt ggt 1243


Glu Ala Glu Ala Ala Gly Leu Leu Glu Gly Phe Leu Val Gly Gly Gly


90 95 100


gtc cct ggt cct ggt aca gca tgt ttg act aag gca ttg aaa gac agt 1291


Val Pro Gly Pro Gly Thr Ala Cys Leu Thr Lys Ala Leu Lys Asp Ser


105 110 115


gga gac ttg ttg gtt gag ttg gct gtt att att tgt gct tac caa aac 1339


Gly Asp Leu Leu Val Glu Leu Ala Val Ile Ile Cys Ala Tyr Gln Asn


120 125 130


ggt aaa gat ttg caa gag caa gat ttc aag gaa ttg aag gag ttg ttg 1387


Gly Lys Asp Leu Gln Glu Gln Asp Phe Lys Glu Leu Lys Glu Leu Leu


135 140 145


gaa aga act ttg gaa aga gct ggt tgt gct ttg gat gat att gtt gct 1435


Glu Arg Thr Leu Glu Arg Ala Gly Cys Ala Leu Asp Asp Ile Val Ala


150 155 160 165


gat ttg ggt ttg gaa gag ttg ttg ggt tct att ggt gtt tct act gga 1483


Asp Leu Gly Leu Glu Glu Leu Leu Gly Ser Ile Gly Val Ser Thr Gly


170 175 180


gat atc atc caa ggt ttg tac aag ttg ttg aag gag ttg aag atc gat 1531


Asp Ile Ile Gln Gly Leu Tyr Lys Leu Leu Lys Glu Leu Lys Ile Asp


185 190 195


gaa act gtt ttt aac gct gtt tgt gat gtt act aag aaa atg ttg gat 1579


Glu Thr Val Phe Asn Ala Val Cys Asp Val Thr Lys Lys Met Leu Asp


200 205 210


aac aag tgt ttg cca aag atc ttg caa gga gat ttg gtt aag ttc ttg 1627


Asn Lys Cys Leu Pro Lys Ile Leu Gln Gly Asp Leu Val Lys Phe Leu


215 220 225


gat ttg aag tac aag gtt tgt atc gaa ggt gga gat cca gaa ttg att 1675


Asp Leu Lys Tyr Lys Val Cys Ile Glu Gly Gly Asp Pro Glu Leu Ile


230 235 240 245


att aag gat ttg aag atc atc ttg gag aga ttg cct tgt gtt ttg ggt 1723


Ile Lys Asp Leu Lys Ile Ile Leu Glu Arg Leu Pro Cys Val Leu Gly


250 255 260


ggt gtt ggt ttg gat gat ttg ttt aaa aac atc ttc gtt aag gat ggt 1771


Gly Val Gly Leu Asp Asp Leu Phe Lys Asn Ile Phe Val Lys Asp Gly


265 270 275


att ttg tct ttc gaa ggt att gct aag cct ttg gga gat ttg ttg att 1819


Ile Leu Ser Phe Glu Gly Ile Ala Lys Pro Leu Gly Asp Leu Leu Ile


280 285 290


ttg gtt ttg tgt cct aat gtc aag aat atc aat gtt tca tca gag aac 1867


Leu Val Leu Cys Pro Asn Val Lys Asn Ile Asn Val Ser Ser Glu Asn


295 300 305


ctt tac ttt cag gga gcg gcc gcc agc ttt cta gaa caa aaa ctc atc 1915


Leu Tyr Phe Gln Gly Ala Ala Ala Ser Phe Leu Glu Gln Lys Leu Ile


310 315 320 325


tca gaa gag gat ctg aat agc gcc gtc gac cat cat cat cat cat cat 1963


Ser Glu Glu Asp Leu Asn Ser Ala Val Asp His His His His His His


330 335 340


tgagtttgta gccttagaca tgactgttcc tcagttcaag ttgggcactt acgagaagac 2023


cggtcttgct agattctaat caagaggatg tcagaatgcc atttgcctga gagatgcagg 2083


cttcattttt gatacttttt tatttgtaac ctatatagta taggattttt tttgtcattt 2143


tgtttcttct cgtacgagct tgctcctgat cagcctatct cgcagctgat gaatatcttg 2203


tggtaggggt ttgggaaaat cattcgagtt tgatgttttt cttggtattt cccactcctc 2263


ttcagagtac agaagattaa gtgagacctt cgtttgtgcg gatcccccac acaccatagc 2323


ttcaaaatgt ttctactcct tttttactct tccagatttt ctcggactcc gcgcatcgcc 2383


gtaccacttc aaaacaccca agcacagcat actaaatttt ccctctttct tcctctaggg 2443


tgtcgttaat tacccgtact aaaggtttgg aaaagaaaaa agagaccgcc tcgtttcttt 2503


ttcttcgtcg aaaaaggcaa taaaaatttt tatcacgttt ctttttcttg aaattttttt 2563


ttttagtttt tttctctttc agtgacctcc attgatattt aagttaataa acggtcttca 2623


atttctcaag tttcagtttc atttttcttg ttctattaca acttttttta cttcttgttc 2683


attagaaaga aagcatagca atctaatcta aggggcggtg ttgacaatta atcatcggca 2743


tagtatatcg gcatagtata atacgacaag gtgaggaact aaaccatggc caagttgacc 2803


agtgccgttc cggtgctcac cgcgcgcgac gtcgccggag cggtcgagtt ctggaccgac 2863


cggctcgggt tctcccggga cttcgtggag gacgacttcg ccggtgtggt ccgggacgac 2923


gtgaccctgt tcatcagcgc ggtccaggac caggtggtgc cggacaacac cctggcctgg 2983


gtgtgggtgc gcggcctgga cgagctgtac gccgagtggt cggaggtcgt gtccacgaac 3043


ttccgggacg cctccgggcc ggccatgacc gagatcggcg agcagccgtg ggggcgggag 3103


ttcgccctgc gcgacccggc cggcaactgc gtgcacttcg tggccgagga gcaggactga 3163


cacgtccgac ggcggcccac gggtcccagg cctcggagat ccgtccccct tttcctttgt 3223


cgatatcatg taattagtta tgtcacgctt acattcacgc cctcccccca catccgctct 3283


aaccgaaaag gaaggagtta gacaacctga agtctaggtc cctatttatt tttttatagt 3343


tatgttagta ttaagaacgt tatttatatt tcaaattttt cttttttttc tgtacagacg 3403


cgtgtacgca tgtaacatta tactgaaaac cttgcttgag aaggttttgg gacgctcgaa 3463


ggctttaatt tgcaagctgg agaccaacat gtgagcaaaa ggccagcaaa aggccaggaa 3523


ccgtaaaaag gccgcgttgc tggcgttttt ccataggctc cgcccccctg acgagcatca 3583


caaaaatcga cgctcaagtc agaggtggcg aaacccgaca ggactataaa gataccaggc 3643


gtttccccct ggaagctccc tcgtgcgctc tcctgttccg accctgccgc ttaccggata 3703


cctgtccgcc tttctccctt cgggaagcgt ggcgctttct caatgctcac gctgtaggta 3763


tctcagttcg gtgtaggtcg ttcgctccaa gctgggctgt gtgcacgaac cccccgttca 3823


gcccgaccgc tgcgccttat ccggtaacta tcgtcttgag tccaacccgg taagacacga 3883


cttatcgcca ctggcagcag ccactggtaa caggattagc agagcgaggt atgtaggcgg 3943


tgctacagag ttcttgaagt ggtggcctaa ctacggctac actagaagga cagtatttgg 4003


tatctgcgct ctgctgaagc cagttacctt cggaaaaaga gttggtagct cttgatccgg 4063


caaacaaacc accgctggta gcggtggttt ttttgtttgc aagcagcaga ttacgcgcag 4123


aaaaaaagga tctcaagaag atcctttgat cttttctacg gggtctgacg ctcagtggaa 4183


cgaaaactca cgttaaggga ttttggtcat gagatc 4219


<210>10


<211>341


<212> PRT


<213> Artificial Sequence


<220>


<223> amino acid sequence of the modified version of the Lv-Rsn-1 surfactant protein encoded by the nucleotide sequence SEQ ID NO:9, which in turn is contained in the SEQ ID NO:11.


<400>10


Met Arg Phe Pro Ser Ile Phe Thr Ala Val Leu Phe Ala Ala Ser Ser


5 10 15


Ala Leu Ala Ala Pro Val Asn Thr Thr Thr Glu Asp Glu Thr Ala Gln


20 25 30


Ile Pro Ala Glu Ala Val Ile Gly Tyr Ser Asp Leu Glu Gly Asp Phe


35 40 45


Asp Val Ala Val Leu Pro Phe Ser Asn Ser Thr Asn Asn Gly Leu Leu


55 60


Phe Ile Asn Thr Thr Ile Ala Ser Ile Ala Ala Lys Glu Glu Gly Val


70 75 80


Ser Leu Glu Lys Arg Glu Ala Glu Ala Ala Gly Leu Leu Glu Gly Phe


85 90 95


Leu Val Gly Gly Gly Val Pro Gly Pro Gly Thr Ala Cys Leu Thr Lys


100 105 110


Ala Leu Lys Asp Ser Gly Asp Leu Leu Val Glu Leu Ala Val Ile Ile


115 120 125


Cys Ala Tyr Gln Asn Gly Lys Asp Leu Gln Glu Gln Asp Phe Lys Glu


130 135 140


Leu Lys Glu Leu Leu Glu Arg Thr Leu Glu Arg Ala Gly Cys Ala Leu


145 150 155 160


Asp Asp Ile Val Ala Asp Leu Gly Leu Glu Glu Leu Leu Gly Ser Ile


165 170 175


Gly Val Ser Thr Gly Asp Ile Ile Gln Gly Leu Tyr Lys Leu Leu Lys


180 185 190


Glu Leu Lys Ile Asp Glu Thr Val Phe Asn Ala Val Cys Asp Val Thr


195 200 205


Lys Lys Met Leu Asp Asn Lys Cys Leu Pro Lys Ile Leu Gln Gly Asp


210 215 220


Leu Val Lys Phe Leu Asp Leu Lys Tyr Lys Val Cys Ile Glu Gly Gly


225 230 235 240


Asp Pro Glu Leu Ile Ile Lys Asp Leu Lys Ile Ile Leu Glu Arg Leu


245 250 255


Pro Cys Val Leu Gly Gly Val Gly Leu Asp Asp Leu Phe Lys Asn Ile


260 265 270


Phe Val Lys Asp Gly Ile Leu Ser Phe Glu Gly Ile Ala Lys Pro Leu


275 280 285


Gly Asp Leu Leu Ile Leu Val Leu Cys Pro Asn Val Lys Asn Ile Asn


290 295 300


Val Ser Ser Glu Asn Leu Tyr Phe Gln Gly Ala Ala Ala Ser Phe Leu


305 310 315 320


Glu Gln Lys Leu Ile Ser Glu Glu Asp Leu Asn Ser Ala Val Asp His


325 330 335


His His His His His


340

Claims
  • 1. A POLYNUCLEOTIDE characterized in that it encodes the predicted sequence of an isoform of the surfactant protein Lv-ranaspumin-1 (Lv-Rsn-1) and consisting of SEQ ID NO:2.
  • 2. A POLYNUCLEOTIDE characterized in that it encodes the predicted sequence of an isoform of the surfactant protein Lv-ranaspumin-1 (Lv-Rsn-1) and in that it has the codon frequency optimized for expression in bacteria, consisting of SEQ ID NO:3.
  • 3. A POLYNUCLEOTIDE characterized in that it encodes the predicted sequence of an isoform of the surfactant protein Lv-ranaspumin-1 (Lv-Rsn-1) and in that it has the codon frequency optimized for expression in yeast, consisting of SEQ ID NO:7.
  • 4. A POLYPEPTIDE characterized in that it is a modified version of an isoform of the surfactant protein Lv-ranaspumin-1 (Lv-Rsn-1) and consisting of SEQ ID NO:6.
  • 5. A POLYPEPTIDE characterized in that it is a modified version of an isoform of the surfactant protein Lv-ranaspumin-1 (Lv-Rsn-1) and consisting of SEQ ID NO:10.
  • 6. AN EXPRESSION CASSETTE, characterized in that it comprises a polynucleotide according to claim 2 operably linked to a promoter that directs expression in bacteria.
  • 7. AN EXPRESSION CASSETTE characterized in that it comprises a polynucleotide according to claim 3 operably linked to a promoter that directs expression in fungi, preferably in yeast.
  • 8. AN EXPRESSION VECTOR characterized in that it comprises an expression cassette according to claim 6.
  • 9. AN EXPRESSION AND TRANSFORMATION VECTOR characterized in that it comprises an expression cassette according to claim 7.
  • 10. A GENETICALLY MODIFIED MICRO-ORGANISM characterized in that it is a bacterium that produces a protein whose encoding sequence comprises the polynucleotide according to claim 6.
  • 11. A GENETICALLY MODIFIED MICRO-ORGANISM characterized in that it is a yeast that produces a protein whose encoding sequence comprises the polynucleotide according to claim 7.
  • 12. A PROCESS OF PRODUCTION OF GENETICALLY MODIFIED ORGANISM characterized in that it results in a bacterium according to claim 10 and comprises: a) transforming a bacterial strain with the expression cassette according to claim 6;b) selecting the transformed bacteria.
  • 13. A GENETICALLY MODIFIED ORGANISMS PRODUCTION PROCESS characterized in that it results in a yeast according to claim 11 and comprises: a) transforming a yeast strain with the expression cassette according to claim 7;b) selecting the transformed yeasts.
  • 14. A PRODUCT characterized in that it comprises a polypeptide according to claim 4.
  • 15. A PRODUCT characterized in that it comprises a polypeptide according to claim 5.
  • 16. AN ADVANCED OIL RECOVERY PROCESS AND IMPROVEMENT OF RESERVOIR FLUID DYNAMICS, using a biosurfactant protein obtained from a genetically modified organism, according to claims 1 to 15, characterized in that the genetically modified organism is capable of synthesizing the biosurfactant protein Lv-ranaspumin-1.
  • 17. AN OIL BIOREMEDIATION PROCESS, using a biosurfactant protein obtained from a genetically modified organism, according to claims 1 to 15, characterized in that the genetically modified organism is capable of synthesizing the biosurfactant protein Lv-ranaspumin-1.
  • 18. A TANK CLEANING PROCESS IN THE OIL AND GAS INDUSTRY, using a biosurfactant protein obtained from a genetically modified organism, according to claims 1 to 15, characterized in that the genetically modified organism is capable of synthesizing the biosurfactant protein Lv-ranaspumin-1.
Priority Claims (1)
Number Date Country Kind
10 2019 026867 0 Dec 2019 BR national
PCT Information
Filing Document Filing Date Country Kind
PCT/BR2020/050541 12/14/2020 WO