Plasmid system

Information

  • Patent Grant
  • 11965173
  • Patent Number
    11,965,173
  • Date Filed
    Friday, December 14, 2018
    5 years ago
  • Date Issued
    Tuesday, April 23, 2024
    21 days ago
Abstract
There is provided a plasmid system for transfection into a cell to create a producer cell, the system comprising: a. a helper plasmid comprising a first nucleotide sequence encoding Murine leukemia virus (MLV)-derived Gag and Pol poly-proteins; b. an envelope plasmid comprising a second nucleotide sequence encoding an Env protein; c. a genome plasmid comprising a third nucleotide sequence encoding a retroviral genome, wherein the first nucleotide sequence is codon-shuffled to remove any significant regions of homology with the third nucleotide sequence; and wherein the second nucleotide sequence is codon-optimised for expression in the producer cell.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS

This is a U.S. National Phase of International Application No. PCT/GB2018/053638, filed Dec. 14, 2018, which claims priority to Great Britain Application No. 1720948.7, filed Dec. 15, 2017.


FIELD OF INVENTION

The present invention relates to a plasmid system which may be used to create a producer cell for producing retroviral vectors. The invention also relates to methods for increasing the efficiency of retroviral vectors using such a plasmid system.


BACKGROUND

Retroviral vectors are used for several applications including ex vivo modification of T cells for cancer immunotherapies.


Retroviral vectors may be made by transient transfection using a three plasmid system:


i) a helper plasmid containing the viral sequences encoding Gag and Pol polyproteins; (ii) an envelope plasmid coding for the envelope protein (Env); and (iii) a genome plasmid containing the transgene flanked by viral Long Terminal Repeats (LTRs) and a packaging signal required for the incorporation of the viral vector RNA into virions.


It is desirable to maximise the efficiency of viral vector production in order to obtain high retroviral titres and to reduce the total amount of plasmid needed for transient transfection.





DESCRIPTION OF THE FIGURES


FIG. 1: Map of a helper plasmid of the plasmid system of the present invention



FIG. 2: Map of an envelope plasmid of the plasmid system of the present invention



FIG. 3: Map of a genome plasmid of the plasmid system of the present invention encoding a CD19/CD22 CAR



FIG. 4: Originals maps of a helper plasmid and envelope plasmid used in standard vector manufacture by transient transfection.



FIG. 5: Titration results of five independent vector preparations made using the different helper plasmids. All 10 helper plasmids were used either at the standard or low amount (¼ of standard). Sample 21 represents the unmodified, standard LTR-driven helper plasmid. The conditions chosen to bring forward for further optimisation experiments are highlighted.



FIG. 6: Titration results of five independent vector preparations made using different envelope plasmids, all comprising the RD114 Envelope protein. All 10 envelope plasmids were used either at a standard or low amount (¼ of standard). Sample 21 represents the unmodified, standard LTR-driven RD114 envelope plasmid used at the standard amount. The plasmid chosen to bring forward for further optimisation experiments is highlighted.



FIG. 7: Titration results of two independent vector preparations made using the different helper plasmids, comprising the Galv Envelope protein. All 6 helper plasmids were used either at the standard, low amount (¼ of standard) or high amount (2× of standard). The helper plasmid with the highest titre is highlighted.



FIG. 8: Titration results of independent vector preparations in 10 cm plates made using different helper and envelope plasmids at different ratios. The ratios are designated “standard” or “low” (¼ of standard). “SS”=Standard helper and envelope, “SL”=standard helper, low envelope; “LS”=Low helper, standard envelope; “LL”=Low helper and envelope. pGEN1, genome plasmid 1.



FIG. 9: Titration results of independent vector preparations using the different helper and Envelope plasmids at different ratios. The plasmids are used in 3 amounts, designated “standard” (S), “low” (L) (¼ of standard) or “minimal” (M) ( 1/16 of standard. The first letter refers to the amount of helper plasmid used and the second letter to the amount of Envelope plasmid used. The total amount of DNA per transfection was kept constant by the addition of genome vector plasmid. pGEN2, genome plasmid 2. Samples 25 and 26 are controls made using the original unmodified plasmids.



FIG. 10: Titration results of independent vector preparations in T175 flasks using the optimised helper and envelope plasmids and ratio, and with different genome plasmids at different amounts ranging from 20-100% of the standard amount.





SUMMARY OF ASPECTS OF THE INVENTION

The present invention is based on the finding that it is possible to improve the system for creating a producer cell for producing retroviral vectors with plasmids comprising novel modifications at the nucleic acid level. These modifications have been found to improve the titre and safety profile of the viral vectors as well as increasing the efficiency of the producer cell itself.


Thus in a first aspect, the present invention provides a plasmid system for transfection into a cell to create a producer cell, the system comprising:

    • a. a helper plasmid comprising a first nucleotide sequence encoding Murine leukemia virus (MLV)-derived Gag and Pol poly-proteins;
    • b. an envelope plasmid comprising a second nucleotide sequence encoding an Env protein;
    • c. a genome plasmid comprising a third nucleotide sequence encoding a retroviral genome,


      wherein the first nucleotide sequence is codon-shuffled to remove any significant regions of homology with the third nucleotide sequence; and


      wherein the second nucleotide sequence is codon-optimised for expression in the producer cell.


The first nucleotide sequence may be codon optimised for expression in the producer cell.


The codon adaptation index (CAI) of the first nucleotide sequence may be at least 0.75.


The first nucleotide sequence may comprise a sequence selected from SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3.


The helper plasmid may comprise a promoter selected from: CMV early enhancer/chicken β actin (CAG) or cytomegalovirus (CMV).


The helper plasmid may comprise a rabbit β-globin polyA site.


The helper plasmid may further comprise an intron sequence in the 5′ untranslated region. The intron may be a (human β-globin) intron.


The helper plasmid may lack a long terminal repeat (LTR) sequence.


The second nucleotide sequence of the first aspect may comprises the sequence selected from SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11 or SEQ ID NO: 12.


The Env protein may be the RD114 Envelope protein or the GALV Envelope protein.


For example, when the Env protein is the RD114 Envelope protein, the codon adaptation index (CAI) of the second nucleotide sequence is at least 0.75.


Alternatively, when the Env protein is GALV Envelope protein, the codon adaptation index (CAI) of the second nucleotide sequence is at least 0.65.


The envelope plasmid may comprise a promoter selected from: Ferritin and cytomegalovirus (CMV) promoters.


The envelope plasmid may comprise a SV40 polyA site.


The envelope plasmid may further comprises an intron sequence in the 5′ untranslated region. The intron of the envelope plasmid may be a BG (human β-globin), RD114 or mEf1 (murine elongation factor 1) intron.


The envelope plasmid may lack a long terminal repeat (LTR) sequence.


The helper plasmid and the envelope plasmid may comprise different promoters.


The helper plasmid and the envelope plasmid may comprise different polyA sites.


The helper plasmid and envelope plasmid may comprise different introns.


The genome plasmid of the first aspect of the invention may further comprise a nucleotide of interest (NOI).


The genome plasmid may comprise a packaging signal, which has homology with a portion of the wildtype MLV nucleotide sequence encoding Gag and/or Pol polyprotein(s).


In a second aspect, the present invention provides a method for making a packaging cell, which packages retroviral vectors, which comprises the step of transfecting a cell with a helper plasmid and an envelope plasmid as defined in the first aspect of the invention.


In a third aspect, the present invention provides a method for making a producer cell, which produces retroviral vectors, which comprises the step of transfecting a cell with a helper plasmid, an envelope plasmid and a genome plasmid, as defined in the first or second aspect of the invention.


In a fourth aspect, the present invention provides a packaging cell capable of packaging retroviral vectors, comprising a helper plasmid and an envelope plasmid, as defined in the first aspect of the invention.


In a fifth aspect, the present invention provides a producer cell capable of producing retroviral vectors, comprising a helper plasmid, an envelope plasmid and a genome plasmid, as defined in the first or second aspect of the invention.


In a sixth aspect, the present invention provides a method for making a retroviral vector using a packaging cell as defined in the fourth aspect of the invention, or a producer cell as defined in the fifth aspect of the invention.


In a seventh aspect, the present invention provides a method to increase efficiency of a producer cell of a plasmid system, the system comprising:

    • a. a helper plasmid comprising a first nucleotide sequence encoding MLV-derived Gag and Pol poly-proteins;
    • b. an envelope plasmid comprising a second nucleotide sequence encoding an Env protein;
    • c. a genome plasmid comprising a third nucleotide sequence encoding a retroviral genome,


      characterised in that the method comprises the steps of:
    • (I) codon-shuffling the first nucleotide sequence to remove significant regions of homology with the third nucleotide sequence;
    • (II) codon-optimising the second nucleotide sequence for expression in the producer cell.


In an eighth aspect, the present invention provides a nucleotide sequence encoding MLV-derived Gag and Pol poly-proteins comprising the sequence selected from: SEQ ID NO: 1 to SEQ ID NO: 3.


In a ninth aspect, the present invention provides a nucleotide sequence encoding Env protein comprising the sequence selected from: SEQ ID NO: 5 to SEQ ID NO: 7 and SEQ ID NO: 9 to 11.


The inventors found that codon shuffling the Gag and Pol polyprotein encoding sequence and codon-optimising the Env sequence not only improves safety but has a synergistic effect in improving vector titre.


It has previously been reported that codon optimisation of Env-encoding sequences leads to a non-functional Env due to impaired glycosylation of the precursor protein, so it is surprising that codon optimised Env-encoding sequences a) encode a functional protein and b) in combination with a codon-shuffled GagPol-encoding sequence leads to increased efficiency of gene expression.


DETAILED DESCRIPTION

Retroviruses


Retroviruses are double stranded RNA enveloped viruses mainly characterized by the ability to “reverse-transcribe” their genome from RNA to DNA. Virions measure 100-120 nm in diameter and contain a dimeric genome of identical positive RNA strands complexed with the nucleocapsid proteins. The genome is enclosed in a protein capsid that also contains enzymatic proteins, namely the reverse transcriptase, the integrase and proteases, required for viral infection. Matrix proteins form a layer outside the capsid core that interacts with the envelope, a lipid bilayer derived from the host cellular membrane, which surrounds the viral core particle. Anchored on this bilayer, are the viral envelope glycoproteins responsible for recognizing specific receptors on a host cell and initiating the infection process. These envelope proteins are formed by two subunits: the transmembrane (TM) that anchors the protein into the lipid membrane and the surface (SU) which binds to the cellular receptors.


Based on the genome structure, retroviruses are classified into simple retrovirus such as MLV (murine leukemia virus); or complex retrovirus such as HIV or EIAV.


Retroviruses encode three genes: gag-pro (group specific antigen-protease), gag-pro-pol (group specific antigen-protease-polymerase) by read through of a stop codon and env (envelope). The gag sequence encodes the three main structural proteins: the matrix protein, nucleocapsid proteins, and capsid protein. The pro sequence encodes proteases responsible for cleaving Gag and Gag-Pol during particle assembly, budding and maturation. The pol sequence encodes the enzymes reverse transcriptase and integrase, the former catalyzing the reverse transcription of the viral genome from RNA to DNA during the infection process and the latter responsible for integrating the proviral DNA into the host cell genome.


In addition to gag, pol and env, complex retroviruses, such as lentiviruses, have accessory genes including vif, vpr, vpu, nef, tat and rev that regulate viral gene expression, assembly of infectious particles and modulate viral replication in infected cells.


During the process of infection, a retrovirus initially attaches to a specific cell surface receptor via the envelope glycoprotein. On entry into the susceptible host cell, the retroviral RNA genome is then copied to DNA by the virally encoded reverse transcriptase in the host cell cytoplasm. This DNA is transported to the host cell nucleus where it subsequently integrates into the host genome. At this stage, it is typically referred to as the provirus. The provirus is stable in the host chromosome during cell division and is transcribed like other cellular proteins. The provirus encodes the proteins and packaging machinery required to make more virus, which can leave the cell by a process known as “budding”.


When enveloped viruses, such as retrovirus and lentivirus, bud out of the host cells, they take part of the host cell lipid bilayer membrane. In this way, host-cell derived membrane proteins become part of the retroviral particle.


The present invention utilises this process of infection in order to introduce proteins of interest into the genome of the host cell.


Retroviral Vectors


Retroviruses and lentiviruses may be used as a carrier, vector or delivery system for the transfer of a nucleotide sequence of interest (NOI), or a plurality of NOIs, to a target cell. The transfer can occur in vitro, ex vivo or in vivo. When used in this fashion, the viruses are typically called viral vectors. Viral vectors of the present invention may comprise a NOI(s), which may encode a T cell receptor or a chimeric antigen receptor and/or a suicide gene.


Gamma-retroviral vectors, commonly designated retroviral vectors, were the first viral vector employed in gene therapy clinical trials in 1990 and are still one of the most used. More recently, the interest in a sub-family of retroviral vectors, i.e. lentiviral vectors, derived from complex retroviruses such as the human immunodeficiency virus (HIV), has grown due to their ability to transduce non-dividing cells. The most attractive features of retroviral and lentiviral vectors as gene transfer tools include the capacity for large genetic payload (up to 9 kb), minimal patient immune response, high transducing efficiency in vivo and in vitro, and the ability to permanently modify the genetic content of the target cell, sustaining a long-term expression of the delivered gene. While both lentiviruses and gamma-retroviruses may use the same gene products for packaging (i.e., Gag, Pol, and Env), the isoforms of these proteins differ so they are not interchangeable. General envelope plasmids, such as VSV-G, however, may be used across both systems.


The retroviral vector may be based on any suitable retrovirus which is able to deliver genetic information to eukaryotic cells. For example, the retroviral vector may be an alpharetroviral vector, a gammaretroviral vector, a lentiviral vector or a spumaretroviral vector. Such vectors have been used extensively in gene therapy treatments and other gene delivery applications. Retroviral vectors are commonly produced by transfection of a packaging target cell line such as Human Embryonic Kidney 293 (HEK293) cells, using a three-plasmid system.


Described herein is such a plasmid system for triple transfection into a cell to create a producer cell for the manufacture of a retroviral vector. Also described is a plasmid system for double transfection into a cell to create a stable packaging cell.


Triple Transfection


The transient three-plasmid system for the production of high titre retroviral vectors has previously been described (see, for example, Soneoka et al.; Nucleic Acids Res. 1995; 23(4); 628-633). This system is used to create a producer cell in the first aspect of the invention and involves three separate plasmids:

    • a) a helper plasmid comprising a first nucleotide sequence encoding MLV-derived Gag and Pol poly-proteins;
    • b) an envelope plasmid comprising a second nucleotide sequence encoding an Env protein; and
    • c) a genome plasmid comprising a third nucleotide sequence encoding a retroviral genome.


This system is a convenient method for rapid analysis of encoding components within each plasmid (such as the MLV-derived Gag and Pol or Env proteins), and is simple and reproducible in the hands of different operators compared to a single plasmid transfection method. This is because in addition to being a helper virus-free method, the triple transfection protocol offers the flexibility to adapt the packaging components and exogenous regulatory elements in the plasmid. Triple transfection is widely used in research grade vector core facilities and has been used for the manufacturing of clinical grade preparations for phase one trials.


Replication Competent Retrovirus


The three plasmids in the plasmid system of the present invention are used for vector production by transient transfection, and thus are designed to be present in the nucleus of the producer cells at the same time. Sequence homology between different DNA molecules that are present in the nucleus at the same time may result in homologous recombination between the DNA molecules, and the generation of recombinant DNA. For example, RCR has been reported in murine producer cell lines PA317, AM12 and Psi-CRIP.


Homologous recombination is a type of genetic recombination in which nucleotide sequences are exchanged between two similar or identical molecules of DNA. If DNA recombines in such a way as to bring together the viral helper packaging compenents (Gag and Pol polyproteins), the envelope (Env) and the LTRs/packaging signal of the three separate plasmids, production of partially or fully replication competent recombinant retroviruses may result. Infection with these type of retroviruses can cause malignant diseases as well as well as a range of other unsafe pathogenic states.


The most efficient way to avoid the risk of generation of recombinant retrovirus, potentially capable of replication, is to ensure that no sequence homology between the plasmids exists.


The nucleic acid sequence comprised within the genome plasmid that encodes the wild type retroviral genome may have portion(s) of sequence homology with the nucleic acid sequence of the helper plasmid that encodes the wild type Gag and Pol polyproteins. For example, the packaging signal of the retroviral genome may have a portion of sequence which shows homology with the nucleic acid sequence encoding the Gag and pol polyprotein. The helper plasmid may also share sequence homology between other exogenous elements in the envelope or genome plasmids, such as the promoter, the intron, the polyA sequence and/or the LTR sequences. This further increases a risk of generation of replication-competent retrovirus (RCR) via homologous recombination during transfection.


The standard 3-plasmid system therefore features limitations relating to the safety as well as efficiency of the system to create a producer cell for creating retroviral vector. To overcome these issues, the present inventors have optimised the plasmid system by re-designing the standard plasmids to improve gene expression by a combination of several approaches;

    • I. Sequence modification of the Gag and Pol polyprotein by codon-shuffling to improve the safety profile of the plasmid combination by removal of the homologous regions with the packaging signal sequence in the genome plasmid; and
    • II. Codon optimising the Env coding sequences to improve translation efficiency and mRNA stability; and/or
    • III. Introduction of new efficient promoters to enhance transcription; and/or;
    • IV. Introduction of new strong polyadenylation sites to improve mRNA stability; and/or
    • V. Utilise introns at the 5′ untranslated region to enhance mRNA export from the nucleus to the cytoplasm.


Further additions to the new plasmids may include the SV40 origin of replication, which results in increased plasmid copy number during transfection if carried out in 293T cells which express the SV40 large T antigen. The antibiotic resistance marker may also be changed from ampicillin to kanamycin to improve the clinical safety profile of the plasmids.


The inventors have designed a three-plasmid system where not one of the regulatory elements present in each of the separate plasmids described herein are the same. For example, the helper plasmid, the envelope plasmid and the genome plasmid all comprise different promoters.


Re-Designing Plasmids


Codon-Shuffling


The present inventors have discovered that it is possible to create a helper plasmid comprising the packaging components with completely novel sequences and reduced homology to the viral genome plasmid comprising a packaging signal.


Codon shuffling, also known as codon-wobbling, is a process whereby the amino acid sequence is preserved for a given coding sequence, but alternative codons are used due to the degeneracy of the genetic code. Importantly, this type of synthetic manipulation to alter codon distribution in a viral gene effectively changes the nucleotide sequence in a gene of interest without altering the amino acid sequence. In addition, the genetic deck can be manipulated to change any number of variables to potentially increase or decrease protein levels, a consideration for viral gene products that are toxic to the host cell.


Retrovirus genomes can undergo homologous recombination (HR) during replication, which may lead to wild-type (WT) reversion during complementation of replication-defective and attenuated viruses via HR with the helper gene provided in trans. To avoid this safety-risk, the present invention employs a synthetic-biology approach based on a technique known as codon shuffling. Computer-assisted algorithms may be used to redistribute codons in a helper plasmid, thereby eliminating regions of homology with the genome plasmid, while enabling manipulation of factors such as codon-pair bias and CpG content to effectively titrate helper-gene protein levels. The present inventors apply this technique to avoid homologous replication between the retroviral genome and the sequences encoding MLV-derived gag and pol polyproteins.


Codon-Optimisation


There may be species-specific codon usage bias between the host and the target cell which may affect the efficiency of the system. One powerful strategy to overcome codon-usage bias is codon-optimisation.


Codon optimisation is a method of enhancing gene expression by making the translation process more efficient. Due to the redundant nature of the DNA code, many amino acids can be coded for by several different triplets. These triplets are transcribed into messenger RNA (mRNA) and then translated into amino acids to make polypeptides. The translation process features the binding of a transfer RNA (tRNA) containing the complementary triplet to the one present on the mRNA molecule, and by doing that the amino acid associated with that tRNA is added onto the growing polypeptide.


The abundance of the different tRNAs varies between organisms. Therefore, the speed at which an amino acid is added to the polypeptide depends on whether the triplet coding for it is abundant or rare in that particular species. The abundance of different codons for several different organisms is documented on publicly available databases such as the codon usage database, and therefore it is possible to change the DNA sequence of a gene by utilising abundant codons without changing the amino acid sequence, resulting in more efficient translation and thus enhanced gene expression.


Recently, one group attempted to codon-optimise a sequence of an Env protein (RD114) but showed that it lead to functional impairment of the protein (Zucchelli et al.; Mol Ther Methods Clin Dev. 2017; 4: 102-114). However, the present inventors found that not only did the Env protein of the present invention remain functional, but the combination of such a modified envelope plasmid with the codon-shuffled GagPol-encoding sequence of the helper plasmid showed a synergistic effect with increased efficiency of gene expression. Notably, the CAI of the codon-optimised RD114 Env protein significantly increased from 0.66 to 0.75.


In one embodiment described herein, the nucleotide sequence encoding the Gag and Pol polyprotein may be codon-optimised.


Codon-Optimisation Index (CAI)


The codon adaptation index (CAI) is a widespread technique for analysing codon usage bias. CAI measures the deviation of a given protein coding gene sequence with respect to a reference set of genes.


The reference set in CAI is composed of highly expressed genes, so that CAI provides an indication of gene expression level under the assumption that there is translational selection to optimise gene sequences according to their expression levels.


CAI is defined as the geometric mean of the weight associated to each codon over the length of the gene sequences (measured in codons). For each amino acid sequence, the weight of each of its codons, in CAI is computed from the reference set, as the ratio between the observed frequency of the codon, and the frequency of the most frequent synonymous codon for that amino acid.


The present invention may provide an increase in CAI for both the first nucleotide sequence comprised within the helper plasmid and the second nucleotide sequence comprised with the envelope plasmid. In one aspect of the invention, the first nucleotide sequence, which encodes packaging components Gag and Pol polyproteins is increased to over 0.75.


The CAI of the second nucleotide sequence which encodes the Env protein similarly increased to over 0.75. In another embodiment, the CAI the second nucleotide sequence, which encoded a differently derived Env protein, is increased to over 0.65.


i) Helper Plasmid


The helper plasmid (FIG. 1), also known as the packaging plasmid, comprises a nucleotide sequence encoding Gag and Pol polyproteins described herein. Notably, in the triple plasmid system, the Gag and Pol polyproteins packaging components are on a separate plasmid to the Env protein. Placing the packaging genes on separate plasmids helps to reduce the formation of replication-competent virus.


The helper plasmid may also comprise exogenous regulatory elements such as, for example at least one stop codon, a promoter sequence, an intron sequence and/or a polyA sequence.


Examples of promoter sequences of the helper plasmid described herein include, but are not limited to, CMV early enhancer/chicken β actin (CAG), cytomegalovirus (CMV) or p565Prom sequence.


An example of an intron sequence of the helper plasmid described herein includes, but is not limited to, human β-globin intron in the 5′ untranslated region of the plasmid.


An example of a polyA sequence of the helper plasmid described herein includes, but is not limited to, the rabbit β-globin polyA sequence. This polyA sequence is selected due to its relative strength. The inclusion of a strong polyadenylation signal at the end of the expression cassette is important for the efficient termination of transcription. It increases gene expression levels by allowing the RNA polymerase to detach from the completed mRNA and begin again to transcribe a new mRNA molecule. Expression levels are further increased as the polyA tail promotes mRNA export from the nucleus, initiation of translation, and protects the mRNA from degradation.


The helper plasmid may further comprise a nucleotide sequence encoding a SV40 origin of replication which results in increased copy number and increase plasmid retention in transfected cells and thus improve vector titre. The helper plasmid may further comprise an antibiotic resistance marker. For example, the helper plasmid may comprise the marker kanamycin, which has shown improved clinical safety profile of plasmids compared to the standard ampicillin marker.


In one embodiment, the LTR sequence of the helper plasmid may be removed to reduce portions of shared sequence homology with the LTR sequences present in any one of the other plasmids of the three-plasmid system, such as the envelope plasmid and/or the genome plasmid.


In particular, provided herein is a helper plasmid comprising MLV-derived Gag and Pol polyproteins, the nucleic acid sequence of which has been codon shuffled, as described herein.


MLV-Derived GAG and POL Polyproteins


The present invention provides Gag and Pol polyproteins derived from Murine Leukemia virus (MLV). MLVs (or MuLVs) are retroviruses with an ability to cause cancer in murine hosts. The murine leukemia viruses are type VI retrovirus and belong to the gammaretroviral genus of the Retroviridae family.


Moloney, Rauscher, Abelson and Friend strains of MLVs are commonly used in cancer research. In one embodiment, the Gag and Pol polyproteins of the helper plasmid are Moloney-MLV derived Gag and pol polyproteins.


Importantly, the nucleotide sequence encoding the MLV-derived Gag and Pol polyproteins of the helper plasmid of the present invention have been codon-shuffled to remove any significant regions of homology with the packaging signal in the genome plasmid.


The nucleotide sequence of the MLV-derived Gag and Pol polyproteins packaging component has a relatively good codon-usage, compared to, for example, HIV-derived Gag and Pol polyproteins. However, the present inventors have shown that it is possible to codon shuffle the nucleotide sequence encoding the MLV-derived Gag and Pol polyproteins in order to avoid homologous recombination but still retain good translation efficiency.


ii) Envelope Plasmid


The envelope plasmid comprises a nucleotide sequence encoding the Env protein as well as regulatory exogenous elements, such as a promoter, an intron sequence and/or a polyA sequence. Other exogenous elements may include a SV40 origin of replication and a marker such as an antibiotic resistance marker.


Examples of promoter sequences of the envelope plasmid described herein include but are not limited Ferritin, cytomegalovirus (CMV) or PDXG3 promoters.


An example of an intron sequence of the envelope plasmid described herein includes, but is not limited to, BGIntron, RD114 or mEf1 intron in the 5′ untranslated region of the plasmid.


An example of a polyA sequence of the helper plasmid described herein includes, but is not limited to, SV40 polyA sequence. FIG. 2 shows a map of an envelope plasmid construct of the present invention.


ENV Protein


A heterologous envelope gene may be used to pseudotype a retroviral vector to alter infectivity to the host. For example, the use of the VSV-G envelope protein provides a wide range of cell types (tropism) that a virus can infect. Despite its efficacy, VSV-G is cytotoxic, a feature that prohibits the development of stable cell lines that constitutively express this envelope. The RD114 Envelope protein is an example of a much less toxic pantropic viral envelope, also capable of infection of a variety of cell types, including stem cells and T cells.


The Env (Envelope) protein is a bipartite transmembrane protein and is cleaved by furin within the Golgi apparatus while being trafficked to the cell surface membrane. The viral Env protein determines viral tropism, or which cell types a virus can infect. The classes of viral envelopes are: i) ecotropic: narrow host range, can infect only one or a small group of species (usually mouse and rat); ii) amphotropic: broader host range, usually refers to viruses infecting only mammalian cells; iii) pantropic: broadest host range; both mammalian and non-mammalian cells can be infected. The vesicular stomatitis viral envelope (VSV-G) commonly used to package lentivirus and retrovirus is an example of a pantropic envelope.


RD114 Envelope


The Env protein in the envelope plasmid described herein may be derived from the feline endogenous retrovirus RD114 (UniProt entry: P31791), as shown in FIG. 2. This envelope confers the virions increased stability in the presence of human serum, compared with the wild-type MoMuLV envelope.


The RD114 Env provides increased particle stability and its receptor is widely expressed on hematopoietic stem cells (HSCs). Both transient transfection as well as generation of a stable cell line may be done with a RD114 envelope.


The present inventors have developed an improved envelope plasmid compared to the original envelope plasmid pLTR-RD114. It has previously been reported that codon optimisation of Env-encoding sequences leads to a non-functional Env due to impaired glycosylation of the precursor protein, so it is not expected that codon optimised Env-encoding sequences a) encode a functional protein and b) in combination with a codon-shuffled GagPol-encoding sequence leads to increased efficiency of gene expression.


The CAI of the wild type RD114 is 0.66 and this was increased to 0.78 in the codon optimised sequence described in the Examples. Both wild type and codon-optimised versions were cloned into each of the plasmid designs. The resulting 10 envelope plasmids are listed in the Examples section herein.


GALV Envelope


GALV (Gibbon ape leukemia virus) is a highly oncogenic C-type retrovirus capable of inducing myeloid leukemia in juvenile gibbons. GALV (UniProt: 070653) is antigenically most closely related to a new world monkey virus, simian sarcoma associated virus (SSAV), and less to the murine and feline C-type leukemia viruses.


As with RD114 Env protein, GALV Env proteins are able to form stable producer cell lines, making them potentially useful for small and large-scale production of pseudotyped vectors. GALV and RD114 are closely related type C mammalian retroviruses, and their entry pathways have common features. Their Env proteins contain two subunits, SU and TM, which are cleaved from a common precursor protein during transport to the cell surface. An additional feature of these Env proteins is that the C-terminal region of the cytoplasmic tail, the R peptide, is cleaved by the viral protease at, or shortly after, viral budding. R peptide cleavage is necessary to confer full activity to the Env protein, although not all of the Env proteins in a virion are cleaved. The processing of the cytoplasmic tail of Env is not unique to the mammalian type C retroviruses and has also been reported for more-distantly related retroviruses such as the Mason-Pfizer monkey virus and equine infectious anemia virus.


Described herein is an envelope plasmid comprising a codon-optimised GALV Env protein or a codon-optimised RD114 Env protein in combination with a helper plasmid comprising MLV-derived codon shuffled Gag and Pol polyproteins.


iii) Genome Plasmid


A genome plasmid (sometimes referred to as the transfer plasmid) comprises the exogenous NOI, also known as a transgene sequence. The NOI sequence is flanked by long terminal repeat (LTR) sequences, which facilitate integration of the transfer plasmid sequences into the host genome. The sequence between LTRs also contains the packaging signal of the virus and may contain a polypurine tract (PPT) and for lenti also a central polypurine tract (cPPT).


Typically, it is the sequence between and including the LTRs that is integrated into the host genome upon transfection into a cell to make a producer cell.


The genome plasmid (FIG. 3) described herein may derive from MoMLV. Since the expression of a transgene is driven by the MoMLV-LTR after integration, rather than an exogenous promoter, the step of selecting a promoter that differs from that of helper plasmid and envelope plasmid is relatively simple.


Examples of other elements featured in a genome plasmid described herein include but are not limited to: an intron sequence, a polyA sequence and a stop codon. As with the other two plasmids of the three plasmid system described herein, the regulatory exogenous elements of the genome plasmid may not be the same as the regulatory exogenous elements of the helper or envelope plasmids.


Since the size of the genome plasmid may be of consideration, depending on the length of the NOI, the introns present in the genome plasmid may also be removed.


Synergistic Effect of the Plasmids


The present inventors have surprisingly found that the combination of specific plasmids of the plasmid system of the present invention provides improved efficiency of the producer cell in terms of vector titre. Each plasmid of the system has been modified at the nucleotide level and/or with the presence or absence of certain regulatory elements. Notably, none of the exogenous regulatory elements between the helper plasmid and the envelope plasmid are the same.


Table 1 below shows two example preferred combinations of exogenous regulatory elements of modified helper and Envelope plasmid.










TABLE 1





Helper (GagPol) Plasmid
Envelope (Env) Plasmid







CMV promoter/β-globin
Ferritin promoter/


intron/Rabbit globin polyA
mEF1α/SV40 polyA


CMV promoter/Rabbit globin
Ferritin promoter/


polyA
mEF1α/SV40 polyA









Nucleotide of Interest (NOI)/Polypeptides of Interest (POI)


The viral vector of the present invention is capable of delivering a nucleotide of interest (NOI) to a target cell, such as a T cell or a natural killer (NK) cell.


The NOI may encode all or part of a T-cell receptor (TCR) or a chimeric antigen receptor (CAR) and/or a suicide gene.


The viral vector of the present invention is capable of delivering a polypeptide of interest (POI) to a target cell, such as a T cell or a natural killer (NK) cell.


The POI may be any polypeptide that is desired to be expressed in the transduced cell population. The POI may, for example be a Chimeric antigen receptor (CAR) or an engineered T cell receptor (TCR). The POI may be a polypeptide encoding for a suicide gene.


CARs, are chimeric type I trans-membrane proteins which connect an extracellular antigen-recognizing domain (binder) to an intracellular signalling domain (endodomain). The binder is typically a single-chain variable fragment (scFv) derived from a monoclonal antibody (mAb), but it can be based on other formats which comprise an antibody-like antigen binding site. A spacer domain is usually necessary to isolate the binder from the membrane and to allow it a suitable orientation. A trans-membrane domain anchors the protein in the cell membrane. A CAR may comprise or associate with an intracellular T-cell signalling domain or endodomain.


CAR-encoding nucleic acids may be transferred to cells, such a T cells, using the retroviral or lentiviral vector of the present invention. In this way, a large number of cancer-specific T cells can be generated for adoptive cell transfer. When the CAR binds the target-antigen, this results in the transmission of an activating signal to the T-cell it is expressed on. Thus the CAR directs the specificity and cytotoxicity of the T cell towards tumour cells expressing the targeted antigen.


A suicide gene encodes a polypeptide which enable the cells expressing such a polypeptide to be deleted, for example by triggering apoptosis. An example of a suicide gene is described in WO2013/153391.


Making a Retroviral Vector: Transient and Stable Transfection of a Producer or Packaging Cell


In a transient transfection, the transfected material enters the cell but is not integrated into the cellular genome. This method is generally useful for shorter-term expression of genes or gene products, or small-scale protein production. In contrast, material is integrated into the cellular genome via stable transfection but this method is a longer and more complex process, mainly for large scale protein production.


Components used to generate retroviral/lentiviral vectors include a helper plasmid encoding the Gag/Pol proteins, an envelope plasmid encoding the Env protein (and, in the case of lentiviral vectors, the rev protein), and the retroviral/lentiviral vector genome, as described herein. Vector production involves transient transfection of one or more of these components into cells containing the other required components.


The packaging cells of one aspect of the invention may be any mammalian cell type capable of producing retroviral/lentiviral vector particles. The packaging cells may be 293T-cells, or variants of 293T-cells which have been adapted to grow in suspension and grow without serum.


In the case of lentiviral vector, transient transfection with a rev vector is also performed.


The invention provides a plasmid system for transfection into a cell to create a producer cell, according to the first aspect of the invention. Also described herein are also other methods of transferring exogenous material into a cell. Other methods may include transduction, transposition or site-specific integration for creating stable cell lines.


A producer cell is a cell that produces a retroviral vector by transient transfection or with a stable cell line.


A producer cell for a retroviral vector may comprise gag, pol and env genes. A producer cell for a lentiviral vector may comprises gag, pol, env and rev genes.


The producer cell may comprise gag, pol, env and optionally rev genes and a retroviral or lentiviral vector genome.


In a recombinant retroviral or lentiviral vector for use in gene therapy, the gag-pol and env protein coding regions are provided by the packaging cell. This makes the viral vector replication-defective as the virus is capable of integrating its genome into a host genome but the modified viral genome is unable to propagate itself due to a lack of structural proteins.


In the plasmid system of the invention, the gag, pol and env (and, in the case of lentiviral vectors, rev) viral coding regions are carried on separate expression plasmids that are independently transfected into a packaging cell line, so that three recombinant events are required for viral production. Packaging cells are used to propagate and isolate quantities of viral vectors i.e to prepare suitable titres of the retroviral vector for transduction of a target cell.


Cell


There is provided a cell transfected with a plasmid system of the first aspect of the present invention to create a producer cell. There is also provided a cell transfected with helper and envelope plasmid from the plasmid system of the first aspect of the invention to create a packaging cell. The cell which is to be transfected may be referred to as the parent cell. It maybe any suitable cell type such as a 293T cell or a HeLa cell.


The cell of the invention may be an ex vivo cell from a subject. The cell may be from a peripheral blood mononuclear cell (PBMC) sample. Such cells may be activated and/or expanded prior to being transduced with nucleic acid encoding the molecules providing the plasmid system according to the first aspect of the invention.


The cell may be made by: (i) isolation of a cell-containing sample from a subject or other sources listed above; and (ii) transducing or transfecting the cell with plasmid system according to the first aspect of the invention.


The cells of the present invention may be capable of producing retroviral vectors or packaging retroviral vectors for transducing cells, such as a T cells. These transduced cells, such as T cells, may be for use in the treatment and or/prevention of diseases. The disease to be treated and/or prevented may be a cancerous disease.


Nucleic Acid


The present invention relates to a nucleic acid encoding MLV-derived Gag and Pol poly-proteins. Described herein are nucleic acid sequences encoding MLV-derived Gag and Pol polyproteins, which comprise a sequence selected from: SEQ ID NO: 1 to SEQ ID NO: 3.


Additionally the present invention also relates to a second nucleic acid sequence encoding an Env protein. Described here are nucleic acid sequences encoding Env protein, which comprise a sequence selected from: SEQ ID NO: 5 to SEQ ID NO: 7 and SEQ ID NO: 9 to 11.


As used herein, the terms “polynucleotide”, “nucleotide”, and “nucleic acid” are intended to be synonymous with each other.


It will be understood by a skilled person that numerous different polynucleotides and nucleic acids can encode the same polypeptide as a result of the degeneracy of the genetic code. In addition, it is to be understood that skilled persons may, using routine techniques, make nucleotide substitutions that do not affect the polypeptide sequence encoded by the polynucleotides described here to reflect the codon usage of any particular host organism in which the polypeptides are to be expressed.


Nucleic acids may comprise DNA or RNA. They may be single-stranded or double-stranded. They may also be polynucleotides which include within them synthetic or modified nucleotides. A number of different types of modification to oligonucleotides are known in the art. These include methylphosphonate and phosphorothioate backbones, addition of acridine or polylysine chains at the 3′ and/or 5′ ends of the molecule. For the purposes of the use as described herein, it is to be understood that the polynucleotides may be modified by any method available in the art. Such modifications may be carried out in order to enhance the in vivo activity or life span of polynucleotides of interest.


The terms “variant”, “homologue” or “derivative” in relation to a nucleotide sequence include any substitution of, variation of, modification of, replacement of, deletion of or addition of one (or more) nucleic acid from or to the sequence.


The invention will now be further described by way of Examples, which are meant to serve to assist one of ordinary skill in the art in carrying out the invention and are not intended in any way to limit the scope of the invention.


EXAMPLES

The initial round of experiments consisted of small-scale vector production and substituting one of the three plasmids with a range of new designs while using the standard plasmids for the other two functions. The best performing plasmids for each component were then taken to combination testing experiments to come to the optimal combination and relative ratio of the plasmids. Some larger scale experiments with different genomes were then conducted to assess the robustness of the plasmid system.


Example 1: Retroviral Vector Manufacture and Titration

Retroviral vectors pseudotyped with the RD114 envelope were produced by transient transfection of HEK (Human embryonic kidney) 293 T cells using Lipofectamine® 2000. HEK293T cells were seeded to be 80-90% confluent on the day of transfection. Transfection was carried out using 48 μl Lipofectamine per 10 cm dish; the amount of transfection reagent remained the same regardless of the plasmid amounts. The media were replaced 16-18 hours after transfection containing 1 mM NaBu. The vector was harvested 48 hours post-transfection, filtered through a 0.45 μm filter, and stored at −80° C.


For functional titration, HEK293T cells were transduced with serial dilutions of vector stock in the presence of 8 μg/ml polybrene. The cells were harvested four days post-transduction and stained for expression of the transgene. Transgene positive cells were scored using flow cytometry and titres calculated as transducing units/ml.


Example 2: Design of the MLV-Derived Gag and Pol Polyprotein Helper Plasmid

Five different versions of the helper plasmid were designed; each contained a strong promoter and an efficient polyadenylation site. Each of these versions was constructed using either the wild-type or a codon-shuffled version of the MLV-derived GagPol coding sequence, resulting in a total of 10 different plasmids.


Care was taken to maintain the stop codon between Gag and Pol in order to maintain the read-through mechanism that results in the majority of transcripts being Gag and minority being GagPol. Additional stop codons were also introduced downstream of the coding sequence to ensure efficient transcription termination.


Rabbit β-globin polyadenylation site was used for all of the versions as it confers high gene expression. The versions are summarised in Table 2.


The codon usage in the GagPol coding sequence was shuffled for expression in human cells. Interestingly, the Codon Adaptation Index (CAI) increased from 0.72 in the wild type to 0.77 in the shuffled sequence.









TABLE 2







10 different helper plasmids













GagPol





Plasmid
CDS
Promoter
Intron





 1
pSF_CAG_GagPol_RabpA
Wild-type
CAG (CMV
None


 2

Codon-
enhancer/chicken





shuffled
β-actin promoter)



 3
pSF_CMV_ GagPol _RabpA
Wild-type
CMV
None


 4

Codon-






shuffled




 5
pSF_CMV_BGIntron_
Wild-type
CMV promoter &
Human



GagPol _RabpA

Intron from
β-


 6

Codon-
Human β-globin
globin




shuffled
gene



 7
pSF_POXG1_ GagPol _
Wild-type
POXG1
None



RabpA





 8

Codon-






shuffled




 9
pSF_Prom565_ GagPol _
Wild-type
Prom565
None



RabpA





10

Codon-






shuffled









All 10 different versions of the helper plasmid were tested for small scale vector manufacture using transient transfection. The standard envelope plasmid pLTR-RD114 and genome plasmid pLTR-APRIL were kept the same throughout the experiments. The amount of envelope plasmid was kept constant, whilst two different amounts of the helper plasmid were tested: the standard amount normally used for transfection (designated as “standard”) and ¼ of the standard amount (designated as “low”). This was done to account for the expression levels from the new plasmids to be higher, and as such, may lead to the generation of excessive empty vector particles. The total amount of DNA per transfection was kept the same so that transfection efficiency and the DNA:Lipofectamine ratio wouldn't be affected, such as, when a reduced amount of GagPol was used the amount of genome plasmid pLTR-APRIL was increased accordingly.


Retroviral vectors were produced by three plasmid co-transfection in HEK293T cells over four independent experiments and were titrated by transducing HEK293T cells and assessing transgene expression by flow cytometry after four days. The results are shown in FIG. 5.


Several of the new helper plasmids increased the vector titre compared to the standard pLTR-GagPol helper plasmid by up to 10-fold. Even better, increased titres were obtained using the decreased (LOW) amount of plasmid, suggesting the feasibility of reducing the amount of helper plasmid required.


The best performing helper plasmids were brought forward for further optimisation experiments, for testing in combination with a range of new envelope plasmids. Only the the codon shuffled versions of the coding sequence were selected due to the increased safety profile (see Example 7); as the shuffled DNA sequence is different from the wild-type gene there will be no sequence homology with the extended packaging signal present in the genome plasmid which spans the beginning of Gag and the end of Pol.


Based on the data, three plasmids were chosen for further optimisation experiments:

    • a) pSF_CAG_shufGAGPOL (4),
    • b) pSF_CMV_shufGAGPOL (8),
    • c) pSF_p565Prom_shufGAGPOL (20).


Example 3: Design of RD114 Envelope Plasmid

Five different versions of the Envelope plasmid comprising the RD114 envelope protein were designed; each also comprising a strong promoter and a highly efficient SV40 polyadenylation site. All but one of the designs also contain an intron in the 5′UTR.


The CAI of the wild type RD114 is 0.66 and this increased to between 0.75 and 0.78 in the codon optimised sequence. Both wild type and codon-optimised versions were cloned into each of the plasmid designs. The resulting 10 envelope plasmids are summarised in Table 3 below.









TABLE 3







10 different envelope plasmids












Plasmid
Env CDS
Promoter
Intron





 1
pSF_CMV_RD114_SV40pA
Wild-type
CMV
None


 2

Codon-






optimised




 3
pSF_CMV_BGIntron_
Wild-type
CMV
Human β-



RD114_SV40pA


globin


 4

Codon-






optimised




 5
pSF_CMV_EnvNatUTR_
Wild-type
CMV
RD114



RD114_SV40pA


intron


 6

Codon-






optimised




 7
pSF_POXG3_RD114_SV40pA
Wild-type
POXG3
None


 8

Codon-






optimised




 9
pSF_Ferritin _mEf1UTR_
Wild-type
Ferritin
Murine



RD114_SV40pA


elongation


10

Codon-

factor 1α




optimised

intron









Codon optimisation has previously been reported to lead to non-functional RD114 due to impaired glycosylation of the precursor protein. However, and surprisingly, this finding was not replicated and the present inventors were able to generate a functional, codon optimised RD114 coding sequence that outperformed its non-optimised standard counterpart.


The ten different versions of the envelope plasmid were tested by small scale vector production using transient transfection. The standard helper plasmid pLTR-GagPol and genome plasmid pLTR-APRIL were kept the same throughout the experiments. Whilst the amount of packaging plasmid was kept constant, two different amounts of the envelope plasmid were tested: the standard amount normally used for transfection (designated as “standard”) and ¼ of the standard amount (designated as “low”). This was done to avoid potential reduction in vector titre resulting from excessive RD114 protein presence in producer cells due more efficient Env gene expression. As before with helper, the total amount of DNA per transfection was kept the same so that when a reduced amount of envelope plasmid was used the amount of genome plasmid could be increased accordingly.


The vectors were produced over three independent experiments and titrated by transducing HEK293T cells and assessing transgene expression by flow cytometry after 4 days. The results are shown below in FIG. 6.


The best performing plasmid was taken forward for further optimisation experiments, to be tested in combination with a range of new helper plasmids. There is no sequence homology between the envelope plasmids and the other plasmids regardless of whether the wild-type or the codon optimised version of the RD114 sequence is used, and therefore the main consideration was the efficiency of gene expression and hence the titre achieved using the minimum amount of plasmid. The plasmid that outperformed others was chosen for further optimisation experiments: pSF_Ferritin_mEF1α_optRD114 (16).


Example 4: Helper Plasmid and Envelope Plasmid Design Providing Synergistic Effect

Following the selection of the most promising helper and envelope plasmids by single plasmid substitution experiments described in Examples 1 and 2, a set of experiments was designed to assess the synergistic effects of combining different helper and envelope plasmids in different ratios. As decreasing the amount of plasmid required is desirable and ¼th of the standard amount produced equivalent titre vector in the first set of experiments, a further plasmid amount of 1/16th was introduced. The two plasmids were therefore used in three amounts, designated “standard” (S), “low” (L) (¼ of standard) or “minimal” (M) ( 1/16 of standard).


As changing the helper plasmid resulted in a more significant increase in the vector titre compared to the envelope plasmid, the strong CMV promoter was prioritised for use in the helper plasmid. The same promoter cannot be used in both plasmids due to the need to avoid sequence homology during transfection, and as three of the best-performing helper plasmids contained the CMV promoter, the ferritin promoter presented an attractive option for inclusion in the envelope plasmid. Thus, the number of allowed helper/envelope plasmid combinations to be tested were:

    • 1) pSF_CMV_shufGAGPOL/pSF_Ferritin_mEF1α_optRD114-UCL
    • 2) pSF_CAG_shufGAGPOL/pSF_Ferritin_mEF1α_optRD114-UCL
    • 3) pSF_p565Prom_ShufGAGPOL/pSF_CMV_BGIntron_wtRD114 (or optRD114)


The plasmid combinations were tested by transient transfection using two different genome plasmids, pGEN1 and pGEN2, to ensure that any titre differences would be applicable to all vector types. The first small scale transient transfection was done using 10 cm plates and the pGEN1 genome plasmid in two independent experiments. The vectors were titrated by transducing HEK293T cells and assessing transgene expression by flow cytometry after four days. The results are shown in FIG. 8.


The testing of different plasmid combinations and ratios was repeated by larger scale vector production in two independent experiments using transient transfection in T175 flasks and the pGEN2 genome plasmid. The vectors were titrated by transducing HEK293T cells and assessing CAR expression by flow cytometry after four days. The results are shown in FIG. 9.


Example 5: Larger Scale Vector Manufacture with Multiple Plasmid Combinations and Rations

Based on the titrations, the plasmid combination that most consistently produced the highest titre of vector with the lowest plasmid input was pSF_CMV_shufGAGPOL/pSF_Ferritin_mEF1α_optRD114. Further vector preparations were done in T175 flasks using the minimal amount ( 1/16th of standard) of each plasmid and using three different genome plasmids. As a control, vectors were produced alongside the original helper and envelope plasmids in the standard amounts. The titres obtained with the shuffled/optimised versus original plasmids using different genome plasmids (GEN1 and GEN2) are summarised below in Table 4 below.









TABLE 4





Comparison of vector titres achieved with


original versus improved plasmids




















GEN1 vector titres
Prep3
Prep2
Prep4
Prep5
Average





Shuffled GagPol in helper
1.36E+06
1.85E+06
2.96E+06
3.49E+06
2.41E+06


plasmid & optimised Env in







Envelope plasmids







Original GagPol & Env
6.29E+05
4.28E+05
4.18E+05
1.06E+06
6.34E+05


plasmids














GEN2 vector titres
Prep1
Prep2
Prep4
Average





Shuffled GagPol in helper
1.55E+06
1.39E+06
2.32E+06
1.75E+06


plasmid & optimised Env in






Envelope plasmids






Original GagPol & Env
1.36E+05
6.94E+04
4.18E+05
2.08E+05


plasmids









Comparison of vector titres achieved with original versus the shuffled helper and optimised envelope plasmids. Both genome plasmids GEN1 and GEN2 showed increased average titre by about 1 log with the modified plasmids (shuffled and optimised) compared to the original plasmids.


The experiment showed that high retroviral titres can be obtained with the shuffled GagPol and optimised Env sequences in the helper and envelope plasmids respectively, with transient transfection, even when the amount of the packaging plasmids was reduced. However, this resulted in an increase in the amount of genome plasmid required in order to maintain the same total amount of DNA used per transfection.


A lower genome plasmid requirement would benefit the production process by reducing the plasmid contamination at the end and reducing the cost of production. Therefore, further experiments were carried out to assess the effect of genome plasmid amount reduction on vector titre.


The experiment showed that high retroviral titres can be obtained with the shuffled GagPol and optimised Env sequences in the helper and envelope plasmids respectively, with transient transfection, even when the amount of the packaging plasmids was reduced.


A lower genome plasmid requirement would benefit the production process by reducing the plasmid contamination at the end and reducing the cost of production. Therefore, further experiments were carried out to assess the effect of genome plasmid amount reduction on vector titre.


The standard genome plasmid amount to complement the low amounts of the optimised packaging plasmids was designated as 100% per T175 flask, and we decided to test a range of genome plasmid amounts from 20-80%. Vectors were produced in independent experiments using GEN1, GEN2 and GEN3 genome plasmids. The vectors were produced in two independent experiments and titrated by transducing HEK293T cells and assessing transgene expression by flow cytometry after four days. The results are summarised in FIG. 10.


The results show that the amount of genome plasmid per flask can be reduced from 100% to 60% without a marked drop in titre. The trend is consistent when using different CAR genomes and can therefore be applied for use in vector production.


The plasmid system described herein utilises plasmid combinations and amounts with the codon-shuffled Gag-pol packaging plasmid pSF_CMV_shufGAGPOL_RabpA at 1/16th of the standard amount, codon-optimised Env plasmid pSF_Ferritin_mEF1α_optRD114 at 1/16th of the standard amount, and the vector genome plasmid at 60% of the standard amount.


Example 6: Large Scale Vector Production, Titre and Safety Data

The optimised plasmids were used for large scale vector production under Good Manufacturing Practice (GMP) conditions. The titres achieved were higher than those with the original plasmids, as shown in Table 6.









TABLE 5







Comparison of GMP grade vector titres












Original GagPol
Optimised GagPol




& Env plasmids
& Env plasmids







GEN1 vector titres
7.4E+05
3.28E+06










Example 7: Testing for Presence of Replication Competent Retrovirus (RCR)

Two assays were carried out to detect not only true RCR, but also potential replication-deficient but RT (reverse transcriptase)-positive particle producing virus or virus-like particle, and hence can also detect replication-deficient recombination events.


The first was a cell-based assay (PG4 S+/L− end point co-culture), which quantifies the number of replication competent virions by counting the distinct foci formed by the cytopathic effects of the RCR on the PG4 cells. PG4 cells can be used to detect and assay retroviruses and are currently recommended for the detection of replication competent retroviruses in gene therapy products


The second assay was a molecular assay, qFPERT (quantitative fluorescent product enhanced reverse transcriptase), which quantifies the level of reverse transcriptase (RT) activity in the supernatant of transduced and serially passaged detector cells.


The GMP GEN1 vector batch produced using the original helper plasmids produced a negative signal for the PG4 co-culture assay but a positive signal for the QFPERT assay, albeit at 3-4 logs less RT activity than the positive control. The results of the QFPERT assays (original and re-test assay for the same vector batch sample) are shown in Table 6.









TABLE 6







Results of QFPERT assay for GEN1 vector using original plasmids












PP5
PP5
PP6




Average CT
Average CT
Average




value
value
CT value



Sample
Assay 1
Assay 2
Assay 2
Result














Gen1_Original Test article 1
29.82
26.44
27.75
+


Gen1_Original Test article 2
30.9
26.52
28.85
+


Negative control A
38.66
37.43
36.48



Negative control B
38.15
35.74
36.12



Cultured positive control A
16.26
14.18
14.91
+


Cultured positive control B
19.77
14.79
16.14
+


Cultured spiked Test article
18.24
17.74
16.19
+










10E-2 units RT +ve control
14.04
14.26
+


enzyme





10E-3 units RT +ve control
17.43
17.74
+


enzyme





10E-4 units RT +ve control
20.85
22.13
+


enzyme





10E-5 units RT +ve control
25.79
25.13
+


enzyme





10E-6 units RT +ve control
27.87
28.26
+


enzyme





10E-7 units RT +ve control
31.94
32.61
+


enzyme





NTC
38.97
UD



DMEM
37.39
35.37



MLV recovery control 10 IU
27.84
21.55










Test articles 1 and 2 are derived from duplicate cultures, PP5 and PP6 refer to the passage point at which supernatant was harvested. The positive control is derived from a culture transduced with 100 IU of positive control virus.


Results of the QFPERT assay for GEN1 vector was then manufactured using the new optimised plasmids (Table 7). Test articles 1-9 are derived from replicate cultures. The four test article repeats that initially produced a low positive signal are shown in italics. The same samples generated did not generate positive signal in two repeat assays.


These data show that the risk of recombination events between plasmids in the three-plasmid system is reduced in the GEN1-optimised plasmid system described herein, versus the GEN1-original plasmid system.









TABLE 7







Results of QFPERT assay for GEN1 vector using new optimised plasmids












PP5
PP5
PP6




Average CT
Average
Average CT




value
CT value
value



Sample
Assay 1
Assay 2
Assay 2
Result














Gen1_Opt Test article 1
35.68





Gen1_Opt Test article 2
32.97





Gen1_Opt Test article 3

28.90

35.76
36.81



Gen1_Opt Test article 4

31.20

37.85
UD



Gen1_Opt Test article 5

30.83

35.23
39.93



Gen1_Opt Test article 6
36.96





Gen1_Opt Test article 7

30.70

36.00
35.83



Gen1_Opt Test article 8
36.10





Gen1_Opt Test article 9
38.14





Negative control A
38.08
UD
UD



Negative control B
36.91
UD
UD



Cultured spiked Test article
15.49
21.16
23.89
+










10E-2 units RT +ve control
14.8
15.44
+


enzyme





10E-3 units RT +ve control
18.3
17.95
+


enzyme





10E-4 units RT +ve control
21.8
22.83
+


enzyme





10E-5 units RT +ve control
25.0
24.97
+


enzyme





10E-6 units RT +ve control
28.9
28.24
+


enzyme





10E-7 units RT +ve control
32.2
32.38
+


enzyme





NTC
36.09
37.9



DMEM
36.42
UD



MLV recovery control 10 IU
24.3
30.82
+









Methods


The plasmids were produced over three independent experiments and titrated by transducing HEK293T cells and assessing transgene expression by flow cytometry after 4 days.


The helper and envelope plasmids were synthesised using the pSF molecular cloning plasmid vector. The various genomic component sequences were retrieved from public databases, synthesised, and cloned into the plasmids.


The helper plasmid pSF_CMV_shufGagPol_RabpA comprised the CMV promoter, the codon shuffled Gag and Pol coding sequences, and the rabbit globin polyadenylation site.


The Envelope plasmid pSF_Ferritin_mEF1α_optRD114_SV40pA comprised the promoter from the human ferritin gene (GenBank AP003733.5, nt 18029 . . . 17850), the murine EF1α 5′UTR (GenBank AC158987.3, nt 121232 . . . 120236), the codon optimised RD114 coding sequence (strain UCL) and the Simian Polyoma Virus 40 polyadenylation site.



FIGS. 1 and 2 show a Schematic representation of the shuffled Gagpol plasmid pSF_CMV_shufGagPol_RabpA and the optimised Env plasmid pSF_Ferritin_mEF1α_optRD114_SV40pA. As demonstrated in Example 3, this combination of plasmids provided a synergistic effect.


A list of the nucleotide and amino acid sequences described herein is provided:










(Gag and Pol polyproteins codon-shuffled variant)



SEQ ID NO: 1



ATGGGCCAAACCGTGACCACCCCGCTGTCGCTGACTCTGGGGCATTGGAAGGATG






TGGAACGCATCGCCCACAACCAGAGCGTGGACGTGAAGAAGCGCCGCTGGGTGA





CCTTCTGCTCCGCAGAATGGCCTACCTTTAACGTGGGGTGGCCTCGGGACGGCAC





CTTCAATCGGGACCTGATCACCCAGGTGAAAATCAAGGTGTTCAGCCCGGGTCCG





CACGGCCATCCAGATCAAGTCCCGTACATCGTGACTTGGGAAGCCCTGGCGTTCG





ACCCCCCACCGTGGGTCAAACCATTCGTCCACCCGAAGCCACCGCCACCCCTGCC





GCCGTCGGCGCCCTCACTGCCGCTGGAACCTCCGAGATCGACTCCTCCGAGATCA





TCGCTCTACCCGGCGCTCACTCCGAGCCTGGGCGCAAAGCCAAAGCCGCAAGTGC





TGTCCGATTCGGGAGGACCTCTCATCGACCTGCTCACCGAGGACCCTCCACCCTA





CAGAGATCCGCGCCCTCCCCCGAGCGACAGGGACGGGAACGGCGGGGAGGCCA





CCCCGGCAGGAGAAGCCCCGGACCCAAGCCCTATGGCGTCAAGACTCAGAGGCA





GAAGAGAACCTCCGGTGGCAGACTCGACTACTTCGCAGGCATTCCCACTGCGCGC





CGGGGGAAATGGCCAGCTGCAGTACTGGCCGTTCAGCTCATCGGACCTCTACAAT





TGGAAGAACAACAATCCCTCGTTCTCGGAGGACCCTGGTAAACTAACCGCTTTGAT





CGAATCGGTCCTGATTACCCACCAGCCGACCTGGGACGACTGCCAGCAGCTCCTG





GGCACTCTGCTGACCGGAGAGGAAAAACAAAGAGTGCTGCTGGAAGCACGGAAGG





CAGTGCGCGGGGATGATGGCAGGCCGACCCAGCTCCCGAACGAGGTGGACGCTG





CCTTCCCACTGGAACGCCCAGATTGGGACTACACCACCCAAGCTGGAAGAAACCA





CCTGGTCCATTACCGCCAACTGCTGCTGGCAGGACTCCAAAACGCAGGACGGTCC





CCTACTAACCTGGCCAAGGTGAAAGGGATTACTCAAGGCCCGAACGAGTCGCCGA





GCGCGTTCCTAGAGCGCCTAAAAGAGGCCTACCGGCGCTACACCCCATATGACCC





AGAGGACCCAGGACAGGAAACCAATGTGAGCATGTCATTCATCTGGCAGTCAGCC





CCCGACATCGGACGCAAGCTGGAACGCCTGGAAGACCTGAAGAATAAAACGCTCG





GCGATCTGGTGCGGGAAGCAGAGAAGATTTTCAATAAACGGGAAACCCCGGAAGA





GCGGGAGGAACGCATCCGGCGCGAGACCGAAGAAAAGGAGGAACGCAGACGCAC





CGAGGATGAACAGAAGGAGAAGGAGAGAGACCGCCGCCGGCACCGCGAAATGTC





GAAACTGCTGGCCACGGTGGTCAGCGGTCAGAAGCAGGATCGCCAAGGAGGCGA





GCGCAGAAGATCGCAACTGGATCGCGACCAGTGCGCCTACTGCAAGGAGAAGGG





GCACTGGGCGAAAGATTGTCCCAAGAAACCACGAGGACCTCGGGGACCAAGACCC





CAGACCTCCCTCCTGACCCTAGATGACTAGGGAGGTCAGGGTCAGGAGCCCCCCC





CTGAACCCAGGATAACCCTCAAAGTCGGGGGGCAACCCGTCACCTTCCTGGTGGA





CACCGGCGCGCAGCACAGCGTGCTGACCCAAAACCCGGGACCTCTGTCAGACAAG





TCCGCCTGGGTGCAGGGCGCAACTGGAGGGAAGCGGTATCGGTGGACCACTGAT





CGCAAAGTGCACCTGGCAACGGGAAAAGTGACCCATTCATTTCTGCACGTGCCGG





ACTGCCCGTACCCGCTTCTGGGACGCGACCTCCTGACTAAGCTCAAGGCACAGAT





CCACTTCGAGGGATCAGGAGCGCAGGTCATGGGACCTATGGGACAACCATTGCAG





GTCCTGACCTTGAACATCGAAGACGAGTACAGGCTGCACGAGACTAGCAAGGAAC





CTGACGTGTCGCTGGGGAGCACCTGGCTGTCGGACTTTCCCCAAGCCTGGGCAGA





GACCGGAGGAATGGGGCTCGCGGTCAGACAGGCACCACTCATCATCCCACTCAAG





GCCACCTCCACCCCGGTCTCAATTAAGCAATACCCGATGTCGCAGGAAGCCCGCC





TCGGAATCAAGCCGCATATTCAACGCCTCCTGGACCAAGGGATTCTGGTGCCGTG





CCAGTCGCCGTGGAACACCCCACTATTGCCGGTCAAGAAGCCTGGAACTAACGATT





ACAGGCCGGTGCAGGACCTGCGGGAAGTGAACAAACGGGTGGAGGACATCCACC





CGACCGTGCCGAATCCGTACAACCTTCTGTCCGGACTCCCTCCCTCACATCAGTGG





TACACTGTGCTCGACCTTAAGGACGCGTTCTTCTGCCTGCGCCTGCATCCGACGTC





ACAGCCGTTGTTCGCTTTCGAGTGGCGCGATCCCGAAATGGGTATCTCGGGCCAA





CTGACTTGGACTCGGCTGCCACAAGGATTCAAGAACTCGCCAACTCTGTTTGATGA





AGCTCTACACCGCGACCTGGCCGACTTCAGAATCCAACACCCGGACCTGATCCTG





CTTCAATACGTGGATGACCTGCTGCTCGCCGCGACTTCCGAGCTGGACTGTCAGC





AGGGCACTAGAGCACTGCTACAGACCTTGGGTAATCTGGGATACAGAGCAAGCGC





CAAGAAAGCTCAGATTTGCCAAAAGCAAGTGAAGTACCTGGGCTACCTTCTCAAAG





AAGGCCAGAGATGGCTGACCGAAGCCAGAAAGGAGACCGTGATGGGACAACCGA





CCCCTAAAACCCCTCGGCAGCTGCGCGAGTTCCTGGGAACCGCAGGCTTCTGCCG





CCTGTGGATTCCCGGATTCGCAGAGATGGCCGCCCCGCTATACCCTCTGACCAAG





ACCGGAACCCTGTTTAATTGGGGACCTGACCAGCAGAAGGCGTACCAAGAGATCA





AGCAAGCCCTGCTGACCGCCCCTGCCCTCGGACTGCCGGACCTGACTAAGCCCTT





TGAGCTGTTCGTGGACGAGAAGCAAGGATACGCAAAGGGCGTCCTGACTCAGAAG





CTGGGACCGTGGAGAAGACCGGTCGCGTACCTGTCCAAGAAGCTGGACCCGGTG





GCCGCTGGATGGCCACCGTGCCTGCGGATGGTGGCTGCCATTGCTGTGCTCACCA





AGGACGCAGGCAAGCTGACTATGGGACAGCCACTGGTGATCCTCGCACCGCACGC





CGTGGAGGCTCTGGTGAAACAGCCTCCTGACCGGTGGCTGTCCAATGCGCGCATG





ACTCATTACCAGGCCCTGCTCCTAGACACCGATCGGGTGCAGTTCGGACCAGTGG





TGGCACTGAACCCAGCAACTCTGCTGCCGCTGCCGGAAGAGGGGTTGCAGCACGA





CTGCCTGGACATCCTCGCAGAAGCTCACGGAACGCGGTCCGACCTTACCGACCAA





CCACTGCCCGATGCTGATCACACTTGGTACACTGATGGGTCATCATTCCTGCAAGA





AGGCCAGCGCAAAGCAGGGGCTGCAGTGACTACCGAAACTGAAGTCATTTGGGCT





CGGGCACTGCCGGCGGGGACGTCGGCACAGCGGGCGGAACTCATCGCACTCACC





CAGGCGCTGAAGATGGCCGAGGGCAAAAAGCTGAACGTGTACACCGACTCAAGAT





ACGCGTTCGCAACTGCACATATCCACGGGGAGATTTACAGACGGCGCGGTCTGCT





GACTTCGGAGGGCAAGGAAATCAAAAACAAGGACGAGATCCTGGCGCTCCTGAAA





GCCCTGTTCCTGCCAAAGCGGCTGTCAATCATCCACTGCCCTGGCCATCAGAAGG





GTAACTCCGCTGAAGCCAGGGGAAACCGCATGGCCGATCAAGCCGCGCGCGAGG





TCGCTACCAGAGAGACCCCCGGAACTTCGACGCTGCTTATCGAGAACTCCACGCC





ATACACCCACGAGCACTTTCACTACACTGTCACCGACACTAAGGATCTAACTAAGCT





GGGTGCCACTTATGATAGCGCAAAGAAGTACTGGGTGTACCAGGGGAAGCCTGTG





ATGCCCGATCAGTTCACCTTCGAGCTGCTGGATTTCCTGCATCAACTGACGCACCT





GAGCTTCTCAAAGACCAAGGCTCTGCTGGAACGCAGCCCTTCGCCGTACTATATGT





TGAATAGGGATCGCACCCTGAAGAATATCACCGAAACCTGCAAGGCCTGCGCCCA





GGTGAATGCTTCCAAGTCCGCCGTGAAGCAGGGCACCCGCGTCCGCGGACACCG





CCCTGGAACTCACTGGGAGATCGACTTCACTGAGGTGAAACCGGGCCTTTACGGC





TACAAATACCTGCTGGTGTTCGTGGACACTTTCTCGGGATGGATCGAGGCCTTCCC





GACTAAAAAGGAAACTGCAAAAGTGGTGACTAAGAAGCTGCTGGAGGAGATTTTCC





CCCGCTTTGGCATGCCGCAGGTATTGGGAACTGACAATGGGCCTGCCTTCGTCTC





CAAGGTGAGTCAGACAGTGGCCGATCTGTTGGGGATTGATTGGAAATTACATTGTG





CATACAGACCCCAAAGCTCAGGTCAGGTAGAAAGAATGAATAGGACCATCAAGGAG





ACTTTAACTAAATTAACGCTTGCAACTGGCTCTAGAGACTGGGTGCTCCTACTCCCC





TTAGCCCTGTACCGAGCCCGCAACACGCCGGGCCCCCATGGCCTCACCCCATATG





AGATCTTATATGGGGCACCCCCGCCCCTTGTAAACTTCCCTGACCCTGACATGACC





AGAGTTACTAACAGCCCCTCTCTCCAAGCTCACTTACAGGCTCTCTACTTAGTCCAG





CACGAAGTTTGGAGACCACTGGCGGCAGCTTACCAAGAACAACTGGACCGGCCGG





TGGTGCCTCACCCTTACCGGGTCGGCGACACAGTGTGGGTCCGCCGACATCAAAC





CAAGAACCTAGAACCTCGCTGGAAAGGACCTTACACAGTCCTGCTGACCACCCCCA





CCGCCCTCAAAGTAGACGGTATCGCAGCTTGGATACACGCAGCCCACGTAAAGGC





GGCCGACACCGAGAGTGGACCATCCTCTGGACGGACATGGCGCGTTCAACGCTCT





CAAAACCCCCTCAAGATAAGATTAACCCGTGGAAGCCCTTAG





(Gag and Pol polyproteins codon-shuffled variant)


SEQ ID NO: 2



ATGGGTCAGACGGTGACTACTCCGCTTTCACTCACACTCGGTCATTGGAAAGACGT






TGAGCGAATCGCGCACAATCAGAGTGTGGACGTAAAAAAGCGCCGCTGGGTTACG





TTCTGTTCTGCTGAGTGGCCTACGTTTAATGTAGGGTGGCCAAGGGATGGCACTTT





CAACAGGGATCTGATAACACAGGTAAAGATAAAAGTTTTTAGTCCAGGCCCACACG





GTCATCCTGATCAGGTGCCTTACATTGTAACATGGGAAGCACTGGCGTTTGATCCT





CCGCCGTGGGTTAAACCCTTTGTACACCCCAAACCCCCTCCACCACTCCCACCCTC





TGCCCCATCATTGCCGTTGGAACCGCCCAGGTCTACGCCCCCCCGCTCATCCCTTT





ACCCTGCTCTGACACCTAGCCTTGGTGCCAAACCCAAGCCACAAGTGCTCTCAGAC





AGCGGCGGCCCTCTGATAGATTTGCTGACTGAAGATCCGCCTCCTTATCGCGACCC





GCGGCCTCCACCGTCAGATAGGGATGGCAATGGCGGCGAAGCCACACCCGCAGG





TGAGGCCCCTGATCCAAGTCCCATGGCTTCTCGACTTCGAGGCCGACGGGAGCCG





CCTGTCGCTGATAGTACGACTTCACAAGCATTCCCTTTGAGAGCGGGGGGGAATG





GGCAATTGCAATATTGGCCCTTTAGCAGCAGTGACCTGTACAATTGGAAAAACAATA





ACCCTTCTTTTAGTGAGGATCCTGGTAAGCTTACGGCTTTGATAGAATCCGTGCTTA





TTACACATCAGCCGACATGGGATGACTGCCAACAACTCTTGGGTACATTGCTGACG





GGTGAAGAAAAACAGCGCGTGCTCTTGGAAGCCAGGAAAGCTGTACGCGGCGACG





ACGGTCGGCCCACACAGCTTCCTAACGAAGTCGACGCCGCTTTTCCTCTCGAGCG





GCCAGATTGGGATTACACAACCCAGGCCGGCCGGAACCATTTGGTACATTACCGG





CAACTCTTGTTGGCAGGGTTGCAAAACGCTGGTCGGAGCCCCACGAACTTGGCGA





AAGTGAAGGGTATCACCCAAGGCCCAAACGAGTCACCTTCAGCTTTTCTCGAACGA





CTTAAAGAAGCCTACAGACGATACACTCCGTACGATCCAGAGGACCCGGGCCAGG





AAACCAACGTATCTATGTCTTTCATTTGGCAGAGCGCTCCAGACATCGGGCGAAAA





CTGGAACGCCTCGAAGACCTGAAGAATAAAACTCTCGGTGACCTCGTTCGCGAAGC





CGAGAAAATTTTTAATAAAAGAGAAACTCCGGAAGAGCGCGAGGAAAGAATTAGGC





GCGAGACGGAGGAAAAAGAAGAACGGAGGAGAACCGAGGACGAACAAAAGGAGA





AAGAGCGAGACCGACGGCGCCACAGAGAAATGAGCAAACTGCTTGCCACCGTGGT





GAGCGGTCAAAAGCAAGACCGACAGGGAGGGGAGCGGAGACGAAGTCAGCTCGA





CAGGGACCAGTGTGCTTATTGTAAAGAAAAGGGCCACTGGGCTAAAGACTGCCCC





AAAAAACCGAGAGGCCCCAGGGGTCCGAGACCGCAGACCTCTTTGTTGACTTTGG





ATGATTAAGGCGGACAGGGTCAAGAGCCTCCACCGGAACCACGCATAACTCTCAAA





GTGGGAGGCCAGCCAGTAACGTTTCTCGTCGACACAGGAGCACAACATTCAGTTCT





TACTCAAAACCCAGGGCCGCTGAGTGACAAGTCTGCTTGGGTGCAGGGAGCTACT





GGAGGGAAGCGGTACCGGTGGACGACGGACCGGAAAGTGCATCTGGCGACGGGT





AAAGTAACACACTCTTTCTTGCATGTACCGGATTGCCCCTACCCACTTCTCGGCCG





CGACTTGCTTACAAAACTTAAAGCTCAGATCCATTTCGAGGGAAGCGGGGCTCAGG





TAATGGGCCCGATGGGGCAGCCTCTTCAGGTCCTGACCTTGAATATCGAAGACGA





GTATCGCTTGCATGAAACCTCTAAGGAACCTGATGTGTCTCTGGGGTCAACGTGGC





TGTCCGACTTTCCTCAGGCATGGGCTGAAACCGGAGGCATGGGTTTGGCGGTCAG





ACAGGCACCGCTTATTATTCCCCTTAAGGCGACGTCTACGCCCGTCTCAATAAAAC





AATACCCAATGTCTCAAGAAGCCCGGCTGGGAATCAAGCCTCACATTCAAAGACTG





CTCGATCAGGGCATCCTCGTCCCTTGCCAGAGCCCGTGGAATACGCCTCTGTTGC





CGGTGAAGAAGCCCGGCACGAATGACTATCGGCCTGTCCAGGACCTCCGGGAAGT





GAACAAGAGAGTGGAGGACATACACCCTACAGTGCCCAATCCCTATAATCTGCTGT





CCGGTCTCCCTCCTTCCCATCAATGGTATACGGTCCTCGATCTGAAGGATGCCTTT





TTTTGTCTTAGGCTTCACCCTACGTCTCAACCCCTCTTcGCCTTCGAGTGGCGCGA





TCCCGAAATGGGGATCAGCGGACAACTTACTTGGACTAGGCTTCCCCAGGGGTTC





AAAAATAGTCCCACACTGTTCGATGAGGCTCTGCACAGGGACTTGGCGGATTTCCG





GATACAACACCCTGACCTCATTTTGCTTCAATATGTCGACGATCTTCTCCTGGCGGC





CACATCTGAACTCGATTGCCAACAAGGAACTAGGGCTCTTCTGCAAACTCTCGGAA





ACTTGGGTTATCGGGCTAGTGCAAAAAAGGCTCAGATATGCCAGAAACAAGTAAAG





TACCTCGGCTATCTCCTGAAAGAAGGGCAACGGTGGCTCACAGAAGCAAGGAAGG





AAACGGTGATGGGCCAGCCAACTCCGAAAACGCCCCGACAGTTGAGAGAGTTCCT





GGGTACAGCGGGGTTTTGCCGACTCTGGATCCCGGGCTTTGCGGAAATGGCCGCC





CCACTGTATceGCTTACCAAGACGGGAACGCTTTTTAACTGGGGGCCTGACCAACA





AAAGGCATACCAGGAAATCAAGCAAGCACTGCTCACAGCTCCAGCGCTCGGTCTC





CCGGACTTGACTAAACCCTTTGAACTTTTTGTTGATGAGAAGCAAGGCTATGCAAAG





GGCGTGCTTACACAGAAGTTGGGTCCATGGAGAAGGCCGGTTGCCTATTTGTCCAA





AAAACTGGACCCTGTGGCAGCTGGCTGGCCCCCATGCTTGAGGATGGTAGCTGCC





ATAGCTGTGCTGACCAAGGACGCAGGGAAACTTACCATGGGCCAACCTCTTGTGAT





ACTTGCACCGCATGCTGTTGAAGCCCTGGTCAAGCAACCGCCGGACCGCTGGCTC





TCTAACGCGAGGATGACGCACTACCAAGCTTTGCTCCTCGACACGGACCGGGTCC





AATTCGGTCCTGTCGTCGCGCTCAATCCCGCGACACTCCTCCCCCTTCCTGAGGAA





GGGCTGCAACATGACTGTCTCGACATACTTGCAGAAGCACACGGCACGCGGTCAG





ACTTGACAGACCAGCCTCTCCCTGATGCCGACCACACTTGGTATACCGATGGCAGT





AGTTTTTTGCAGGAAGGTCAGCGAAAGGCTGGCGCCGCAGTCACCACAGAAACTG





AGGTAATTTGGGCGAGGGCTCTCCCAGCTGGGACATCTGCTCAACGCGCGGAACT





CATTGCACTCACCCAAGCCCTGAAGATGGCAGAAGGAAAAAAATTGAATGTCTACA





CTGATTCCCGGTATGCTTTTGCCACGGCGCATATCCATGGGGAGATATATCGACGC





CGAGGTCTGCTTACGTCTGAAGGTAAGGAGATTAAAAACAAAGACGAGATCCTCGC





CCTTCTGAAGGCACTGTTCTTGCCAAAAAGACTGAGTATCATACACTGTCCTGGACA





CCAGAAAGGTAATTCAGCCGAAGCGAGGGGTAACCGGATGGCAGATCAAGCAGCA





CGGGAAGTCGCTACCCGAGAAACCCCCGGAACCTCCACCCTTTTGATCGAGAACA





GTACTCCTTACACTCACGAGCATTTCCATTATACAGTGACGGACACGAAAGATTTGA





CGAAACTGGGTGCAACGTACGATAGTGCAAAAAAATACTGGGTATATCAGGGCAAA





CCCGTGATGCCTGACCAGTTCACGTTCGAGCTTCTGGATTTCCTCCACCAGCTTAC





GCATTTGTCTTTTTCCAAGACGAAAGCGCTTCTGGAACGGTCTCCGTCCCCATATTA





TATGTTGAATAGAGATAGGACCTTGAAAAATATAACAGAAACCTGCAAGGCTTGTGC





TCAAGTGAATGCTTCCAAGAGCGCAGTCAAACAAGGTACGAGGGTCAGAGGCCAC





AGGCCAGGAACCCATTGGGAGATCGACTTCACTGAGGTGAAACCAGGCCTTTACG





GCTACAAGTACCTTCTTGTTTTTGTTGATACGTTCTCCGGCTGGATCGAGGCCTTTC





CAACTAAGAAGGAGACTGCGAAAGTGGTCACAAAGAAACTCCTGGAAGAAATCTTC





CCGCGCTTTGGGATGCCTCAGGTCCTTGGGACCGATAACGGGCCTGCTTTTGTATC





CAAAGTCAGCCAAACAGTCGCCGACCTCTTGGGAATCGATTGGAAACTGCACTGTG





CCTATCGCCCCCAGTCAAGCGGCCAAGTAGAAAGGATGAACAGGACAATCAAAGA





AACTCTCACCAAGCTGACTTTGGCAACTGGGTCACGCGACTGGGTCTTGCTTTTGC





CACTTGCTCTTTACCGCGCTCGCAACACACCCGGTCCCCACGGTCTCACTCCATAT





GAGATTTTGTATGGCGCACCACCCCCTCTCGTGAATTTTCCCGATCCTGACATGAC





GAGGGTCACCAACTCTCCCTCTTTGCAGGCTCATCTTCAGGCGCTTTATCTTGTGC





AGCACGAGGTTTGGAGACCTCTTGCAGCTGCATACCAAGAACAGCTTGACAGGCCT





GTCGTGCCACATCCGTACCGGGTCGGAGATACGGTATGGGTAAGGAGACACCAAA





CTAAAAACCTGGAGCCAAGATGGAAAGGGCCTTATACTGTTCTCCTGACTACGCCT





ACTGCTCTCAAGGTTGATGGCATAGCAGCCTGGATTCATGCGGCCCATGTTAAGGC





TGCAGATACAGAATCCGGTCCCTCATCCGGAAGGACATGGCGGGTTCAAAGGTCC





CAAAACCCCCTCAAAATTCGACTCACACGCGGCTCCCCGTAA





(Gag and Pol polyprotein codon-shuffled variant)


SEQ ID NO: 3



ATGGGCCAGACCGTGACCACCCCCCTGAGCCTGACCCTGGGCCACTGGAAGGAC






GTGGAGCGCATCGCCCACAACCAGAGCGTGGACGTGAAGAAGCGCCGCTGGGTG





ACCTTCTGCAGCGCCGAGTGGCCCACCTTCAACGTGGGCTGGCCCCGCGACGGC





ACCTTCAACCGCGACCTGATCACCCAGGTGAAGATCAAGGTGTTCAGCCCCGGCC





CCCACGGCCACCCCGACCAGGTGCCCTACATCGTGACCTGGGAGGCCCTGGCCTT





CGACCCCCCCCCCTGGGTGAAGCCCTTCGTGCACCCCAAGCCCCCCCCCCCCCT





GCCCCCCAGCGCCCCCAGCCTGCCCCTGGAGCCCCCCCGCAGCACCCCCCCCCG





CAGCAGCCTGTACCCCGCCCTGACCCCCAGCCTGGGCGCCAAGCCCAAGCCCCA





GGTGCTGAGCGACAGCGGCGGCCCCCTGATCGACCTGCTGACCGAGGACCCCCC





CCCCTACCGCGACCCCCGCCCCCCCCCCAGCGACCGCGACGGCAACGGCGGCGA





GGCCACCCCCGCCGGCGAGGCCCCCGACCCCAGCCCCATGGCCAGCCGCCTGC





GCGGCCGCCGCGAGCCCCCCGTGGCCGACAGCACCACCAGCCAGGCCTTCCCCC





TGCGCGCCGGCGGCAACGGCCAGCTGCAGTACTGGCCCTTCAGCAGCAGCGACC





TGTACAACTGGAAGAACAACAACCCCAGCTTCAGCGAGGACCCCGGCAAGCTGAC





CGCCCTGATCGAGAGCGTGCTGATCACCCACCAGCCCACCTGGGACGACTGCCAG





CAGCTGCTGGGCACCCTGCTGACCGGCGAGGAGAAGCAGCGCGTGCTGCTGGAG





GCCCGCAAGGCCGTGCGCGGCGACGACGGCCGCCCCACCCAGCTGCCCAACGA





GGTGGACGCCGCCTTCCCCCTGGAGCGCCCCGACTGGGACTACACCACCCAGGC





CGGCCGCAACCACCTGGTGCACTACCGCCAGCTGCTGCTGGCCGGCCTGCAGAA





CGCCGGCCGCAGCCCCACCAACCTGGCCAAGGTGAAGGGCATCACCCAGGGCCC





CAACGAGAGCCCCAGCGCCTTCCTGGAGCGCCTGAAGGAGGCCTACCGCCGCTA





CACCCCCTACGACCCCGAGGACCCCGGCCAGGAGACCAACGTGAGCATGAGCTTC





ATCTGGCAGAGCGCCCCCGACATCGGCCGCAAGCTGGAGCGCCTGGAGGACCTG





AAGAACAAGACCCTGGGCGACCTGGTGCGCGAGGCCGAGAAGATCTTCAACAAGC





GCGAGACCCCCGAGGAGCGCGAGGAGCGCATCCGCCGCGAGACCGAGGAGAAG





GAGGAGCGCCGCCGCACCGAGGACGAGCAGAAGGAGAAGGAGCGCGACCGCCG





CCGCCACCGCGAGATGAGCAAGCTGCTGGCCACCGTGGTGAGCGGCCAGAAGCA





GGACCGCCAGGGCGGCGAGCGCCGCCGCAGCCAGCTGGACCGCGACCAGTGCG





CCTACTGCAAGGAGAAGGGCCACTGGGCCAAGGACTGCCCCAAGAAGCCCCGCG





GCCCCCGCGGCCCCCGCCCCCAGACCAGCCTGCTGACCCTGGACGACTAAGGCG





GCCAGGGCCAGGAGCCCCCCCCCGAGCCCCGCATCACCCTGAAGGTGGGCGGCC





AGCCCGTGACCTTCCTGGTGGACACCGGCGCCCAGCACAGCGTGCTGACCCAGAA





CCCCGGCCCCCTGAGCGACAAGAGCGCCTGGGTGCAGGGCGCCACCGGCGGCA





AGCGCTACCGCTGGACCACCGACCGCAAGGTGCACCTGGCCACCGGCAAGGTGA





CCCACAGCTTCCTGCACGTGCCCGACTGCCCCTACCCCCTGCTGGGCCGCGACCT





GCTGACCAAGCTGAAGGCCCAGATCCACTTCGAGGGCAGCGGCGCCCAGGTGAT





GGGCCCCATGGGCCAGCCCCTGCAGGTGCTGACCCTGAACATCGAGGACGAGTA





CCGCCTGCACGAGACCAGCAAGGAGCCCGACGTGAGCCTGGGCAGCACCTGGCT





GAGCGACTTCCCCCAGGCCTGGGCCGAGACCGGCGGCATGGGCCTGGCCGTGCG





CCAGGCCCCCCTGATCATCCCCCTGAAGGCCACCAGCACCCCCGTGAGCATCAAG





CAGTACCCCATGAGCCAGGAGGCCCGCCTGGGCATCAAGCCCCACATCCAGCGC





CTGCTGGACCAGGGCATCCTGGTGCCCTGCCAGAGCCCCTGGAACACCCCCCTGC





TGCCCGTGAAGAAGCCCGGCACCAACGACTACCGCCCCGTGCAGGACCTGCGCG





AGGTGAACAAGCGCGTGGAGGACATCCACCCCACCGTGCCCAACCCCTACAACCT





GCTGAGCGGCCTGCCCCCCAGCCACCAGTGGTACACCGTGCTGGACCTGAAGGA





CGCCTTCTTCTGCCTGCGCCTGCACCCCACCAGCCAGCCCCTGTTCGCCTTCGAG





TGGCGCGACCCCGAGATGGGCATCAGCGGCCAGCTGACCTGGACCCGCCTGCCC





CAGGGCTTCAAGAACAGCCCCACCCTGTTCGACGAGGCCCTGCACCGCGACCTGG





CCGACTTCCGCATCCAGCACCCCGACCTGATCCTGCTGCAGTACGTGGACGACCT





GCTGCTGGCCGCCACCAGCGAGCTGGACTGCCAGCAGGGCACCCGCGCCCTGCT





GCAGACCCTGGGCAACCTGGGCTACCGCGCCAGCGCCAAGAAGGCCCAGATCTG





CCAGAAGCAGGTGAAGTACCTGGGCTACCTGCTGAAGGAGGGCCAGCGCTGGCT





GACCGAGGCCCGCAAGGAGACCGTGATGGGCCAGCCCACCCCCAAGACCCCCCG





CCAGCTGCGCGAGTTCCTGGGCACCGCCGGCTTCTGCCGCCTGTGGATCCCCGG





CTTCGCCGAGATGGCCGCCCCCCTGTACCCCCTGACCAAGACCGGCACCCTGTTC





AACTGGGGCCCCGACCAGCAGAAGGCCTACCAGGAGATCAAGCAGGCCCTGCTG





ACCGCCCCCGCCCTGGGCCTGCCCGACCTGACCAAGCCCTTCGAGCTGTTCGTGG





ACGAGAAGCAGGGCTACGCCAAGGGCGTGCTGACCCAGAAGCTGGGCCCCTGGC





GCCGCCCCGTGGCCTACCTGAGCAAGAAGCTGGACCCCGTGGCCGCCGGCTGGC





CCCCCTGCCTGCGCATGGTGGCCGCCATCGCCGTGCTGACCAAGGACGCCGGCA





AGCTGACCATGGGCCAGCCCCTGGTGATCCTGGCCCCCCACGCCGTGGAGGCCC





TGGTGAAGCAGCCCCCCGACCGCTGGCTGAGCAACGCCCGCATGACCCACTACCA





GGCCCTGCTGCTGGACACCGACCGCGTGCAGTTCGGCCCCGTGGTGGCCCTGAA





CCCCGCCACCCTGCTGCCCCTGCCCGAGGAGGGCCTGCAGCACGACTGCCTGGA





CATCCTGGCCGAGGCCCACGGCACCCGCAGCGACCTGACCGACCAGCCCCTGCC





CGACGCCGACCACACCTGGTACACCGACGGCAGCAGCTTCCTGCAGGAGGGCCA





GCGCAAGGCCGGCGCCGCCGTGACCACCGAGACCGAGGTGATCTGGGCCCGCG





CCCTGCCCGCCGGCACCAGCGCCCAGCGCGCCGAGCTGATCGCCCTGACCCAGG





CCCTGAAGATGGCCGAGGGCAAGAAGCTGAACGTGTACACCGACAGCCGCTACGC





CTTCGCCACCGCCCACATCCACGGCGAGATCTACCGCCGCCGCGGCCTGCTGACC





AGCGAGGGCAAGGAGATCAAGAACAAGGACGAGATCCTGGCCCTGCTGAAGGCC





CTGTTCCTGCCCAAGCGCCTGAGCATCATCCACTGCCCCGGCCACCAGAAGGGCA





ACAGCGCCGAGGCCCGCGGCAACCGCATGGCCGACCAGGCCGCCCGCGAGGTG





GCCACCCGCGAGACCCCCGGCACCAGCACCCTGCTGATCGAGAACAGCACCCCC





TACACCCACGAGCACTTCCACTACACCGTGACCGACACCAAGGACCTGACCAAGCT





GGGCGCCACCTACGACAGCGCCAAGAAGTACTGGGTGTACCAGGGCAAGCCCGT





GATGCCCGACCAGTTCACCTTCGAGCTGCTGGACTTCCTGCACCAGCTGACCCAC





CTGAGCTTCAGCAAGACCAAGGCCCTGCTGGAGCGCAGCCCCAGCCCCTACTACA





TGCTGAACCGCGACCGCACCCTGAAGAACATCACCGAGACCTGCAAGGCCTGCGC





CCAGGTGAACGCCAGCAAGAGCGCCGTGAAGCAGGGCACCCGCGTGCGCGGCCA





CCGCCCCGGCACCCACTGGGAGATCGACTTCACCGAGGTGAAGCCCGGCCTGTA





CGGCTACAAGTACCTGCTGGTGTTCGTGGACACCTTCAGCGGCTGGATCGAGGCC





TTCCCCACCAAGAAGGAGACCGCCAAGGTGGTGACCAAGAAGCTGCTGGAGGAGA





TCTTCCCCCGCTTCGGCATGCCCCAGGTGCTGGGCACCGACAACGGCCCCGCCTT





CGTGAGCAAGGTGAGCCAGACCGTGGCCGACCTGCTGGGCATCGACTGGAAGCT





GCACTGCGCCTACCGCCCCCAGAGCAGCGGCCAGGTGGAGCGCATGAACCGCAC





CATCAAGGAGACCCTGACCAAGCTGACCCTGGCCACCGGCAGCCGCGACTGGGT





GCTGCTGCTGCCCCTGGCCCTGTACCGCGCCCGCAACACCCCCGGCCCCCACGG





CCTGACCCCCTACGAGATCCTGTACGGCGCCCCCCCCCCCCTGGTGAACTTCCCC





GACCCCGACATGACCCGCGTGACCAACAGCCCCAGCCTGCAGGCCCACCTGCAG





GCCCTGTACCTGGTGCAGCACGAGGTGTGGCGCCCCCTGGCCGCCGCCTACCAG





GAGCAGCTGGACCGCCCCGTGGTGCCCCACCCCTACCGCGTGGGCGACACCGTG





TGGGTGCGCCGCCACCAGACCAAGAACCTGGAGCCCCGCTGGAAGGGCCCCTAC





ACCGTGCTGCTGACCACCCCCACCGCCCTGAAGGTGGACGGCATCGCCGCCTGG





ATCCACGCCGCCCACGTGAAGGCCGCCGACACCGAGAGCGGCCCCAGCAGCGGC





CGCACCTGGCGCGTGCAGCGCAGCCAGAACCCCCTGAAGATCCGCCTGACCCGC





GGCAGCCCCTAA





SEQ ID NO: 4: Wildtype Gag and Pol polyproteins


ATGGGCCAGACTGTTACCACTCCCTTAAGTTTGACCTTAGGTCACTGGAAAGATGT





CGAGCGGATCGCTCACAACCAGTCGGTAGATGTCAAGAAGAGACGTTGGGTTACC





TTCTGCTCTGCAGAATGGCCAACCTTTAACGTCGGATGGCCGCGAGACGGCACCTT





TAACCGAGACCTCATCACCCAGGTTAAGATCAAGGTCTTTTCACCTGGccCGCATG





GACACCCAGACCAGGTCCCCTACATCGTGACCTGGGAAGCCTTGGCTTTTGACCC





CCCTCCCTGGGTCAAGCCCTTTGTACACCCTAAGCCTCCGCCTCCTCTTCCTCCAT





CCGCCCCGTCTCTCCCCCTTGAACCTCCTCGTTCGACCCCGCCTCGATCCTCCCTT





TATCCAGCCCTCACTCCTTCTCTAGGCGCCAAACCTAAACCTCAAGTTCTTTCTGAC





AGTGGGGGGCCGCTCATCGACCTACTTACAGAAGACCCCCCGCCTTATAGGGACC





CAAGACCACCCCCTTCCGACAGGGACGGAAATGGTGGAGAAGCGACCCCTGCGG





GAGAGGCACCGGACCCCTCCCCAATGGCATCTCGCCTACGTGGGAGACGGGAGC





CCCCTGTGGCCGACTCCACTACCTCGCAGGCATTCCCCCTCCGCGCAGGAGGAAA





CGGACAGCTTCAATACTGGCCGTTCTCCTCTTCTGACCTTTACAACTGGAAAAATAA





TAACCCTTCTTTTTCTGAAGATCCAGGTAAACTGACAGCTCTGATCGAGTCTGTTCT





CATCACCCATCAGCCCACCTGGGACGACTGTCAGCAGCTGTTGGGGACTCTGCTG





ACCGGAGAAGAAAAACAACGGGTGCTCTTAGAGGCTAGAAAGGCGGTGCGGGGC





GATGATGGGCGCCCCACTCAACTGCCCAATGAAGTCGATGCCGCTTTTCCCCTCGA





GCGCCCAGACTGGGATTACACCACCCAGGCAGGTAGGAACCACCTAGTCCACTAT





CGCCAGTTGCTCCTAGCGGGTCTCCAAAACGCGGGCAGAAGCCCCACCAATTTGG





CCAAGGTAAAAGGAATAACACAAGGGCCCAATGAGTCTCCCTCGGCCTTCCTAGAG





AGACTTAAGGAAGCCTATCGCAGGTACACTCCTTATGACCCTGAGGACCCAGGGCA





AGAAACTAATGTGTCTATGTCTTTCATTTGGCAGTCTGCCCCAGACATTGGGAGAAA





GTTAGAGAGGTTAGAAGATTTAAAAAACAAGACGCTTGGAGATTTGGTTAGAGAGG





CAGAAAAGATCTTTAATAAACGAGAAACCCCGGAAGAAAGAGAGGAACGTATCAGG





AGAGAAACAGAGGAAAAAGAAGAACGCCGTAGGACAGAGGATGAGCAGAAAGAGA





AAGAAAGAGATCGTAGGAGACATAGAGAGATGAGCAAGCTATTGGCCACTGTCGTT





AGTGGACAGAAACAGGATAGACAGGGAGGAGAACGAAGGAGGTCCCAACTCGATC





GCGACCAGTGTGCCTACTGCAAAGAAAAGGGGCACTGGGCTAAAGATTGTCCCAA





GAAACCACGAGGACCTCGGGGACCAAGACCCCAGACCTCCCTCCTGACCCTAGAT





GACTAGGGAGGTCAGGGTCAGGAGCCCCCCCCTGAACCCAGGATAACCCTCAAAG





TCGGGGGGCAACCCGTCACCTTCCTGGTAGATACTGGGGCCCAACACTCCGTGCT





GACCCAAAATCCTGGACCCCTAAGTGATAAGTCTGCCTGGGTCCAAGGGGCTACT





GGAGGAAAGCGGTATCGCTGGACCACGGATCGCAAAGTACATCTAGCTACCGGTA





AGGTCACCCACTCTTTCCTCCATGTACCAGACTGTCCCTATCCTCTGTTAGGAAGA





GATTTGCTGACTAAACTAAAAGCCCAAATCCACTTTGAGGGATCAGGAGCTCAGGT





TATGGGACCAATGGGGCAGCCCCTGCAAGTGTTGACCCTAAATATAGAAGATGAGT





ATCGGCTACATGAGACCTCAAAAGAGCCAGATGTTTCTCTAGGGTCCACATGGCTG





TCTGATTTTCCTCAGGCCTGGGCGGAAACCGGGGGCATGGGACTGGCAGTTCGCC





AAGCTCCTCTGATCATACCTCTGAAAGCAACCTCTACCCCCGTGTCCATAAAACAAT





ACCCCATGTCACAAGAAGCCAGACTGGGGATCAAGCCCCACATACAGAGACTGTT





GGACCAGGGAATACTGGTACCCTGCCAGTCCCCCTGGAACACGCCCCTGCTACCC





GTTAAGAAACCAGGGACTAATGATTATAGGCCTGTCCAGGATCTGAGAGAAGTCAA





CAAGCGGGTGGAAGACATCCACCCCACCGTGCCCAACCCTTACAACCTCTTGAGC





GGGCTCCCACCGTCCCACCAGTGGTACACTGTGCTTGATTTAAAGGATGCCTTTTT





CTGCCTGAGACTCCACCCCACCAGTCAGCCTCTCTTCGCCTTTGAGTGGAGAGATC





CAGAGATGGGAATCTCAGGACAATTGACCTGGACCAGACTCCCACAGGGTTTCAAA





AACAGTCCCACCCTGTTTGATGAGGCACTGCACAGAGACCTAGCAGACTTCCGGAT





CCAGCACCCAGACTTGATCCTGCTACAGTACGTGGATGACTTACTGCTGGCCGCCA





CTTCTGAGCTAGACTGCCAACAAGGTACTCGGGCCCTGTTACAAACCCTAGGGAAC





CTCGGGTATCGGGCCTCGGCCAAGAAAGCCCAAATTTGCCAGAAACAGGTCAAGT





ATCTGGGGTATCTTCTAAAAGAGGGTCAGAGATGGCTGACTGAGGCCAGAAAAGA





GACTGTGATGGGGCAGCCTACTCCGAAGACCCCTCGACAACTAAGGGAGTTCCTA





GGGACGGCAGGCTTCTGTCGCCTCTGGATCCCTGGGTTTGCAGAAATGGCAGCCC





CCTTGTACCCTCTCACCAAAACGGGGACTCTGTTTAATTGGGGCCCAGACCAACAA





AAGGCCTATCAAGAAATCAAGCAAGCTCTTCTAACTGCCCCAGCCCTGGGGTTGCC





AGATTTGACTAAGCCCTTTGAACTCTTTGTCGACGAGAAGCAGGGCTACGCCAAAG





GCGTCCTAACGCAAAAGCTGGGACCTTGGCGTCGGCCGGTGGCCTACCTGTCTAA





AAAGCTAGACCCAGTGGCAGCTGGCTGGCCCCCCTGCCTACGGATGGTGGCAGC





CATTGCAGTTCTGACAAAAGATGCTGGCAAGCTCACTATGGGACAGCCGTTGGTCA





TTCTGGCCCCCCATGCCGTAGAGGCACTAGTTAAGCAACCCCCTGATCGCTGGCT





CTCCAATGCCCGGATGACCCATTACCAAGCCCTGCTCCTGGACACGGACCGGGTC





CAGTTCGGGCCAGTAGTGGCCCTAAATCCAGCTACGCTGCTCCCTCTGCCTGAGG





AGGGGCTGCAACATGACTGCCTTGACATCTTGGCTGAAGCCCACGGAACTAGATCA





GATCTTACGGACCAGCCCCTCCCAGACGCCGACCACACCTGGTACACGGATGGGA





GCAGCTTCCTGCAAGAAGGGCAGCGTAAGGCCGGAGCAGCGGTGACCACTGAGA





CTGAGGTAATCTGGGCCAGGGCATTGCCAGCCGGGACATCGGCCCAAAGAGCTGA





ACTGATAGCGCTCACCCAAGCCCTAAAGATGGCAGAAGGTAAGAAGCTAAATGTTT





ATACTGATAGCCGTTACGCTTTTGCCACCGCCCATATTCATGGAGAAATATACAGAA





GGCGCGGGTTGCTCACATCAGAAGGAAAAGAGATCAAGAACAAGGACGAGATCTT





AGCCCTACTAAAGGCTCTCTTCTTGCCCAAAAGACTTAGCATAATTCATTGCCCGGG





ACATCAAAAAGGAAACAGCGCAGAGGCCAGGGGCAACCGGATGGCCGACCAAGC





GGCCCGAGAAGTAGCCACTAGAGAAACTCCAGGAACTTCCACACTTCTGATAGAAA





ACTCAACCCCCTATACCCATGAACACTTTCACTATACAGTAACTGACACAAAGGATT





TGACCAAACTAGGAGCCACTTATGACAGTGCGAAGAAATATTGGGTCTATCAAGGA





AAGCCTGTTATGCCTGATCAATTCACCTTTGAGTTACTAGACTTTCTTCACCAATTGA





CCCACCTCAGCTTCTCAAAAACAAAGGCTCTCCTAGAGAGAAGCCCCAGTCCCTAC





TACATGCTGAACCGGGATCGAACACTCAAAAATATCACTGAGACCTGCAAAGCTTG





TGCACAAGTCAATGCCAGCAAGTCTGCCGTTAAGCAAGGAACTAGGGTCCGCGGG





CATCGGCCTGGCACACACTGGGAGATCGATTTCACCGAGGTAAAACCTGGATTGTA





TGGCTATAAGTATCTTTTAGTTTTTGTAGATACTTTTTCTGGCTGGATAGAAGCTTTC





CCAACTAAGAAAGAAACCGCCAAGGTCGTGACCAAGAAACTGCTAGAAGAGATCTT





CCCTAGGTTCGGCATGCCGCAGGTATTGGGAACTGACAATGGGCCTGCCTTCGTC





TCCAAGGTGAGTCAGACAGTGGCCGATCTGTTGGGGATTGATTGGAAATTACATTG





TGCATACAGACCCCAAAGCTCAGGTCAGGTAGAAAGAATGAATAGGACCATCAAGG





AGACTTTAACTAAATTAACGCTTGCAACTGGCTCTAGAGACTGGGTGCTCCTACTCC





CCTTAGCCCTGTACCGAGCCCGCAACACGCCGGGCCCCCATGGCCTCACCCCATA





TGAGATCTTATATGGGGCACCCCCGCCCCTTGTAAACTTCCCTGACCCTGACATGA





CCAGAGTTACTAACAGCCCCTCTCTCCAAGCTCACTTACAGGCTCTCTACTTAGTCC





AGCACGAAGTTTGGAGACCACTGGCGGCAGCTTACCAAGAACAACTGGACCGGCC





GGTGGTGCCTCACCCTTACCGGGTCGGCGACACAGTGTGGGTCCGCCGACATCAA





ACCAAGAACCTAGAACCTCGCTGGAAAGGACCTTACACAGTCCTGCTGACCACCCC





CACCGCCCTCAAAGTAGACGGTATCGCAGCTTGGATACACGCAGCCCACGTAAAG





GCGGCCGACACCGAGAGTGGACCATCCTCTGGACGGACATGGCGCGTTCAACGCT





CTCAAAACCCCCTCAAGATAAGATTAACCCGTGGAAGCCCTTAA





SEQ ID NO: 5: (RD114 ENV protein codon-optimised variant)


ATGAAGCTGCCGACGGGAATGGTGATCCTGTGCAGCCTGATTATCGTGCGCGCGG





GGTTCGACGACCCGAGAAAGGCTATCGCTATCGTGCAAAAGCAGCACGGGAAACC





ATGCGAATGCAGCGGTGGCCAGGTGTCAGAGGCCCCACCGAACTCAATCCAGCAG





GTCACCTGCCCCGGTAAAACCGCATACCTGATGACCAATCAAAAGTGGAAGTGCCG





GGTGACCCCGAAGAATCTGACTCCAAGCGGGGGAGAACTGCAGAACTGCCCCTGC





AATACTTTCCAGGATTCAATGCACTCCTCCTGTTACACCGAATACCGCCAGTGCAG





GGCAAATAACAAAACGTACTACACTGCGACCCTGCTGAAGATCCGCTCCGGCTCCC





TAAATGAAGTGCAGATCCTGCAGAATCCAAACCAACTGTTGCAGAGCCCGTGCAGA





GGCAGCATCAATCAGCCGGTCTGCTGGAGCGCCACCGCACCTATCCACATCTCAG





ACGGAGGGGGACCGCTCGATACCAAGCGCGTGTGGACCGTGCAAAAGCGGCTAG





AGCAGATCCACAAAGCTATGCACCCGGAACTGCAATACCACCCGCTGGCGCTTCC





AAAGGTCCGCGACGATCTGTCGCTGGACGCGCGGACCTTCGACATCTTGAATACTA





CCTTCCGCCTGCTGCAGATGTCGAATTTCAGCCTGGCACAGGATTGTTGGCTGTGC





CTGAAGCTGGGTACTCCGACCCCGCTGGCCATCCCCACCCCGTCACTGACTTACT





CACTCGCAGACTCGTTGGCAAACGCCTCCTGCCAGATTATCCCACCTCTGCTGGTG





CAGCCGATGCAGTTCTCGAACTCCAGCTGCCTGTCATCACCATTCATCAACGACAC





TGAACAGATTGATCTGGGAGCAGTGACTTTCACCAACTGCACTTCAGTGGCCAACG





TCTCCTCGCCACTGTGCGCTCTGAACGGGTCCGTGTTCCTGTGTGGAAACAATATG





GCGTACACTTACCTGCCGCAAAACTGGACTGGCCTGTGCGTGCAAGCGTCACTGC





TGCCTGACATCGACATTATCCCAGGAGACGAGCCCGTCCCGATCCCGGCAATCGA





CCACTACATTCACCGCCCGAAACGGGCAGTCCAGTTCATCCCGCTCCTGGCTGGA





CTGGGGATCACCGCTGCTTTCACTACCGGAGCCACTGGCTTGGGTGTCTCCGTGA





CCCAGTACACGAAGCTGTCCCACCAACTGATTTCGGACGTCCAAGTCCTATCGGGA





ACCATCCAGGACCTCCAGGATCAGGTCGATTCCCTCGCAGAGGTGGTGCTCCAGA





ACCGCAGAGGACTGGATCTGCTGACCGCTGAACAGGGAGGCATCTGCCTTGCACT





CCAGGAGAAGTGCTGCTTCTACGCCAATAAGTCGGGGATCGTGCGGAACAAAATC





AGAACTCTGCAGGAAGAACTGCAGAAGCGCCGGGAAAGCCTCGCCAGCAATCCGC





TGTGGACCGGACTCCAAGGATTTCTCCCGTATCTTCTCCCGCTGCTGGGGCCTCTG





CTCACTCTGCTGCTGATCCTGACCATCGGACCGTGCGTCTTTAGCAGACTGATGGC





ATTTATCAACGACAGACTGAACGTGGTGCATGCAATGGTCCTGGCACAGCAGTACC





AGGCCCTGAAGGCCGAGGAGGAAGCACAGGACTAG





(RD114 ENV protein codon-optimised variant)


SEQ ID NO: 6



ATGAAGCTGCCCACCGGCATGGTGATCCTGTGCAGCCTGATCATCGTGCGCGCCG






GCTTCGACGACCCCCGCAAGGCCATCGCCCTGGTGCAGAAGCAGCACGGCAAGC





CCTGCGAGTGCAGCGGCGGCCAGGTGAGCGAGGCCCCCCCCAACAGCATCCAGC





AGGTGACCTGCCCCGGCAAGACCGCCTACCTGATGACCAACCAGAAGTGGAAGTG





CCGCGTGACCCCCAAGAACCTGACCCCCAGCGGCGGCGAGCTGCAGAACTGCCC





CTGCAACACCTTCCAGGACAGCATGCACAGCAGCTGCTACACCGAGTACCGCCAG





TGCCGCGCCAACAACAAGACCTACTACACCGCCACCCTGCTGAAGATCCGCAGCG





GCAGCCTGAACGAGGTGCAGATCCTGCAGAACCCCAACCAGCTGCTGCAGAGCCC





CTGCCGCGGCAGCATCAACCAGCCCGTGTGCTGGAGCGCCACCGCCCCCATCCA





CATCAGCGACGGCGGCGGCCCCCTGGACACCAAGCGCGTGTGGACCGTGCAGAA





GCGCCTGGAGCAGATCCACAAGGCCATGCACCCCGAGCTGCAGTACCACCCCCTG





GCCCTGCCCAAGGTGCGCGACGACCTGAGCCTGGACGCCCGCACCTTCGACATC





CTGAACACCACCTTCCGCCTGCTGCAGATGAGCAACTTCAGCCTGGCCCAGGACT





GCTGGCTGTGCCTGAAGCTGGGCACCCCCACCCCCCTGGCCATCCCCACCCCCA





GCCTGACCTACAGCCTGGCCGACAGCCTGGCCAACGCCAGCTGCCAGATCATCCC





CCCCCTGCTGGTGCAGCCCATGCAGTTCAGCAACAGCAGCTGCCTGAGCAGCCCC





TTCATCAACGACACCGAGCAGATCGACCTGGGCGCCGTGACCTTCACCAACTGCA





CCAGCGTGGCCAACGTGAGCAGCCCCCTGTGCGCCCTGAACGGCAGCGTGTTCCT





GTGCGGCAACAACATGGCCTACACCTACCTGCCCCAGAACTGGACCGGCCTGTGC





GTGCAGGCCAGCCTGCTGCCCGACATCGACATCATCCCCGGCGACGAGCCCGTG





CCCATCCCCGCCATCGACCACTACATCCACCGCCCCAAGCGCGCCGTGCAGTTCA





TCCCCCTGCTGGCCGGCCTGGGCATCACCGCCGCCTTCACCACCGGCGCCACCG





GCCTGGGCGTGAGCGTGACCCAGTACACCAAGCTGAGCCACCAGCTGATCAGCGA





CGTGCAGGTGCTGAGCGGCACCATCCAGGACCTGCAGGACCAGGTGGACAGCCT





GGCCGAGGTGGTGCTGCAGAACCGCCGCGGCCTGGACCTGCTGACCGCCGAGCA





GGGCGGCATCTGCCTGGCCCTGCAGGAGAAGTGCTGCTTCTACGCCAACAAGAGC





GGCATCGTGCGCAACAAGATCCGCACCCTGCAGGAGGAGCTGCAGAAGCGCCGC





GAGAGCCTGGCCAGCAACCCCCTGTGGACCGGCCTGCAGGGCTTCCTGCCCTAC





CTGCTGCCCCTGCTGGGCCCCCTGCTGACCCTGCTGCTGATCCTGACCATCGGCC





CCTGCGTGTTCAGCCGCCTGATGGCCTTCATCAACGACCGCCTGAACGTGGTGCA





CGCCATGGTGCTGGCCCAGCAGTACCAGGCCCTGAAGGCCGAGGAGGAGGCCCA





GGACTAA





(RD114 ENV protein codon-optimised variant)


SEQ ID NO: 7



ATGAAACTTCCTACGGGCATGGTCATTCTGTGTAGTTTGATAATAGTCCGGGCCGG






GTTTGATGATCCTAGGAAGGCCATCGCATTGGTTCAGAAACAGCACGGGAAGCCCT





GTGAGTGCAGTGGTGGGCAAGTTAGTGAAGCCCCGCCTAACAGCATTCAGCAAGT





CACTTGTCCGGGTAAAACTGCATACCTGATGACTAACCAGAAATGGAAATGTAGAG





TTACTCCTAAAAATTTGACACCTTCAGGCGGAGAGCTCCAAAACTGCCCTTGTAATA





CTTTTCAGGACTCTATGCATAGCTCCTGTTACACAGAGTACAGGCAATGCAGAGCG





AATAACAAGACTTACTATACTGCGACCCTTCTGAAGATCCGGTCAGGCTCACTCAAC





GAAGTGCAAATTCTGCAGAACCCAAACCAACTGCTCCAAAGTCCATGTCGGGGCAG





TATCAATCAACCAGTATGCTGGTCAGCCACGGCACCTATTCACATATCTGATGGCG





GCGGACCCTTGGACACAAAGCGAGTCTGGACCGTTCAAAAGCGACTTGAGCAAAT





ACACAAAGCCATGCATCCTGAACTCCAGTATCACCCCTTGGCATTGCCAAAAGTAC





GGGACGATCTCAGTCTTGATGCAAGGACCTTTGACATACTTAACACTACATTCAGAC





TGCTCCAGATGAGTAATTTCAGCCTCGCACAGGACTGTTGGCTTTGTCTCAAGCTG





GGCACCCCCACCCCGCTCGCGATCCCGACACCGAGTCTGACATACTCACTCGCCG





ACTCATTGGCAAATGCAAGTTGCCAGATAATCCCGCCCTTGCTCGTCCAGCCGATG





CAGTTCAGTAACTCATCCTGTCTCTCAAGTCCGTTCATTAACGACACAGAACAAATC





GACTTGGGCGCAGTCACCTTCACCAACTGCACAAGTGTGGCAAATGTCAGTAGCCC





ACTTTGCGCCCTGAACGGGAGCGTATTTCTCTGTGGAAATAATATGGCGTACACGT





ATTTGCCGCAAAACTGGACCGGCCTTTGTGTTCAAGCCTCACTCCTGCCGGATATC





GACATAATCCCTGGCGACGAACCTGTACCAATCCCCGCAATCGACCACTACATTCA





CAGACCAAAGAGAGCAGTCCAGTTTATCCCCCTTCTTGCGGGCCTTGGTATCACTG





CTGCATTCACTACGGGCGCAACGGGGCTTGGGGTATCTGTAACACAATATACAAAG





CTTTCTCATCAGCTCATTTCTGACGTACAGGTGCTTTCTGGAACTATCCAAGATTTG





CAAGATCAAGTAGATTCCCTCGCAGAAGTGGTCCTCCAGAACCGGAGGGGTCTCG





ATCTTCTGACTGCCGAACAAGGGGGTATCTGCCTTGCACTCCAAGAGAAATGCTGC





TTTTACGCAAACAAAAGTGGTATTGTACGCAACAAGATACGCACGCTGCAAGAGGA





GCTTCAGAAGCGACGGGAGAGCTTGGCTAGTAACCCCCTTTGGACCGGACTTCAA





GGTTTCTTGCCCTACCTTCTTCCTCTTTTGGGCCCACTCCTGACTTTGTTGCTGATT





CTCACAATAGGTCCCTGTGTTTTCTCTCGCCTTATGGCTTTCATCAACGACAGGTTG





AATGTCGTGCATGCTATGGTTTTGGCACAGCAATACCAAGCCCTTAAAGCAGAAGA





GGAAGCACAGGACTGA





(Wild type RD114 ENV protein)


SEQ ID NO: 8



ATGAAACTCCCAACAGGAATGGTCATTTTATGTAGCCTAATAATAGTTCGGGCAGG






GTTTGACGACCCCCGCAAGGCTATCGCATTAGTACAAAAACAACATGGTAAACCAT





GCGAATGCAGCGGAGGGCAGGTATCCGAGGCCCCACCGAACTCCATCCAACAGGT





AACTTGCCCAGGCAAGACGGCCTACTTAATGACCAACCAAAAATGGAAATGCAGAG





TCACTCCAAAAAATCTCACCCCTAGCGGGGGAGAACTCCAGAACTGCCCCTGTAAC





ACTTTCCAGGACTCGATGCACAGTTCTTGTTATACTGAATACCGGCAATGCAGGGC





GAATAATAAGACATACTACACGGCCACCTTGCTTAAAATACGGTCTGGGAGCCTCA





ACGAGGTACAGATATTACAAAACCCCAATCAGCTCCTACAGTCCCCTTGTAGGGGC





TCTATAAATCAGCCCGTTTGCTGGAGTGCCACAGCCCCCATCCATATCTCCGATGG





TGGAGGACCCCTCGATACTAAGAGAGTGTGGACAGTCCAAAAAAGGCTAGAACAAA





TTCATAAGGCTATGCATCCTGAACTTCAATACCACCCCTTAGCCCTGCCCAAAGTCA





GAGATGACCTTAGCCTTGATGCACGGACTTTTGATATCCTGAATACCACTTTTAGGT





TACTCCAGATGTCCAATTTTAGCCTTGCCCAAGATTGTTGGCTCTGTTTAAAACTAG





GTACCCCTACCCCTCTTGCGATACCCACTCCCTCTTTAACCTACTCCCTAGCAGACT





CCCTAGCGAATGCCTCCTGTCAGATTATACCTCCCCTCTTGGTTCAACCGATGCAG





TTCTCCAACTCGTCCTGTTTATCTTCCCCTTTCATTAACGATACGGAACAAATAGACT





TAGGTGCAGTCACCTTTACTAACTGCACCTCTGTAGCCAATGTCAGTAGTCCTTTAT





GTGCCCTAAACGGGTCAGTCTTCCTCTGTGGAAATAACATGGCATACACCTATTTAC





CCCAAAACTGGACAGGACTTTGCGTCCAAGCCTCCCTCCTCCCCGACATTGACATC





ATCCCGGGGGATGAGCCAGTCCCCATTCCTGCCATTGATCATTATATACATAGACC





TAAACGAGCTGTACAGTTCATCCCTTTACTAGCTGGACTGGGAATCACCGCAGCAT





TCACCACCGGAGCTACAGGCCTAGGTGTCTCCGTCACCCAGTATACAAAATTATCC





CATCAGTTAATATCTGATGTCCAAGTCTTATCCGGTACCATACAAGATTTACAAGAC





CAGGTAGACTCGTTAGCTGAAGTAGTTCTCCAAAATAGGAGGGGACTGGACCTACT





AACGGCAGAACAAGGAGGAATTTGTTTAGCCTTACAAGAAAAATGCTGTTTTTATGC





TAACAAGTCAGGAATTGTGAGAAACAAAATAAGAACCCTACAAGAAGAATTACAAAA





ACGCAGGGAAAGCCTGGCATCCAACCCTCTCTGGACCGGGCTGCAGGGCTTTCTT





CCGTACCTCCTACCTCTCCTGGGACCCCTACTCACCCTCCTACTCATACTAACCATT





GGGCCATGCGTTTTCAGTCGCCTCATGGCCTTCATTAATGATAGACTTAATGTTGTA





CATGCCATGGTGCTGGCCCAGCAATACCAAGCACTCAAAGCTGAGGAAGAAGCTC





AGGATTGA





(GALV ENV protein codon-optimised variant)


SEQ ID NO: 9



ATGGTGTTGCTTCCTGGGTCTATGCTGCTGACATCTAATCTCCACCATCTCAGACAC






CAGATGTCACCCGGCAGTTGGAAGCGGCTGATCATACTGTTGAGCTGCGTATTCG





GCGGAGGGGGCACCTCTCTCCAGAACAAAAATCCTCATCAACCGATGACGCTTAC





GTGGCAGGTATTGTCCCAGACGGGTGACGTGGTATGGGACACTAAGGCTGTTCAA





CCGCCTTGGACGTGGTGGCCGACGCTGAAGCCAGATGTCTGTGCCCTGGCGGCG





TCCCTTGAGAGCTGGGATATCCCGGGGACCGACGTATCCTCCAGCAAGAGAGTTC





GCCCCCCTGACTCAGACTACACAGCCGCCTATAAGCAAATCACTTGGGGCGCGATT





GGGTGTTCATATCCCCGAGCACGCACCAGAATGGCAAGCTCTACATTCTATGTTTG





TCCCCGCGATGGCCGGACGCTGTCCGAAGCGAGGCGATGCGGAGGTCTCGAAAG





CCTCTACTGCAAGGAATGGGACTGTGAGACTACGGGCACGGGTTATTGGCTTTCTA





AATCAAGCAAAGACTTGATCACTCTTAAGTGGGACCAGAACTCAGAGTGGACACAA





AAGTTTCAACAATGCCACCAGACTGGATGGTGCAACCCCTTGAAGATAGACTTTACT





GACAAAGGTAAGCTGAGCAAGGACTGGATAACAGGGAAAACTTGGGGGTTGCGCT





TTTATGTCTCAGGCCATCCGGGGGTACAATTTACGATTCGCCTCAAAATCACGAACA





TGCCGGCGGTCGCTGTAGGTCCGGACTTGGTTTTGGTAGAACAAGGCCCTCCTCG





GACTAGCCTCGCACTGCCTCCCCCACTCCCGCCTCGAGAGGCACCACCGCCGAGC





CTGCCGGATTCCAATTCAACGGCTCTGGCCACCTCCGCACAAACACCAACAGTGC





GGAAGACTATCGTGACCCTCAACACTCCGCCCCCGACCACGGGCGACAGATTGTT





TGACCTGGTTCAAGGGGCCTTCTTGACGCTCAATGCAACGAACCCTGGAGCAACA





GAGTCTTGTTGGCTTTGTCTGGCCATGGGTCCCCCTTATTATGAAGCCATCGCGTC





ATCTGGTGAAGTGGCTTACTCAACCGACCTCGATCGCTGTAGGTGGGGCACGCAA





GGAAAGCTTACTTTGACCGAGGTCTCAGGTCATGGGTTGTGCATTGGGAAGGTCCC





CTTTACACACCAACATCTTTGTAACCAGACTCTGAGTATAAATTCTTCTGGAGATCAT





CAGTATTTGCTGCCGAGTAACCATTCATGGTGGGCGTGCTCCACGGGACTCACCC





CTTGCCTTTCAACTTCCGTTTTTAATCAAACGAGAGATTTCTGTATCCAAGTGCAACT





CATTCCGAGGATCTACTACTATCCGGAAGAAGTACTCCTGCAGGCGTATGACAATT





CCCACCCTAGGACCAAACGCGAAGCAGTGAGCCTGACCCTTGCAGTATTGTTGGG





TTTGGGGATTACTGCGGGTATCGGCACTGGTTCCACCGCGCTGATTAAGGGACCG





ATCGATTTGCAACAAGGATTGACTTCACTCCAGATAGCCATAGACGCCGACCTTCG





CGCGTTGCAGGATTCTGTGTCTAAGCTGGAGGATAGTTTGACAAGCCTCTCAGAGG





TGGTGCTGCAAAACAGACGAGGCCTTGATCTCTTGTTTCTTAAGGAGGGAGGCCTT





TGCGCTGCTCTGAAGGAAGAGTGTTGTTTCTACATCGATCATAGCGGAGCGGTCAG





AGATTCTATGAAGAAGCTTAAGGAGAAGCTTGACAAGCGACAGCTCGAACGCCAAA





AGAGCCAGAATTGGTACGAAGGATGGTTTAATAATTCTCCATGGTTCACTACACTGC





TTTCCACCATCGCTGGTCCGCTGCTGCTCCTGCTGCTCCTGTTGATACTCGGTCCG





TGCATAATTAATAAGCTCGTTCAATTCATAAACGACCGGATCTCTGCGTGCTAA





(GALV ENV protein codon-optimised variant)


SEQ ID NO: 10



ATGGTGCTTCTCCCTGGTAGCATGCTTTTGACCTCAAACCTCCATCATCTGCGACAC






CAGATGTCACCTGGCTCTTGGAAACGCCTTATTATATTGCTGAGCTGTGTTTTTGGA





GGCGGAGGTACATCATTGCAGAACAAAAACCCTCATCAGCCAATGACGTTGACCTG





GCAAGTATTGTCCCAGACCGGAGATGTCGTTTGGGACACGAAAGCGGTACAACCT





CCCTGGACTTGGTGGCCGACCCTCAAGCCCGACGTTTGCGCTCTTGCGGCGTCTT





TGGAGTCTTGGGACATACCGGGGACGGATGTCTCATCTTCAAAGAGGGTTCGACC





GCCGGATTCAGACTACACCGCTGCATATAAGCAGATTACGTGGGGAGCCATTGGCT





GTAGTTATCCGCGGGCGAGGACGCGGATGGCTTCCAGTACTTTTTATGTGTGTCCG





AGAGACGGCCGCACCCTGTCTGAGGCTCGGCGCTGCGGGGGGCTCGAAAGCCTG





TACTGCAAAGAATGGGATTGTGAGACTACAGGGACTGGTTATTGGCTCTCAAAATC





TAGCAAAGATCTGATTACGCTCAAATGGGATCAAAATTCAGAATGGACCCAAAAGTT





CCAGCAATGTCATCAGACCGGGTGGTGTAATCCGCTGAAGATAGACTTTACAGACA





AAGGCAAACTGTCAAAAGACTGGATTACGGGTAAGACTTGGGGCCTCCGCTTTTAC





GTAAGCGGTCATCCTGGGGTACAGTTTACTATAAGGCTGAAAATAACGAACATGCC





GGCGGTCGCTGTCGGGCCGGATTTGGTGCTCGTGGAACAAGGGCCACCTAGGAC





CTCTCTCGCTCTTCCCCCGCCATTGCCACCACGGGAAGCACCGCCACCAAGTCTTC





CAGATTCCAACTCTACCGCACTGGCTACGAGTGCGCAGACACCAACGGTTAGAAAA





ACCATTGTCACGCTTAACACCCCCCCTCCGACAACCGGAGATCGCCTTTTCGATCT





CGTACAGGGCGCGTTTCTTACGCTTAACGCCACAAATCCTGGGGCCACTGAGAGC





TGTTGGCTTTGCCTTGCTATGGGCCCACCATACTATGAGGCCATCGCCTCCTCCGG





CGAAGTAGCCTACTCCACGGACCTTGACCGATGCAGGTGGGGAACGCAAGGCAAA





TTGACTTTGACTGAGGTGAGCGGGCATGGTCTCTGCATCGGAAAAGTTCCGTTCAC





TCATCAGCACCTTTGTAACCAGACCCTCAGCATTAATTCTTCCGGGGATCATCAGTA





CCTCCTGCCGTCAAACCACTCTTGGTGGGCCTGCTCCACAGGTCTTACTCCCTGCT





TGAGCACATCCGTATTTAATCAGACCCGAGACTTCTGTATCCAGGTACAATTGATAC





CGAGAATTTATTACTACCCCGAGGAAGTGTTGCTCCAAGCATACGATAACTCACAC





CCTAGAACGAAGAGAGAAGCAGTCTCCCTGACGTTGGCCGTCCTTCTGGGACTGG





GAATCACCGCGGGTATAGGCACTGGATCTACGGCACTGATCAAGGGGCCTATAGA





TTTGCAGCAGGGGCTTACTTCACTTCAAATTGCCATAGACGCGGATCTTCGGGCGC





TCCAGGACTCCGTTTCCAAGTTGGAAGACTCTCTGACTAGCCTGTCCGAAGTTGTG





TTGCAGAACAGACGAGGACTTGACTTGTTGTTTCTCAAGGAAGGGGGTCTCTGTGC





TGCGCTTAAGGAGGAATGTTGCTTCTATATAGATCATTCCGGCGCGGTACGGGACT





CCATGAAAAAACTTAAAGAAAAGTTGGACAAGAGACAGTTGGAGAGGCAAAAGTCC





CAGAACTGGTATGAGGGCTGGTTTAATAACTCCCCATGGTTTACAACCCTTTTGTCT





ACCATTGCTGGGCCGCTCCTTCTTCTTCTGTTGCTGCTCATATTGGGGCCTTGTATT





ATTAACAAGCTTGTGCAATTCATTAATGACCGAATTTCTGCATGCTAA





SEQ ID NO: 11: (GALV ENV protein codon-optimised variant)


ATGGTGCTGCTGCCCGGCAGCATGCTGCTGACCAGCAACCTGCACCACCTGCGCC





ACCAGATGAGCCCCGGCAGCTGGAAGCGCCTGATCATCCTGCTGAGCTGCGTGTT





CGGCGGCGGCGGCACCAGCCTGCAGAACAAGAACCCCCACCAGCCCATGACCCT





GACCTGGCAGGTGCTGAGCCAGACCGGCGACGTGGTGTGGGACACCAAGGCCGT





GCAGCCCCCCTGGACCTGGTGGCCCACCCTGAAGCCCGACGTGTGCGCCCTGGC





CGCCAGCCTGGAGAGCTGGGACATCCCCGGCACCGACGTGAGCAGCAGCAAGCG





CGTGCGCCCCCCCGACAGCGACTACACCGCCGCCTACAAGCAGATCACCTGGGG





CGCCATCGGCTGCAGCTACCCCCGCGCCCGCACCCGCATGGCCAGCAGCACCTT





CTACGTGTGCCCCCGCGACGGCCGCACCCTGAGCGAGGCCCGCCGCTGCGGCG





GCCTGGAGAGCCTGTACTGCAAGGAGTGGGACTGCGAGACCACCGGCACCGGCT





ACTGGCTGAGCAAGAGCAGCAAGGACCTGATCACCCTGAAGTGGGACCAGAACAG





CGAGTGGACCCAGAAGTTCCAGCAGTGCCACCAGACCGGCTGGTGCAACCCCCTG





AAGATCGACTTCACCGACAAGGGCAAGCTGAGCAAGGACTGGATCACCGGCAAGA





CCTGGGGCCTGCGCTTCTACGTGAGCGGCCACCCCGGCGTGCAGTTCACCATCCG





CCTGAAGATCACCAACATGCCCGCCGTGGCCGTGGGCCCCGACCTGGTGCTGGT





GGAGCAGGGCCCCCCCCGCACCAGCCTGGCCCTGCCCCCCCCCCTGCCCCCCCG





CGAGGCCCCCCCCCCCAGCCTGCCCGACAGCAACAGCACCGCCCTGGCCACCAG





CGCCCAGACCCCCACCGTGCGCAAGACCATCGTGACCCTGAACACCCCCCCCCCC





ACCACCGGCGACCGCCTGTTCGACCTGGTGCAGGGCGCCTTCCTGACCCTGAACG





CCACCAACCCCGGCGCCACCGAGAGCTGCTGGCTGTGCCTGGCCATGGGCCCCC





CCTACTACGAGGCCATCGCCAGCAGCGGCGAGGTGGCCTACAGCACCGACCTGG





ACCGCTGCCGCTGGGGCACCCAGGGCAAGCTGACCCTGACCGAGGTGAGCGGCC





ACGGCCTGTGCATCGGCAAGGTGCCCTTCACCCACCAGCACCTGTGCAACCAGAC





CCTGAGCATCAACAGCAGCGGCGACCACCAGTACCTGCTGCCCAGCAACCACAGC





TGGTGGGCCTGCAGCACCGGCCTGACCCCCTGCCTGAGCACCAGCGTGTTCAACC





AGACCCGCGACTTCTGCATCCAGGTGCAGCTGATCCCCCGCATCTACTACTACCCC





GAGGAGGTGCTGCTGCAGGCCTACGACAACAGCCACCCCCGCACCAAGCGCGAG





GCCGTGAGCCTGACCCTGGCCGTGCTGCTGGGCCTGGGCATCACCGCCGGCATC





GGCACCGGCAGCACCGCCCTGATCAAGGGCCCCATCGACCTGCAGCAGGGCCTG





ACCAGCCTGCAGATCGCCATCGACGCCGACCTGCGCGCCCTGCAGGACAGCGTG





AGCAAGCTGGAGGACAGCCTGACCAGCCTGAGCGAGGTGGTGCTGCAGAACCGC





CGCGGCCTGGACCTGCTGTTCCTGAAGGAGGGCGGCCTGTGCGCCGCCCTGAAG





GAGGAGTGCTGCTTCTACATCGACCACAGCGGCGCCGTGCGCGACAGCATGAAGA





AGCTGAAGGAGAAGCTGGACAAGCGCCAGCTGGAGCGCCAGAAGAGCCAGAACT





GGTACGAGGGCTGGTTCAACAACAGCCCCTGGTTCACCACCCTGCTGAGCACCAT





CGCCGGCCCCCTGCTGCTGCTGCTGCTGCTGCTGATCCTGGGCCCCTGCATCATC





AACAAGCTGGTGCAGTTCATCAACGACCGCATCAGCGCCTGCTAA





(Wild type GALV ENV protein)


SEQ ID NO: 12



ATGGTATTGCTGCCTGGGTCCATGCTTCTCACCTCAAACCTGCACCACCTTCGGCA






CCAGATGAGTCCTGGGAGCTGGAAAAGACTGATCATCCTCCTAAGCTGCGTATTCG





GCGGCGGCGGTACCAGTCTGCAAAATAAGAACCCCCACCAGCCCATGACCCTCAC





TTGGCAGGTACTGTCCCAAACTGGAGACGTTGTCTGGGATACAAAGGCAGTCCAG





CCCCCTTGGACTTGGTGGCCCACACTTAAACCTGATGTATGTGCCTTGGCGGCTAG





TCTTGAGTCCTGGGATATCCCGGGAACCGATGTCTCGTCCTCTAAACGAGTCAGAC





CTCCGGACTCAGACTATACTGCCGCTTATAAGCAAATCACCTGGGGAGCCATAGGG





TGCAGCTACCCTCGGGCTAGGACTAGAATGGCAAGCTCTACCTTCTACGTATGTCC





CCGGGATGGCCGGACCCTTTCAGAAGCTAGAAGGTGCGGGGGGCTAGAATCCCTA





TACTGTAAAGAATGGGATTGTGAGACCACGGGGACCGGTTATTGGCTATCTAAATC





CTCAAAAGACCTCATAACTCTTAAGTGGGACCAAAATAGCGAATGGACTCAAAAATT





TCAACAGTGTCACCAGACCGGCTGGTGTAACCCCCTTAAAATAGATTTCACAGACA





AAGGAAAATTATCCAAGGACTGGATAACGGGAAAAACCTGGGGATTAAGATTCTAT





GTGTCTGGACATCCAGGCGTACAGTTCACCATTCGCTTAAAAATCACCAACATGCC





AGCTGTGGCAGTAGGTCCTGACCTCGTCCTTGTGGAACAAGGACCTCCTAGAACGT





CCCTCGCTCTCCCACCTCCTCTTCCCCCAAGGGAAGCGCCACCGCCATCTCTCCC





CGACTCTAACTCCACAGCCCTGGCGACTAGTGCACAAACTCCCACGGTGAGAAAAA





CAATTGTTACCCTAAACACTCCGCCTCCCACCACAGGCGACAGACTTTTTGATCTTG





TGCAGGGGGCCTTCCTAACCTTAAATGCTACCAACCCAGGGGCCACTGAGTCTTGC





TGGCTTTGTTTGGCCATGGGCCCCCCTTATTATGAAGCAATAGCCTCATCAGGAGA





GGTCGCCTACTCCACCGACCTTGACCGGTGCCGCTGGGGGACCCAAGGAAAGCTC





ACCCTCACTGAGGTCTCAGGACACGGGTTGTGCATAGGAAAGGTGCCCTTTACCCA





TCAGCATCTCTGCAATCAGACCCTATCCATCAATTCCTCCGGAGACCATCAGTATCT





GCTCCCCTCCAACCATAGCTGGTGGGCTTGCAGCACTGGCCTCACCCCTTGCCTC





TCCACCTCAGTTTTTAATCAGACTAGAGATTTCTGTATCCAGGTCCAGCTGATTCCT





CGCATCTATTACTATCCTGAAGAAGTTCGTTACAGGCCTATGACAATTCTCACCCC





AGGACTAAAAGAGAGGCTGTCTCACTTACCCTAGCTGTTTTACTGGGGTTGGGAAT





CACGGCGGGAATAGGTACTGGTTCAACTGCCTTAATTAAAGGACCTATAGACCTCC





AGCAAGGCCTGACAAGCCTCCAGATCGCCATAGATGCTGACCTCCGGGCCCTCCA





AGACTCAGTCAGCAAGTTAGAGGACTCACTGACTTCCCTGTCCGAGGTAGTGCTCC





AAAATAGGAGAGGCCTTGACTTGCTGTTTCTAAAAGAAGGTGGCCTCTGTGCGGCC





cTAAAGGAAGAGTGCTGTTTTTACATAGACCACTCAGGTGCAGTACGGGACTCCAT





GAAAAAACTCAAAGAAAAACTGGATAAAAGACAGTTAGAGCGCCAGAAAAGCCAAA





ACTGGTATGAAGGATGGTTCAATAACTCCCCTTGGTTCACTACCCTGCTATCAACCA





TCGCTGGGCCCCTATTACTCCTCCTTCTGTTGCTCATCCTCGGGCCATGCATCATC





AATAAGTTAGTTCAATTCATCAATGATAGGATAAGTGCATGTTAA





(Gag Polyprotein Amino Acid Sequence)


SEQ ID NO: 13



MGQTVTTPLSLTLGHWKDVERIAHNQSVDVKKRRWVTFCSAEWPTFNVGWPRDGTF






NRDLITQVKIKVFSPGPHGHPDQVPYIVTWEALAFDPPPWVKPFVHPKPPPPLPPSAPS





LPLEPPRSTPPRSSLYPALTPSLGAKPKPQVLSDSGGPLIDLLTEDPPPYRDPRPPPSD





RDGNGGEATPAGEAPDPSPMASRLRGRREPPVADSTTSQAFPLRAGGNGQLQYWPF





SSSDLYNWKNNNPSFSEDPGKLTALIESVLITHQPTWDDCQQLLGTLLTGEEKQRVLLE





ARKAVRGDDGRPTQLPNEVDAAFPLERPDWDYTTQAGRNHLVHYRQLLLAGLQNAGR





SPTNLAKVKGITQGPNESPSAFLERLKEAYRRYTPYDPEDPGQETNVSMSFIWQSAPDI





GRKLERLEDLKNKTLGDLVREAEKIFNKRETPEEREERIRRETEEKEERRRTEDEQKEK





ERDRRRHREMSKLLATVVSGQKQDRQGGERRRSQLDRDQCAYCKEKGHWAKDCPK





KPRGPRGPRPQTSLLTLDD*





(Pol Polyprotein Amino Acid Sequence)


SEQ ID NO: 14



MGPMGQPLQVLTLNIEDEYRLHETSKEPDVSLGSTWLSDFPQAWAETGGMGLAVRQA






PLHPLKATSTPVSIKQYPMSQEARLGIKPHIQRLLDQGILVPCQSPWNTPLLPVKKPGTN





DYRPVQDLREVNKRVEDIHPTVPNPYNLLSGLPPSHQWYTVLDLKDAFFCLRLHPTSQ





PLFAFEWRDPEMGISGQLTWTRLPQGFKNSPTLFDEALHRDLADFRIQHPDLILLQYVD





DLLLAATSELDCQQGTRALLQTLGNLGYRASAKKAQICQKQVKYLGYLLKEGQRWLTE





ARKETVMGQPTPKTPRQLREFLGTAGFCRLWIPGFAEMAAPLYPLTKTGTLFNWGPD





QQKAYQEIKQALLTAPALGLPDLTKPFELFVDEKQGYAKGVLTQKLGPWRRPVAYLSK





KLDPVAAGWPPCLRMVAAIAVLTKDAGKLTMGQPLVILAPHAVEALVKQPPDRWLSNA





RMTHYQALLLDTDRVQFGPVVALNPATLLPLPEEGLQHDCLDILAEAHGTRSDLTDQPL





PDADHTWYTDGSSFLQEGQRKAGAAVTTETEVIWARALPAGTSAQRAELIALTQALKM





AEGKKLNVYTDSRYAFATAHIHGEIYRRRGLLTSEGKEIKNKDEILALLKALFLPKRLSIIH





CPGHQKGNSAEARGNRMADQAAREVATRETPGTSTLLIENSTPYTHEHFHYTVTDTKD





LTKLGATYDSAKKYWVYQGKPVMPDQFTFELLDFLHQLTHLSFSKTKALLERSPSPYY





MLNRDRTLKNITETCKACAQVNASKSAVKQGTRVRGHRPGTHWEIDFTEVKPGLYGYK





YLLVFVDTFSGWIEAFPTKKETAKVVTKKLLEEIFPRFGMPQVLGTDNGPAFVSKVSQT





VADLLGIDWKLHCAYRPQSSGQVERMNRTIKETLTKLTLATGSRDWVLLLPLALYRARN





TPGPHGLTPYEILYGAPPPLVNFPDPDMTRVTNSPSLQAHLQALYLVQHEVWRPLAAA





YQEQLDRPVVPHPYRVGDTVWVRRHQTKNLEPRWKGPYTVLLTTPTALKVDGIAAWIH





AAHVKAADTESGPSSGRTWRVQRSQNPLKIRLTRGSP*





SEQ ID NO: 15: RD114 Envelope Amino Acid Sequence


MKLPTGMVILCSLIIVRAGFDDPRKAIAIVQKQHGKPCECSGGQVSEAPPNSIQQVTCP





GKTAYLMTNQKWKCRVTPKNLTPSGGELQNCPCNTFQDSMHSSCYTEYRQCRANNK





TYYTATLLKIRSGSLNEVQILQNPNQLLQSPCRGSINQPVCWSATAPIHISDGGGPLDTK





RVWTVQKRLEQIHKAMHPELQYHPLALPKVRDDLSLDARTFDILNTTFRLLQMSNFSLA





QDCWLCLKLGTPTPLAIPTPSLTYSLADSLANASCQIIPPLLVQPMQFSNSSCLSSPFIND





TEQIDLGAVTFTNCTSVANVSSPLCALNGSVFLCGNNMAYTYLPQNWTGLCVQASLLP





DIDHPGDEPVPIPAIDHYIHRPKRAVQFIPLLAGLGITAAFTTGATGLGVSVTQYTKLSHQ





LISDVQVLSGTIQDLQDQVDSLAEVVLQNRRGLDLLTAEQGGICLALQEKCCFYANKSGI





VRNKIRTLQEELQKRRESLASNPLWTGLQGFLPYLLPLLGPLLTLLLILTIGPCVFSRLMA





FINDRLNWHAMVLAQQYQALKAEEEAQD*





(Galv Envelope Amino Acid Sequence)


SEQ ID NO: 16



MVLLPGSMLLTSNLHHLRHQMSPGSWKRLIILLSCVFGGGGTSLQNKNPHQPMTLTW






QVLSQTGDVVWDTKAVQPPWTWWPTLKPDVCALAASLESWDIPGTDVSSSKRVRPP





DSDYTAAYKQITWGAIGCSYPRARTRMASSTFYVCPRDGRTLSEARRCGGLESLYCKE





WDCETTGTGYWLSKSSKDLITLKWDQNSEWTQKFQQCHQTGWCNPLKIDFTDKGKLS





KDWITGKTWGLRFYVSGHPGVQFTIRLKITNMPAVAVGPDLVLVEQGPPRTSLALPPPL





PPREAPPPSLPDSNSTALATSAQTPTVRKTIVTLNTPPPTTGDRLFDLVQGAFLTLNATN





PGATESCWLCLAMGPPYYEAIASSGEVAYSTDLDRCRWGTQGKLTLTEVSGHGLCIGK





VPFTHQHLCNQTLSINSSGDHQYLLPSNHSWWACSTGLTPCLSTSVFNQTRDFCIQVQ





LIPRIYYYPEEVLLQAYDNSHPRTKREAVSLTLAVLLGLGITAGIGTGSTALIKGPIDLQQG





LTSLQIAIDADLRALQDSVSKLEDSLTSLSEVVLQNRRGLDLLFLKEGGLCAALKEECCF





YIDHSGAVRDSMKKLKEKLDKRQLERQKSQNWYEGWFNNSPWFTTLLSTIAGPLLLLL





LLLILGPCIINKLVQFINDRISAC*





(Nucleotide Sequence encoding the packaging signal in the genome


plasmid)


SEQ ID NO: 17



AAGCTGGCCAGCAACTTATCTGTGTCTGTCCGATTGTCTAGTGTCTATGACTGATTT






TATGCGCCTGCGTCGGTACTAGTTAGCTAACTAGCTCTGTATCTGGCGGACCCGTG





GTGGAACTGACGAGTTCGGAACACCCGGCCGCAACCCTGGGAGACGTCCCAGGG





ACTTCGGGGGCCGTTTTTGTGGCCCGACCTGAGTCCTAAAATCCCGATCGTTTAGG





ACTCTTTGGTGCACCCCCCTTAGAGGAGGGATATGTGGTTCTGGTAGGAGACGAG





AACCTAAAACAGTTCCCGCCTCCGTCTGAATTTTTGCTTTCGGTTTGGGACCGAAG





CCGCGCCGCGCGTCTTGTCTGCTGCAGCATCGTTCTGTGTTGTCTCTGTCTGACTG





TGTTTCTGTATTTGTCTGAAAATATGGGCCCGGGCTAGCCTGTTACCACTCCCTTAA





GTTTGACCTTAGGTCACTGGAAAGATGTCGAGCGGATCGCTCACAACCAGTCGGTA





GATGTCAAGAAGAGACGTTGGGTTACCTTCTGCTCTGCAGAATGGCCAACCTTTAA





CGTCGGATGGCCGCGAGACGGCACCTTTAACCGAGACCTCATCACCCAGGTTAAG





ATCAAGGTCTTTTCACCTGGCCCGCATGGACACCCAGACCAGGTCCCCTACATCGT





GACCTGGGAAGCCTTGGCTTTTGACCCCCCTCCCTGGGTCAAGCCCTTTGTACACC





CTAAGCCTCCGCCTCCTCTTCCTCCATCCGCCCCGTCTCTCCCCCTTGAACCTCCT





CGTTCGACCCCGCCTCGATCCTCCCTTTATCCAGCCCTCACTCCTTCTCTAGGCGC





CCCCATATGGCCATATGAGATCTTATATGGGGCACCCCCGCCCCTTGTAAACTTCC





CTGACCCTGACATGACAAGAGTTACTAACAGCCCCTCTCTCCAAGCTCACTTACAG





GCTCTCTACTTAGTCCAGCACGAAGTCTGGAGACCTCTGGCGGCAGCCTACCAAG





AACAACTGGACCGA





(Nucleotide sequence of CMV promoter)


SEQ ID NO: 18



AGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACAT






AACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGAC





GTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCA





ATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATAT





GCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATG





CCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCA





TCGCTATTACCATGCTGATGCGGTTTTGGCAGTACATCAATGGGCGTGGATAGCGG





TTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTT





TGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGAC





GCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTGGTTTA





GTGAACCGTC





(Nucleotide sequence of CAG promoter)


SEQ ID NO: 19



ATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTAC






ATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTG





ACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGT





CAATGGGTGGACTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCAT





ATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTA





TGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGT





CATCGCTATTACCATGCGTCGAGGTGAGCCCCACGTTCTGCTTCACTCTCCCCATC





TCCCCCCCCTCCCCACCCCCAATTTTGTATTTATTTATTTTTTAATTATTTTATGCAG





CGATGGGGGCGGGGGGGGGGGGGGCGCGCGCCAGGCGGGGCGGGGCGGGGC





GAGGGGCGGGGCGGGGCGAGGCGGAGAGGTGCGGCGGCAGCCAATCAGAGCG





GCGCGCTCCGAAAGTTTCCTTTTATGGCGAGGCGGCGGCGGCGGCGGCCCTATAA





AAAGCGAAGCGCGCGGCGGGCGGGAGTCGCTGCGTTGCCTTCGCCCCGTGCCCC





GCTCCGCGCCGCCTCGCGCCGCCCGCCCCGGCTCTGACTGACCGCGTTACTCCC





ACAGGTGAGCGGGCGGGACGGCCCTTCTCCCTCCGGGCTGTAATTAGCGCTTGGT





TTAATGACGGCTCGTTTCTTTTCTGTGGCTGCGTGAAAGCCTTAAAGGGCTCCGGG





AGGGCCTTTGTGCGGGGGGGAGCGGCTCGGGGGGTGCGTGCGTGTGTGTGTGC





GTGGGGAGCGCCGCGTGCGGCCCGCGCTGCCCGGCGGCTGTGAGCGCTGCGGG





CGCGGCGCGGGGCTTTGTGCGCTCCGCGTGTGCGCGAGGGGAGCGCGGGCCGG





GGGCGGTGCCCCGCGGTGCGGGGGGGCTGCGAGGGGAACAAAGGCTGCGTGCG





GGGTGTGTGCGTGGGGGGGTGAGCAGGGGGTGTGGGCGCGGCGGTCGGGCTGT





AACCCCCCCCTGGCACCCCCCTCCCCGAGTTGCTGAGCACGGCCCGGCTTCGGG





TGCGGGGCTCCGTGCGGGGCGTGGCGCGGGGCTCGCCGTGCCGGGCGGGGGGT





GGCGGCAGGTGGGGGTGCCGGGCGGGGCGGGGCCGCCTCGGGCCGGGGAGGG





CTCGGGGGAGGGGCGCGGCGGCCCCGGAGCGCCGGCGGCTGTCGAGGCGCGG





CGAGCCGCAGCCATTGCCTTTTATGGTAATCGTGCGAGAGGGCGCAGGGACTTCC





TTTGTCCCAAATCTGGCGGAGCCGAAATCTGGGAGGCGCCGCCGCACCCCCTCTA





GCGGGCGCGGGCGAAGCGGTGCGGCGCCGGCAGGAAGGAAATGGGCGGGGAG





GGCCTTCGTGCGTCGCCGCGCCGCCGTCCCCTTCTCCATCTCCAGCCTCGGGGCT





GCCGCAGGGGGACGGCTGCCTTCGGGGGGGACGGGGCAGGGCGGGGTTCGGCT





TCTGGCGTGTGACCGGCGGCTTTAGAGCCTCTGCTAACCATGTTCATGCCTTCTTC





TTTTTCCTACAGCTCCTGGGCAACGTGCTGGTTGTTGTGCTGTCTCATCATTTTGGC





AA





(Nucleotide sequence of Rabbit B-globin polyA)


SEQ ID NO: 20



CTGGTGTGGCCAATGCCCTGGCTCACAAATACCACTGACGATCTTTTTCCCTCTGC






CAAAAATTATGGGGACATCATGAAGCCCCTTGAGCATCTGACTTCTGGCTAATAAAG





GAAATTTATTTTCATTGCAATAGTGTGTTGGAATTTTTTGTGTCTCTCACTCGG





(Nucleotide sequence of BGIntron)


SEQ ID NO: 21



AGATCGCCTGGAGACGCCATCCACGCTGTTTTGACCTCCATAGAAGACACCGGGA






CCGATCCAGCCTCCCCTCGAAGCTTACATGTGGTACCGAGCTCGGATCCTGAGAA





CTTCAGGGTGAGTCTATGGGACCCTTGATGTTTTCTTTCCCCTTCTTTTCTATGGTT





AAGTTCATGTCATAGGAAGGGGAGAAGTAACAGGGTACACATATTGACCAAATCAG





GGTAATTTTGCATTTGTAATTTTAAAAAATGCTTTCTTCTTTTAATATACTTTTTTGTTT





ATCTTATTTCTAATACTTTCCCTAATCTCTTTCTTTCAGGGCAATAATGATACAATGTA





TCATGCCTCTTTGCACCATTCTAAAGAATAACAGTGATAATTTCTGGGTTAAGGCAA





TAGCAATATTTCTGCATATAAATATTTCTGCATATAAATTGTAACTGATGTAAGAGGT





TTCATATTGCTAATAGCAGCTACAATCCAGCTACCATTCTGCTTTTATTTTATGGTTG





GGATAAGGCTGGATTATTCTGAGTCCAAGCTAGGCCCTTTTGCTAATCATGTTCATA





CCTCTTATCTTCCTCCCACAGCTCCTGGGCAACGTGCTGGTCTGTGTGCTGGCCCA





TCACTTTGGCAAAG





(Nucleotide sequence of Ferritin promoter)


SEQ ID NO: 22



AGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACAT






AACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGAC





GTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCA





ATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATAT





GCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATG





CCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCA





TCGCTATTACCATGCTGATGCGGTTTTGGCAGTACATCAATGGGCGTGGATAGCGG





TTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTT





TGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGAC





GCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTGGTTTA





GTGAACCGGATCCCCCGGGCTGCAGGAATTTATGAAATCCTTTATGGGGGACCCC





CCCCTTTGTCAACCTTGCTCAATTCCTTCTCCCCCTCCGATCCTAAGACTGATTTAC





AAGCCCGACTAAAAGGGCTGCAAGCGGTGCAGGCCCAAATCTGGACACCCCTGGC





CGAATTGTACCGGCCAGGACATCCACAAACTAGCCACCCATTTCAGGTGGGAGACT





CCGTGTACGTCCGGCGGCACGCCTCTCAAGGATTGGAGCCTCGTTGGAAGGGACC





TTACATCGTCCTGCTGACCACGCCCACCGCCATAAAGGTTGACGGGATCGCCGCC





TGGATTCACGCATCGCACGCCAAGGCAGCCCCAAAAACCCCTGGACCAGAAACTC





CCAAAACCTGGAAGCTCCGCCGTTCGGAGAACCCTCTTAAGATAAGACTCTCCCGT





GTCTGACTGCTAATCCACCTTGTCCCTGTACTAACCCAAA





(Nucleotide sequence of CMV-RD114UTR)


SEQ ID NO: 23



ACTAGTTCCGCCAGAGCGCGCGAGGGCCTCCAGCGGCCGCCCCTCCCCCACAGC






AGGGGCGGGGTCCCGCGCCCACCGGAAGGAGCGGGCTCGGGGCGGGCGGCGCT





GATTGGCCGGGGCGGGCCTGACGCCGACGCGGCTATAAGAGACCACAAGCGACC





CGCAGGGCCAGACGTTCTTCGCCGAAGCTT





(Sv40 PolyA)


SEQ ID NO: 24



CAGACATGATAAGATACATTGATGAGTTTGGACAAACCACAACTAGAATGCAGTGAA






AAAAATGCTTTATTTGTGAAATTTGTGATGCTATTGCTTTATTTGTAACCATTATAAGC





TGCAATAAACAAGTTAACAACAACAATTGCATTCATTTTATGTTTCAGGTTCAGGGG





GAGGTGTGGGAGGTTTTTT





(RD114 Intron)


SEQ ID NO: 25



GATCCCCCGGGCTGCAGGAATTTATGAAATCCTTTATGGGGGACCCCCCCCTTTGT






CAACCTTGCTCAATTCCTTCTCCCCCTCCGATCCTAAGACTGATTTACAAGCCCGAC





TAAAAGGGCTGCAAGCGGTGCAGGCCCAAATCTGGACACCCCTGGCCGAATTGTA





CCGGCCAGGACATCCACAAACTAGCCACCCATTTCAGGTGGGAGACTCCGTGTAC





GTCCGGCGGCACCGCTCTCAAGGATTGGAGCCTCGTTGGAAGGGACCTTACATCG





TCCTGCTGACCACGCCCACCGCCATAAAGGTTGACGGGATCGCCGCCTGGATTCA





CGCATCGCACGCCAAGGCAGCCCCAAAAACCCCTGGACCAGAAACTCCCAAAACC





TGGAAGCTCCGCCGTTCGGAGAACCCTCTTAAGATAAGACTCTCCCGTGTCTGACT





GCTAATCCACCTTGTCCCTGTACTAACCCAAA





(MEF1 Intron)


SEQ ID NO: 26



GCCGTCAGAACGCAGGTGAGGGGCGGGTGTGGCTTCCGCGGGCCGCCGAGCTG






GAGGTCCTGCTCCGAGCGGGCCGGGCCCCGCTGTCGTCGGCGGGGATTAGCTGC





GAGCATTCCCGCTTCGAGTTGCGGGCGGCGCGGGAGGCAGAGTGCGAGGCCTAG





CGGCAACCCCGTAGCCTCGCCTCGTGTCCGGCTTGAGGCCTAGCGTGGTGTCCGC





GCCGCCGCCGCGTGCTACTCCGGCCGCACTCTGGTCTTTTTTTTTTTTGTTGTTGTT





GCCCTGCTGCCTTCGATTGCCGTTCAGCAATAGGGGCTAACAAAGGGAGGGTGCG





GGGCTTGCTCGCCCGGAGCCCGGAGAGGTCATGGTTGGGGAGGAATGGAGGGAC





AGGAGTGGCGGCTGGGGCCCGCCCGCCTTCGGAGCACATGTCCGACGCCACCTG





GATGGGGCGAGGCCTGGGGTTTTTCCCGAAGCAACCAGGCTGGGGTTAGCGTGC





CGAGGCCATGTGGCCCCAGCACCCGGCACGATCTGGCTTGGCGGCGCCGCGTTG





CCCTGCCTCCCTAACTAGGGTGAGGCCATCCCGTCCGGCACCAGTTGCGTGCGTG





GAAAGATGGCCGCTCCCGGGCCCTGTTGCAAGGAGCTCAAAATGGAGGACGCGG





CAGCCCGGTGGAGCGGGCGGGTGAGTCACCCACACAAAGGAAGAGGGCCTGGTC





CCTCACCGGCTGCTGCTTCCTGTGACCCCGTGGTCCTATCGGCCGCAATAGTCAC





CTCGGGCTTTTGAGCACGGCTAGTCGCGGCGGGGGGAGGGGATGTAATGGCGTT





GGAGTTTGTTCACATTTGGTGGGTGGAGACTAGTCAGGCCAGCCTGGCGCTGGAA





GTCATTTTTGGAATTTGTCCCCTTGAGTTTTGAGCGGAGCTAATTCTCGGGCTTCTT





AGCGGTTCAAAGGTATCTTTTAAACCCTTTTTTAGGTGTTGTGAAAACCACCGCTAA





TTCAAAGCAACCGG





Claims
  • 1. A plasmid system for transfection into a cell to create a producer cell, the system comprising: a) a helper plasmid comprising a first nucleotide sequence encoding Murine leukemia virus (MLV)-derived Gag and Pol poly-proteins,b) an envelope plasmid comprising a second nucleotide sequence encoding an Env protein,c) a genome plasmid comprising a third nucleotide sequence comprising a retroviral genome,wherein the first nucleotide sequence comprises the sequence selected from SEQ ID NO: 1, SEQ ID NO: 2 or SEQ ID NO: 3, andwherein the second nucleotide sequence is codon-optimized for expression in the producer cell.
  • 2. The system of claim 1, wherein the second nucleotide sequence comprises the sequence selected from SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10 or SEQ ID NO: 11.
  • 3. The system of claim 1, wherein the Env protein is RD114 Envelope protein.
  • 4. The system of claim 3, wherein the codon adaptation index (CAI) of the second nucleotide sequence is at least 0.75.
  • 5. The system of claim 1, wherein the genome plasmid further comprises a nucleotide of interest (NOI).
  • 6. The system of claim 1, wherein genome plasmid comprises a packaging signal, which has homology with a portion of the wildtype MLV nucleotide sequence encoding Gag and/or Pol polyprotein(s).
  • 7. A producer cell capable of producing retroviral vectors, comprising the plasma system of claim 1.
  • 8. A nucleotide sequence encoding MLV-derived Gag and Pol poly-proteins comprising the sequence selected from: SEQ ID NO: 1 to SEQ ID NO: 3.
  • 9. A nucleotide sequence encoding Env protein comprising the sequence selected from: SEQ ID NO: 5 to SEQ ID NO: 7 and SEQ ID NO: 9 to 11.
Priority Claims (1)
Number Date Country Kind
1720948 Dec 2017 GB national
PCT Information
Filing Document Filing Date Country Kind
PCT/GB2018/053638 12/14/2018 WO
Publishing Document Publishing Date Country Kind
WO2019/116051 6/20/2019 WO A
Foreign Referenced Citations (4)
Number Date Country
WO-9531566 Nov 1995 WO
WO-0179518 Oct 2001 WO
WO-2013153391 Oct 2013 WO
WO-2014066700 May 2014 WO
Non-Patent Literature Citations (6)
Entry
International Search Report and Written Opinion from International Application No. PCT/GB2018/053638 dated Apr. 10, 2019.
Li et al., “A codon-shuffling method to prevent reversion during production of replication-defective herpesvirus stocks: Implications for herpesvirus vaccines,” Scientific Reports 7(1), 9 pages (2017).
Sheridan et al., “Generation of Retroviral Packaging and Producer Cell Lines for Large-Scale Vector Production and Clinical Application: Improved Safety and High Titer,” Molecular Therapy 2(3):262-275 (2000).
Shin et al., “Construction of a retroviral vector production system with the minimum possibility of a homologous recombination,” Gene Therapy 10(8):706-711 (2003).
Soneoka et al., “A transient three-plasmid expression system for the production of high titer retroviral vectors,” Nucleic Acids Res. 23(4):628-633 (1995).
Zucchelli et al., “Codon Optimization Leads to Functional Impairment of RD114-TR Envelope Glycoprotein,” Mol Ther Methods Clin Dev. 4:102-114 (2017).
Related Publications (1)
Number Date Country
20230242937 A1 Aug 2023 US