COMPOSITIONS AND METHODS FOR MODULATING HEPATOCYTE NUCLEAR FACTOR 4-ALPHA (HNF4alpha) GENE EXPRESSION

Abstract
The present invention provides agents and compositions for modulating expression (e.g., enhanced or reduced expression) of a hepatocyte nuclear factor 4 alpha (HNF4a) gene by targeting an HNF4α expression control region and methods of use thereof for treating an HNF4α associated disorder, e.g., cirrhosis.
Description
SEQUENCE LISTING

The instant application contains a Sequence Listing which has been submitted electronically in ASCII format and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Dec. 10, 2020, is named 131717-00103_SL.txt and is 4,343,647 bytes in size.


BACKGROUND OF THE INVENTION

Master regulators are proteins, such as transcription factors, with the ability to influence the expression of a network of genes related by cell type, organ system, or response to a stimulus.


One example of a master regulator is the transcription factor Hepatocyte Nuclear Factor 4-alpha (HNF4α). HNF4α is expressed for the first time in terminally differentiated liver cells, late in embryonic development. HNF4α controls the expression of proteins necessary for the normal function of hepatocytes and other cell types in the liver (Li, et al. (2000) Genes and Devel 14:464-474). In addition, many of these proteins are secreted by the liver cells and contribute to health systemically. For example, proteins such as albumin are required to transport nutrients, hormones, lipids, and small molecule drugs in the circulation. In fibrotic liver disease, HNF4α is dysregulated and, as a result, gene expression in its network declines significantly or stops (Guzman-Lepe, et al. (2018) Hepatol Comm 2(5):582). This dysregulation of the network contributes to the pathology of liver failure in the organ itself, and to co-morbidities throughout the patient.


Recently, Nishikawa et al. (2015) demonstrated that transgenic expression of HNF4α in a rat model of cirrhosis led to the restoration of gene expression throughout the HNF4α network, restored hepatocyte function, and improved health of the animal. The transgene was delivered with an adeno-associated virus (AAV). However, transgene expression from AAV delivery does not allow subtle control or temporary modification of the expression of genes already in the genome. Once modified with AAV, the affected cells lose the ability to respond to changing conditions in the organ and body in a nimble physiologically meaningful way.


Accordingly there is a need in the art for temporary and labile effectors that offer greater control and the ability to restore cells and tissues to a nascent “normal” state and, thus, treat HNF4α-associated disease, such as fibrotic liver disease, e.g., cirrhosis.


SUMMARY OF THE INVENTION

The present invention provides agents and compositions for modulating the expression (e.g., enhancing or reducing expression) of a hepatocyte nuclear factor 4 alpha (HNF4α) gene by targeting an HNF4α expression control region. The HNF4α gene may be in a cell, e.g., a mammalian cell, such as a mammalian somatic cell, e.g., a human somatic cell. The present invention also provides methods of using the agents and compositions of the invention for modulating the expression of an HNF4α gene or for treating a subject who would benefit from modulating the expression of an HNF4α gene, e.g., a subject suffering or prone to suffering from an HNF4α-associated disease.


Accordingly, in one aspect, the present invention provides a site-specific disrupting agent, comprising a site-specific HNF4α targeting moiety which targets an HNF4α expression control region.


In some embodiments, the site-specific HNF4α targeting moiety comprises a polymeric molecule.


The polymeric molecule may comprise a polyamide, a polynucleotide, a polynucleotide encoding a DNA-binding domain, or fragment thereof, that specifically binds to the HNF4α expression control region, or a peptide nucleic acid (PNA).


In some embodiments, the expression control region comprises an HNF4α-specific transcriptional control element.


In some embodiments, the transcriptional control element comprises an HNF4α promoter, such as the nucleotide sequence of HNF4α promoter 1, or a fragment thereof, or the nucleotide sequence of HNF4α promoter 2, or a fragment thereof.


In some embodiments, the transcriptional control element comprises a transcriptional enhancer.


In some embodiments, the transcriptional control element comprises a transcriptional repressor.


In some embodiments, the site-specific HNF4α disrupting agent comprises a nucleotide sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% nucleotide identity to the entire nucleotide sequence of any of the nucleotide sequences in any one of Tables 2, 3, 4, and 9.


In some embodiments, the site-specific HNF4α disrupting agent comprises a polynucleotide encoding a DNA-binding domain of a Transcription activator-like effector (TALE) polypeptide or a zinc finger (ZNF) polypeptide, or fragment thereof, that specifically binds to the HNF4α expression control region.


In one embodiment, the DNA-binding domain of the TALE or ZNF polypeptide comprises an amino acid sequence having at least about 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% amino acid identity to the entire amino acid sequence of any one of the amino acid sequences listed in column 5 of Table 6A or column 4 of Table 10. In one embodiment, DNA-binding domain of the TALE or ZNF polypeptide comprises an amino acid sequence having at least about 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% amino acid identity to the entire amino acid sequence of ZF5.3. In one embodiment, DNA-binding domain of the TALE or ZNF polypeptide comprises an amino acid sequence having at least about 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% amino acid identity to the entire amino acid sequence of ZF7. In one embodiment, DNA-binding domain of the TALE or ZNF polypeptide comprises an amino acid sequence having at least about 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% amino acid identity to the entire amino acid sequence of ZF14. In one embodiment, DNA-binding domain of the TALE or ZNF polypeptide comprises an amino acid sequence having at least about 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% amino acid identity to the entire amino acid sequence of ZF15.


In some embodiments, the expression control region comprises one or more HNF4α-associated anchor sequences within an anchor sequence-mediated conjunction comprising a first and a second HNF4α-associated anchor sequence.


In some embodiments, the anchor sequence comprises a CCCTC-binding factor (CTCF) binding motif.


In some embodiments, the anchor sequence-mediated conjunction comprises one or more transcriptional control elements internal to the conjunction.


In some embodiments, the anchor sequence-mediated conjunction comprises one or more transcriptional control elements external to the conjunction.


In some embodiments, the first and/or the second anchor sequence is located within about 500 kb of the transcriptional control element.


In some embodiments, the first and/or the second anchor sequence is located within 300 kb of the transcriptional control element.


In some embodiments, the site-specific HNF4α disrupting agent comprises a nucleotide modification.


The present invention also provides vectors, such as viral expression vectors and cells comprising the site-specific HNF4α disrupting agents of the invention as well as the vectors of the invention.


In some embodiments, the site-specific HNF4α disrupting agents of the invention are present in a composition, such as a pharmaceutical composition.


In some embodiments, the pharmaceutical composition comprises a lipid formulation.


In some embodiments, the lipid formulation comprises one or more cationic lipids, one or more non-cationic lipids, one or more cholesterol-based lipids, or one or more PEG-modified lipids, or combinations of any of the foregoing.


In some embodiments, the pharmaceutical composition comprises a lipid nanoparticle.


In another aspect, the present invention provides a site-specific HNF4α disrupting agent, comprising a nucleic acid molecule encoding a fusion protein, the fusion protein comprising a site-specific HNF4α targeting moiety which targets an HNF4α expression control region and an effector molecule.


In some embodiments, the site-specific HNF4α targeting moiety comprises a polynucleotide encoding a DNA-binding domain of a Transcription activator-like effector (TALE) polypeptide or a zinc finger (ZNF) polypeptide, or fragment thereof, that specifically binds to the HNF4α expression control region.


In one embodiment, the DNA-binding domain of the TALE comprises an amino acid sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% amino acid identity to the entire amino acid sequence of an amino acid sequence selected from the amino acid sequences listed in column 5 of Table 6A or column 4 of Table 10.


In some embodiments, the effector molecule comprises a polypeptide or a nucleic acid molecule encoding a polypeptide.


In some embodiments, the fusion protein comprises a peptide-nucleic acid fusion.


In some embodiments, the effector is selected from the group consisting of a nuclease, a physical blocker, an epigenetic recruiter, and an epigenetic CpG modifier, and combinations of any of the foregoing.


In some embodiments, the effector comprises a CRISPR associated protein (Cas) polypeptide or nucleic acid molecule encoding the Cas polypeptide.


In some embodiments, the Cas polypeptide is an enzymatically inactive Cas polypeptide.


In some embodiments, the Cas polypeptide comprises a catalytically active domain of human exonuclease 1 (hEXO1).


In some embodiments, the epigenetic recruiter comprises a transcriptional enhancer or a transcriptional repressor.


In one embodiment, the transcriptional enhancer is a VPR (VP64-p65-Rta).


In one embodiment, the VPR comprises an amino acid sequence having at least about 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% amino acid identity to the entire amino acid sequence of











(SEQ ID NO: 66)



DALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDD







FDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSI







MKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLS







TINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPAMV







SALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQ







FDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIPVAPHT







TEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDE







DFSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGSAISDVFEG







REVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLT







PAPVPQPLDPAPAVTPEASHLLEDPDEETSQAVKALREMADTVI







PQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLT







PELNEILDTFLNDECLLHAMHISTGLSIFDTSLF.






In one embodiment, the transcriptional enhancer comprises two, three, four, or five VPRs.


In one embodiment, the transcriptional enhancer is a p300.


In one embodiment, the p300 comprises an amino acid sequence having at least about 85% 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identity to the entire amino acid sequence of











(SEQ ID NO: 67)



IFKPEELRQALMPTLEALYRQDPESLPFRQPVDPQLLGIPDYF







DIVKSPMDLSTIKRKLDTGQYQEPWQYVDDIWLMFNNAWLYNR







KTSRVYKYCSKLSEVFEQEIDPVMQSLGYCCGRKLEFSPQTLC







CYGKQLCTIPRDATYYSYQNRYHFCEKCFNEIQGESVSLGDDP







SQPQTTINKEQFSKRKNDTLDPELFVECTECGRKMHQICVLHH







EIIVVPAGFVCDGCLKKSARTRKENKFSAKRLPSTRLGTFLEN







RVNDFLRRQNHPESGEVTVRVVHASDKTVEVKPGMKARFVDSG







EMAESFPYRTKALFAFEEIDGVDLCFFGMHVQEYGSDCPPPNQ







RRVYISYLDSVHFFRPKCLRTAVYHEILIGYLEYVKKLGYTTG







HIVVACPPSEGDDYIFHCHPPDQKIPKPKRLQEWYKKMLDKAV







SERIVHDYKDIFKQATEDRLTSAKELPYFEGDFWPNVLEESIK







ELEQEEEERKREENTSNESTDVTKGDSKNAKKKNNKKTSKNKS







SLSRGNKKKPGMPNVSNDLSQKLYATMEKHKEVFFVIRLIAGP







PAANSLPPIVDPDPLIPCDLMDGRDAFLTLARDKHLEFSSLRR







AQWSTMCMLVELHTQSQD.






In some embodiments, the epigenetic CpG modifier comprises a DNA methylase, a DNA demethylase, a histone modifying agent, or a histone deacetylase.


In some embodiments, the effector molecule comprises a zinc finger polypeptide.


In some embodiments, the effector molecule comprises a Transcription activator-like effector nuclease (TALEN) polypeptide.


In some embodiments, the site-specific HNF4α disrupting agent further comprises a second nucleic acid molecule encoding a second fusion protein, wherein the second fusion comprises a second site-specific HNF4α targeting moiety which targets a second HNF4α expression control region and a second effector molecule, wherein the second HNF4α expression control region is different than the HNF4α expression control region.


In one embodiment, the HNF4α expression control region comprises a ZF5 target sequence GGCGGGGGACCGATTAACCAT (SEQ ID NO: 118) and the second HNF4α expression control region comprises a ZF7 target sequence ACTGAACATCGGTGAGTTAGG (SEQ ID NO: 126).


In one embodiment, the second effector is different than the first effector.


In one embodiment, the second effector is the same as the first effector.


In one embodiment, the fusion protein and the second fusion protein are operably linked.


In one embodiment, the fusion protein and the second fusion protein comprise an amino acid sequence that has at least about 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% amino acid sequence identity to the entire amino acid sequence of a polypeptide selected from the group consisting of ZF5.3-VPR-tPT2a-ZF7-VPR; ZF7-VPR-tPT2a-ZF5.3-VPR; ZF5.3-VPR-tPT2a-ZF7-p300; and ZF7-p300-tPT2a-ZF5.3-VPR.


In one embodiment, the fusion protein comprises an amino acid sequence having at least about 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% amino acid sequence identity to the entire amino acid sequence of ZF5.3-VPR.


In one embodiment, the fusion protein is encoded by a polynucleotide comprising a nucleotide sequence having at least about 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% nucleotide sequence identity to the entire nucleotide sequence of a polynucleotide selected from the group consisting of ZF5-VPR mRNA, ZF5.1-VPR mRNA, ZF5.2-VPR mRNA, ZF5.3-VPR mRNA, ZF5.4-VPR mRNA, ZF5.5-VPR mRNA, and ZF5.6-VPR mRNA.


In one aspect, the present invention provides a site-specific HNF4α disrupting agent. The disrupting agent includes a nucleic acid molecule encoding a fusion protein, wherein the fusion protein comprises an amino acid sequence having at least about 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% amino acid identity to the entire amino acid sequence of a polypeptide selected from the group consisting of ZF1-VPR, ZF2-VPR, ZF3-VPR, ZF4-VPR, ZF5-VPR, ZF5.3-VPR, ZF6-VPR, ZF7-VPR, ZF8-VPR, ZF9-VPR, ZF10-VPR, ZF11-VPR, ZF12-VPR, ZF13-VPR, ZF14-VPR, and ZF15-VPR.


In one embodiment, the polypeptide is selected from the group consisting of ZF5-VPR, ZF5.3-VPR, ZF7-VPR, ZF10-VPR, ZF14-VPR, and ZF15-VPR.


In one embodiment, the polypeptide is ZF5.3-VPR.


The present invention also provides vectors, such as viral expression vectors and cells comprising the site-specific HNF4α disrupting agents of the invention as well as the vectors of the invention.


In some embodiments, the site-specific HNF4α disrupting agents of the invention are present in a composition, such as a pharmaceutical composition.


In some embodiments, the pharmaceutical composition comprises a lipid formulation.


In some embodiments, the lipid formulation comprises one or more cationic lipids, one or more non-cationic lipids, one or more cholesterol-based lipids, or one or more PEG-modified lipids, or combinations of any of the foregoing.


In some embodiments, the pharmaceutical composition comprises a lipid nanoparticle.


In one aspect, the present invention provides a method of modulating expression of hepatocyte nuclear factor 4 alpha- (HNF4α) in a cell. The method includes contacting the cell with a site-specific HNF4α disrupting agent, the disrupting agent comprising a site-specific HNF4α targeting moiety which targets an HNF4α expression control region, and an effector molecule, thereby modulating expression of HNF4α in the cell.


The modulation of expression may be enhanced expression of HNF4α in the cell or reduced expression of HNF4α in the cell.


In some embodiments, the site-specific HNF4α targeting moiety comprises a polymeric molecule.


The polymeric molecule may comprise a polyamide, a polynucleotide, a peptide nucleic acid (PNA).


In some embodiments, the expression control region comprises an HNF4α-specific transcriptional control element.


In some embodiments, the transcriptional control element comprises an HNF4α promoter, such as the nucleotide sequence of HNF4α promoter 1, or a fragment thereof, or the nucleotide sequence of HNF4α promoter 2, or a fragment thereof.


In some embodiments, the transcriptional control element comprises a transcriptional enhancer.


In some embodiments, the transcriptional control element comprises a transcriptional repressor.


In some embodiments, the site-specific HNF4α disrupting agent comprises a nucleotide sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% nucleotide identity to the entire nucleotide sequence of any of the nucleotide sequences in any one of Tables 2, 3, 4, and 9.


In some embodiments, the site-specific HNF4α disrupting agent comprises a polynucleotide encoding a DNA-binding domain of a Transcription activator-like effector (TALE) polypeptide or a zinc finger (ZNF) polypeptide, or fragment thereof, that specifically binds to the HNF4α expression control region.


In some embodiments, the DNA-binding domain of the TALE or ZNF comprises an amino acid sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% amino acid identity to the entire amino acid sequence of an amino acid sequence selected from the amino acid sequences listed in column 5 of Table 6A or column 4 of Table 10.


In some embodiments, the expression control region comprises one or more HNF4α-associated anchor sequences within an anchor sequence-mediated conjunction comprising a first and a second HNF4α-associated anchor sequence.


In some embodiments, the anchor sequence comprises a CCCTC-binding factor (CTCF) binding motif.


In some embodiments, the anchor sequence-mediated conjunction comprises one or more transcriptional control elements internal to the conjunction.


In some embodiments, the anchor sequence-mediated conjunction comprises one or more transcriptional control elements external to the conjunction.


In some embodiments, the first and/or the second anchor sequence is located within about 500 kb of the transcriptional control element.


In some embodiments, the first and/or the second anchor sequence is located within 300 kb of the transcriptional control element.


In some embodiments, the site-specific HNF4α disrupting agent comprises a nucleotide modification.


In some embodiments, the effector molecule comprises a polypeptide.


In some embodiments, the polypeptide comprises a nucleic acid molecule encoding a fusion protein comprising the site-specific HNF4α targeting moiety which targets an HNF4α expression regulatory region, and the effector molecule.


In some embodiments, the fusion protein comprises a peptide-nucleic acid fusion molecule.


In some embodiments, the effector is selected from the group consisting of a nuclease, a physical blocker, an epigenetic recruiter, and an epigenetic CpG modifier, and combinations of any of the foregoing.


In some embodiments, the effector comprises a CRISPR associated protein (Cas) polypeptide or nucleic acid molecule encoding the Cas polypeptide.


In some embodiments, the Cas polypeptide is an enzymatically inactive Cas polypeptide.


In some embodiments, the Cas polypeptide further comprises a catalytically active domain of human exonuclease 1 (hEXO1).


In some embodiments, the epigenetic recruiter comprises a transcriptional enhancer or a transcriptional repressor.


In some embodiments, the transcriptional enhancer is a VPR.


In some embodiments, the VPR comprises an amino acid sequence having at least about 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% amino acid identity to the entire amino acid sequence of











(SEQ ID NO: 66)



DALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALD







DFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFK







SIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTS







SLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPA







PAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEA







LLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGI







PVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPN







GLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGS







AISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGP







VHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDEETSQAVK







ALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESM







TEDLNLDSPLTPELNEILDTFLNDECLLHAMHISTGLSIFDTSLF.






In some embodiments, the transcriptional enhancer comprises two, three, four, or five VPRs.


In some embodiments, the transcriptional enhancer is a p300.


In some embodiments, the p300 has an amino acid sequence having at least about 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% amino acid identity to the entire amino acid sequence of









(SEQ ID NO: 67)


IFKPEELRQALMPTLEALYRQDPESLPFRQPVDPQLLGIPDYFDIVKSPM





DLSTIKRKLDTGQYQEPWQYVDDIWLMFNNAWLYNRKTSRVYKYCSKLSE





VFEQEIDPVMQSLGYCCGRKLEFSPQTLCCYGKQLCTIPRDATYYSYQNR





YHFCEKCFNEIQGESVSLGDDPSQPQTTINKEQFSKRKNDTLDPELFVEC





TECGRKMHQICVLHHEIIWPAGFVCDGCLKKSARTRKENKFSAKRLPSTR





LGTFLENRVNDFLRRQNHPESGEVTVRVVHASDKTVEVKPGMKARFVDSG





EMAESFPYRTKALFAFEEIDGVDLCFFGMHVQEYGSDCPPPNQRRVYISY





LDSVHFFRPKCLRTAVYHEILIGYLEYVKKLGYTTGHIWACPPSEGDDYI





FHCHPPDQKIPKPKRLQEWYKKMLDKAVSERIVHDYKDIFKQATEDRLTS





AKELPYFEGDFWPNVLEESIKELEQEEEERKREENTSNESTDVTKGDSKN





AKKKNNKKTSKNKSSLSRGNKKKPGMPNVSNDLSQKLYATMEKHKEVFFV





IRLIAGPAANSLPPIVDPDPLIPCDLMDGRDAFLTLARDKHLEFSSLRRA





QWSTMCMLVELHTQSQD.






In some embodiments, the epigenetic CpG modifier comprises a DNA methylase, a DNA demethylase, a histone modifying agent, or a histone deacetylase.


In some embodiments, the effector molecule comprises a zinc finger polypeptide.


In some embodiments, the effector molecule comprises a Transcription activator-like effector nuclease (TALEN) polypeptide.


In some embodiments, the fusion protein comprises an enzymatically inactive Cas polypeptide and an epigenetic recruiter polypeptide.


In some embodiments, the fusion protein comprises an enzymatically Cas polypeptide and an epigenetic CpG modifier polypeptide.


In some embodiments, the site-specific HNF4α disrupting agent comprises a second nucleic acid molecule encoding a second fusion protein, wherein the second fusion protein comprises a second site-specific HNF4α targeting moiety which targets a second HNF4α expression control region and a second effector molecule, wherein the second HNF4α expression control region is different than the HNF4α expression control region.


In some embodiments, the HNF4α expression control region comprises a ZF5 target sequence GGCGGGGGACCGATTAACCAT (SEQ ID NO: 118) and the second HNF4α expression control region comprises a ZF7 target sequence ACTGAACATCGGTGAGTTAGG (SEQ ID NO: 126).


In some embodiments, the second effector is different than the effector.


In some embodiments, the second effector is the same as the effector.


In some embodiments, the fusion protein and the second fusion protein are operably linked.


In some embodiments, the fusion protein and the second fusion protein comprise an amino acid sequence that has at least about 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to the entire amino acid sequence of a polypeptide selected from ZF5.3-VPR-tPT2a-ZF7-VPR protein, ZF7-VPR-tPT2a-ZF5.3-VPR protein, ZF5.3-VPR-tPT2a-ZF7-p300 protein, and ZF7-p300-tPT2a-ZF5.3-VPR protein.


In some embodiments, the fusion protein comprises an amino acid sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to the entire amino acid sequence of ZF5.3-VPR protein.


In some embodiments, the fusion protein is encoded by a polynucleotide having a sequence that has at least about 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to the entire nucleotide sequence of a polynucleotide selected from the group consisting of ZF5-VPR mRNA, ZF5.1-VPR mRNA, ZF5.2-VPR mRNA, ZF5.3-VPR mRNA, ZF5.4-VPR mRNA, ZF5.5-VPR mRNA, and ZF5.6-VPR mRNA.


In some embodiments, the administration of the site-specific HNF4α disrupting agent and the second site-specific HNF4α disrupting agent has a synergistic effect in modulating the expression of HNF4α.


In some embodiments, the HNF4α expression control region comprises a ZF5 target sequence GGCGGGGGACCGATTAACCAT (SEQ ID NO: 118) and the second HNF4α expression control region comprises a sequence selected from ZF7 target sequence ACTGAACATCGGTGAGTTAGG (SEQ ID NO: 126), ZF10 target sequence CCTGCAGCCCCGCCCAGCCTA (SEQ ID NO: 138), ZF14 target sequence GGAGGGGTGGGGGTTAATGGT (SEQ ID NO: 154), and ZF15 target sequence GAAGGGGTGGAGGCTCTGCCG (SEQ ID NO: 158).


In some embodiments, the HNF4α expression control region comprises a ZF5 target sequence GGCGGGGGACCGATTAACCAT (SEQ ID NO: 118) and the second HNF4α expression control region comprises a ZF7 target sequence ACTGAACATCGGTGAGTTAGG (SEQ ID NO: 126).


In some embodiments, the fusion protein comprises a sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to the entire amino acid sequence of ZF5-VPR, and the second fusion protein comprises a sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% sequence identity to the entire amino acid sequence of a polypeptide selected from ZF7-VPR, ZF10-VPR, ZF14-VPR, and ZF15-VPR.


In some embodiments, the fusion protein comprises an amino acid sequence having at least about 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% identity to the entire amino acid sequence of a polypeptide selected from the group consisting of ZF1-VPR, ZF2-VPR, ZF3-VPR, ZF4-VPR, ZF5-VPR, ZF5.3-VPR, ZF6-VPR, ZF7-VPR, ZF8-VPR, ZF9-VPR, ZF10-VPR, ZF11-VPR, ZF12-VPR, ZF13-VPR, ZF14-VPR, and ZF15-VPR.


In some embodiments, the polypeptide is selected from the group consisting of ZF5-VPR, ZF5.3-VPR, ZF7-VPR, ZF10-VPR, ZF14-VPR, and ZF15-VPR.


In some embodiments, the polypeptide is ZF5.3-VPR.


In some embodiments, the site-specific disrupting agent, the effector, or both the site-specific disrupting agent and the effector are present in a vector, such as a viral expression vector.


In some embodiments, the site-specific disrupting agent and the effector are present in the same vector.


In some embodiments, the site-specific disrupting agent and the effector are present in different vectors.


In some embodiments, the site-specific disrupting agent, the effector, or both the site-specific disrupting agent and the effector are present in a composition.


In some embodiments, the site-specific disrupting agent and the effector are present in the same composition.


In some embodiments, the site-specific disrupting agent and the effector are present in different compositions.


In some embodiments, the composition comprises a pharmaceutical composition.


In some embodiments, the pharmaceutical composition comprises a lipid formulation.


In some embodiments, the lipid formulation comprises one or more cationic lipids, one or more non-cationic lipids, one or more cholesterol-based lipids, or one or more PEG-modified lipids, or combinations of any of the foregoing.


In some embodiments, the pharmaceutical composition comprises a lipid nanoparticle.


In some embodiments, the cell is a mammalian cell, such as a somatic cell or a primary cell.


In some embodiments, the contacting is performed in vitro.


In some embodiments, the contacting is performed in vivo.


In some embodiments, the contacting is performed ex vivo.


In some embodiments, the methods of the invention further comprise administering the cell to a subject.


In some embodiments, the cell is within a subject.


In some embodiments, the subject has an HNF4α-associated disease.


In some embodiments, the HNF4α-associated disease is selected from the group consisting of fatty liver (steatosis), nonalcoholic steatohepatitis (NASH), cirrhosis of the liver, accumulation of fat in the liver, inflammation of the liver, hepatocellular necrosis, liver fibrosis, and nonalcoholic fatty liver disease (NAFLD), polycystic kidney disease, inflammatory bowel disease (IBD), and MODY I.


In another aspect, the present invention provides a method for treating a subject having an HNF4α-associated disease. The method includes administering to the subject a therapeutically effective amount of the site-specific HNF4α disrupting agent, the disrupting agent comprising a site-specific HNF4α targeting moiety which targets an HNF4α expression control region, and an effector molecule, thereby treating the subject.


In some embodiments, the HNF4α-associated disease is hepatocellular cancer and the site-specific HNF4α disrupting agent reduces expression of HNF4α in the subject.


In some embodiments, the HNF4α-associated disease is selected from the group consisting of fatty liver (steatosis), nonalcoholic steatohepatitis (NASH), cirrhosis of the liver, accumulation of fat in the liver, inflammation of the liver, hepatocellular necrosis, liver fibrosis, and nonalcoholic fatty liver disease (NAFLD) and the site-specific HNF4α disrupting agent enhances expression of HNF4α in the subject.


In some embodiments, the site-specific HNF4α disrupting agent and the effector molecule are administered to the subject concurrently.


In some embodiments, the site-specific HNF4α disrupting agent and the effector molecule are administered to the subject sequentially.


In some embodiments, the effector molecule is administered to the subject prior to administration of the site-specific HNF4α disrupting agent.


In some embodiments, the site-specific HNF4α disrupting agent is administered to the subject prior to administration of the effector molecule.


In one aspect, the present invention provides a site-specific HNF4α disrupting agent, comprising a site-specific HNF4α targeting moiety which targets an HNF4α expression control region, wherein the HNFα expression control region comprises the nucleotide sequence of any one of the nucleotide sequences listed in column 3 of Table 1 or column 4 of Table 10.


In one embodiment, the site-specific HNF4α targeting moiety comprises a nucleotide sequence having at least 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% nucleotide identity to the entire nucleotide sequence of any of the nucleotide sequences in any one of Table 2, 3, 4, and 9.


In one embodiment, the site-specific HNF4α targeting moiety comprises a polymeric molecule comprising a polynucleotide encoding a DNA-binding domain of a Transcription activator-like effector (TALE) polypeptide or a zinc finger (ZNF) polypeptide, or fragment thereof, that specifically binds to the HNF4α expression control region.


In one embodiment, the DNA-binding domain of the TALE or ZNF polypeptide comprises an amino acid sequence having at least about 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or 100% amino acid identity to the entire amino acid sequence of any one of the amino acid sequences listed in column 5 of Table 6A or column 4 of Table 10.


In one embodiment, the HNFα expression control region comprises a nucleotide sequence of a ZF5 target sequence GGCGGGGGACCGATTAACCAT (SEQ ID NO: 118).


In one embodiment, the HNFα expression control region comprises a nucleotide sequence of a ZF7 target sequence ACTGAACATCGGTGAGTTAGG (SEQ ID NO: 126).


In one aspect, the present invention provides a site-specific HNF4α disrupting agent, comprising a polynucleotide encoding the amino acid sequence of ZF5-VPR comprising the amino acid sequence of









(SEQ ID NO: 301)


MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSTSGNLTEHQR





THTGEKPYKCPECGKSFSDSGNLRVHQRTHTGEKPYKCPECGKSFSHKNA





LQNHQRTHTGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPYKCPECGKS





FSQRAHLERHQRTHTGEKPYKCPECGKSFSRSDKLVRHQRTHTGEKPYKC





PECGKSFSDPGHLVRHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDLD





MLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRK





VGSQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPTDPRPPPRRIAV





PSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAP





PQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGE





GTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIP





VAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDED





FSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGSAISDVFEGREVCQPK





RLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPA





VTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPP





RGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHISTG





LSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA.






In another aspect, the present invention provides a site-specific HNF4α disrupting agent, comprising a polynucleotide encoding the amino acid sequence of ZF5.3-VPR comprising the amino acid sequence of









(SEQ ID NO: 301)


MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSTSGNLTEHQR





THTGEKPYKCPECGKSFSDSGNLRVHQRTHTGEKPYKCPECGKSFSHKNA





LQNHQRTHTGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPYKCPECGKS





FSQRAHLERHQRTHTGEKPYKCPECGKSFSRSDKLVRHQRTHTGEKPYKC





PECGKSFSDPGHLVRHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDLD





MLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRK





VGSQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPTDPRPPPRRIAV





PSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAP





PQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGE





GTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIP





VAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDED





FSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGSAISDVFEGREVCQPK





RLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPA





VTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPP





RGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHISTG





LSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA.






In yet another aspect, the present invention provides a site-specific HNF4α disrupting agent, comprising a polynucleotide encoding the amino acid sequence of ZF7-VPR comprising the amino acid sequence of









(SEQ ID NO: 302)


MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSRSDHLTNHQR





THTGEKPYKCPECGKSFSTSGSLVRHQRTHTGEKPYKCPECGKSFSQAGH





LASHQRTHTGEKPYKCPECGKSFSRSDKLTEHQRTHTGEKPYKCPECGKS





FSTSGNLTEHQRTHTGEKPYKCPECGKSFSQSSNLVRHQRTHTGEKPYKC





PECGKSFSTHLDLIRHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDLD





MLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRK





VGSQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPTDPRPPPRRIAV





PSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAP





PQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGE





GTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIP





VAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDED





FSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGSAISDVFEGREVCQPK





RLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPA





VTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPP





RGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHISTG





LSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA.






In one aspect, the present invention provides a site-specific HNF4α disrupting agent, comprising a polynucleotide encoding the amino acid sequence of ZF10-VPR comprising the amino acid sequence of









(SEQ ID NO: 303)


MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSQNSTLTEHQR





THTGEKPYKCPECGKSFSERSHLREHQRTHTGEKPYKCPECGKSFSSKKH





LAEHQRTHTGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPYKCPECGKS





FSDCRDLARHQRTHTGEKPYKCPECGKSFSQSGDLRRHQRTHTGEKPYKC





PECGKSFSTKNSLTEHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDLD





MLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRK





VGSQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPTDPRPPPRRIAV





PSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAP





PQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGE





GTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIP





VAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDED





FSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGSAISDVFEGREVCQPK





RLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPA





VTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPP





RGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHISTG





LSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA.






In another aspect, the present invention provides a site-specific HNF4α disrupting agent, comprising a polynucleotide encoding the amino acid sequence of ZF14-VPR comprising the amino acid sequence of









(SEQ ID NO: 304)


MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSTSGHLVRHQR





THTGEKPYKCPECGKSFSTTGNLTVHQRTHTGEKPYKCPECGKSFSTSGS





LVRHQRTHTGEKPYKCPECGKSFSRSDKLVRHQRTHTGEKPYKCPECGKS





FSRSDELVRHQRTHTGEKPYKCPECGKSFSRSDKLVRHQRTHTGEKPYKC





PECGKSFSQRAHLERHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDLD





MLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRK





VGSQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPTDPRPPPRRIAV





PSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAP





PQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGE





GTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIP





VAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDED





FSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGSAISDVFEGREVCQPK





RLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPA





VTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPP





RGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHISTG





LSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA.






In one aspect, the present invention provides a site-specific HNF4α disrupting agent, comprising a polynucleotide encoding the amino acid sequence of ZF15-VPR comprising the amino acid sequence of









(SEQ ID NO: 305)


MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSRNDTLTEHQR





THTGEKPYKCPECGKSFSRNDALTEHQRTHTGEKPYKCPECGKSFSTSGE





LVRHQRTHTGEKPYKCPECGKSFSRSDNLVRHQRTHTGEKPYKCPECGKS





FSRSDELVRHQRTHTGEKPYKCPECGKSFSRSDKLVRHQRTHTGEKPYKC





PECGKSFSQSSNLVRHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDLD





MLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRK





VGSQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPTDPRPPPRRIAV





PSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAP





PQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGE





GTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIP





VAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDED





FSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGSAISDVFEGREVCQPK





RLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPA





VTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPP





RGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHISTG





LSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA.






In one aspect, the present invention provides a bicistronic site-specific HNF4α disrupting agent, comprising a polynucleotide encoding the amino acid sequence of a bicistronic ZF5.3-VPR-tPT2a-ZF7-VPR comprising the amino acid sequence of









(SEQ ID NO: 306)


MAPKKKRKVGIHGVPAAGSSGSSGSLEPGEKPYKCPECGKSFSTSGNLTE





HQRTHTGEKPYKCPECGKSFSDSGNLRVHQRTHTGEKPYKCPECGKSFSH





KNALQNHQRTHTGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPYKCPEC





GKSFSQRAHLERHQRTHTGEKPYKCPECGKSFSRSDKLVRHQRTHTGEKP





YKCPECGKSFSDPGHLVRHQRTHTGEKPTGKKTSASGSGGGSGGDALDDF





DLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKK





KRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPTDPRPPPRR





IAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALA





PAPPQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQ





AGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQ





GIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSG





DEDFSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGSAISDVFEGREVC





QPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDP





APAVTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSH





PPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHI





STGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYAATNFSLLK





QAGDVEENPGPTSAGKLGSGEGRGSLLTCGDVEENPGPLEGSSGSGSLEP





GEKPYKCPECGKSFSRSDHLTNHQRTHTGEKPYKCPECGKSFSTSGSLVR





HQRTHTGEKPYKCPECGKSFSQAGHLASHQRTHTGEKPYKCPECGKSFSR





SDKLTEHQRTHTGEKPYKCPECGKSFSTSGNLTEHQRTHTGEKPYKCPEC





GKSFSQSSNLVRHQRTHTGEKPYKCPECGKSFSTHLDLIRHQRTHTGEKP





TGKKTSASGSGGGSGGDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDL





DMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYET





FKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTI





NYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPAMVSALAQAPA





PVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNST





DPAVFTDLASVDNSEFQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQ





RPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREG





MFLPKPEAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTP





TGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDEETSQAVKALRE





MADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLT





PELNEILDTFLNDECLLHAMHISTGLSIFDTSLFSGGKRPAATKKAGQAK





KKKGSYPYDVPDYA.






In one aspect, the present invention provides a bicistronic site-specific HNF4α disrupting agent, comprising a polynucleotide encoding the amino acid sequence of a bicistronic ZF7-VPR-tPT2a-ZF5.3-VPR comprising the amino acid sequence of









(SEQ ID NO: 178)


MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSRSDHLTNHQR





THTGEKPYKCPECGKSFSTSGSLVRHQRTHTGEKPYKCPECGKSFSQAGH





LASHQRTHTGEKPYKCPECGKSFSRSDKLTEHQRTHTGEKPYKCPECGKS





FSTSGNLTEHQRTHTGEKPYKCPECGKSFSQSSNLVRHQRTHTGEKPYKC





PECGKSFSTHLDLIRHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDLD





MLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRK





VGSQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPTDPRPPPRRIAV





PSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAP





PQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGE





GTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIP





VAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDED





FSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGSAISDVFEGREVCQPK





RLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPA





VTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPP





RGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHISTG





LSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYAATNFSLLKQAG





DVEENPGPTSAGKLGSGEGRGSLLTCGDVEENPGPLEGSSGSLEPGEKPY





KCPECGKSFSTSGNLTEHQRTHTGEKPYKCPECGKSFSDSGNLRVHQRTH





TGEKPYKCPECGKSFSHKNALQNHQRTHTGEKPYKCPECGKSFSRNDTLT





EHQRTHTGEKPYKCPECGKSFSQRAHLERHQRTHTGEKPYKCPECGKSFS





RSDKLVRHQRTHTGEKPYKCPECGKSFSDPGHLVRHQRTHTGEKPTGKKT





SASGSGGGSGGDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGS





DALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIM





KKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEF





PTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPAMVSALAQAPAPVPVL





APGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVF





TDLASVDNSEFQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDP





APAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPK





PEAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVH





EPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDEETSQAVKALREMADTV





IPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNE





ILDTFLNDECLLHAMHISTGLSIFDTSLFSGGKRPAATKKAGQAKKKKGS





YPYDVPDYA.






In one aspect, the present invention provides a bicistronic site-specific HNF4α disrupting agent, comprising a polynucleotide encoding the amino acid sequence of a bicistronic ZF5.3-VPR-tPT2a-ZF7-p300 comprising the amino acid sequence of









(SEQ ID NO: 307)


MAPKKKRKVGIHGVPAAGSSGSSGSLEPGEKPYKCPECGKSFSTSGNLTE





HQRTHTGEKPYKCPECGKSFSDSGNLRVHQRTHTGEKPYKCPECGKSFSH





KNALQNHQRTHTGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPYKCPEC





GKSFSQRAHLERHQRTHTGEKPYKCPECGKSFSRSDKLVRHQRTHTGEKP





YKCPECGKSFSDPGHLVRHQRTHTGEKPTGKKTSASGSGGGSGGDALDDF





DLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKK





KRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPTDPRPPPRR





IAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALA





PAPPQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAfPPAPKPT





QAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLN





QGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLS





GDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGSAISDVFEGREV





CQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLD





PAPAVTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLS





HPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMH





ISTGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYAATNFSLL





KQAGDVEENPGPTSAGKLGSGEGRGSLLTCGDVEENPGPLEGSSGSGSLE





PGEKPYKCPECGKSFSRSDHLTNHQRTHTGEKPYKCPECGKSFSTSGSLV





RHQRTHTGEKPYKCPECGKSFSQAGHLASHQRTHTGEKPYKCPECGKSFS





RSDKLTEHQRTHTGEKPYKCPECGKSFSTSGNLTEHQRTHTGEKPYKCPE





CGKSFSQSSNLVRHQRTHTGEKPYKCPECGKSFSTHLDLIRHQRTHTGEK





PTGKKTSASGSGGGSGGIFKPEELRQALMPTLEALYRQDPESLPFRQPVD





PQLLGIPDYFDIVKSPMDLSTIKRKLDTGQYQEPWQYVDDIWLMFNNAWL





YNRKTSRVYKYCSKLSEVFEQEIDPVMQSLGYCCGRKLEFSPQTLCCYGK





QLCTIPRDATYYSYQNRYHFCEKCFNEIQGESVSLGDDPSQPQTTINKEQ





FSKRKNDTLDPELFVECTECGRKMHQICVLHHEIIWPAGFVCDGCLKKSA





RTRKENKFSAKRLPSTRLGTFLENRVNDFLRRQNHPESGEVTVRVVHASD





KTVEVKPGMKARFVDSGEMAESFPYRTKALFAFEEIDGVDLCFFGMHVQE





YGSDCPPPNQRRVYISYLDSVHFFRPKCLRTAVYHEILIGYLEYVKKLGY





TTGHIWACPPSEGDDYIFHCHPPDQKIPKPKRLQEWYKKMLDKAVSERIV





HDYKDIFKQATEDRLTSAKELPYFEGDFWPNVLEESIKELEQEEEERKRE





ENTSNESTDVTKGDSKNAKKKNNKKTSKNKSSLSRGNKKKPGMPNVSNDL





SQKLYATMEKHKEVFFVIRLIAGPAANSLPPIVDPDPLIPCDLMDGRDAF





LTLARDKHLEFSSLRRAQWSTMCMLVELHTQSQDSGGKRPAATKKAGQAK





KKKGSYPYDVPDYA.






In one aspect, the present invention provides a bicistronic site-specific HNF4α disrupting agent, comprising a polynucleotide encoding the amino acid sequence of a bicistronic ZF7-p300-tPT2a-ZF5.3-VPR comprising the amino acid sequence of









(SEQ ID NO: 308)


MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSRSDHLTNHQR





THTGEKPYKCPECGKSFSTSGSLVRHQRTHTGEKPYKCPECGKSFSQAGH





LASHQRTHTGEKPYKCPECGKSFSRSDKLTEHQRTHTGEKPYKCPECGKS





FSTSGNLTEHQRTHTGEKPYKCPECGKSFSQSSNLVRHQRTHTGEKPYKC





PECGKSFSTHLDLIRHQRTHTGEKPTGKKTSASGSGGGSGGIFKPEELRQ





ALMPTLEALYRQDPESLPFRQPVDPQLLGIPDYFDIVKSPMDLSTIKRKL





DTGQYQEPWQYVDDIWLMFNNAWLYNRKTSRVYKYCSKLSEVFEQEIDPV





MQSLGYCCGRKLEFSPQTLCCYGKQLCTIPRDATYYSYQNRYHFCEKCFN





EIQGESVSLGDDPSQPQTTINKEQFSKRKNDTLDPELFVECTECGRKMHQ





ICVLHHEIIWPAGFVCDGCLKKSARTRKENKFSAKRLPSTRLGTFLENRV





NDFLRRQNHPESGEVTVRVVHASDKTVEVKPGMKARFVDSGEMAESFPYR





TKALFAFEEIDGVDLCFFGMHVQEYGSDCPPPNQRRVYISYLDSVHFFRP





KCLRTAVYHEILIGYLEYVKKLGYTTGHIWACPPSEGDDYIFHCHPPDQK





IPKPKRLQEWYKKMLDKAVSERIVHDYKDIFKQATEDRLTSAKELPYFEG





DFWPNVLEESIKELEQEEEERKREENTSNESTDVTKGDSKNAKKKNNKKT





SKNKSSLSRGNKKKPGMPNVSNDLSQKLYATMEKHKEVFFVIRLIAGPAA





NSLPPIVDPDPLIPCDLMDGRDAFLTLARDKHLEFSSLRRAQWSTMCMLV





ELHTQSQDSGGKRPAATKKAGQAKKKKGSYPYDVPDYAATNFSLLKQAGD





VEENPGPTSAGKLGSGEGRGSLLTCGDVEENPGPLEGSSGSLEPGEKPYK





CPECGKSFSTSGNLTEHQRTHTGEKPYKCPECGKSFSDSGNLRVHQRTHT





GEKPYKCPECGKSFSHKNALQNHQRTHTGEKPYKCPECGKSFSRNDTLTE





HQRTHTGEKPYKCPECGKSFSQRAHLERHQRTHTGEKPYKCPECGKSFSR





SDKLVRHQRTHTGEKPYKCPECGKSFSDPGHLVRHQRTHTGEKPTGKKTS





ASGSGGGSGGDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSD





ALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMK





KSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFP





TMVFPSGQISQASALAPAPPQVLPQAPAPAPAPAMVSALAQAPAPVPVLA





PGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFT





DLASVDNSEFQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPA





PAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKP





EAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHE





PVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDEETSQAVKALREMADTVI





PQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEI





LDTFLNDECLLHAMHISTGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSY





PYDVPDYA.






The present invention also provides vectors, such as viral expression vectors and cells comprising the site-specific HNF4α disrupting agents of the invention as well as the vectors of the invention.


In some embodiments, the site-specific HNF4α disrupting agents of the invention are present in a composition, such as a pharmaceutical composition.


In some embodiments, the pharmaceutical composition comprises a lipid formulation.


In some embodiments, the lipid formulation comprises one or more cationic lipids, one or more non-cationic lipids, one or more cholesterol-based lipids, or one or more PEG-modified lipids, or combinations of any of the foregoing.


In some embodiments, the pharmaceutical composition comprises a lipid nanoparticle.


In one aspect, the present invention provides a pharmaceutical composition comprising a site-specific HNF4α disrupting agent, comprising a polynucleotide encoding the amino acid sequence of ZF5.3-VPR comprising the amino acid sequence of









(SEQ ID NO: 301)


MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSTSGNLTEHQR





THTGEKPYKCPECGKSFSDSGNLRVHQRTHTGEKPYKCPECGKSFSHKNA





LQNHQRTHTGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPYKCPECGKS





FSQRAHLERHQRTHTGEKPYKCPECGKSFSRSDKLVRHQRTHTGEKPYKC





PECGKSFSDPGHLVRHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDLD





MLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRK





VGSQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPTDPRPPPRRIAV





PSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAP





PQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGE





GTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIP





VAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDED





FSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGSAISDVFEGREVCQPK





RLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPA





VTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPP





RGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHISTG





LSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA,







and a lipid nanoparticle.


In one aspect, the present invention provides a pharmaceutical composition comprising a site-specific HNF4α disrupting agent, comprising a polynucleotide encoding the amino acid sequence of ZF5.3-VPR comprising the amino acid sequence of









(SEQ ID NO: 301)


MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSTSGNLTEHQR





THTGEKPYKCPECGKSFSDSGNLRVHQRTHTGEKPYKCPECGKSFSHKNA





LQNHQRTHTGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPYKCPECGKS





FSQRAHLERHQRTHTGEKPYKCPECGKSFSRSDKLVRHQRTHTGEKPYKC





PECGKSFSDPGHLVRHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDLD





MLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRK





VGSQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPTDPRPPPRRIAV





PSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAP





PQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGE





GTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIP





VAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDED





FSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGSAISDVFEGREVCQPK





RLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPA





VTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPP





RGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHISTG





LSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA;







and


a site-specific HNF4α disrupting agent, comprising a polynucleotide encoding the amino acid sequence of ZF7-VPR comprising the amino acid sequence of









(SEQ ID NO: 302)


MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSRSDHLTNHQR





THTGEKPYKCPECGKSFSTSGSLVRHQRTHTGEKPYKCPECGKSFSQAGH





LASHQRTHTGEKPYKCPECGKSFSRSDKLTEHQRTHTGEKPYKCPECGKS





FSTSGNLTEHQRTHTGEKPYKCPECGKSFSQSSNLVRHQRTHTGEKPYKC





PECGKSFSTHLDLIRHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDLD





MLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRK





VGSQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPTDPRPPPRRIAV





PSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAP





PQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGE





GTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIP





VAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDED





FSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGSAISDVFEGREVCQPK





RLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPA





VTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPP





RGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHISTG





LSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA;







and a lipid nanoparticle.


In one aspect, the present invention provides a pharmaceutical composition, comprising a site-specific HNF4α disrupting agent, comprising a polynucleotide encoding the amino acid sequence of a bicistronic ZF5.3-VPR-tPT2a-ZF7-VPR comprising the amino acid sequence of









(SEQ ID NO: 306)


MAPKKKRKVGIHGVPAAGSSGSSGSLEPGEKPYKCPECGKSFSTSGNLTE





HQRTHTGEKPYKCPECGKSFSDSGNLRVHQRTHTGEKPYKCPECGKSFSH





KNALQNHQRTHTGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPYKCPEC





GKSFSQRAHLERHQRTHTGEKPYKCPECGKSFSRSDKLVRHQRTHTGEKP





YKCPECGKSFSDPGHLVRHQRTHTGEKPTGKKTSASGSGGGSGGDALDDF





DLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKK





KRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPTDPRPPPRR





IAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALA





PAPPQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQ





AGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQ





GIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSG





DEDFSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGSAISDVFEGREVC





QPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDP





APAVTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSH





PPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHI





STGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYAATNFSLLK





QAGDVEENPGPTSAGKLGSGEGRGSLLTCGDVEENPGPLEGSSGSGSLEP





GEKPYKCPECGKSFSRSDHLTNHQRTHTGEKPYKCPECGKSFSTSGSLVR





HQRTHTGEKPYKCPECGKSFSQAGHLASHQRTHTGEKPYKCPECGKSFSR





SDKLTEHQRTHTGEKPYKCPECGKSFSTSGNLTEHQRTHTGEKPYKCPEC





GKSFSQSSNLVRHQRTHTGEKPYKCPECGKSFSTHLDLIRHQRTHTGEKP





TGKKTSASGSGGGSGGDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDL





DMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYET





FKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTI





NYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPAMVSALAQAPA





PVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNST





DPAVFTDLASVDNSEFQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQ





RPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREG





MFLPKPEAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTP





TGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDEETSQAVKALRE





MADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLT





PELNEILDTFLNDECLLHAMHISTGLSIFDTSLFSGGKRPAATKKAGQAK





KKKGSYPYDVPDYA;







and a lipid nanoparticle.


In one aspect, the present invention provides a method of modulating expression of hepatocyte nuclear factor 4 alpha- (HNF4α) in a cell. The method includes contacting the cell with a site-specific HNF4α disrupting agent, the disrupting agent comprising a site-specific HNF4α targeting moiety which targets an HNF4α expression control region, and an effector molecule, thereby modulating expression of HNF4α in the cell.


The modulation of expression may be enhanced expression of HNF4α in the cell or reduced expression of HNF4α in the cell.





BRIEF DESCRIPTION OF THE DRAWINGS


FIG. 1 depicts a chromosomal view of a portion of the upstream region and coding sequence of HNF4α and the positions of guide RNAs and pools of guide RNAs described herein.



FIG. 2 is a graph depicting the percent of HNF4α mRNA remaining in HepG2 cells at 72 hours after contacting the cells with the indicated pools of site-specific HNF4α targeting moieties and an effector molecule comprising dCas and KRAB at the indicated doses.



FIG. 3 is a graph depicting the percent of HNF4α mRNA remaining in HepG2 cells at 72 hours after contacting the cells with either the indicated site-specific HNF4α targeting moieties and an effector molecule comprising dCas and KRAB, or the indicated pools of HNF4α targeting moieties and an effector molecule comprising dCas and KRAB, at the indicated doses. Untreated cells and cells contacted with dCas9 alone were used as controls.



FIG. 4A is a graph depicting the percent of HNF4α mRNA measured in A549 cells at 48 hours after contacting the cells with the indicated pools of HNF4α targeting moieties and an effector molecule comprising dCas and VPR, at the indicated doses. SH is the abbreviation for “Safe Harbor” which is a non-target guide control.



FIG. 4B is a graph depicting the percent of HNF4α mRNA measured in LX-2 cells at 48 hours after contacting the cells with the indicated pools of HNF4α targeting moieties and an effector molecule comprising dCas and VPR at the indicated doses.



FIG. 5 is a graph depicting activation of HNF4α by dCas9-VPR in LX-2 cells with transfection using LNP formulations.



FIG. 6 is a graph depicting activation of HNF4 α by dCas9-VPR in HepG2 cells with transfection using LNP formulations.



FIG. 7 are immunohistological images depicting that HNF4α protein induced by dCas9-VPR Pool 1 is localized to the nucleus.



FIG. 8 is a schematic depicticting the structure of the human HNF4α promoter region and the isoforms of HNF4α.



FIG. 9 schematically depicticts the localization to the HNF4α promoter region of the various site-specific HNF4α targeting moieties. FIG. 9 discloses SEQ ID NOS 2519-2525, respectively, in order of appearance.



FIG. 10 schematically depicts the structure of an exemplary zinc finger-VPR fusion protein of the invention.



FIG. 11 is a graph depicting activation of HNF4α in LX-2 cells using ZF-VPR mRNAs.



FIG. 12 is a graph depicting activation of HNF4α in LX-2 cells with TAL-VPRs and ZF-VPRs.



FIG. 13 is a graph depicting activation of HNF4α in LX-2 Cells with ATUM-Codon Optimized ZF5-VPR variants.



FIG. 14 is a graph depicting the durability of VPR activation of HNF4α in K562 cells with ZF5-VPR, ZF5-p300, ZF5-VPR and ZF5-p300, and ZF5-VPR and ZF7-VPR.



FIG. 15 is a graph depicting the durability of VPR activation of HNF4α in K562 cells with ZF5-VPR and ZF7-VPR.



FIG. 16 is a graph depicting the durability of VPR activation of HNF4α in K562 cells with ZF5-VPR and ZF7-VPR, and ZF5.3-VPR and ZF7-VPR.



FIG. 17 is a graph depicting the change in expression level of biomarkers downstream of HNF4α following activation of HNF4α by dCas9-VPR in LX-2 cells.



FIG. 18 is a graph depicting the change in expression level of biomarkers of HNF4α following activation of HNF4α by ZF5-VPR, ZF5.3-VPR, ZF5-VPR and ZF7-VPR, and ZF5.3-VPR and ZF7-VPR in LX-2 cells.



FIG. 19 is a graph depicting screening of ZF11, ZF13, and ZF14 in LX-2 cells and synergy in activating HNF4α by ZF5-VPR and ZF7-VPR, and ZF5.3-VPR and ZF7-VPR.



FIG. 20 is a graph depicting activation of HNF4α using dCas9-VPR3-Pool 1 and screening of ZF-VPR combinations



FIG. 21 is a graph depicting HNF4α activation in FRG-KO mouse liver humanized hepatocytes (Yecuris human hepatocytes).



FIG. 22 is a graph depicting activation of HNF4α in LX-2 cells with bicistronic ZF5.3-VPR and ZF7-VPR.



FIG. 23 is a graph depicting the effect of repeat dosing of Yecuris hepatocytes with various ZF-VPR and ZF-p300 on HNF4α gene expression.



FIG. 24 is a graph depicting activation of HNF4α in Yecuris hepatocytes with bicistronic ZF-effector constructs.



FIG. 25 is graph depicting the 10 days durability of VPR activation of HNF4α in K562 cells with bicistronic ZF-effector constructs.





DETAILED DESCRIPTION OF THE INVENTION

The present invention provides agents and compositions for modulating expression (e.g., enhanced or reduced expression) of a hepatocyte nuclear factor 4 alpha (HNF4α) gene by targeting an HNF4α expression control region. The HNF4α gene may be in a cell, e.g., a mammalian cell, such as a mammalian somatic cell, e.g., a human somatic cell. The present invention also provides methods of using the agents and compositions of the invention for modulating the expression of an HNF4α gene or for treating a subject who would benefit from modulating the expression of an HNF4α gene, e.g., a subject suffering or prone to suffering from an HNF4α-associated disease.


The agents of the invention are referenced to herein as site-specific HNF4α disrupting agents and are described in Section II, below.


I. Definitions

In order that the present invention may be more readily understood, certain terms are first defined. In addition, it should be noted that whenever a value or range of values of a parameter are recited, it is intended that values and ranges intermediate to the recited values are also intended to be part of this invention.


The articles “a” and “an” are used herein to refer to one or to more than one (i.e., to at least one) of the grammatical object of the article. By way of example, “an element” means one element or more than one element, e.g., a plurality of elements.


The term “including” is used herein to mean, and is used interchangeably with, the phrase “including but not limited to”. The term “or” is used herein to mean, and is used interchangeably with, the term “and/or,” unless context clearly indicates otherwise.


The term “about” is used herein to mean within the typical ranges of tolerances in the art. For example, “about” can be understood as about 2 standard deviations from the mean. In certain embodiments, about means ±10%. In certain embodiments, about means ±5%. When about is present before a series of numbers or a range, it is understood that “about” can modify each of the numbers in the series or range.


The term “at least” prior to a number or series of numbers is understood to include the number adjacent to the term “at least”, and all subsequent numbers or integers that could logically be included, as clear from context. For example, the number of nucleotides in a nucleic acid molecule must be an integer. For example, “at least 18 nucleotides of a 21 nucleotide nucleic acid molecule” means that 18, 19, 20, or 21 nucleotides have the indicated property. When at least is present before a series of numbers or a range, it is understood that “at least” can modify each of the numbers in the series or range.


As used herein, “no more than” or “less than” is understood as the value adjacent to the phrase and logical lower values or integers, as logical from context, to zero. When “no more than” is present before a series of numbers or a range, it is understood that “no more than” can modify each of the numbers in the series or range.


As used herein, the term “substantially” refers to the qualitative condition of exhibiting total or near-total extent or degree of a characteristic or property of interest. One of ordinary skill in the art will understand that biological and chemical phenomena rarely, if ever, go to completion and/or proceed to completeness or achieve or avoid an absolute result. The term “substantially” may therefore be used in some embodiments herein to capture potential lack of completeness inherent in many biological and chemical phenomena.


As used herein, the terms “hepatocyte nuclear factor 4 alpha,” “HNF4α,” and “HNF4A,” as used interchangeably herein, refer to the gene as well as the well known encoded protein which is a nuclear transcription factor which binds DNA as a homodimer. The encoded protein controls the expression of several genes, including hepatocyte nuclear factor 1 alpha, a transcription factor which regulates the expression of several hepatic genes. HNF4α also plays a role in development of the liver, kidney, and intestines. Decreased expression of this gene has been associated with monogenic autosomal dominant non-insulin-dependent diabetes mellitus type I (MODY I) and liver disease, e.g., cirrhosis. Dysregulated, e.g., increased, expression of this gene has been associated with hepatocellular carcinoma. The nucleotide and amino acid sequence of HNF4α is known and may be found in, for example, GenBank Accession Nos. NM_000457; NM_175914; NM_178849; NM_178850; NM_001030003; NM_001030004; NM_001258355; NM_001287182; NM_001287183; NM_001287184; XM_005260407; NP_000448.3; NP_787110.2; NP_849180.1; NP_849181.1; NP_001025174.1; NP_001025175.1; NP_001245284.1; NP_001274111.1; NP_001274112.1; NP_001274113.1; XP_005260464.1, the entire contents of each of which are incorporated herein by reference. The nucleotide sequence of the genomic region of Chromosome 20 which includes the endogenous promoters of HNF4α and the HNF4α coding sequence is also known and may be found in GenBank Accession No. NC_000020.10 (42984441 . . . 43061485).


The HNF4α gene is located on chromosome 20, with transcription regulated by two promoters (P1 and P2) and alternative splicing variants, resulting in nine distinct isoforms (α1-α9) (FIG. 8). The HNF4α locus is transcriptionally regulated through the use of two distinct promoters that are physically separated by more than 45 kb. Isoforms produced by the activity of the closer promoter are designated P1 whereas isoforms produced by the second and more distant promoter are designated P2. Isoforms most common in the liver are expressed from promoter 1 (P1), with isoforms from P2 most commonly found in fetal tissues, and in the adult kidney and small intestine.


HNF4α controls the expression of proteins necessary for the normal function of hepatocytes and other cell types in the liver. (Argemi J, et al. Defective HNF4alpha-dependent gene expression as a driver of hepatocellular failure in alcoholic hepatitis. Nat Commun. 2019; 10(1):3126. doi:10.1038/s41467-019-11004-3; Nishikawa T, et al. Resetting the transcription factor network reverses terminal chronic hepatic failure. J Clin Invest. 2015; 125(4):1533-1544. doi:10.1172/JCI73137). In addition, many of these proteins are secreted by liver cells and contribute to health systemically. For example, proteins such as albumin are required to transport nutrients, hormones, lipids, and small molecule drugs in the circulation.


The term “site-specific HNF4α disrupting agent,” as used herein, refers to any agent that specifically binds to a target HNF4α expression control region and, e.g., modulates expression of an HNF4α gene. The modulation of expression may be permanent or transient modulation. Site-specific HNF4α disruption agents of the invention may comprise a “site-specific HNF4α targeting moiety.”


As used herein, the term “site-specific HNF4α targeting moiety” refers to a moiety that specifically binds to an HNF4α expression control region, e.g., a transcriptional control region of an HNF4α gene, such as a promoter, an enhancer, or a repressor; or an HNF4α-associated anchor sequence, such as, for example within an HNF4α-associated anchor sequence-mediated conjunction. Exemplary “site-specific HNF4α targeting moieties” include, but are not limited to, polyamides, nucleic acid molecules, such as RNA, DNA, or modified RNA or DNA, polypeptides, protein nucleic acid molecules, and fusion proteins.


As used herein, the terms “specific binding” or “specifically binds” refer to an ability to discriminate between possible binding partners in the environment in which binding is to occur. In some embodiments, a disrupting agent that interacts, e.g., preferentially interacts, with one particular target when other potential disrupting agents are present is said to “bind specifically” to the target (i.e., the expression control region) with which it interacts. In some embodiments, specific binding is assessed by detecting or determining the degree of association between the disrupting agent and its target; in some embodiments, specific binding is assessed by detecting or determining degree of dissociation of a disrupting agent-target complex. In some embodiments, specific binding is assessed by detecting or determining ability of the disrupting agent to compete with an alternative interaction between its target and another entity. In some embodiments, specific binding is assessed by performing such detections or determinations across a range of concentrations.


As used herein, the term “expression control region” or expression control domain” refers to a region or domain present in a genomic DNA that modulates the expression of a target gene in a cell. A functionality associated with an expression control region may directly affect expression of a target gene, e.g., by recruiting or blocking recruitment of a transcription factor that would stimulate expression of the gene. A functionality associated with an expression control region may indirectly affect expression of a target gene, e.g., by introducing epigenetic modifications or recruiting other factors that introduce epigenetic modifications that induce a change in chromosomal topology that modulates expression of a target gene. Expression control regions may be upstream and/or downstream of the protein coding sequence of a gene and include, for example, transcriptional control elements, e.g., promoters, ehnacers, or repressors; and anchor sequences, and anchor sequence-mediated conjunctions.


The term “transcriptional control element,” as used herein, refers to a nucleic acid sequence that controls transcription of a gene. Transcriptional control elements include, for example, anchor sequences, anchor sequence-mediated conjunctions, promoters, transcriptional enhancers, and transcriptional repressors.


A promoter is a region of DNA recognized by an RNA polymerase to initiate transcription of a particular gene and is generally located upstream of the 5′-end of the transcription start site of the gene.


A “transcriptional enhancer” increases gene transcription. A “transcriptional silencer” or “transcriptional repressor” decreases gene transcription. Enhancing and silencing sequences may be about 50-3500 base pairs in length and may influence gene transcription up to about 1 megabases away.


The term “gene,” as used herein, refers to a sequence of nucleotides that encode a molecule, such as a protein, that has a function. A gene contains sequences that are transcribed (e.g., a 3′UTR), sequences that are not transcribed (e.g., a promoter), sequences that are translated (e.g., an exon), and sequences that are not translated (e.g., intron).


As used herein, the term “target gene” means an HNF4α gene that is targeted for modulation, e.g., increase or decrease, of expression. In some embodiments, an HNF4α target gene is part of a targeted genomic complex (e.g. an HNF4α gene that has at least part of its genomic sequence as part of a target genomic complex, e.g. inside an anchor sequence-mediated conjunction), which genomic complex is targeted by one or more site-specific disrupting agents as described herein. In some embodiments, modulation comprises inhibition of expression of the target gene. In some embodiments, an HNF4α gene is modulated by contacting the HNF4α gene or a transcription control element operably linked to the HNF4α gene with one or more site-specific disrupting agents as described herein. In some embodiments, an HNF4α gene is aberrantly expressed (e.g., over-expressed) in a cell, e.g., a cell in a subject (e.g., a subject having an HNF4α-associated disease). In some embodiments, an HNF4α gene is aberrantly expressed (e.g., under-expressed) in a cell, e.g., a cell in a subject (e.g., a subject having an HNF4α-associated disease).


The term “anchor sequence” as used herein, refers to a nucleic acid sequence recognized by a nucleating agent that binds sufficiently to form an anchor sequence-mediated conjunction, e.g., a complex. In some embodiments, an anchor sequence comprises one or more CTCF binding motifs. In some embodiments, an anchor sequence is not located within a gene coding region. In some embodiments, an anchor sequence is located within an intergenic region. In some embodiments, an anchor sequence is not located within either of an enhancer or a promoter. In some embodiments, an anchor sequence is located at least 400 bp, at least 450 bp, at least 500 bp, at least 550 bp, at least 600 bp, at least 650 bp, at least 700 bp, at least 750 bp, at least 800 bp, at least 850 bp, at least 900 bp, at least 950 bp, or at least 1 kb away from any transcription start site. In some embodiments, an anchor sequence is located within a region that is not associated with genomic imprinting, monoallelic expression, and/or monoallelic epigenetic marks. In some embodiments, the anchor sequence has one or more functions selected from binding an endogenous nucleating polypeptide (e.g., CTCF), interacting with a second anchor sequence to form an anchor sequence mediated conjunction, or insulating against an enhancer that is outside the anchor sequence mediated conjunction. In some embodiments of the present invention, technologies are provided that may specifically target a particular anchor sequence or anchor sequences, without targeting other anchor sequences (e.g., sequences that may contain a nucleating agent (e.g., CTCF) binding motif in a different context); such targeted anchor sequences may be referred to as the “target anchor sequence”. In some embodiments, sequence and/or activity of a target anchor sequence is modulated while sequence and/or activity of one or more other anchor sequences that may be present in the same system (e.g., in the same cell and/or in some embodiments on the same nucleic acid molecule, e.g., the same chromosome) as the other targeted anchor sequence is not modulated. In some embodiments, the anchor sequence comprises or is a nucleating polypeptide binding motif. In some embodiments, the anchor sequence is adjacent to a nucleating polypeptide binding motif.


The term “anchor sequence-mediated conjunction” as used herein, refers to a DNA structure, in some cases, a complex, that occurs and/or is maintained via physical interaction or binding of at least two anchor sequences in the DNA by one or more polypeptides, such as nucleating polypeptides, or one or more proteins and/or a nucleic acid entity (such as RNA or DNA), that bind the anchor sequences to enable spatial proximity and functional linkage between the anchor sequences.


As used herein, the term “genomic complex” is a complex that brings together two genomic sequence elements that are spaced apart from one another on one or more chromosomes, via interactions between and among a plurality of protein and/or other components (potentially including, the genomic sequence elements). In some embodiments, the genomic sequence elements are anchor sequences to which one or more protein components of the complex bind. In some embodiments, a genomic complex may comprise an anchor sequence-mediated conjunction. In some embodiments, a genomic sequence element may be or comprise a CTCF binding motif, a promoter and/or an enhancer.


In some embodiments, a genomic sequence element includes at least one or both of a promoter and/or regulatory region (e.g., an enhancer). In some embodiments, complex formation is nucleated at the genomic sequence element(s) and/or by binding of one or more of the protein component(s) to the genomic sequence element(s). As will be understood by those skilled in the art, in some embodiments, co-localization (e.g., conjunction) of the genomic sites via formation of the complex alters DNA topology at or near the genomic sequence element(s), including, in some embodiments, between them. In some embodiments, a genomic complex comprises an anchor sequence-mediated conjunction, which comprises one or more loops. In some embodiments, a genomic complex as described herein is nucleated by a nucleating polypeptide such as, for example, CTCF and/or Cohesin. In some embodiments, a genomic complex as described herein may include, for example, one or more of CTCF, Cohesin, non-coding RNA (e.g., eRNA), transcriptional machinery proteins (e.g., RNA polymerase, one or more transcription factors, for example selected from the group consisting of TFIIA, TFIIB, TFIID, TFIIE, TFIIF, TFIIH, etc.), transcriptional regulators (e.g., Mediator, P300, enhancer-binding proteins, repressor-binding proteins, histone modifiers, etc.), etc. In some embodiments, a genomic complex as described herein includes one or more polypeptide components and/or one or more nucleic acid components (e.g., one or more RNA components), which may, in some embodiments, be interacting with one another and/or with one or more genomic sequence elements (e.g., anchor sequences, promoter sequences, regulatory sequences (e.g., enhancer sequences)) so as to constrain a stretch of genomic DNA into a topological configuration (e.g., a loop) that the stretch of genomic DNA does not adopt when the complex is not formed.


An “effector molecule,” as used herein, refers to a molecule that is able to regulate a biological activity, such as enzymatic activity, gene expression, anchor sequence-mediated conjunction or cell signaling. Exemplary effectors are described in Section II, below, and in some embodiment include, for example, nucleases, physical blockers, epigenetic recruiters, e.g., a transcriptional enhancer or a transcriptional repressor, and epigenetic CpG modifiers, e.g., a DNA methylase, a DNA demethylase, a histone modifying agent, or a histone deacetylase, and combinations of any of the foregoing.


II. Site-Specific HNF4α Disrupting Agents of the Invention

The present invention provides site-specific HNF4α disrupting agents which, in one aspect of the invention include a site-specific HNF4α targeting moiety which targets an HNF4α expression control region. In another aspect, the site-specific disrupting agents of the invention include a site-specific HNF4α targeting moiety which targets an HNF4α expression control region and an effector molecule. As will be appreciated by one of ordinary skill in the art, such disrupting agents are site-specific and, thus, specifically bind to an HNF4α expression control region (e.g., one or more transcriptional control elements and/or one or more target anchor sequences), e.g., within a cell and not to non-targeted expression control regions (e.g., within the same cell).


The site-specific HNF4α disrupting agents of the invention comprise a site-specific HNF4α targeting moiety targeting an HNF4α expression control region. The expression control region targeted by the site-specific targeting moiety may be, for example, a transcriptional control element or an anchor sequence, such as an anchor sequence within an anchor-mediated conjunction.


Thus, site-specific HNF4α disrupting agents of the invention may modulate expression of a gene, i.e., HNF4α, e.g., by modulating expression of the gene from an endogenous promoter, an enhancer, or an repressor, may alter methylation of the control region, may alter at least one anchor sequence; may alter at least one conjunction nucleating molecule binding site, such as by altering binding affinity for the conjunction nucleating molecule; may alter an orientation of at least one common nucleotide sequence, such as a CTCF binding motif by, e.g., substitution, addition or deletion in at least one anchor sequence, such as a CTCF binding motif.


In certain embodiments, the site-specific disrupting agents and compositions described herein target an expression control region comprising one or more HNF4α-specific transcriptional control elements to modulate expression in a cell. HNF4α-specific transcriptional control elements that can be targeted include HNF4α-specific promoters, HNF4α-specific enhancers, and HNF4α-specific repressors. In one embodiment, an HNF4α-specific promoter substantially drives expression in cells of the liver, i.e., promoter 1. In one embodiment, an HNF4α-specific promoter substantially drives expression in cells of the pancreas, i.e., promoter 2. The nucleotide sequences of HNF4α promoter 1 and promoter 2 are known and may be found in, for example, GenBank Accession No. NC_000020.10 (42984441 . . . 43061485).


For example, a site-specific disrupting agent may include a site-specific targeting moiety, e.g., a nucleic acid molecule encoding a DNA-binding domain of a Transcription activator-like effector (TALE) polypeptide or a zinc finger (ZNF) polypeptide, or fragment thereof, that specifically targets and binds to an HNF4α expression control region, such as an HNF4α endogenous promoter region, e.g., promoter 1, and an effector molecule, such as an effector molecule that includes a transcriptional enhancer or transcriptional repressor that modulates, e.g., enhances or represses, expression of a target gene from an endogenous promoter to modulate gene expression. In one embodiment, the disrupting agent is “bicistronic nucleic acid molecule,” i.e., capable of making two fusion proteins from a single messenger RNA molecule, a first and a second site-specific targeting moiety, e.g., a nucleic acid molecule encoding a DNA-binding domain of a Transcription activator-like effector (TALE) polypeptide or a zinc finger (ZNF) polypeptide, or fragment thereof, that specifically targets and binds to an HNF4α expression control region, such as an HNF4α endogenous promoter region, e.g., promoter 1, and an effector molecule, such as an effector molecule that includes a transcriptional enhancer or transcriptional repressor that modulates, e.g., enhances or represses, expression of a target gene from an endogenous promoter to modulate gene expression.


In some embodiments of the invention, a site-specific disrupting agent may include a site-specific targeting moiety, e.g., a nucleic acid molecule such as a guide RNA targeting an HNF4α endogenous promoter region, e.g., promoter 1, and an effector molecule, such as an effector molecule that includes a transcriptional enhancer or transcriptional repressor that modulates, e.g., enhances or represses, expression of a target gene from an endogenous promoter to modulate gene expression.


In certain embodiments of the invention, the site-specific disrupting agents and compositions described herein target an expression control region comprising one or more HNF4α-associated anchor sequences, e.g., within an anchor sequence-mediated conjunction, comprising a first and a second HNF4α-associated anchor sequence to alter a two-dimensional chromatin structure (e.g., anchor sequence-mediated conjunctions in order to modulate expression in a cell, e.g., a cell within a subject, e.g., by modifying anchor sequence-mediated conjunctions in DNA, e.g., genomic DNA.


In one aspect, the invention includes a site-specific HNF4α disrupting agent comprising a site-specific HNF4α targeting moiety which targets an HNF4α expression control region comprising one or more HNF4α-associated anchor sequences within an anchor sequence-mediated conjunction. The disrupting agent binds, e.g., specifically binds, a specific anchor sequence-mediated conjunction to alter a topology of the anchor sequence-mediated conjunction, e.g., an anchor sequence-mediated conjunction having a physical interaction of two or more DNA loci bound by a conjunction nucleating molecule.


The formation of an anchor sequence-mediated conjunction may force transcriptional control elements to interact with an HNF4α gene or spatially constrain the activity of the transcriptional control elements. Altering anchor sequence-mediated conjunctions, therefore, allows for modulating HNF4α expression without altering the coding sequences of the HNF4α gene being modulated.


In some embodiments, the site-specific disrupting agents and compositions of the invention modulate expression of an HNF4α gene associated with an anchor sequence-mediated conjunction by physically interfering between one or more anchor sequences and a conjunction nucleating molecule. For example, a DNA binding small molecule (e.g., minor or major groove binders), peptide (e.g., zinc finger, TALE, novel or modified peptide), protein (e.g., CTCF, modified CTCF with impaired CTCF binding and/or cohesion binding affinity), or nucleic acids (e.g., ssDNA, modified DNA or RNA, peptide oligonucleotide conjugates, locked nucleic acids, bridged nucleic acids, polyamides, and/or triplex forming oligonucleotides) may physically prevent a conjunction nucleating molecule from interacting with one or more anchor sequences to modulate HNF4α gene expression.


In some embodiments, the site-specific disrupting agents and compositions of the invention modulate expression of an HNF4α gene associated with an anchor sequence-mediated conjunction by modification of an anchor sequence, e.g., epigenetic modifications, e.g., histone protein modifications, or genomic editing modifications. For example, one or more anchor sequences associated with an anchor sequence-mediated conjunction comprising an HNF4α gene may be targeted for methylation modification by a DNA methyltransferase, e.g., dCas9-methyltransferase fusion, e.g., antisense oligonucleotide-enzyme fusion, to modulate expression of the gene.


In some embodiments, the site-specific disrupting agents and compositions of the invention modulate expression of an HNF4α gene associated with an anchor sequence-mediated conjunction, e.g., activate or represses transcription, e.g., induces epigenetic changes to chromatin.


In some embodiments, an anchor sequence-mediated conjunction includes one or more anchor sequences, an HNF4α gene, and one or more transcriptional control elements, such as an enhancing or silencing element. In some embodiments, the transcriptional control element is within, partially within, or outside the anchor sequence-mediated conjunction.


In one embodiment, the anchor sequence-mediated conjunction comprises a loop, such as an intra-chromosomal loop. In certain embodiments, the anchor sequence-mediated conjunction has a plurality of loops. One or more loops may include a first anchor sequence, a nucleic acid sequence, a transcriptional control element, and a second anchor sequence. In another embodiment, at least one loop includes, in order, a first anchor sequence, a transcriptional control element, and a second anchor sequence; or a first anchor sequence, a nucleic acid sequence, and a second anchor sequence. In yet another embodiment, either one or both of the nucleic acid sequences and the transcriptional control element is located within or outside the loop. In still another embodiment, one or more of the loops comprises a transcriptional control element.


In some embodiments, the anchor sequence-mediated conjunction includes a TATA box, a CAAT box, a GC box, or a CAP site.


In some embodiments, the anchor sequence-mediated conjunction comprises a plurality of loops, and where the anchor sequence-mediated conjunction comprises at least one of an anchor sequence, a nucleic acid sequence, and a transcriptional control element in one or more of the loops.


In one aspect, the site-specific disrupting agents and compositions of the invention may introduce a targeted alteration to an anchor sequence-mediated conjunction to modulate expression of a nucleic acid sequence with a disrupting agent that binds the anchor sequence. In some embodiments, the anchor sequence-mediated conjunction is altered by targeting one or more nucleotides within the anchor sequence-mediated conjunction for substitution, addition or deletion.


In some embodiments, expression, e.g., transcription, is activated by inclusion of an activating loop or exclusion of a repressive loop. In one such embodiment, the anchor sequence-mediated conjunction comprises a transcriptional control sequence that increases transcription of a nucleic acid sequence, e.g., such an HNF4α encoding nucleic acid. In another such embodiment, the anchor sequence-mediated conjunction excludes a transcriptional control element that decreases expression, e.g., transcription, of a nucleic acid sequence, e.g., such an HNF4α encoding nucleic acid.


In some embodiments, expression, e.g., transcription, is repressed by inclusion of a repressive loop or exclusion of an activating loop. In one such embodiment, the anchor sequence-mediated conjunction includes a transcriptional control element that decreases expression, e.g., transcription, of a nucleic acid sequence, e.g., such an HNF4α encoding nucleic acid sequence. In another such embodiment, the anchor sequence-mediated conjunction excludes a transcriptional control sequence that increases transcription of a nucleic acid sequence, e.g., such an HNF4α encoding nucleic acid.


Each anchor sequence-mediated conjunction comprises one or more anchor sequences, e.g., a plurality. Anchor sequences can be manipulated or altered to disrupt naturally occurring loops or form new loops (e.g., to form exogenous loops or to form non-naturally occurring loops with exogenous or altered anchor sequences). Such alterations modulate HNF4α gene expression by changing the 2-dimensional structure of DNA containing all or a portion of an HNF4α gene, e.g., by thereby modulating the ability of the HNF4α gene to interact with transcriptional control elements (e.g., enhancing and silencing/repressive sequences). In some embodiments, the chromatin structure is modified by substituting, adding or deleting one or more nucleotides within an anchor sequence of the anchor sequence-mediated conjunction.


The anchor sequences may be non-contiguous with one another. In embodiments with noncontiguous anchor sequences, the first anchor sequence may be separated from the second anchor sequence by about 500 bp to about 500 Mb, about 750 bp to about 200 Mb, about 1 kb to about 100 Mb, about 25 kb to about 50 Mb, about 50 kb to about 1 Mb, about 100 kb to about 750 kb, about 150 kb to about 500 kb, or about 175 kb to about 500 kb. In some embodiments, the first anchor sequence is separated from the second anchor sequence by about 500 bp, 600 bp, 700 bp, 800 bp, 900 bp, 1 kb, 5 kb, 10 kb, 15 kb, 20 kb, 25 kb, 30 kb, 35 kb, 40 kb, 45 kb, 50 kb, 55 kb, 60 kb, 65 kb, 70 kb, 75 kb, 80 kb, 85 kb, 90 kb, 95 kb, 100 kb, 125 kb, 150 kb, 175 kb, 200 kb, 225 kb, 250 kb, 275 kb, 300 kb, 350 kb, 400 kb, 500 kb, 600 kb, 700 kb, 800 kb, 900 kb, 1 Mb, 2 Mb, 3 Mb, 4 Mb, 5 Mb, 6 Mb, 7 Mb, 8 Mb, 9 Mb, 10 Mb, 15 Mb, 20 Mb, 25 Mb, 50 Mb, 75 Mb, 100 Mb, 200 Mb, 300 Mb, 400 Mb, 500 Mb, or any size therebetween.


In one embodiment, the anchor sequence comprises a common nucleotide sequence, e.g., a CTCF-binding motif:









(SEQ ID NO: 64)


N(T/C/G)N(G/A/T)CC(A/T/G)(C/G)(C/T/A)AG(G/A)(G/T)





GG(C/A/T)(G/A)(C/G)(C/T/A)(G/A/C),







where N is any nucleotide.


A CTCF-binding motif may also be in the opposite orientation, e.g.,











(SEQ ID NO: 65)



(G/A/C)(C/T/A)(C/G)(G/A)(C/A/T)GG(G/T)(G/A)GA







(C/T/A)(C/G)(A/T/G)CC(G/A/T)N(T/C/G)N.






In one embodiment, the anchor sequence comprises SEQ ID NO: 64 or SEQ ID NO:65 or a nucleotide sequence at least 75%, at least 80%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% identical to either SEQ ID NO: 64 or SEQ ID NO:65.


In some embodiments, the anchor sequence-mediated conjunction comprises at least a first anchor sequence and a second anchor sequence. The first anchor sequence and second anchor sequence may each comprise a common nucleotide sequence, e.g., each comprises a CTCF binding motif. In some embodiments, the first anchor sequence and second anchor sequence comprise different sequences, e.g., the first anchor sequence comprises a CTCF binding motif and the second anchor sequence comprises an anchor sequence other than a CTCF binding motif. In some embodiments, each anchor sequence comprises a common nucleotide sequence and one or more flanking nucleotides on one or both sides of the common nucleotide sequence.


Two CTCF-binding motifs (e.g., contiguous or non-contiguous CTCF binding motifs) that can form a conjunction may be present in the genome in any orientation, e.g., in the same orientation (tandem) either 5′->3′ (left tandem, e.g., the two CTCF-binding motifs that comprise SEQ ID NO: 64) or 3′->5′ (right tandem, e.g., the two CTCF-binding motifs comprise SEQ ID NO: 65), or convergent orientation, where one CTCF-binding motif comprises SEQ ID NO: 64 and the other comprises SEQ ID NO: 65. CTCFBSDB 2.0: Database For CTCF binding motifs And Genome Organization (http://insulatordb.uthsc.edu/) can be used to identify CTCF binding motifs associated with a target gene, e.g., HNF4α.


In some embodiments, the anchor sequence-mediated conjunction is altered by changing an orientation of at least one common nucleotide sequence, e.g., a conjunction nucleating molecule binding site.


In some embodiments, the anchor sequence comprises a conjunction nucleating molecule binding site, e.g., CTCF binding motif, and site-specific disrupting agent of the invention introduces an alteration in at least one conjunction nucleating molecule binding site, e.g. altering binding affinity for the conjunction nucleating molecule.


In some embodiments, the anchor sequence-mediated conjunction is altered by introducing an exogenous anchor sequence. Addition of a non-naturally occurring or exogenous anchor sequence to form or disrupt a naturally occurring anchor sequence-mediated conjunction, e.g., by inducing a non-naturally occurring loop to form that alters transcription of the nucleic acid sequence.


In some embodiments, the anchor sequence-mediated conjunction comprises an HNF4α gene, and one or more, e.g., 2, 3, 4, 5, or other genes other than the HNF4α gene.


In some embodiments, the anchor sequence-mediated conjunction is associated with one or more, e.g., 2, 3, 4, 5, or more, transcriptional control elements. In some embodiments, the HNF4α gene is noncontiguous with one or more of the transcriptional control elements. In some embodiments where the HNF4α gene is non-contiguous with the transcriptional control element, the gene may be separated from one or more transcriptional control elements by about 100 bp to about 500 Mb, about 500 bp to about 200 Mb, about 1 kb to about 100 Mb, about 25 kb to about 50 Mb, about 50 kb to about 1 Mb, about 100 kb to about 750 kb, about 150 kb to about 500 kb, or about 175 kb to about 500 kb. In some embodiments, the gene is separated from the transcriptional control element by about 100 bp, 300 bp, 500 bp, 600 bp, 700 bp, 800 bp, 900 bp, 1 kb, 5 kb, 10 kb, 15 kb, 20 kb, 25 kb, 30 kb, 35 kb, 40 kb, 45 kb, 50 kb, 55 kb, 60 kb, 65 kb, 70 kb, 75 kb, 80 kb, 85 kb, 90 kb, 95 kb, 100 kb, 125 kb, 150 kb, 175 kb, 200 kb, 225 kb, 250 kb, 275 kb, 300 kb, 350 kb, 400 kb, 500 kb, 600 kb, 700 kb, 800 kb, 900 kb, 1 Mb, 2 Mb, 3 Mb, 4 Mb, 5 Mb, 6 Mb, 7 Mb, 8 Mb, 9 Mb, 10 Mb, 15 Mb, 20 Mb, 25 Mb, 50 Mb, 75 Mb, 100 Mb, 200 Mb, 300 Mb, 400 Mb, 500 Mb, or any size therebetween.


In some embodiments, the type of anchor sequence-mediated conjunction may help to determine how to modulate gene expression, e.g., choice of site-specific targeting moiety, by altering the anchor sequence-mediated conjunction. For example, some types of anchor sequence-mediated conjunctions comprise one or more transcription control elements within the anchor sequence-mediated conjunction. Disruption of such an anchor sequence-mediated conjunction by disrupting the formation of the anchor sequence-mediated conjunction, e.g., altering one or more anchor sequences, is likely to decrease transcription of an HNF4α gene within the anchor sequence-mediated conjunction.


In some embodiments, expression of the HNF4α gene is regulated, modulated, or influenced by one or more transcriptional control elements associated with the anchor sequence-mediated conjunction. In some embodiments, the anchor sequence-mediated conjunction comprises an HNF4α gene and one or more transcriptional control elements. For example, the HNF4α gene and one or more transcriptional control sequences are located within, at least partially, an anchor sequence-mediated conjunction, e.g., a Type 1 anchor sequence-mediated conjunction. The anchor sequence-mediated conjunction may also be referred to as a “Type 1, EP subtype.” In some embodiments, the HNF4α gene has a defined state of expression, e.g., in its native state, e.g., in a diseased state. For example, the HNF4α gene may have a high level of expression. By disrupting the anchor sequence-mediated conjunction, expression of the HNF4α gene may be decreased, e.g., decreased transcription due to conformational changes of the DNA previously open to transcription within the anchor sequence-mediated conjunction, e.g., decreased transcription due to conformational changes of the DNA creating additional distance between the HNF4α gene and the enhancing sequences. In one embodiment, both the HNF4α gene associated and one or more transcriptional control sequences, e.g., enhancing sequences, reside inside the anchor sequence-mediated conjunction. Disruption of the anchor sequence-mediated conjunction decreases expression of the HNF4α gene. In one embodiment, the HNF4α gene associated with the anchor sequence-mediated conjunction is accessible to one or more transcriptional control elements that reside inside, at least partially, the anchor sequence-mediated conjunction.


In some embodiments, expression of the HNF4α gene is regulated, modulated, or influenced by one or more transcriptional control elements associated with, but inaccessible due to the anchor sequence-mediated conjunction. For example, the anchor sequence-mediated conjunction associated with an HNF4α gene disrupts the ability of one or more transcriptional control elements to regulate, modulate, or influence expression of the HNF4α gene. The transcriptional control sequences may be separated from the HNF4α gene, e.g., reside on the opposite side, at least partially, e.g., inside or outside, of the anchor sequence-mediated conjunction as the HNF4α gene, e.g., the HNF4α gene is inaccessible to the transcriptional control elements due to proximity of the anchor sequence-mediated conjunction. In some embodiments, one or more enhancing sequences are separated from the HNF4α gene by the anchor sequence-mediated conjunction, e.g., a Type 2 anchor sequence-mediated conjunction.


In some embodiments, the HNF4α gene is inaccessible to one or more transcriptional control elements due to the anchor sequence-mediated conjunction, and disruption of the anchor sequence-mediated conjunction allows the transcriptional control element to regulate, modulate, or influence expression of the HNF4α gene. In one embodiment, the HNF4α gene is inside and outside the anchor sequence-mediated conjunction and inaccessible to the one or more transcriptional control elements. Disruption of the anchor sequence-mediated conjunction increases access of the transcriptional control elements to regulate, modulate, or influence expression of the HNF4α gene, e.g., the transcriptional control elements increase expression of the HNF4α gene. In one embodiment, the HNF4α gene is inside the anchor sequence-mediated conjunction and inaccessible to the one or more transcriptional control elements residing outside, at least partially, the anchor sequence-mediated conjunction. Disruption of the anchor sequence-mediated conjunction increases expression of the HNF4α gene. In one embodiment, the HNF4α gene is outside, at least partially, the anchor sequence-mediated conjunction and inaccessible to the one or more transcriptional control elements residing inside the anchor sequence-mediated conjunction. Disruption of the anchor sequence-mediated conjunction increases expression of the HNF4α gene.


A. HNF4α Site-Specific Targeting Moieties


The site-specific HNF4α targeting moieties of the invention target an HNF4α expression control region and may comprise a polymer or polymeric molecule, such as a polyamide (i.e., a molecule of repeating units linked by amide binds, e.g., a polypeptide), a polymer of nucleotides (such as a guide RNA, a nucleic acid molecule encoding a TALE polypeptide or a zinc finger polypeptides), a peptide nucleic acid (PNA), or a polymer of amino acids, such as a peptide or polypeptide, e.g., a fusion protein, etc. Suitable site-specific HNF4α targeting moieties, compositions, and methods of use of such agents and compositions are described below and in PCT Publication WO 2018/049073, the entire contents of which are expressly incorporated herein by reference.


In one embodiment, a site specific disrupting agent of the invention comprises a site-specific HNF4α targeting moiety comprising a nucleic acid molecule encoding a polypeptide, such as a DNA-binding domain, of a Transcription activator-like effector (TALE) polypeptide or a zinc finger (ZNF) polypeptide, or fragment thereof, that is engineered to specifically target an HNF4α expression control region to modulate expression of an HNF4α gene.


In another embodiment, a site-specific disrupting agent of the invention comprises a site-specific HNF4α targeting moiety comprising a nucleic acid molecule, such as a guide RNA (or gRNA) or a guide RNA and an effector, or fragment thereof, or nucleic acid molecule encoding an effector, or fragment thereof.


In another embodiment, a site-specific disrupting agent of the invention comprises a site-specific HNF4α targeting moiety comprising a polynucleotide, such as a PNA, e.g., a nucleic acid gRNA linked to an effector polypeptide, or fragment thereof.


In another embodiment, a site-specific disrupting agent of the invention comprises a site-specific HNF4α targeting moiety comprising a fusion molecule, such as a nucleic acid molecule encoding a DNA-binding domain, of a Transcription activator-like effector (TALE) polypeptide or a zinc finger (ZNF) polypeptide, or fragment thereof, and an effector.


In one embodiment, such site-specific disrupting agents comprise a second fusion protein, wherein the second fusion protein comprises a second site-specific HNF4α targeting moiety which targets a second HNF4α expression control region and a second effector molecule, wherein the second HNF4α expression control region is different than the HNF4α expression control region.


In another embodiment, a site-specific disrupting agent of the invention comprises a site-specific HNF4α targeting moiety comprising a fusion molecule, such as a nucleic acid molecule encoding a protein comprising a Cas polypeptide and, e.g., an epigenetic recruiter or an epigenetic CpG modifier.


In yet, another embodiment, a site-specific disrupting agent of the invention comprises a site-specific HNF4α targeting moiety comprising a fusion molecule, such as fusion protein comprising a Cas polypeptide and, e.g., an epigenetic recruiter or an epigenetic CpG modifier.


As used herein, in its broadest sense, the term “nucleic acid” refers to any compound and/or substance that is or can be incorporated into an oligonucleotide chain. In some embodiments, a nucleic acid is a compound and/or substance that is or can be incorporated into a polynucleotide chain via a phosphodiester linkage. As will be clear from context, in some embodiments, “nucleic acid” refers to an individual nucleic acid residue (e.g., a nucleotide and/or nucleoside); in some embodiments, “nucleic acid” refers to a polynucleotide chain comprising individual nucleic acid residues. In some embodiments, a “nucleic acid” is or comprises RNA; in some embodiments, a “nucleic acid” is or comprises DNA. In some embodiments, a “nucleic acid” is a “mixmer” comprising locked nucleic acid molecules and deoxynucleic acid molecules. In some embodiments, a nucleic acid is, comprises, or consists of one or more natural nucleic acid residues. In some embodiments, a nucleic acid is, comprises, or consists of one or more nucleic acid analogs. In some embodiments, a nucleic acid analog differs from a nucleic acid in that it does not utilize a phosphodiester backbone. For example, in some embodiments, a nucleic acid is, comprises, or consists of one or more “peptide nucleic acids”, which are known in the art and have peptide bonds instead of phosphodiester bonds in the backbone, are considered within the scope of the present invention. Alternatively or additionally, in some embodiments, a nucleic acid has one or more phosphorothioate and/or 5′-N-phosphoramidite linkages rather than phosphodiester bonds. In some embodiments, a nucleic acid is, comprises, or consists of one or more natural nucleosides (e.g., adenosine, thymidine, guanosine, cytidine, uridine, deoxyadenosine, deoxythymidine, deoxy guanosine, and deoxycytidine). In some embodiments, a nucleic acid is, comprises, or consists of one or more nucleoside analogs (e.g., 2-aminoadenosine, 2-thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyl adenosine, 5-methylcytidine, C-5 propynyl-cytidine, C-5 propynyl-uridine, 2-aminoadenosine, C5-bromouridine, C5-fluorouridine, C5-iodouridine, C5-propynyl-uridine, C5-propynyl-cytidine, C5-methylcytidine, 2-aminoadenosine, 7-deazaadenosine, 7-deazaguanosine, 8-oxoadenosine, 8-oxoguanosine, O(6)-methylguanine, 2-thiocytidine, methylated bases, intercalated bases, and combinations thereof). In some embodiments, a nucleic acid comprises one or more modified sugars (e.g., 2′-fluororibose, ribose, 2′-deoxyribose, arabinose, and hexose) as compared with those in natural nucleic acids. In some embodiments, a nucleic acid has a nucleotide sequence that encodes a functional gene product such as an RNA or protein. In some embodiments, a nucleic acid includes one or more introns. In some embodiments, nucleic acids are prepared by one or more of isolation from a natural source, enzymatic synthesis by polymerization based on a complementary template (in vivo or in vitro), reproduction in a recombinant cell or system, and chemical synthesis. In some embodiments, a nucleic acid is at least 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 20, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, 600, 700, 800, 900, 1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000 or more residues long. In some embodiments, a nucleic acid is partly or wholly single stranded; in some embodiments, a nucleic acid is partly or wholly double stranded. In some embodiments a nucleic acid has a nucleotide sequence comprising at least one element that encodes, or is the complement of a sequence that encodes, a polypeptide. In some embodiments, a nucleic acid has enzymatic activity.


As used herein, the terms “peptide,” “polypeptide,” and “protein” refer to a compound comprised of amino acid residues covalently linked by peptide bonds, or by means other than peptide bonds. A protein or peptide must contain at least two amino acids, and no limitation is placed on the maximum number of amino acids that can comprise a protein's or peptide's sequence. Polypeptides include any peptide or protein comprising two or more amino acids joined to each other by peptide bonds or by means other than peptide bonds. As used herein, the term refers to both short chains, which also commonly are referred to in the art as peptides, oligopeptides and oligomers, for example, and to longer chains, which generally are referred to in the art as proteins, of which there are many types.


In certain embodiments, a polypeptide is or may comprise a chimeric or “fusion protein.” As used herein, a “chimeric protein” or “fusion protein” comprises all or part (preferably a biologically active part) of a first protein operably linked to a heterologous second polypeptide (i.e., a polypeptide other than the first protein). Within the fusion protein, the term “operably linked” is intended to indicate that the first protein or segment thereof and the heterologous polypeptide are fused in-frame to each other. The heterologous polypeptide can be fused to the amino-terminus or the carboxyl-terminus of the first protein or segment.


A “polyamide” is a polymeric molecule with repeating units linked by amide binds. Proteins are examples of naturally occurring polyamides. In some embodiments, a polyamide comprises a peptide nucleic acid (PNA).


A “peptide nucleic acid” (“PNA”) is a molecule in which one or more amino acid units in the PNA have an amide containing backbone, e.g., aminoethyl-glycine, similar to a peptide backbone, with a nucleic acid side chain in place of the amino acid side chain. Peptide nucleic acids (PNA) are known to hybridize complementary DNA and RNA with higher affinity than their oligonucleotide counterparts. This character of PNA not only makes them a stable hybrid with the nucleic acid side chains, but at the same time, the neutral backbone and hydrophobic side chains result in a hydrophobic unit within the polypeptide. The nucleic acid side chain includes, but is not limited to, a purine or a pyrimidine side chain such as adenine, cytosine, guanine, thymine and uracil. In one embodiment, the nucleic acid side chain includes a nucleoside analog as described herein.


In one embodiment, a site-specific HNF4α targeting moiety of the invention comprises a polyamide. Suitable polyamides for use in the agents and compositions of the invention are known in the art.


In one embodiment, a site-specific HNF4α targeting moiety of the invention comprises a polynucleotide. In some embodiments, the nucleotide sequence of the polynucleotide encodes an HNF4α gene or an HNF4α expression product. In some embodiments, the nucleotide sequence of the polynucleotide does not include an HNF4α coding sequence or an HNF4α expression product. For example, in some embodiments, a site-specific HNF4α targeting moiety of the invention comprises a polynucleotide that hybridizes to a target expression control region, e.g., a promoter or an anchor sequence. In some embodiments, the nucleotide sequence of the polynucleotide is a complement of a target anchor sequence, or has a sequence that is at least 80%, at least 85%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% identical to a complement of the target sequence.


The polynucleotides of the invention may include deoxynucleotides, ribonucleotides, modified deoxynucleotides, modified ribonucleotides (e.g., chemical modifications, such as modifications that alter the backbone linkages, sugar molecules, and/or nucleic acid bases), and artificial nucleic acids. In some embodiments, the polynucleotide includes, but is not limited to, genomic DNA, cDNA, peptide nucleic acids (PNA) or peptide oligonucleotide conjugates, locked nucleic acids (LNA), bridged nucleic acids (BNA), polyamides, triplex forming oligonucleotides, modified DNA, antisense DNA oligonucleotides, tRNA, mPvNA, rPvNA, modified RNA, miRNA, gRNA, and siRNA or other RNA or DNA molecules.


In some embodiments, the polynucleotides of the invention have a length from about 2 to about 5000 nts, about 10 to about 100 nts, about 50 to about 150 nts, about 100 to about 200 nts, about 150 to about 250 nts, about 200 to about 300 nts, about 250 to about 350 nts, about 300 to about 500 nts, about 10 to about 1000 nts, about 50 to about 1000 nts, about 100 to about 1000 nts, about 1000 to about 2000 nts, about 2000 to about 3000 nts, about 3000 to about 4000 nts, about 4000 to about 5000 nts, or any range therebetween.


The polynucleotides of the invention may include nucleosides, e.g., purines or pyrimidines, e.g., adenine, cytosine, guanine, thymine and uracil. In some embodiments, the polynucleotides includes one or more nucleoside analogs. The nucleoside analog includes, but is not limited to, a nucleoside analog, such as 5-fluorouracil; 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 4-acetylcytosine, 4-methylbenzimidazole, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, dihydrouridine, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine, 5′-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, 2,6-diaminopurine, 3-nitropyrrole, inosine, thiouridine, queuosine, wyosine, diaminopurine, isoguanine, isocytosine, diaminopyrimidine, 2,4-difluorotoluene, isoquinoline, pyrrolo[2,3-]pyridine, and any others that can base pair with a purine or a pyrimidine side chain.


In some embodiments, the site-specific HNF4α targeting moieties of the invention comprise a polynucleotide encoding a polypeptide that comprises a DNA-binding domain (DBD), or fragment thereof, of a zinc finger or TALE, that is engineered to specifically target an HNF4α expression control region to modulate expression of an HNF4α gene.


The design and preparation of such zinc finger polypeptides which specifically bind to a DNA target region of interest, such as an HNF4α expression control region, is well known in the art. For example, zinc finger (ZNF) proteins contain a DNA binding domain that specifically binds a triplet of nucleotides. Thus, to design and prepare the site-specific HNF4α targeting moieties of the invention, a modular assembly process which includes combining separate zinc finger DNA binding domains that can each recognize a specific 3-basepair DNA sequence to generate 3-finger, 4-, 5-, 6-, 7-, or 8-zinc finger polypeptide that recognizes specific target sites ranging from 9 base pairs to 24 base pairs in length may be used. Another suitable method may include 2-finger modules to generate ZNF polynucleotides with up to six individual zinc fingers. See, e.g., Shukla V K, et al., Nature. 459 (7245) 2009: 437-41; Dreier B, et al., JBC. 280 (42) 2005: 35588-97; Dreier B, et al, JBC 276 (31) 2001: 29466-78; Bae K H, et al., Nature Biotechnology. 21 (3) 2003: 275-80.


In some embodiments, a site-specific HNF4α targeting moiety of the invention comprises a polynucleotide encoding a polypeptide that comprises a DNA-binding domain (DBD), or fragment thereof, of a zinc finger, that is engineered to specifically target an HNF4α expression control region to modulate expression of an HNF4α gene. Exemplary amino acid sequences encoding a zinc finger that binds to a nucleotide triplet suitable for use in the present invention are provide in Table 1A below. (See, e.g., Gersbach et al., Synthetic Zinc Finger Proteins: The Advent of Targeted Gene Regulation and Genome Modification Technologies).













TABLE 1A







Amino Acid





Sequence of





Zing Finger





DNA Binding





Domain
Nucleotide
SEQ ID



(Finger)
Triplet
NO.




















RKDALRG
TTG
1







TTGALTE
CTT
2







QRHHLVE
CTC
3







QNSTLTE
CTA
4







RNDALTE
CTG
5







HKNALQN
ATT
6







RRSACRR
ATC
7







QKSSLIA
ATA
8







RRDELNV
ATG
9







TSGSLVR
GTT
10







DPGALVR
GTC
11







QSSSLVR
GTA
12







RSDELVR
GTG
13







RLRDIQF
TCT
14







RSDERKR
TCC
15







RSDHLTT
TCA
16







RLRALDR
TCG
17







TKNSLTE
CCT
18







SKKHLAE
CCC
19







TSHSLTE
CCA
20







RNDTLTE
CCG
21







THLDLIR
ACT
22







DKKDLTR
ACC
23







SPADLTR
ACA
24







RTDTLRD
ACG
25







TSGELVR
GCT
26







DCRDLAR
GCC
27







QSGDLRR
GCA
28







RSDDLVR
GCG
29







ARGNLRT
TAT
30







SRGNLKS
TAC
31







QASNLIS
TAA
32







REDNLHT
TAG
33







TSGNLTE
CAT
34







SKKALTE
CAC
35







QSGNLTE
CAA
36







RADNLTE
CAG
37







TTGNLTV
AAT
38







DSGNLRV
AAC
39







QRANLRA
AAA
40







RKDNLKN
AAG
41







TSGNLVR
GAT
42







DPGNLVR
GAC
43







QSSNLVR
GAA
44







RSDNLVR
GAG
45







APKALGW
TGC
46







QAGHLAS
TGA
47







RSDHLTT
TGG
48







SRRTCRA
CGT
49







HTGHLLE
CGC
50







QSGHLTE
CGA
51







RSDKLTE
CGG
52







HRTTLTN
AGT
53







ERSHLRE
AGC
54







QLAHLRA
AGA
55







RSDHLTN
AGG
56







TSGHLVR
GGT
57







DPGHLVR
GGC
58







QRAHLER
GGA
59







RSDKLVR
GGG
60










A zinc finger DNA binding domain comprises an N-terminal region and a C-terminal region with the “fingers” that bind to the target DNA sequence in between. The N-terminal region generally is 7 amino acids in length. The C-terminal region is generally 6 amino acids in length. Thus, the N-terminal region generally comprises the amino acid sequence of X1X2X3X4X5X6X7. “X” can be any amino acid. In some embodiments, the N-terminal region comprises the exemplary amino acid sequence of LEPGEKP (SEQ ID NO: 309). “X” can be any amino acid. The C-terminal region generally comprises the amino acid sequence of X25X26X27X28X29X30. In certain embodiments, the C-terminal region comprises the exemplary amino acid sequence of TGKKTS (SEQ ID NO: 310)


Each finger in the DNA binding domain is flanked by a N-terminal backbone located to the N-terminus of the finger and a C-terminal backbone located to the C-terminus of the finger. The N-terminal backbone of the finger generally is 11 amino acids long with two conservative cysteines (C) locate at 3rd and 6th positions. Thus, the N-terminal backbone of the finger generally comprises the amino acid sequence of X8X9CX10X11CX12X13X14X15X16. “X” can be any amino acid. The C-terminal backbone of the finger generally is 5 amino acids long with two conservative histines (H) located at 1st and 5th positions. Thus, the C-terminal backbone of the finger generally comprises the amino acid sequence of HX17X18X19H. “X” can be any amino acid. In some embodiments, the N-terminal backbone comprises the exemplary amino acid sequence of YKCPECGKSFS (SEQ ID No. 61) and the C-terminal backbone comprises the exemplary amino acid sequence of HQRTH (SEQ ID No. 62). Two “fingers” are linked through a linker. A linker generally is 5 amino acids in length and comprises the amino acid sequence of X20X21X22X23X24 “X” can be any amino acid. In certain embodiments, the linker comprises the exemplary amino acid sequence of TGEKP (SEQ ID No. 63). Thus, the zinc finger of a site specific HNF4α site-specific disrupting agent has a structure as follows: (N-terminal backbone-finger-C-terminal backbone-linker)n and the zinc finger DNA binding domain of a site specific HNF4α site-specific disrupting agent has a structure as follows: [N-terminal region (N-terminal backbone-finger-C-terminal backbone-linker)n-C-terminal region]. “N” represents the number of triplets of nucleotides to which the zinc finger DNA binding domain and, thus, to which the HNF4α site-specific disrupting agent binds.


The “finger” amino acid sequences of four nucleotide triplets are unknown, however, if such a triplet is identified in a target area of interest, two “linker span sequences”—linker span 1 and linker span 2—are useful to circumvent the issue. Linker span 1 is used to skip one base pair if a “finger” amino acid sequence of a triplet is not available. Linker span 2 is used to skip 2 base pairs if a “finger” amino acid sequence of a triplet is not available. Linker span 1 is generally 12 amino acids long. Linker span 2 is generally 16 amino acids long. Thus, linker span 1 generally comprises the amino acid sequence of X31X32X33X34X35X36X37X38X39X40X41X42. Linker span 2 generally comprises the amino acid sequence of X43X44X45X46X47X48X49X50X51X52X53X54X55X56X57X58. In some embodiments, linker span 1 comprises the amino acid sequence of THPRAPIPKPFQ (SEQ ID NO: 311). In certain embodiments, linker span 2 comprises the amino acid sequence of TPNPHRRTDPSHKPFQ (SEQ ID NO: 312). When linker span 1 and/or linker span 2 is used, the finger-linker span 1/span 2-finger comprises the structure as follows: N-terminal back bone-finger-C-terminal backbone-linker span 1/span 2-N-terminal backbone-finger-C-terminal backbone-linker.


Table 1B provides the amino acid sequence structure of exemplary zinc finger DNA binding domains of the disrupting agents comprising a zinc finger DNA binding domain described in the working examples below (see Table 6A). Table 10 also provides the nucleotide sequence of suitable target sequences in the expression control region, the amino acid sequences of exemplary zinc finger DNA binding domains suitable for use in the disrupting agents comprising a zinc finger DNA binding domain of the present invention as well as the amino acid sequence structure of the exemplary zinc finger DNA binding domains suitable for use in the disrupting agents comprising a zinc finger DNA binding domain of the present invention. The “X,” as used in Table 1B, represents any amino acid.


In some embodiments, a zinc finger DNA binding domain suitable for use in the disrupting agents of the invention comprises an amino acid sequence having at least 75%, at least 80%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% amino acid identity to the entire amino acid sequence of any one of the zinc finger DNA binding domains provided in any one of Tables 6A and 10.











TABLE 1B





Name of




Exemplary Zinc

SEQ


Finger DNA

ID


Binding Domain
Amino Acid Sequence Structure
NO:







ZF1
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18
313



X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVR




HX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH




LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SR




RTCRAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16




RSDKLTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15




X16RNDTLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






ZF2
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18
314



X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEH




X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDL




ARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP




GHLVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16




RNDALTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15




X16DKKDLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






ZF3
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18
315



X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17




X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVR




HX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNL




VRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RK




DNLKNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X61




QAGHLASHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15




X16TSGSLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






ZF4
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18
316



X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17




X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAE




HX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGAL




TEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSG




NLVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16




QSGDLRRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15




X16TSHSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






ZF5
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18
317



X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGNLRVHX17




X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQN




HX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDT




LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR




AHLERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16




RSDKLVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15




X16DPGHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






ZF6
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18
318



X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAHX17




X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTE




HX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHL




REHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPG




ALVRHXI7X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16




TSGHLVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15




X16RNDTLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






ZF7
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18
319



X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLVRHX17




X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLAS




HX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKL




TEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSG




NLTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16Q




SSNLVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16




THLDLIRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






ZF8
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18
320



X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRHX17




X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNV




HX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADL




TRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSD




ELVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16R




SDKLVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15




X16TTGALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






ZF9
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17X18
321



X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNHX17




X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNSTLTE




HX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH




LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS




GNLVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16




QSGHLTEHX17X18X19HX43X44X45X46X47X48X49X50X51X52X53X54X55




X56X57X58X8X9CX10X11CX12X13X14X15X16DKKDLTRHX17X18X19HX20




X21X22X23X24X25X26X27X28X29X30






ZF10
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QNSTLTEHX17X18
322



X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREHX17




X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH




X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLT




EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCR




DLARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16




QSGDLRRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15




X16TKNSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






ZF11
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16ERSHLREHX17X18
323



X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17




X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEH




X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHL




AEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TK




NSLTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16




DKKDLTRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15




X16SKKHLAEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






ZF12
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18
324



X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRHX17




X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLKN




HX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRD




LARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RE




DNLHTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16




DPGHLVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15




X16QLAHLRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






ZF13
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18
325



X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17




X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRH




X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV




RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPG




HLVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16




RNDALTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15




X16RSDHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






ZF14
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18
326



X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17




X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLVRH




X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV




RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDE




LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS




DKLVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16




QRAHLERHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






ZF15
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17X18
327



X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRH




X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV




RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDE




LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS




DKLVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16




QSSNLVRHX17X18X1911X20X21X22X23X24X25X26X27X28X29X30









Similarly, the design and preparation of such TALE polypeptide which specifically bind to a DNA target region of interest, such as an HNF4α expression control region, is well known in the art. For example, the DNA binding domain of TALE contains a repeated highly conserved 33-34 amino acid sequence with divergent 12th and 13th amino acids. These two positions, referred to as the Repeat Variable Diresidue (RVD), are highly variable and show a strong correlation with specific nucleotide recognition. This straightforward relationship between amino acid sequence and DNA recognition has allowed for the engineering of specific DNA-binding domains by selecting a combination of repeat segments containing the appropriate RVDs. See, e.g., Boch J Nature Biotechnology. 29 (2) 2011: 135-6; Boch J, et al., Science. 326 (5959) 2009: 1509-12; Moscou M J & Bogdanove A J Science. 326 (5959) 2009: 1501.


In some embodiments, the site-specific HNF4α targeting moieties of the invention comprising a polynucleotide comprise a guide RNA (or gRNA) or nucleic acid encoding a guide RNA. A gRNA is a short synthetic RNA molecule comprising a “scaffold” sequence necessary for, e.g., directing an effector to an HNF4α expression control element which may, e.g., include an about 20 nucleotide site-specific sequence targeting a genomic target sequence comprising the HNF4α expression control element.


Generally, guide RNA sequences are designed to have a length of between about 17 to about 24 nucleotides (e.g., 19, 20, or 21 nucleotides) and are complementary to the target sequence. Custom gRNA generators and algorithms are available commercially for use in the design of effective guide RNAs. Gene editing has also been achieved using a chimeric “single guide RNA” (“sgRNA”), an engineered (synthetic) single RNA molecule that mimics a naturally occurring crRNA-tracrRNA complex and contains both a tracrRNA (for binding the nuclease) and at least one crRNA (to guide the nuclease to the sequence targeted for editing). Chemically modified sgNAs have also been demonstrated to be effective in genome editing; see, for example, Hendel et al. (2015) Nature Biotechnol., 985-991.


Exemplary site-specific HNF4α promoter 1 targeting moieties are provided in Table 2, below. In some embodiments, the polynucleotide comprises a nucleotide sequence at least 75%, at least 80%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% identical to the entire nucleotide sequence of any one of the nucleotide sequences in Table 2.


Exemplary site-specific HNF4α promoter 2 targeting moieties are provided in Table 3, below. In some embodiments, the polynucleotide comprises a nucleotide sequence at least 75%, at least 80%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% identical to the entire nucleotide sequence of any one of the nucleotide sequences in Table 3.


Exemplary site-specific HNF4α promoter targeting moieties are also provided in Table 9, below. In some embodiments, the polynucleotide comprises a nucleotide sequence at least 75%, at least 80%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% identical to the entire nucleotide sequence of any one of the nucleotide sequences in Table 9.


It will be understood that, although the sequences in Tables 2, 3, 4, and 9 are described as modified (or unmodified), the nucleic acid molecule encompassed by the of the invention, e.g., a site-specific disrupting agent, may comprise any one of the sequences set forth in any one of Tables 2, 3, 4, or 9 that is un-modified or modified differently than described therein. It will also be understood that although some of the sequences in Table 9 have “Ts”, when used as an RNA molecule, such as a guide RNA, in the site-specific targeting moieties of the invention, the “Ts” may be replaced with “Us.”


In some embodiments, a site-specific HNF4α targeting moiety comprising a polynucleotide, e.g., gRNA, comprises a nucleotide sequence complementary to an anchor sequence. In one embodiment, the anchor sequence comprises a CTCF-binding motif or consensus sequence:









(SEQ ID NO: 64)


N(T/C/G)N(G/A/T)CC(A/T/G)(C/G)(C/T/A)AG(G/A)(G/T)





GG(C/A/T)(G/A)(C/G)(C/T/A)(G/A/C),







where N is any nucleotide. A CTCF-binding motif or consensus sequence may also be in the opposite orientation, e.g.,











(SEQ ID NO: 65)



(G/A/C)(C/T/A)(C/G)(G/A)(C/A/T)GG(G/T)(G/A)GA







(C/T/A)(C/G)(A/T/G)CC(G/A/T)N(T/C/G)N.







In some embodiments, the nucleic acid sequence comprises a sequence complementary to a CTCF-binding motif or consensus sequence.


In some embodiments, the polynucleotide comprises a nucleotide sequence at least 75%, at least 80%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% complementary to an anchor sequence.


In some embodiments, the polynucleotide comprises a nucleotide sequence at least 80%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% complementary to a CTCF-binding motif or consensus sequence. In some embodiments, the polynucleotide is selected from the group consisting of a gRNA, and a sequence complementary or a sequence comprising at least 80%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% complementary sequence to an anchor sequence.


In some embodiments, a site-specific HNF4α targeting moiety comprising a polynucleotide of the invention is an RNAi molecule. RNAi molecules comprise RNA or RNA-like structures typically containing 15-50 base pairs (such as about 18-25 base pairs) and having a nucleobase sequence identical (complementary) or nearly identical (substantially complementary) to a coding sequence in an expressed target gene within the cell. RNAi molecules include, but are not limited to: short interfering RNAs (siRNAs), double-strand RNAs (dsRNA), micro RNAs (miRNAs), short hairpin RNAs (shRNA), meroduplexes, and dicer substrates (U.S. Pat. Nos. 8,084,599, 8,349,809, and 8,513,207). In one embodiment, the invention includes a composition to inhibit expression of a gene encoding a polypeptide described herein, e.g., a conjunction nucleating molecule.


RNAi molecules comprise a sequence substantially complementary, or fully complementary, to all or a fragment of a target gene. RNAi molecules may complement sequences at the boundary between introns and exons to prevent the maturation of newly-generated nuclear RNA transcripts of specific genes into mRNA for transcription. RNAi molecules complementary to specific genes can hybridize with the mRNA for that gene and prevent its translation. The antisense molecule can be DNA, RNA, or a derivative or hybrid thereof. Examples of such derivative molecules include, but are not limited to, peptide nucleic acid (PNA) and phosphorothioate-based molecules such as deoxyribonucleic guanidine (DNG) or ribonucleic guanidine (R G).


RNAi molecules can be provided to the cell as “ready-to-use” RNA synthesized in vitro or as an antisense gene transfected into cells which will yield RNAi molecules upon transcription. Hybridization with mRNA results in degradation of the hybridized molecule by RNAse H and/or inhibition of the formation of translation complexes. Both result in a failure to produce the product of the original gene.


The length of the RNAi molecule that hybridizes to the transcript of interest should be around 10 nucleotides, between about 15 or 30 nucleotides, or about 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 or more nucleotides. The degree of identity of the antisense sequence to the targeted transcript should be at least 75%, at least 80%, at least 85%, at least 90%, or at least 95.


RNAi molecules may also comprise overhangs, i.e. typically unpaired, overhanging nucleotides which are not directly involved in the double helical structure normally formed by the core sequences of the herein defined pair of sense strand and antisense strand. RNAi molecules may contain 3′ and/or 5′ overhangs of about 1-5 bases independently on each of the sense strands and antisense strands. In one embodiment, both the sense strand and the antisense strand contain 3′ and 5′ overhangs. In one embodiment, one or more of the 3′ overhang nucleotides of one strand base pairs with one or more 5′ overhang nucleotides of the other strand. In another embodiment, the one or more of the 3′ overhang nucleotides of one strand base do not pair with the one or more 5′ overhang nucleotides of the other strand. The sense and antisense strands of an RNAi molecule may or may not contain the same number of nucleotide bases. The antisense and sense strands may form a duplex wherein the 5′ end only has a blunt end, the 3′ end only has a blunt end, both the 5′ and 3′ ends are blunt ended, or neither the 5′ end nor the 3′ end are blunt ended. In another embodiment, one or more of the nucleotides in the overhang contains a thiophosphate, phosphorothioate, deoxynucleotide inverted (3′ to 3′ linked) nucleotide or is a modified ribonucleotide or deoxynucleotide.


Small interfering RNA (siRNA) molecules comprise a nucleotide sequence that is identical to about 15 to about 25 contiguous nucleotides of the target mRNA. In some embodiments, the siRNA sequence commences with the dinucleotide AA, comprises a GC-content of about 30-70% (about 50-60%, about 40-60%, or about 45%-55%), and does not have a high percentage identity to any nucleotide sequence other than the target in the genome of the mammal in which it is to be introduced, for example as determined by standard BLAST search.


siRNAs and shRNAs resemble intermediates in the processing pathway of the endogenous microRNA (miRNA) genes (Bartel, Cell 116:281-297, 2004). In some embodiments, siRNAs can function as miRNAs and vice versa (Zeng et al., Mol Cell 9: 1327-1333, 2002; Doench et al., Genes Dev 17:438-442, 2003). MicroRNAs, like siRNAs, use RISC to downregulate target genes, but unlike siRNAs, most animal miRNAs do not cleave the mRNA. Instead, miRNAs reduce protein output through translational suppression or polyA removal and mRNA degradation (Wu et al., Proc Natl Acad Sci USA 103:4034-4039, 2006). Known miRNA binding sites are within mRNA 3′ UTRs; miRNAs seem to target sites with near-perfect complementarity to nucleotides 2-8 from the miRNA's 5′ end (Rajewsky, Nat Genet 38 Suppl: S8-13, 2006; Lim et al, Nature 433:769-773, 2005). This region is known as the seed region. Because siRNAs and miRNAs are interchangeable, exogenous siRNAs downregulate mRNAs with seed complementarity to the siRNA (Birmingham et al., Nat Methods 3: 199-204, 2006. Multiple target sites within a 3′ UTR give stronger downregulation (Doench et al., Genes Dev 17:438-442, 2003).


Lists of known miRNA sequences can be found in databases maintained by research organizations, such as Wellcome Trust Sanger Institute, Perm Center for Bioinformatics, Memorial Sloan Kettering Cancer Center, and European Molecule Biology Laboratory, among others. Known effective siRNA sequences and cognate binding sites are also well represented in the relevant literature. RNAi molecules are readily designed and produced by technologies known in the art. In addition, there are computational tools that increase the chance of finding effective and specific sequence motifs (Pei et al. 2006, Reynolds et al. 2004, Khvorova et al. 2003, Schwarz et al. 2003, Ui-Tei et al. 2004, Heale et al. 2005, Chalk et al. 2004, Amarzguioui et al. 2004).


An RNAi molecule modulates expression of RNA encoded by a gene. Because multiple genes can share some degree of sequence homology with each other, in some embodiments, the RNAi molecule can be designed to target a class of genes with sufficient sequence homology. In some embodiments, the RNAi molecule can contain a sequence that has complementarity to sequences that are shared amongst different gene targets or are unique for a specific gene target. In some embodiments, the RNAi molecule can be designed to target conserved regions of an RNA sequence having homology between several genes thereby targeting several genes in a gene family (e.g., different gene isoforms, splice variants, mutant genes, etc.). In some embodiments, the RNAi molecule can be designed to target a sequence that is unique to a specific RNA sequence of a single gene.


In some embodiments, the RNAi molecule targets a sequence in a conjunction nucleating molecule, e.g., CTCF, cohesin, USF 1, YY1, TATA-box binding protein associated factor 3 (TAF3), ZNF 143, or another polypeptide that promotes the formation of an anchor sequence-mediated conjunction, or an epigenetic modifying agent, e.g., an enzyme involved in post-translational modifications including, but are not limited to, DNA methylases (e.g., DNMT3a, DNMT3b, DNMTL), DNA demethylation (e.g., the TET family enzymes catalyze oxidation of 5-methylcytosine to 5-hydroxymethylcytosine and higher oxidative derivatives), histone methyltransferases, histone deacetylase (e.g., HDAC1, HDAC2, HDAC3), sirtuin 1, 2, 3, 4, 5, 6, or 7, lysine-specific histone demethylase 1 (LSD1), histone-lysine-N-methyltransferase (Setdbl), euchromatic histone-lysine N-methyltransferase 2 (G9a), histone-lysine N-methyltransferase (SUV39H1), enhancer of zeste homolog 2 (EZH2), viral lysine methyltransferase (vSET), histone methyltransferase (SET2), protein-lysine N-methyltransferase (SMYD2), and others. In one embodiment, the RNAi molecule targets a protein deacetylase, e.g., sirtuin 1, 2, 3, 4, 5, 6, or 7. In one embodiment, the invention includes a composition comprising an RNAi that targets a conjunction nucleating molecule, e.g., CTCF.


In some embodiments, the site-specific HNF4α targeting moiety comprises a peptide or protein moiety. In some embodiments, a site-specific disrupting agent comprises a fusion protein. In some embodiments, an effector is ca peptide or protein moiety. The peptide or protein moieties may include, but is not limited to, a peptide ligand, antibody fragment, or targeting aptamer that binds a receptor such as an extracellular receptor, neuropeptide, hormone peptide, peptide drug, toxic peptide, viral or microbial peptide, synthetic peptide, and agonist or antagonist peptide.


Exemplary peptides or protein include a DNA-binding protein, a CRISPR component protein, a conjunction nucleating molecule, a dominant negative conjunction nucleating molecule, an epigenetic modifying agent, or any combination thereof. In some embodiments, the peptide comprises a nuclease, a physical blocker, an epigenetic recruiter, and an epigenetic CpG modifier, and fragments and combinations of any of the foregoing. In some embodiments, the peptide comprises a DNA-binding domain of a protein, such as a helix-turn-helix motif, a leucine zipper, a Zn-finger, a TATA box binding proteins, a transcription factor.


Peptides or proteins may be linear or branched. The peptide or protein moiety may have a length from about 5 to about 200 amino acids, about 15 to about 150 amino acids, about 20 to about 125 amino acids, about 25 to about 100 amino acids, 20-70 amino acids, 20-80 amino acids, 20-90 amino acids, 30-100 amino acids, 30-60 amino acids, 30-80 amino acids, 35-85 amino acids, 40-100 amino acids, or 50-125 amino acids or any range therebetween.


As indicated above, in some embodiments, the site-specific HNF4α targeting moieties of the invention comprise a fusion protein.


In some embodiments, the fusion proteins of the invention include a site-specific HNF4α targeting moiety which targets an HNF4α expression control region and an effector molecule. In other embodiments, a fusion protein of the invention comprises an effector molecule. Exemplary effector molecules include are described below and in some embodiments include, for example, nucleases, physical blockers, epigenetic recruiters, e.g., a transcriptional enhancer or a transcriptional repressor, and epigenetic CpG modifiers, e.g., a DNA methylase, a DNA demethylase, a histone modifying agent, or a histone deacetylase, and combinations of any of the foregoing.


For example, a site-specific targeting moiety may comprise a gRNA and an effector, such as a nuclease, e.g., a Cas9, e.g., a wild type Cas9, a nickase Cas9 (e.g., Cas9 D10A), a dead Cas9 (dCas9), eSpCas9, Cpf1, C2C1, or C2C3, or a nucleic acid encoding such a nuclease. The choice of nuclease and gRNA(s) is determined by whether the targeted mutation is a deletion, substitution, or addition of nucleotides, e.g., a deletion, substitution, or addition of nucleotides to a targeted sequence. Fusions of a catalytically inactive endonuclease e.g., a dead Cas9 (dCas9, e.g., D10A; H840A) tethered with all or a portion of (e.g., biologically active portion of) an (one or more) effector domain create chimeric proteins that can be linked to the polypeptide to guide the composition to specific DNA sites by one or more RNA sequences (e.g., DNA recognition elements including, but not restricted to zinc finger arrays, sgRNA, TAL arrays, peptide nucleic acids described herein) to modulate activity and/or expression of one or more target nucleic acids sequences (e.g., to methylate or demethylate a DNA sequence).


In one embodiment, a fusion protein of the invention may comprise an effector molecule comprising, for example, a CRISPR associated protein (Cas) polypeptide, or fragment thereof, (e.g., a Cas9 polypeptide, or fragment thereof) and an epigenetic recruiter or an epigenetic CpG modifier.


In one embodiment, a suitable Cas polypeptide is an enzymatically inactive Cas polypeptide, e.g., a “dead Cas polypeptide” or “dCas” polypeptide


Exemplary Cas polypeptides that are adaptable to the methods and compositions described herein are described below. Using methods known in the art, a Cas polypeptide can be fused to any of a variety of agents and/or molecules as described herein; such resulting fusion molecules can be useful in various disclosed methods.


In one aspect, the invention includes a composition comprising a protein comprising a domain, e.g., an effector, that acts on DNA (e.g., a nuclease domain, e.g., a Cas9 domain, e.g., a dCas9 domain; a DNA methyltransferase, a demethylase, a deaminase), in combination with at least one guide RNA (gRNA) or antisense DNA oligonucleotide that targets the protein to site-specific target sequence, wherein the composition is effective to alter, in a human cell, the expression of a target gene. In some embodiments, the enzyme domain is a Cas9 or a dCas9. In some embodiments, the protein comprises two enzyme domains, e.g., a dCas9 and a methylase or demethylase domain.


In one aspect, the invention includes a composition comprising a protein comprising a domain, e.g., an effector, that comprises a transcriptional control element (e.g., a nuclease domain, e.g., a Cas9 domain, e.g., a dCas9 domain; a transcriptional enhancer; a transcriptional repressor), in combination with at least one guide RNA (gRNA) or antisense DNA oligonucleotide that targets the protein to a site-specific target sequence, wherein the composition is effective to alter, in a human cell, the expression of a target gene. In some embodiments, the enzyme domain is a Cas9 or a dCas9. In some embodiments, the protein comprises two enzyme domains, e.g., a dCas9 and a transcriptional enhancer or transcriptional repressor domain.


As used herein, a “biologically active portion of an effector domain” is a portion that maintains the function (e.g. completely, partially, minimally) of an effector domain (e.g., a “minimal” or “core” domain).


The chimeric proteins described herein may also comprise a linker, e.g., an amino acid linker. In some aspects, a linker comprises 2 or more amino acids, e.g., one or more GS sequences. In some aspects, fusion of Cas9 (e.g., dCas9) with two or more effector domains (e.g., of a DNA methylase or enzyme with a role in DNA demethylation or protein acetyl transferase or deacetylase) comprises one or more interspersed linkers (e.g., GS linkers) between the domains. In some aspects, dCas9 is fused with 2-5 effector domains with interspersed linkers.


In some embodiments, a site-specific HNF4α targeting moiety comprises a conjunction nucleating molecule, a nucleic acid encoding a conjunction nucleating molecule, or a combination thereof. In some embodiments, an anchor sequence-mediated conjunction is mediated by a first conjunction nucleating molecule bound to the first anchor sequence, a second conjunction nucleating molecule bound to the noncontiguous second anchor sequence, and an association between the first and second conjunction nucleating molecules. In some embodiments, a conjunction nucleating molecule may disrupt, e.g., by competitive binding, the binding of an endogenous conjunction nucleating molecule to its binding site.


The conjunction nucleating molecule may be, e.g., CTCF, cohesin, USF1, YY1, TATA-box binding protein associated factor 3 (TAF3), ZNF143 binding motif, or another polypeptide that promotes the formation of an anchor sequence-mediated conjunction. The conjunction nucleating molecule may be an endogenous polypeptide or other protein, such as a transcription factor, e.g., autoimmune regulator (AIRE), another factor, e.g., X-inactivation specific transcript (XIST), or an engineered polypeptide that is engineered to recognize a specific DNA sequence of interest, e.g., having a zinc finger, leucine zipper or bHLH domain for sequence recognition. The conjunction nucleating molecule may modulate DNA interactions within or around the anchor sequence-mediated conjunction. For example, the conjunction nucleating molecule can recruit other factors to the anchor sequence that alters an anchor sequence-mediated conjunction formation or disruption.


The conjunction nucleating molecule may also have a dimerization domain for homo- or heterodimerization. One or more conjunction nucleating molecules, e.g., endogenous and engineered, may interact to form the anchor sequence-mediated conjunction. In some embodiments, the conjunction nucleating molecule is engineered to further include a stabilization domain, e.g., cohesion interaction domain, to stabilize the anchor sequence-mediated conjunction. In some embodiments, the conjunction nucleating molecule is engineered to bind a target sequence, e.g., target sequence binding affinity is modulated. In some embodiments, the conjunction nucleating molecule is selected or engineered with a selected binding affinity for an anchor sequence within the anchor sequence-mediated conjunction. Conjunction nucleating molecules and their corresponding anchor sequences may be identified through the use of cells that harbor inactivating mutations in CTCF and Chromosome Conformation Capture or 3C-based methods, e.g., Hi-C or high-throughput sequencing, to examine topologically associated domains, e.g., topological interactions between distal DNA regions or loci, in the absence of CTCF. Long-range DNA interactions may also be identified. Additional analyses may include ChlA-PET analysis using a bait, such as Cohesin, YY1 or USF1, ZNF143 binding motif, and MS to identify complexes that are associated with the bait.


B. Effector Molecules


Effector molecules for use in the compositions and methods of the invention include those that modulate a biological activity, for example increasing or decreasing enzymatic activity, gene expression, cell signalling, and cellular or organ function. Preferred effector molecules of the invention are nucleases, physical blockers, epigenetic recruiters, e.g., a transcriptional enhancer or a transcriptional repressor, and epigenetic CpG modifiers, e.g., a DNA methylase, a DNA demethylase, a histone modifying agent, or a histone deacetylase, and combinations of any of the foregoing.


Additional effector effector activities of the effector molecules of the invention may also include binding regulatory proteins to modulate activity of the regulator, such as transcription or translation. Effector molecules also may include activator or inhibitor (or “negative effector”) functions as described herein. For example, the effector molecule may inhibit substrate binding to a receptor and inhibit its activation, e.g., naltrexone and naloxone bind opioid receptors without activating them and block the receptors' ability to bind opioids. Effector molecules may also modulate protein stability/degradation and/or transcript stability/degradation. For example, proteins may be targeted for degradation by the polypeptide co-factor, ubiquitin, onto proteins to mark them for degradation. In another example, an effector molecule inhibits enzymatic activity by blocking the enzyme's active site, e.g., methotrexate is a structural analog of tetrahydrofolate, a coenzyme for the enzyme dihydrofolate reductase that binds to dihydrofolate reductase 1000-fold more tightly than the natural substrate and inhibits nucleotide base synthesis.


In some embodiments, the effector molecule is a chemical, e.g., a chemical that modulates a cytosine (C) or an adenine (A) (e.g., Na bisulfite, ammonium bisulfite). In some embodiments, the effector molecule has enzymatic activity (methyl transferase, demethylase, nuclease (e.g., Cas9), a deaminase). In some embodiments, the effector molecule sterically hinders formation of an anchor sequence-mediated conjunction or binding of an RNA polymerase to a promoter.


The effector molecule with effector activity may be any one of the small molecules, peptides, fusion proteins, nucleic acids, nanoparticle, aptamers, or pharmacoagents with poor PK/PD described herein.


In some embodiments, the effector molecule is an inhibitor or “negative effector molecule”. In the context of a negative effector molecule that modulates formation of an anchor sequence-mediated conjunction, in some embodiments, the negative effector molecule is characterized in that dimerization of an endogenous nucleating polypeptide is reduced when the negative effector molecule is present as compared with when it is absent. For example, in some embodiments, the negative effector molecule is or comprises a variant of the endogenous nucleating polypeptide's dimerization domain, or a dimerizing portion thereof.


For example, in certain embodiments, an anchor sequence-mediated conjunction is altered (e.g., disrupted) by use of a dominant negative effector, e.g., a protein that recognizes and binds an anchor sequence, (e.g., a CTCF binding motif), but with an inactive (e.g., mutated) dimerization domain, e.g., a dimerization domain that is unable to form a functional anchor sequence-mediated conjunction. For example, the Zinc Finger domain of CTCF can be altered so that it binds a specific anchor sequence (by adding zinc fingers that recognize flanking nucleic acids), while the homo-dimerization domain is altered to prevent the interaction between the engineered CTCF and endogenous forms of CTCF.


In some embodiments, the effector molecule comprises a synthetic conjunction nucleating molecule with a selected binding affinity for an anchor sequence within a target anchor sequence-mediated conjunction, (the binding affinity may be at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or higher or lower than the affinity of an endogenous conjunction nucleating molecule that associates with the target anchor sequence. The synthetic conjunction nucleating molecule may have between 30-90%, 30-85%, 30-80%, 30-70%, 50-80%, 50-90% amino acid sequence identity to the endogenous conjunction nucleating molecule). The conjunction nucleating molecule may disrupt, such as through competitive binding, the binding of an endogenous conjunction nucleating molecule to its anchor sequence. In some more embodiments, the conjunction nucleating molecule is engineered to bind a novel anchor sequence within the anchor sequence-mediated conjunction.


In some embodiments, a dominant negative effector molecule has a domain that recognizes specific DNA sequences (e.g., an anchor sequence, a CTCF anchor sequence, flanked by sequences that confer sequence specificity), and a second domain that provides a steric presence in the vicinity of the anchoring sequence. The second domain may include a dominant negative conjunction nucleating molecule or fragment thereof, a polypeptide that interferes with conjunction nucleating molecule sequence recognition (e.g., the amino acid backbone of a peptide/nucleic acid or PNA), a nucleic acid sequence ligated to a small molecule that imparts steric interference, or any other combination of DNA recognition elements and a steric blocker.


In some embodiments, the effector molecule is an epigenetic modifying agent. Epigenetic modifying agents useful in the methods and compositions described herein include agents that affect, e.g., DNA methylation, histone acetylation, and RNA-associated silencing. In some embodiments, the effectors sequence-specifically target an epigenetic enzyme (e.g., an enzyme that generates or removes epigenetic marks, e.g., acetylation and/or methylation). Exemplary epigenetic effectors may target an expression control region comprising, e.g., a transcriptional control element or an anchor sequence, by a site-specific disrupting agent comprising a site-specific targeting moiety.


In some embodiments, an effector molecule comprises one or more components of a gene editing system. Components of gene editing systems may be used in a variety of contexts including but not limited to gene editing. For example, such components may be used to target agents that physically modify, genetically modify, and/or epigenetically modify HNF4α sequences.


Exemplary gene editing systems include the clustered regulatory interspaced short palindromic repeat (CRISPR) system, zinc finger nucleases (ZFNs), and Transcription Activator-Like Effector-based Nucleases (TALEN). ZFNs, TALENs, and CRISPR-based methods are described, e.g., in Gaj et al. Trends Biotechnol. 31.7(2013):397-405; CRISPR methods of gene editing are described, e.g., in Guan et al, Application of CRISPR-Cas system in gene therapy: Pre-clinical progress in animal model. DNA Repair 2016 Jul. 30 [Epub ahead of print]; Zheng et al, Precise gene deletion and replacement using the CRISPR/Cas9 system in human cells. BioTechniques, Vol. 57, No. 3, September 2014, pp. 115-124.


CRISPR systems are adaptive defense systems originally discovered in bacteria and archaea. CRISPR systems use RNA-guided nucleases termed CRISPR-associated or “Cas” endonucleases (e.g., Cas9 or Cpf1) to cleave foreign DNA. In a typical CRISPR/Cas system, an endonuclease is directed to a target nucleotide sequence (e.g., a site in the genome that is to be sequence-edited) by sequence-specific, non-coding “guide RNAs” that target single- or double-stranded DNA sequences. Three classes (I-III) of CRISPR systems have been identified. The class II CRISPR systems use a single Cas endonuclease (rather than multiple Cas proteins). One class II CRISPR system includes a type II Cas endonuclease such as Cas9, a CRISPR RNA (“crRNA”), and a trans-activating crRNA (“tracrRNA”). The crRNA contains a “guide RNA”, typically about 20-nucleotide RNA sequence that corresponds to a target DNA sequence. The crRNA also contains a region that binds to the tracrRNA to form a partially double-stranded structure which is cleaved by RNase III, resulting in a crRNA/tracrRNA hybrid. The crRNA/tracrRNA hybrid then directs the Cas9 endonuclease to recognize and cleave the target DNA sequence. The target DNA sequence must generally be adjacent to a “protospacer adjacent motif (“PAM”) that is specific for a given Cas endonuclease; however, PAM sequences appear throughout a given genome. CRISPR endonucleases identified from various prokaryotic species have unique PAM sequence requirements; examples of PAM sequences include 5′-NGG (Streptococcus pyogenes), 5′-NNAGAA (Streptococcus thermophilus CRISPR1), 5′-NGGNG (Streptococcus thermophilus CRISPR3), and 5′-NNNGATT (Neisseria meningiditis). Some endonucleases, e.g., Cas9 endonucleases, are associated with G-rich PAM sites, e.g., 5′-NGG, and perform blunt-end cleaving of the target DNA at a location 3 nucleotides upstream from (5′ from) the PAM site. Another class II CRISPR system includes the type V endonuclease Cpf1, which is smaller than Cas9; examples include AsCpf1 (from Acidaminococcus sp.) and LbCpf1 (from Lachnospiraceae sp.). Cpf 1-associated CRISPR arrays are processed into mature crRNAs without the requirement of a tracrRNA; in other words a Cpf1 system requires only the Cpf1 nuclease and a crRNA to cleave the target DNA sequence. Cpf1 endonucleases, are associated with T-rich PAM sites, e.g., 5′-TTN. Cpf1 can also recognize a 5′-CTA PAM motif. Cpf1 cleaves the target DNA by introducing an offset or staggered double-strand break with a 4- or 5-nucleotide 5′ overhang, for example, cleaving a target DNA with a 5-nucleotide offset or staggered cut located 18 nucleotides downstream from (3′ from) from the PAM site on the coding strand and 23 nucleotides downstream from the PAM site on the complimentary strand; the 5-nucleotide overhang that results from such offset cleavage allows more precise genome editing by DNA insertion by homologous recombination than by insertion at blunt-end cleaved DNA. See, e.g., Zetsche et al. (2015) Cell, 163:759-771.


A variety of CRISPR associated (Cas) genes or proteins can be used in the present invention and the choice of Cas protein will depend upon the particular conditions of the method.


Specific examples of Cas proteins include class II systems including Cas1, Cast, Cas3, Cas4, Cas5, Cas6, Cas7, Cas8, Cas9, Cas10, Cpf1, C2C1, or C2C3. In some embodiments, a Cas protein, e.g., a Cas9 protein, may be from any of a variety of prokaryotic species. In some embodiments a particular Cas protein, e.g., a particular Cas9 protein, is selected to recognize a particular protospacer-adjacent motif (PAM) sequence. In some embodiments, the site-specific targeting moiety includes a sequence targeting polypeptide, such as an enzyme, e.g., Cas9. In certain embodiments a Cas protein, e.g., a Cas9 protein, may be obtained from a bacteria or archaea or synthesized using known methods. In certain embodiments, a Cas protein may be from a gram positive bacteria or a gram negative bacteria. In certain embodiments, a Cas protein may be from a Streptococcus, (e.g., a S. pyogenes, a S. thermophilus) a Crptococcus, a Corynebacterium, a Haemophilus, a Eubacterium, a Pasteurella, a Prevotella, a Veillonella, or a Marinobacter. In some embodiments nucleic acids encoding two or more different Cas proteins, or two or more Cas proteins, may be introduced into a cell, zygote, embryo, or animal, e.g., to allow for recognition and modification of sites comprising the same, similar or different PAM motifs. In some embodiments, the Cas protein is modified to deactivate the nuclease, e.g., nuclease-deficient Cas9, and to recruit transcription activators or repressors, e.g., the co-subunit of the E. coli Pol, VP64, the activation domain of p65, KRAB, or SID4X, to induce epigenetic modifications, e.g., histone acetyltransferase, histone methyltransferase and demethylase, DNA methyltransferase and enzyme with a role in DNA demethylation (e.g., the TET family enzymes catalyze oxidation of 5-methylcytosine to 5-hydroxymethylcytosine and higher oxidative derivatives).


For the purposes of gene editing, CRISPR arrays can be designed to contain one or multiple guide RNA sequences corresponding to a desired target DNA sequence; see, for example, Cong et al. (2013) Science, 339:819-823; Ran et al. (2013) Nature Protocols, 8:2281-2308. At least about 16 or 17 nucleotides of gRNA sequence are required by Cas9 for DNA cleavage to occur; for Cpf1 at least about 16 nucleotides of gRNA sequence is needed to achieve detectable DNA cleavage.


Whereas wild-type Cas9 generates double-strand breaks (DSBs) at specific DNA sequences targeted by a gRNA, a number of CRISPR endonucleases having modified functionalities are available, for example: a “nickase” version of Cas9 generates only a single-strand break; a catalytically inactive Cas9 (“dCas9”) does not cut the target DNA but interferes with transcription by steric hindrance. dCas9 can further be fused with a heterologous effector to repress (CRISPRi) or activate (CRISPRa) expression of a target gene. For example, Cas9 can be fused to a transcriptional silencer (e.g., a KRAB domain) or a transcriptional activator (e.g., a dCas9-VP64 fusion). A catalytically inactive Cas9 (dCas9) fused to FokI nuclease (“dCas9-FokI”) can be used to generate DSBs at target sequences homologous to two gRNAs. See, e.g., the numerous CRISPR/Cas9 plasmids disclosed in and publicly available from the Addgene repository (Addgene, 75 Sidney St., Suite 550A, Cambridge, Mass. 02139; addgene.org/crispr). A “double nickase” Cas9 that introduces two separate double-strand breaks, each directed by a separate guide RNA, is described as achieving more accurate genome editing by Ran et al. (2013) Cell, 154: 1380-1389.


CRISPR technology for editing the genes of eukaryotes is disclosed in US Patent Application Publications 2016/0138008A1 and US2015/0344912A1, and in U.S. Pat. Nos. 8,697,359, 8,771,945, 8,945,839, 8,999,641, 8,993,233, 8,895,308, 8,865,406, 8,889,418, 8,871,445, 8,889,356, 8,932,814, 8,795,965, and 8,906,616. Cpf1 endonuclease and corresponding guide RNAs and PAM sites are disclosed in US Patent Application Publication 2016/0208243 A1.


In some embodiments, an effector comprises one or more components of a CRISPR system described hereinabove.


In some embodiments, suitable effectors for use in the agents, compositions, and methods of the invention include, for example, nucleases, physical blockers, epigenetic recruiters, e.g., a transcriptional enhancer or a transcriptional repressor, and epigenetic CpG modifiers, e.g., a DNA methylase, a DNA demethylase, a histone modifying agent, or a histone deacetylase, and combinations of any of the foregoing.


Suitable effectors include a polypeptide or its variant. The term “variant,” as used herein, refers to a polypeptide that is derived by incorporation of one or more amino acid insertions, substitutions, or deletions in a precursor polypeptide (e.g., “parent” polypeptide). In certain embodiments, a variant polypeptide has at least about 85% amino acid sequence identity, e.g., about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or about 100%, amino acid sequence identity to the entire amino acid sequence of a parent polypeptide.


The term “sequence identity,” as used herein, refers to a comparison between pairs of nucleic acid or amino acid molecules, i.e., the relatedness between two amino acid sequences or between two nucleotide sequences. In general, the sequences are aligned so that the highest order match is obtained. Methods for determining sequence identity are known and can be determined by commercially available computer programs that can calculate the percentage of identity between two or more sequences. A typical example of such a computer program is CLUSTAL.


Exemplary effectors include ubiquitin, bicyclic peptides as ubiquitin ligase inhibitors, transcription factors, DNA and protein modification enzymes such as topoisomerases, topoisomerase inhibitors such as topotecan, DNA methyltransferases such as the DNMT family (e.g., DNMT3a, DNMT3b, DNMTL), protein methyltransferases (e.g., viral lysine methyltransferase (vSET), protein-lysine N-methyltransferase (SMYD2), deaminases (e.g., APOBEC, UG1), histone methyltransferases such as enhancer of zeste homolog 2 (EZH2), PRMT1, histone-lysine-N-methyltransferase (Setdbl), histone methyltransferase (SET2), euchromatic histone-lysine N-methyltransferase 2 (G9a), histone-lysine N-methyltransferase (SUV39H1), and G9a), histone deacetylase (e.g., HDAC1, HDAC2, HDAC3), enzymes with a role in DNA demethylation (e.g., the TET family enzymes catalyze oxidation of 5-methylcytosine to 5-hydroxymethylcytosine and higher oxidative derivatives), protein demethylases such as KDMIA and lysine-specific histone demethylase 1 (LSD1), helicases such as DHX9, acetyltransferases, deacetylases (e.g., sirtuin 1, 2, 3, 4, 5, 6, or 7), kinases, phosphatases, DNA-intercalating agents such as ethidium bromide, sybr green, and proflavine, efflux pump inhibitors such as peptidomimetics like phenylalanine arginyl-naphthylamide or quinoline derivatives, nuclear receptor activators and inhibitors, proteasome inhibitors, competitive inhibitors for enzymes such as those involved in lysosomal storage diseases, zinc finger proteins, TALENs, specific domains from proteins, such as a KRAB domain, a VP64 domain, a p300 domain (e.g., p300 core domain), an MeCP2 domain, an MQ1 domain, a DNMT3a-3L domain a TET1 domain, and a TET2 domain, protein synthesis inhibitors, nucleases (e.g., Cpf1, Cas9, zinc finger nuclease), fusions of one or more thereof (e.g., dCas9-DNMT, dCas9-APOBEC, dCas9-UG1, dCas9-VP64, dCas9-p300 core, dCas9-KRAB, dCas9-KRAB-MeCP2, dCas9-MQ1, dCas9-DNMT3a-3L, dCAS9-TET1, dCAS9-TET2, and dCas9-MC/MN).


In some embodiments, a suitable nuclease for use in the agent, compositions, and methods of the invention comprises a Cas9 polypeptide, or enzymatically active portion thereof. In one embodiment, the Cas9 polypeptide, or enzymatically active portion thereof, further comprises a catalytically active domain of human exonuclease 1 (hEXO1), e.g., 5′ to 3′ exonuclease activity and/or an RNase H activity. In other embodiments, a suitable nuclease comprises a transcription activator like effector nucleases (TALEN). In yet other embodiments, a suitable nuclease comprises a zinc finger protein.


The term TALEN, as used herein, is broad and includes a monomeric TALEN that can cleave double stranded DNA without assistance from another TALEN. The term TALEN is also used to refer to one or both members of a pair of TALENs that are engineered to work together to cleave DNA at the same site. TALENs that work together may be referred to as a left-TALEN and a right-TALEN, which references the handedness of DNA. See U.S. Ser. No. 12/965,590; U.S. Ser. No. 13/426,991 (U.S. Pat. No. 8,450,471); U.S. Ser. No. 13/427,040 (U.S. Pat. No. 8,440,431); U.S. Ser. No. 13/427,137 (U.S. Pat. No. 8,440,432); and U.S. Ser. No. 13/738,381, all of which are incorporated by reference herein in their entirety.


TAL effectors are proteins secreted by Xanthomonas bacteria. The DNA binding domain contains a highly conserved 33-34 amino acid sequence with the exception of the 12th and 13th amino acids. These two locations are highly variable (Repeat Variable Diresidue (RVD)) and show a strong correlation with specific nucleotide recognition. This simple relationship between amino acid sequence and DNA recognition has allowed for the engineering of specific DNA binding domains by selecting a combination of repeat segments containing the appropriate RVDs.


The non-specific DNA cleavage domain from the end of the FokI endonuclease can be used to construct hybrid nucleases that are active in a yeast assay. These reagents are also active in plant cells and in animal cells. Initial TALEN studies used the wild-type FokI cleavage domain, but some subsequent TALEN studies also used FokI cleavage domain variants with mutations designed to improve cleavage specificity and cleavage activity. The FokI domain functions as a dimer, requiring two constructs with unique DNA binding domains for sites in the target genome with proper orientation and spacing. Both the number of amino acid residues between the TALEN DNA binding domain and the Fold cleavage domain and the number of bases between the two individual TALEN binding sites are parameters for achieving high levels of activity. The number of amino acid residues between the TALEN DNA binding domain and the Fold cleavage domain may be modified by introduction of a spacer (distinct from the spacer sequence) between the plurality of TAL effector repeat sequences and the Fold endonuclease domain. The spacer sequence may be 12 to 30 nucleotides.


The relationship between amino acid sequence and DNA recognition of the TALEN binding domain allows for designable proteins. In this case artificial gene synthesis is problematic because of improper annealing of the repetitive sequence found in the TALE binding domain. One solution to this is to use a publicly available software program (DNAWorks) to calculate oligonucleotides suitable for assembly in a two step PCR; oligonucleotide assembly followed by whole gene amplification. A number of modular assembly schemes for generating engineered TALE constructs have also been reported. Both methods offer a systematic approach to engineering DNA binding domains that is conceptually similar to the modular assembly method for generating zinc finger DNA recognition domains.


Once the TALEN genes have been assembled they are inserted into plasmids; the plasmids are then used to transfect the target cell where the gene products are expressed and enter the nucleus to access the genome. TALENs can be used to edit genomes by inducing double-strand breaks (DSB), which cells respond to with repair mechanisms. In this manner, they can be used to correct mutations in the genome which, for example, cause disease.


As used herein, a “zinc finger polypeptide” or “zinc finger protein” is a protein that binds to DNA, RNA and/or protein, in a sequence-specific manner, by virtue of a metal stabilized domain known as a zinc finger. Zinc finger proteins are nucleases having a DNA cleavage domain and a DNA binding zinc finger domain. Zinc finger polypeptides may be made by fusing the nonspecific DNA. cleavage domain of an endonuclease with site-specific DNA binding zinc finger domains. Such nucleases are powerful tools for gene editing and can be assembled to induce double strand breaks (DSBs) site-specifically into genomic DNA. ZFNs allow specific gene disruption as during DNA repair, the targeted genes can be disrupted via mutagenic non-homologous end joint (NHEJ) or modified via homologous recombination (HR) if a closely related DNA template is supplied.


Zinc finger nucleases are chimeric enzymes made by fusing the nonspecific DNA. cleavage domain of the endonuclease FokI with site-specific DNA binding zinc finger domains. Due to the flexible nature of zinc finger proteins (ZFPs), ZFNs can be assembled that induce double strand breaks (DSBs) site-specifically into genomic DNA. ZFNs allow specific gene disruption as during DNA repair, the targeted genes can be disrupted via mutagenic non-homologous end joint (NHEJ) or modified via homologous recombination (HR) if a closely related DNA template is supplied.


In some embodiments, a suitable physical blocker for use in the agent, compositions, and methods of the invention comprises a gRNA, antisense DNA, or triplex forming oligonucleotide (which may target an expression control unit) steric block a transcriptional control element or anchoring sequence. The gRNA recognizes specific DNA sequences and further includes sequences that interfere with, e.g., a conjunction nucleating molecule sequence to act as a steric blocker. In some embodiments, the gRNA is combined with one or more peptides, e.g., S-adenosyl methionine (SAM), that acts as a steric presence. In other embodiments, a physical blocker comprises an enzymatically inactive Cas9 polypeptide, or fragment thereof (e.g., dCas9).


In one embodiment, an epigenetic recruiter activates or enhances transcription of a target gene. In some embodiments, a suitable epigenetic recruiter for use in the agent, compositions, and methods of the invention comprises a VP64 domain or a p300 core domain.


In one embodiment, an epigenetic recruiter silences or represses transcription of a target gene. In some embodiments, a suitable epigenetic recruiter for use in the agent, compositions, and methods of the invention comprises a KRAB domain, or an MeCP2 domain.


In one embodiment, a suitable epigenetic recruiter for use in the agent, compositions, and methods of the invention comprises dCas9-VP64 fusion, a dCas9-p300 core fusion, a dCas9-KRAB fusion, or a dCas9-KRAB-MeCP2 fusion.


As used herein, “VP64” is a transcriptional activator composed of four tandem copies of VP16 (Herpes Simplex Viral Protein 16, amino acids 437-447*: DALDDFDLDML (SEQ ID NO: 328) connected with glycine-serine (GS) linkers. In one embodiment, the VP64 further comprises the transcription factors p65 (RelA) and Rta at the C terminus. An effector that comprises VP64, p65 and Rta is referred to as “VPR.” The GenBank Accession number of VP64 is ADD60007.1, the GenBank Accession number of p65 is NP_001138610.1, and the GenBank Accession number of Rta is AAA66528.1.


An exemplary amino acid sequence of a VPR is as follows:









(SEQ ID NO.: 66)


DALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDML





SGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPTDP





RPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQIS





QASALAPAPPQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPP





APKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEF





QQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLP





NGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGSAISDVF





EGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPV





PQPLDPAPAVTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICG





QMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECL





LHAMHISTGLSIFDTSLF






As used herein, “p300 core domain” refers to the catalytic core of the human acetyltransferase p300. The GenBank Accession number for the protein comprising p300 is NP_001420.2.


An exemplary amino acid sequence of a p300 is as follows:









(SEQ ID NO.: 67) 


IFKPEELRQALMPTLEALYRQDPESLPFRQPVDPQLLGIPDYEDIVKSPM





DLSTIKRKLDTGQYQEPWQYVDDIWLMFNNAWLYNRKTSRVYKYCSKLSE





VFEQEIDPVMQSLGYCCGRKLEFSPQTLCCYGKQLCTIPRDATYYSYQNR





YHFCEKCFNEIQGESVSLGDDPSQPQTTINKEQFSKRKNDTLDPELFVEC





TECGRKMHQICVLHHEIIWPAGFVCDGCLKKSARTRKENKFSAKRLPSTR





LGTFLENRVNDFLRRQNHPESGEVTVRVVHASDKTVEVKPGMKARFVDSG





EMAESFPYRTKALFAFEEIDGVDLCFFGMHVQEYGSDCPPPNQRRVYISY





LDSVHFFRPKCLRTAVYHEILIGYLEYVKKLGYTTGHIWACPPSEGDDYI





FHCHPPDQKIPKPKRLQEWYKKMLDKAVSERIVHDYKDIFKQATEDRLTS





AKELPYFEGDFWPNVLEESIKELEQEEEERKREENTSNESTDVTKGDSKN





AKKKNNKKTSKNKSSLSRGNKKKPGMPNVSNDLSQKLYATMEKHKEVFFV





IRLIAGPAANSLPPIVDPDPLIPCDLMDGRDAFLTLARDKHLEFSSLRRA





QWSTMCMLVELHTQSQD.






As used herein, “KRAB” refers to a Krüppel associated box (KRAB) transcriptional repression domain present in human zinc finger protein-based transcription factors (KRAB zinc finger proteins).


As used herein, “MeCp2” refers to methyl CpG binding protein 2 which represses transcription, e.g., by binding to a promoter comprising methylated DNA.


In one embodiment, an epigenetic CpG modifier methylates DNA and inactivates or represses transcription. In some embodiments, a suitable epigenetic CpG modifier for use in the agent, compositions, and methods of the invention comprises a MQ1 domain or a DNMT3a-3L domain.


In one embodiment, an epigenetic CpG modifier demethylates DNA and activates or stimulates transcription. In some embodiments, a suitable epigenetic recruiter for use in the agent, compositions, and methods of the invention comprises a TET1 or TET2 domain.


As used herein “TET1” refers to “ten-eleven translocation methylcytosine dioxygenase 1,” a member of the TET family of enzymes, encoded by the TET1 gene. TET1 is a dioxygenase that catalyzes the conversion of the modified DNA base 5-methylcytosine (5-mC) to 5-hydroxymethylcytosine (5-hmC) by oxidation of 5-mC in an iron and alpha-ketoglutarate dependent manner, the initial step of active DNA demethylation in mammals. Methylation at the C5 position of cytosine bases is an epigenetic modification of the mammalian genome which plays an important role in transcriptional regulation. In addition to its role in DNA demethylation, plays a more general role in chromatin regulation. Preferentially binds to CpG-rich sequences at promoters of both transcriptionally active and Polycomb-repressed genes. Involved in the recruitment of the O-GlcNAc transferase OGT to CpG-rich transcription start sites of active genes, thereby promoting histone H2B GlcNAcylation by OGT.


As used herein, “TET2” refers to “ten-eleven translocation 2 (TET2),” a member of the TET family of enzymes, encoded by the TET1 gene. Similarly to TET1, TET2 is a dioxygenase that catalyzes the conversion of the modified genomic base 5-methylcytosine (5mC) into 5-hydroxymethylcytosine (5hmC) and plays a key role in active DNA demethylation. TET2 a preference for 5-hydroxymethylcytosine in CpG motifs. TET2 also mediates subsequent conversion of 5hmC into 5-formylcytosine (5fC), and conversion of 5fC to 5-carboxylcytosine (5caC). The conversion of 5mC into 5hmC, 5fC and 5caC probably constitutes the first step in cytosine demethylation. Methylation at the C5 position of cytosine bases is an epigenetic modification of the mammalian genome which plays an important role in transcriptional regulation. In addition to its role in DNA demethylation, also involved in the recruitment of the O-GlcNAc transferase OGT to CpG-rich transcription start sites of active genes, thereby promoting histone H2B GlcNAcylation by OGT.


As used herein “DNMT3a-3L” refers to a fusion of a DNA methyltransferase, Dnmt3a and a Dnmt3L which is catalytically inactive, but directly interacts with the catalytic domains of Dnmt3a.


In some embodiments, a suitable epigenetic recruiter for use in the agent, compositions, and methods of the invention comprises a MQ1 domain, a DNMT3a-3L domain, a TET1 domain, or a TET2 domain. In one embodiment, a suitable epigenetic recruiter for use in the agent, compositions, and methods of the invention comprises a dCas9-MQ1 fusion, a dCas9-DNMT3a-3L fusion, a dCas9-TET1 fusion or a dCAS9-TET2 fusion.


III. Delivery of a Site-Specific HNF4α Disrupting Agent of the Invention and Compositions Comprising a Site-Specific an HNF4α Disrupting Agents of the Invention

The delivery of the disrupting agents of the invention to a cell e.g., a cell within a subject, such as a human subject (e.g., a subject in need thereof, such as a subject having an HNF4α-associated disorder, e.g., cirrhosis) may be achieved in a number of different ways. For example, delivery may be performed by contacting a cell with a disrupting agent of the invention either in vitro, ex vivo, or in vivo. In vivo delivery may be performed directly by administering a composition, such as a lipid composition, comprising a disrupting agent to a subject. Alternatively, in vivo delivery may be performed indirectly by administering one or more vectors that encode and direct the expression of the disrupting agent. These alternatives are discussed further below.


In some embodiments, the disrupting agent comprises a nucleic acid molecule encoding a fusion protein, the fusion protein comprising a site-specific HNF4α targeting moiety, such as a polynucleotide encoding a DNA-binding domain of a Transcription activator-like effector (TALE) polypeptide or a zinc finger (ZNF) polypeptide, or fragment thereof, that specifically targets and binds to the HNF4α expression control region and an effector molecule, such as a VPR.


In other embodiments, the disrupting agent comprises a guide RNA and an mRNA encoding an effector molecule. The ratio of guide RNA to mRNA may be about 100:1 to about 1:100 (wt:wt).


In general, any method of delivery of a site-specific HNF4α disrupting agent of the invention (in vitro, ex vivo, or in vivo) may be adapted for use with the disrupting agents of the invention (see e.g., Akhtar S. and Julian R L., (1992) Trends Cell. Biol. 2(5):139-144 and WO94/02595, which are incorporated herein by reference in their entireties). For in vivo delivery, factors to be considered for delivering a site-specific HNF4α disrupting agent of the invention include, for example, biological stability of the disrupting agent, prevention of non-specific effects, and accumulation of the disrupting agent in the target tissue. The non-specific effects of a disrupting agent can be minimized by local administration, for example, by direct injection or implantation into a tissue or topically administering a composition comprising the disrupting agent. Local administration to a treatment site maximizes local concentration of the disrupting agent, limits the exposure of the disrupting agent to systemic tissues that can otherwise be harmed by the disrupting agent or that can degrade the disrupting agent, and permits a lower total dose of the disrupting agent to be administered.


For administering a site-specific HNF4α disrupting agent systemically for the treatment of a disease, such as an HNF4α-associate disease, the disrupting agent, e.g., a disrupting agent comprising a site-specific targeting moiety comprising a nucleic acid molecule, can be modified or alternatively delivered using a drug delivery system; both methods act to prevent the rapid degradation of a site-specific targeting moiety comprising a nucleic acid molecule by endo- and exo-nucleases in vivo. Modification of a disrupting agent comprising a site-specific targeting moiety comprising a nucleic acid molecule or a pharmaceutical carrier also permits targeting of the disrupting agent to a target tissue and avoidance of undesirable off-target effects. For example, a disrupting agent of the invention may be modified by chemical conjugation to lipophilic groups such as cholesterol to enhance cellular uptake and prevent degradation.


Alternatively, a disrupting agent of the invention may be delivered using a drug delivery system such as a nanoparticle, a dendrimer, a polymer, a liposome, or a cationic delivery system. Positively charged cationic delivery systems facilitate binding of disrupting agent (e.g., negatively charged molecule) and also enhance interactions at the negatively charged cell membrane to permit efficient uptake of a disrupting agent by the cell. Cationic lipids, dendrimers, or polymers can either be bound to a disrupting agent, or induced to form a vesicle or micelle (see e.g., Kim S H. et al., (2008) Journal of Controlled Release 129(2):107-116) that encases the disrupting agent. The formation of vesicles or micelles further prevents degradation of the disrupting agent when administered systemically. Methods for making and administering cationic complexes are well within the abilities of one skilled in the art (see e.g., Sorensen, D R., et al. (2003) J. Mol. Biol 327:761-766; Verma, U N. et al., (2003) Clin. Cancer Res. 9:1291-1300; Arnold, A S et al. (2007) J. Hypertens. 25:197-205, which are incorporated herein by reference in their entirety). Some non-limiting examples of drug delivery systems useful for systemic delivery of a distrupting agent of the invention include DOTAP (Sorensen, D R., et al (2003), supra; Verma, U N. et al., (2003), supra), Oligofectamine, “solid nucleic acid lipid particles” (Zimmermann, T S. et al., (2006) Nature 441:111-114), cardiolipin (Chien, P Y. et al., (2005) Cancer Gene Ther. 12:321-328; Pal, A. et al., (2005) Int J. Oncol. 26:1087-1091), polyethyleneimine (Bonnet M E. et al., (2008) Pharm. Res. August 16 Epub ahead of print; Aigner, A. (2006) J. Biomed. Biotechnol. 71659), Arg-Gly-Asp (RGD) peptides (Liu, S. (2006) Mol. Pharm. 3:472-487), and polyamidoamines (Tomalia, D A. et al., (2007) Biochem. Soc. Trans. 35:61-67; Yoo, H. et al., (1999) Pharm. Res. 16:1799-1804). In some embodiments, a disrupting agent (e.g., gRNA, or mRNA) forms a complex with cyclodextrin for systemic administration. Methods for administration and pharmaceutical compositions comprising cyclodextrins may be found in U.S. Pat. No. 7,427,605, the entire contents of which are incorporated herein by reference.


The disrupting agents of the invention may be incorporated into pharmaceutical compositions suitable for administration. Such compositions typically include one or more species of disrupting agent and a pharmaceutically acceptable carrier. As used herein the language “pharmaceutically acceptable carrier” is intended to include any and all solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, compatible with pharmaceutical administration. The use of such media and agents for pharmaceutically active substances is well known in the art. Except insofar as any conventional media or agent is incompatible with the active compound, use thereof in the compositions is contemplated. Supplementary active compounds can also be incorporated into the compositions.


The pharmaceutical compositions of the present invention may be administered in a number of ways depending upon whether local or systemic treatment is desired and upon the area to be treated. Administration may be topical (including ophthalmic, vaginal, rectal, intranasal, transdermal), oral, or parenteral. Parenteral administration includes intravenous drip, subcutaneous, intraperitoneal or intramuscular injection, or intrathecal or intraventricular administration.


The route and site of administration may be chosen to enhance delivery or targeting of the disrupting agent comprising a site-specific targeting moiety to a particular location. For example, to target liver cells, intravenous injection may be used. Lung cells may be targeted by administering the disrupting agent in aerosol form. Jejunum cells may be targeted by anal administration.


Formulations for topical administration may include transdermal patches, ointments, lotions, creams, gels, drops, suppositories, sprays, liquids, and powders. Conventional pharmaceutical carriers, aqueous, powder or oily bases, thickeners and the like may be necessary or desirable. Coated condoms, gloves and the like may also be useful.


Compositions for oral administration include powders or granules, suspensions or solutions in water, syrups, elixirs or non-aqueous media, tablets, capsules, lozenges, or troches. In the case of tablets, carriers that can be used include lactose, sodium citrate and salts of phosphoric acid. Various disintegrants such as starch, and lubricating agents such as magnesium stearate, sodium lauryl sulfate and talc, are commonly used in tablets. For oral administration in capsule form, useful diluents are lactose and high molecular weight polyethylene glycols. When aqueous suspensions are required for oral use, the nucleic acid compositions can be combined with emulsifying and suspending agents. If desired, certain sweetening or flavoring agents can be added.


Compositions for intravenous administration may include sterile aqueous solutions which may also contain buffers, diluents, and other suitable additives.


Formulations for parenteral administration may include sterile aqueous solutions which may also contain buffers, diluents, and other suitable additives. For intravenous use, the total concentration of solutes may be controlled to render the preparation isotonic.


In one embodiment, the administration of a disrupting agent composition of the invention is parenteral, e.g., intravenous (e.g., as a bolus or as a diffusible infusion), intradermal, intraperitoneal, intramuscular, intrathecal, intraventricular, intracranial, subcutaneous, transmucosal, buccal, sublingual, endoscopic, rectal, oral, vaginal, topical, pulmonary, intranasal, urethral, or ocular. Administration can be provided by the subject or by another person, e.g., a health care provider. The composition may be provided in measured doses or in a dispenser which delivers a metered dose. Selected modes of delivery are discussed in more detail below.


In certain embodiments, the disrupting agents of the invention are polynucleotides, such as mRNAs, and are formulated in lipid nanoparticles (LNPs).


A. Compositions Comprising a Site-Specific an HNF4α Disrupting Agent of the Invention


The site-specific HNF4α disrupting agents of the invention may be formulated into compositions, such as pharmaceutical compositions, using one or more excipients to: (1) increase stability; (2) increase cell transfection; (3) permit sustained or delayed release (e.g., from a depot formulation); (4) alter the biodistribution (e.g., target the disrupting agent to specific tissues or cell types); (5) increase the translation of an encoded protein in vivo; and/or (6) alter the release profile of an encoded protein in vivo. In addition to traditional excipients, such as any and all solvents, dispersion media, diluents, or other liquid vehicles, dispersion or suspension aids, surface active agents, isotonic agents, thickening or emulsifying agents, preservatives, excipients for use in the compositions of the invention may include, without limitation, lipidoids, liposomes, lipid nanoparticles, polymers, lipoplexes, core-shell nanoparticles, peptides, proteins, cells transfected with nucleic acid molecules, modified nucleic acid molecules, or RNA (e.g., for transplantation into a subject), hyaluronidase, nanoparticle mimics and combinations thereof. Accordingly, the pharmaceutical compositions of the invention can include one or more excipients, each in an amount that together increases the stability of the disrupting agent, increases cell transfection by the disrupting agent, increases the expression of modified nucleic acid, or mRNA encoded protein, and/or alters the release profile of a disrupting agent. Further, the disrupting agents of the present invention may be formulated using self-assembled nucleic acid nanoparticles (see, e.g., U.S. Patent Publication No. 2016/0038612A1, which is incorporated herein by reference in its entirety).


i. Lipidoid


The synthesis of lipidoids has been extensively described and formulations containing these compounds are particularly suited for delivery of a disrupting agent of the invention, such as a disrupting agent comprising a site-specific HNF4α targeting moiety comprising a nucleic acid molecule, e.g., comprising modified nucleic acid molecules or mRNA (see Mahon et al., Bioconjug Chem. 2010 21:1448-1454; Schroeder et al., J Intern Med. 2010 267:9-21; Akinc et al., Nat Biotechnol. 2008 26:561-569; Love et al., Proc Natl Acad Sci USA. 2010 107: 1864-1869; Siegwart et al., Proc Natl Acad Sci USA. 2011108:12996-3001; the contents of all of which are incorporated herein in their entireties).


For example, lipidoids have been used to effectively deliver double stranded small interfering RNA molecules, single stranded nucleic acid molecules, modified nucleic acid molecules or modified mRNA. (See, e.g., US Patent Publication 2016/0038612A1). Complexes, micelles, liposomes or particles can be prepared containing these lipidoids and, therefore, provide effective delivery of a site-specific HNF4α targeting moiety comprising a nucleic acid molecule, as judged by the production of an encoded protein, following the administration of a lipidoid formulation, e.g., via localized and/or systemic administration. Lipidoid complexes of can be administered by various means including, but not limited to, intravenous, intramuscular, intradermal, intraperitoneal or subcutaneous routes.


In vivo delivery of a site-specific HNF4α targeting moiety comprising, e.g., a nucleic acid molecule, may be affected by many parameters, including, but not limited to, the formulation composition, nature of particle PEGylation, degree of loading, polynucleotide to lipid ratio, and biophysical parameters such as, but not limited to, particle size (Akinc et al., Mol Ther. 2009 17:872-879; herein incorporated by reference in its entirety). As an example, small changes in the anchor chain length of poly(ethylene glycol) (PEG) lipids may result in significant effects on in vivo efficacy. Formulations with different lipidoids, including, but not limited to penta[3-(1-laurylaminopropionyl)]-triethylenetetramine hydrochloride (TETA-SLAP; aka 98NI2-5, see Murugaiah et al., Analytical Biochemistry, 401:61 (2010); the contents of which are herein incorporated by reference in their entirety), C12-200 (including derivatives and variants), and MDI, may be used.


In one embodiment, a disrupting agent comprising a site-specific HNF4α targeting moiety comprising, e.g., a nucleic acid molecule, is formulated with a lipidoid for systemic intravenous administration to target cells of the liver. For example, a final optimized intravenous formulation comprising a disrupting agent comprising a site-specific HNF4α targeting moiety comprising a nucleic acid molecule, and a lipid molar composition of 42% 98NI2-5, 48% cholesterol, and 10% PEG-lipid with a final weight ratio of about 7.5 to 1 total lipid to nucleic acid molecule, and a C14 alkyl chain length on the PEG lipid, with a mean particle size of roughly 50-60 nm, can result in the distribution of the formulation to be greater than 90% to the liver (see, Akinc et al., Mol Ther. 2009 17:872-879; the contents of which is herein incorporated by reference in its entirety). In another example, an intravenous formulation using a C12-200 lipidoid (see, e.g., PCT Publication No. WO 2010/129709, which is herein incorporated by reference in its entirety) having a molar ratio of 50/10/38.5/1.5 of C12-200/disteroylphosphatidyl choline/cholesterol/PEG-DMG, with a weight ratio of 7 to 1 total lipid to nucleic acid molecule, and a mean particle size of 80 nm may be used to deliver a disrupting agent comprising a site-specific HNF4α targeting moiety comprising a nucleic acid molecule, to hepatocytes (see, Love et al., Proc Natl Acad Sci USA. 2010 107:1864-1869; the contents of which are herein incorporated by reference in their entirety). In another embodiment, an MDI lipidoid-containing formulation may be used to effectively deliver a disrupting agent comprising a site-specific HNF4α targeting moiety comprising a nucleic acid molecule, to hepatocytes in vivo. The characteristics of optimized lipidoid formulations for intramuscular or subcutaneous routes may vary significantly depending on the target cell type and the ability of formulations to diffuse through the extracellular matrix into the blood stream. While a particle size of less than 150 nm may be desired for effective hepatocyte delivery due to the size of the endothelial fenestrae (see, Akinc et al., Mol Ther. 2009 17:872-879; the contents of which are herein incorporated by reference in their entirety), use of lipidoid-formulated nucleic acid molecules to deliver the formulation to other cells types including, but not limited to, endothelial cells, myeloid cells, and muscle cells may not be similarly size-limited. Use of lipidoid formulations to deliver siRNA in vivo to other non-hepatocyte cells such as myeloid cells and endothelium has been reported (see Akinc et al., Nat Biotechnol. 200826:561-569; Leuschner et al., Nat Biotechnol. 2011 29: 1005-1010; Cho et al. Adv. Funct. Mater. 2009 19:3112-3118; 8th International Judah Folkman Conference, Cambridge, Mass. Oct. 8-9, 2010; the contents of each of which are herein incorporated by reference in their entirety). For delivery to myeloid cells, such as monocytes, lipidoid formulations may have a similar component molar ratio. Different ratios of lipidoids and other components including, but not limited to, disteroylphosphatidyl choline, cholesterol and PEG-DMG, may be used to optimize the formulation for delivery to different cell types including, but not limited to, hepatocytes, myeloid cells, muscle cells, etc. For example, the component molar ratio may include, but is not limited to, 50% CI2-200, 10% disteroylphosphatidyl choline, 38.5% cholesterol, and 1.5% PEG-DMG (see Leuschner et al., Nat Biotechnol 2011 29: 1005-1010; the contents of which are herein incorporated by reference in its entirety). The use of lipidoid formulations for the localized delivery to cells (such as, but not limited to, adipose cells and muscle cells) via either subcutaneous, intradermal or intramuscular delivery, may not require all of the formulation components desired for systemic delivery and, as such, may comprise only the lipidoid and a disrupting agent comprising comprising a site-specific HNF4α targeting moiety comprising, e.g., a nucleic acid molecule, as described herein.


Combinations of different lipidoids may be used to improve the efficacy of the formualtions by increasing cell transfection and/or increasing the translation of encoded protein contained therein (see Whitehead et al., Mol. Ther. 2011, 19:1688-1694, the contents of which are herein incorporated by reference in their entirety).


In one embodiment, the lipidoid may be prepared from the conjugate addition of alklamines to acrylates. As a non-limiting example, a lipidoid may be prepared by the methods described in PCT Patent Publication No. WO 2014/028487, the contents of which are herein incorporated by reference in its entirety. In one embodiment, the lipidoid may comprise a compound having formula (I), formula (II), formula (III), formula (IV) or formula (V) as described in PCT Patent Publication No. WO 2014/028487, the contents of which are herein incorporated by reference in their entirety. In one embodiment, the lipidoid may be biodegradable.


ii. Liposomes, Lipoplexes, and Lipid Nanoparticles


A disrupting agent of the invention may be formulated using one or more liposomes, lipoplexes, or lipid nanoparticles. In one embodiment, pharmaceutical compositions of the invention include liposomes. Liposomes are artificially-prepared vesicles which are primarily composed of a lipid bilayer and may be used as a delivery vehicle for the administration of nutrients and pharmaceutical formulations. Liposomes may be of different sizes such as, but not limited to, a multilamellar vesicle (MLV) which may be hundreds of nanometers in diameter and may contain a series of concentric bilayers separated by narrow aqueous compartments, a small unicellular vesicle (SUV) which may be smaller than 50 nm in diameter, and a large unilamellar vesicle (LUV) which may be between 50 and 500 nm in diameter. Liposome design may include, but is not limited to, opsonins or ligands in order to improve the attachment of liposomes to unhealthy tissue or to activate events such as, but not limited to, endocytosis. Liposomes may contain a low or a high pH in order to improve the delivery of the pharmaceutical formulations. The formation of liposomes may depend on the physicochemical characteristics such as, but not limited to, the pharmaceutical formulation entrapped and the liposomal ingredients, the nature of the medium in which the lipid vesicles are dispersed, the effective concentration of the entrapped substance and its potential toxicity, any additional processes involved during the application and/or delivery of the vesicles, the optimization size, polydispersity and the shelf-life of the vesicles for the intended application, and the batch-to-batch reproducibility and possibility of large-scale production of safe and efficient liposomal products.


As a non-limiting example, liposomes, such as synthetic membrane vesicles, may be prepared by the methods, apparatus and devices described in U.S. Patent Publication Nos. 2013/0177638, 2013/0177637, 2013/0177636, 201/30177635, 2013/0177634, 2013/0177633, 2013/0183375, 2013/0183373, 2013/0183372 and 2016/0038612) and PCT Patent Publication No WO 2008/042973, the contents of each of which are herein incorporated by reference in their entirety.


In one embodiment, a pharmaceutical composition described herein may include, without limitation, liposomes such as those formed from 1,2-dioleyloxy-N,N-dimethyl aminopropane (DODMA) liposomes, DiLa2 liposomes from Marina Biotech (Bothell, Wash.), 1,2-dilinoleyloxy-3-dimethylaminopropane (DLin-DMA), 2,2-dilinoleyl-4-(2-dimethylaminoethyl)-[1,3]-dioxolane (DLin-KC2-DMA), and MC3 (US20100324120; herein incorporated by reference in its entirety) and liposomes which may deliver small molecule drugs such as, but not limited to, DOXIL® from Janssen Biotech, Inc. (Horsham, Pa.). In one embodiment, a pharmaceutical composition described herein may include, without limitation, liposomes such as those formed from the synthesis of stabilized plasmid-lipid particles (SPLP) or stabilized nucleic acid lipid particle (SNALP) that have been previously described and shown to be suitable for oligonucleotide delivery in vitro and in vivo (see Wheeler et al. Gene Therapy. 1999 6:271-281; Zhang et al. Gene Therapy. 19996:1438-1447; Jeffs et al. Pharm Res. 2005 22:362-372; Morrissey et al., Nat Biotechnol. 2005 2:1002-1007; Zimmermann et al., Nature. 2006 441:111-114; Heyes et al. J Contr Rel. 2005 107:276-287; Semple et al. Nature Biotech. 2010 28:172-176; Judge et al. J Clin Invest. 2009 119:661-673; deFougerolles Hum Gene Ther. 2008 19:125-132; U.S. Patent Publication Nos 2013/0122104, 2013/0303587, and 2016/0038612; the contents of each of which are incorporated herein in their entireties). The original manufacturing method of Wheeler et al. was a detergent dialysis method, which was later improved by Jeffs et al. and is referred to as the spontaneous vesicle formation method. The liposome formulations of the invention may be composed of 3 to 4 lipid components in addition a disrupting agent comprising a site-specific HNF4α targeting moiety. As an example a liposome of the invention can contain, but is not limited to, 55% cholesterol, 20% disteroylphosphatidyl choline (DSPC), 10% PEG-SDSG, and 15% 1,2-dioleyloxy-N,N-dimethylaminopropane (DODMA), as described by Jeffs et al. As another example, liposome formulationsof the invention may contain, but are not limited to, 48% cholesterol, 20% DSPC, 2% PEG-c-DMA, and 30% cationic lipid, where the cationic lipid can be 1,2-distearloxy-N,N-dimethylaminopropane (DSDMA), DODMA, DLin-DMA, or 1,2-dilinolenyloxy-3-dimethylaminopropane (DLenDMA), as described by Heyes et al. In some embodiments, liposome formulations may comprise from about 25.0% cholesterol to about 40.0% cholesterol, from about 30.0% cholesterol to about 45.0% cholesterol, from about 35.0% cholesterol to about 50.0% cholesterol and/or from about 48.5% cholesterol to about 60% cholesterol. In another embodiment, formulations of the invention may comprise a percentage of cholesterol selected from the group consisting of 28.5%, 31.5%, 33.5%, 36.5%, 37.0%, 38.5%, 39.0% and 43.5%. In some embodiments, liposome formulations of the invention may comprise from about 5.0% to about 10.0% DSPC and/or from about 7.0% to about 15.0% DSPC.


In one embodiment, a pharmaceutical composition may include liposomes which may be formed to deliver a disrupting agent of the invention. The disrupting agent comprising a site-specific HNF4α targeting moiety comprising may be encapsulated by the liposome and/or it may be contained in an aqueous core which may then be encapsulated by the liposome (see, e.g., PCT Patent Publication Nos. WO 2012/031046, WO 2012/031043, WO 2012/030901 and WO 2012/006378 and U.S. Patent Publication Nos. 2013/0189351, 2013/0195969 and 201/30202684, the contents of each of which are herein incorporated by reference in their entirety).


In another embodiment, liposomes for use in the present invention may be formulated for targeted delivery. As a non-limiting example, the liposome may be formulated for targeted delivery to the liver. Such a liposome may include, but is not limited to, a liposome described in U.S. Patent Publication No. 2013/0195967, the contents of which are herein incorporated by reference in their entirety.


In one embodiment, formulations comprising liposomes and a disrupting agent may be administered intramuscularly, intrademrally, or intravenously.


In another embodiment, a lipid formulation of the invention may include at least one cationic lipid, a lipid which enhances transfection and a least one lipid which contains a hydrophilic head group linked to a lipid moiety (International Pub. No. WO2011076807 and U.S. Pub. No. 20110200582; the contents of each of which is herein incorporated by reference in their entirety). In another embodiment, a lipdid formulation of the invention is a lipid vesicle which may have crosslinks between functionalized lipid bilayers (see U.S Patent Publication No. 2012/0177724, the contents of which are herein incorporated by reference in their entirety).


In one embodiment, a formulation comprising a disrupting agent is a lipid nanoparticle (LNP) which may comprise at least one lipid. The lipid may be selected from, but is not limited to, DLin-DMA, DLin-K-DMA, 98NI2-5, CI2-200, DLin-MC3-DMA, DLin-KC2-DMA, DODMA, PLGA, PEG, PEG-DMG, PEGylated lipids and amino alcohol lipids. In another aspect, the lipid may be a cationic lipid such as, but not limited to, DLin-DMA, DLin-D-DMA, DLin-MC3-DMA, DLin-KC2-DMA, DODMA and amino alcohol lipids. The amino alcohol cationic lipid may be the lipids described in and/or made by the methods described in U.S. Patent Publication No. 2013/0150625.


In one embodiment, the cationic lipid may be selected from, but not limited to, a cationic lipid described in PCT Publication Nos. WO 2012/040184, WO 2011/153120, WO 2011/149733, WO 2011/090965, WO 2011/043913, WO 2011/022460, WO 2012/061259, WO 2012/054365, WO 2012/044638, WO 2010/080724, WO 2010/21865, WO 2008/103276, WO 2013/086373 and WO 2013/086354, U.S. Pat. Nos. 7,893,302, 7,404,969, 8,283, 333, 8,466,122 and 8,569,256, and U.S. Patent Publication Nos. 2010/0036115, 2012/0202871, 2013/0064894, 2013/0129785, 2013/0150625, 2013/0178541, 2013/0225836 and 2014/0039032; the contents of each of which are herein incorporated by reference in their entirety. In another embodiment, the cationic lipid may be selected from, but not limited to, formula A described in PCT Publication Nos. WO 2012/040184, WO 0111/53120, WO 2011/149733, WO 2011/090965, WO 2011/043913, WO 2011/022460, WO 2012/061259, WO 2012/054365, WO 2012/044638 and WO 2013/116126 or U.S. Patent Publication Nos. 2013/0178541 and 2013/0225836; the contents of each of which is herein incorporated by reference in their entirety. In yet another embodiment, the cationic lipid may be selected from, but not limited to, formula CLI-CLXXIX of PCT Publication No. WO 2008/103276, formula CLICLXXIX of U.S. Pat. No. 7,893,302, formula CLICLXXXXII of U.S. Pat. No. 7,404,969 and formula I-VI of us Patent Publication No. 2010/0036115, formula I of U.S. Patent Publication No 2013/0123338; each of which is herein incorporated by reference in their entirety.


In one embodiment, the cationic lipid may be synthesized by methods known in the art and/or as described in PCT Publication Nos. WO 2012/040184 WO 2011/153120, WO 2011/149733, WO 2011/090965: WO 2011/043913, WO 2011/022460, WO 2012/061259, WO 2012/054365, WO 2012/044638, WO 2010/080724, WO 2010/21865, WO 2013/126803, WO 2013/086373, and WO 2013/086354; the contents of each of which are herein incorporated by reference in their entirety.


In one embodiment, the lipids which may be used in the formulations and/or for delivery of the disrupting agents described herein may be a cleavable lipid. As a non-limiting example, a cleavable lipid and/or pharmaceutical compositions comprising cleavable lipids include those described in PCT Patent Publication No. WO 2012/170889, the contents of which are herein incorporated by reference in their entirety. As another non-limiting example, the cleavable lipid may be HGT4001, HGT4002, HGT4003, HGT4004 and/or HGT4005 as described in PCT Patent Publication No. WO 2012/170889, the contents of which are herein incorporated by reference in their entirety.


In one embodiment, polymers which may be used in the formulation and/or delivery of the disrupting agents described herein may include, but is not limited to, poly(ethylene) glycol (PEG), polyethylenimine (PEI), dithiobis(succinimidylpropionate) (DSP), Dimethyl-3,3′-dithiobispropionimidate (DTBP), poly(ethylene imine) biscarbamate (PEIC), poly(L-lysine) (PLL), histidine modified PLL, poly(N-vinylpyrrohdone) (PVP), poly(propylenimine (PPI), poly(amidoamine) (PAMAM), poly(amido ethylenimine) (SS-PAEI), triehtylenetetramine (TETA), poly(β-aminoester), poly(4-hydroxy-L-proine ester) (PHP), poly(allylamine), poly(α-[4-aminobutyl]-L-glycolic acid (PAGA), Poly(D,L-lactic-coglycolid acid (PLGA), Poly(N-ethyl-4-vinylpyridinium bromide), poly(phosphazene)s (PPZ), poly(phosphoester)s (PPE), poly(phosphoramidate)s (PPA), poly(N-2-hydroxypropylmethacrylamide) (pHPMA), poly(2-(dimethylamino)ethyl methacrylate) (pDMAEMA), poly(2-aminoethyl propylene phosphate) PPE_EA), Chitosan, galactosylated chitosan, N-dodecylated chitosan, histone, collagen and dextran-spermine. In one embodiment, the polymer may be an inert polymer such as, but not limited to, PEG.


In one embodiment, the polymer may be a cationic polymer such as, but not limited to, PEL PLL, TETA, poly(allylamine), Poly(N-ethyl-4-vinylpyridinium bromide), pHPMA and pDMAEMA. In one embodiment, the polymer may be a biodegradable PEI such as, but not limited to, DSP, DTBP and PEIC. In one embodiment, the polymer may be biodegradable such as, but not limited to, histine modified PLL SSPAEI, poly(β-aminoester), PHP, PAGA, PLGA, PPZ, PPE, PPA and PPE-EA. In one embodiment, an LNP formulation of the invention may be prepared according to the methods described in PCT Publication Nos. WO 2011/127255 or WO 2008/103276, the contents of each of which are herein incorporated by reference in their entirety. As a non-limiting example, a disrupting agent comprising a site-specific HNF4α targeting moiety may be encapsulated in an LNP formulation as described in PCT Publication Nos. WO 2011/127255 and/or WO 2008/103276; the contents of each of which are herein incorporated by reference in their entirety. As another non-limiting example, a disrupting agent comprising a site-specific HNF4α targeting moiety as described herein, may be formulated in a nanoparticle to be delivered by a parenteral route as described in U.S. Patent Publication No. 2012/0207845 and PCT Publication No. WO 2014/008334; the contents of each of which are herein incorporated by reference in their entirety.


In one embodiment, LNP formulations described herein may be administered intramusculary. The LNP formulation may comprise a cationic lipid described herein, such as, but not limited to, DLin-DMA, DLin-KC2-DMA, DLin-MC3-DMA, DODMA and C12-200.


In one embodiment, LNP formulations described herein comprising a disrupting agent as described herein, may be administered intradermally. The LNP formulation may comprise a cationic lipid described herein, such as, but not limited to, DLin-DMA, DLin-KC2-DMA, DLin-MC3-DMA, DODMA and C12-200.


The nanoparticle formulations may comprise conjugate, such as a phosphate conjugate, a polymer conjugates, a conjugate that enhances the delivery of nanoparticle as described in US Patent Publication No. US20160038612 A1.


In one embodiment, the lipid nanoparticle formulation comprises DLin-MC3-DMA as described in US Patent Publication No. US20100324120.


In one embodiment, the lipid nanoparticle comprises a lipid compound, or a pharmaceutically acceptable salt, tautomer or stereoisomer thereof, or a lipid nanoparticle formulation, as described in U.S. Pat. No. 10,723,692B2, US Patent Publication Nos. US20200172472A1, US20200163878A1, US20200046838A1, US20190359556A1, US20190314524A1, US20190274968A1, US20190022247A1, US20180303925A1, US20180185516A1, US20160317676A1, International Patent Publication No.: WO20200146805A1, WO2020081938A1, WO2019089828A1, WO2019036030A1, WO2019036028A1, WO2019036008A1, WO 2018200943A1, WO2018191719A1, WO2018107026A1, WO2018081480A1, the contents of each of which are herein incorporated by reference in their entirety (Acuitas Therapeutics, Inc.).


In one embodiment, the lipid nanoparticle comprises an amino lipid, or a pharmaceutically acceptable salt, tautomer or stereoisomer thereof, or a lipid nanoparticle formulation, described by Tekmira Pharmaceuticals Corp. in U.S. Pat. No. 9,139,554B2, U.S. Pat. No. 9,051,567B2, U.S. Pat. No. 8,883,203B2, US Patent Publication US20110117125A1, the contents of each of which are herein incorporated by reference in their entirety. In one particular example, the compound described in U.S. Pat. No. 9,139,554B2 is DLin-kC2-DMA.


In one embodiment, the lipid nanoparticle comprises an amino lipid, or a pharmaceutically acceptable salt, tautomer or stereoisomer thereof, or a lipid nanoparticle formulation, described by Arbutus Biopharma Corp. in U.S. Ser. No. 10/561,732B2, U.S. Pat. No. 9,938,236B2, U.S. Pat. No. 9,687,550B2, US Patent Publication US20190240354A1, US20170027658A1, WO2020097493A1, WO2020097520A1, WO2020097540A1, WO2020097548A1, the contents of each of which are herein incorporated by reference in their entirety.


Lipid nanoparticles may be engineered to alter the surface properties of particles so the lipid nanoparticles may penetrate the mucosal barrier. Mucus is located on mucosal tissue such as, but not limited to, oral (e.g., the buccal and esophageal membranes and tonsil tissue), ophthalmic, gastrointestinal (e.g., stomach, small intestine, large intestine, colon, rectum), nasal, respiratory (e.g., nasal, pharyngeal, tracheal and bronchial membranes), genital (e.g., vaginal, cervical and urethral membranes). Nanoparticles larger than 10-200 nm which are preferred for higher drug encapsulation efficiency and the ability to provide the sustained delivery of a wide array of drugs have been thought to be too large to rapidly diffuse through mucosal barriers. Mucus is continuously secreted, shed, discarded or digested and recycled so most of the trapped particles may be removed from the mucosla tissue within seconds or within a few hours. Large polymeric nanoparticles (200 nm-500 nm in diameter) which have been coated densely with a low molecular weight polyethylene glycol (PEG) diffused through mucus only 4 to 6-fold lower than the same particles diffusing in water (Lai et al. PNAS 2007 104(5): 1482-487; Lai et al. Adv Drug Deliv Rev. 200961(2): 158-171; the contents of each of which are herein incorporated by reference in their entirety). The transport of nanoparticles may be determined using rates of permeation and/or fluorescent microscopy techniques including, but not limited to, fluorescence recovery after photobleaching (FRAP) and high resolution multiple particle tracking (MPT). As a non-limiting example, compositions which can penetrate a mucosal barrier may be made as described in U.S. Pat. No. 8,241,670 or International Patent Publication No. WO2013110028, the contents of each of which are herein incorporated by reference in their entirety.


In one embodiment, a disrupting agent comprising a site-specific HNF4α targeting moiety as described herein, is formulated as a lipoplex, such as, without limitation, the ATUPLEX™ system, the DACC system, the DBTC system and other siRNAlipoplex technology from Silence Therapeutics (London, United Kingdom), STEMFECFM from STEMGENT® (Cambridge, Mass.), and polyethylenimine (PEI) or protamine-based targeted and non-targeted delivery of nucleic acids acids (Aleku et al. Cancer Res. 2008 68:9788-9798; Strumberg et al. Int J Clin Pharmacol Ther 2012 50:76-78; Santel et al., Gene Ther 2006 13:1222-1234; Santel et al., Gene Ther 200613:1360-1370; Gutbier et al., PulmPharmacol. Ther. 201023:334-344; Kaufmann et al. Microvasc Res 2010 80:286-293; Weide et al. J Immunother. 2009 32:498-507; Weide et al. J Immnnother. 2008 31:180-188; Pascolo Expert Opin. Biol. Ther. 4:1285-1294; Fotin-Mleczek et al., 2011 J. Immunother. 34: 1-15; Song et al., Nature Biotechnol. 2005, 23:709-717; Peer et al., Proc NatlAcad Sci USA. 2007 6; 104:4095-4100; deFougerolles Hum Gene Ther. 2008 19: 125-132; all of which are incorporated herein by reference in their entirety).


In one embodiment such formulations may also be constructed or compositions altered such that they passively or actively are directed to different cell types in vivo, including but not limited to hepatocytes, immune cells, tumor cells, endothelial cells, antigen presenting cells, and leukocytes (Akinc et al. Mol Ther. 2010 18:1357-1364; Song et al., Nat Biotechnol. 2005 23:709-717; Judge et al., J Clin Invest. 2009 119:661-673; Kaufmann et al., Microvasc Res 2010 80:286-293; Santel et al., Gene Ther 200613:1222-1234; Santel et al., Gene Ther 2006 13: 1360-1370; Gutbier et al., Pulm Pharmacol. Ther. 2010 23:334-344; Basha et al., Mol. Ther. 2011 19:2186-2200; Fenske and Cullis, Expert Opin Drug Deliv. 20085:25-44; Peer et al., Science. 2008 319:627-630; Peer and Lieberman, Gene Ther. 2011 18: 1127-1133; all of which are incorporated herein by reference in its entirety). One example of passive targeting of formulations to liver cells includes the DLin-DMA, DLin-KC2-DMA and DLin-MC3-DMA-based lipid nanoparticle formulations which have been shown to bind to apolipoprotein E and promote binding and uptake of these formulations into hepatocytes in vivo (Akinc et al. Mol Ther. 2010 18: 1357-1364; the contents of which are herein incorporated by reference in its entirety). Formulations can also be selectively targeted through expression of different ligands on their surface as exemplified by, but not limited by, folate, transferrin, N-acetylgalactosamine (GalNAc), and antibody targeted approaches (Kolhatkar et al., Curr Drug Discov Technol. 2011 8: 197-206; Musacchio and Torchilin, Front Biosci. 201116: 1388-1412; Yu et al., Mol Membr Biol. 2010 27:286-298; Patil et al., Crit Rev Ther Drug Carrier Syst. 2008 25: 1-61; Benoit et al., Biomacromolecules. 2011 12:2708-2714; Zhao et al., Expert Opin Drug Deliv. 2008 5:309-319; Akinc et al., Mol Ther. 2010 18:1357-1364; Srinivasan et al., Methods Mol Biol. 2012 820: 105-116; Ben-Arie et al., Methods Mol Biol. 2012 757:497-507; Peer 2010 J Control Release. 20:63-68; Peer et al., Proc Natl Acad Sci USA. 2007 104:4095-4100; Kim et al., Methods Mol Biol. 2011 721:339-353; Subramanya et aL, Mol Ther. 2010 18:2028-2037; Song et aL, Nat Biotechnol. 2005 23:709-717; Peer et al., Science. 2008 319:627-630; Peer and Lieberman, Gene Ther. 2011 18:1127-1133; the contents of all of which are incorporated herein by reference in its entirety).


In one embodiment, a disrupting agent comprising a site-specific HNF4α targeting moiety of the invention, may be formulated as a solid lipid nanoparticle. A solid lipid nanoparticle (SLN) may be spherical with an average diameter between 10 to 1000 nm. SLN possess a solid lipid core matrix that can solubilize lipophilic molecules and may be stabilized with surfactants and/or emulsifiers. In a further embodiment, the lipid nanoparticle may be a self-assembly lipid-polymer nanoparticle (see Zhang et al., ACS Nano, 2008, 2 (8), pp 1696-1702; herein incorporated by reference in its entirety). As a non-limiting example, the SLN may be the SLN described in PCT Publication No. WO2013/105101, the contents of which are herein incorporated by reference in their entirety. As another non-limiting example, the SLN may be made by the methods or processes described in PCT Publication No. WO 2013/105101, the contents of which are herein incorporated by reference in their entirety.


Liposomes, lipoplexes, or lipid nanoparticles may be used to improve the efficacy of a disrupting agent comprising a site-specific HNF4α targeting moiety comprising, e.g., a nucleic acid molecule, to direct protein production as these formulations may be able to increase cell transfection by a nucleic acid molecule; and/or increase the translation of encoded protein (e.g., an effector of the invention). One such example involves the use of lipid encapsulation to enable the effective systemic delivery of polyplex plasmid DNA (Heyes et al., Mol Ther. 2007 15:713-720; the contents of which are herein incorporated by reference in its entirety). The liposomes, lipoplexes, or lipid nanoparticles of the invention may also increase the stability of a a disrupting agent comprising a site-specific HNF4α targeting moiety comprising, e.g., a nucleic acid molecule. Liposomes, lipoplexes, or lipid nanoparticles are described in U.S. Patent Publication No. 2016/0038612, the contents of which are incorporated herein by reference in their entirety.


In one embodiment, a disrupting agent comprising a site-specific HNF4α targeting moiety comprising may be formulated for controlled release and/or targeted delivery. As used herein, “controlled release” refers to a pharmaceutical composition or compound release profile that conforms to a particular pattern of release to effect a therapeutic outcome. In one embodiment, a disrupting agent comprising a site-specific HNF4α targeting moiety, as described herein, may be encapsulated into a delivery agent described herein and/or known in the art for controlled release and/or targeted delivery. As used herein, the term “encapsulate” means to enclose, surround or encase. As it relates to the formulation of the compounds of the invention, encapsulation may be substantial, complete or partial. The term “substantially encapsulated” means that at least greater than 50, 60, 70, 80, 85, 90, 95, 96, 97, 98, 99, 99.9, 99.9 or greater than 99.999% of the pharmaceutical composition or disrupting agent of the invention may be enclosed, surrounded or encased within the delivery agent. “Partial encapsulation” or “partially encapsulated” means that less than 10, 10, 20, 30, 40 50 or less of the pharmaceutical composition or disrupting agent of the invention may be enclosed, surrounded or encased within the delivery agent. Advantageously, encapsulation may be determined by measuring the escape or the activity of the pharmaceutical composition or compound of the invention using fluorescence and/or electron micrograph. For example, at least 1, 5, 10, 20, 30, 40, 50, 60, 70, 80, 85, 90, 95, 96, 97, 98, 99, 99.9, 99.99 or greater than 99.99% of the pharmaceutical composition or disrupting agent of the invention are encapsulated in the delivery agent.


In one embodiment, a disrupting agent comprising a site-specific HNF4α targeting moiety comprising as described herein, may be encapsulated in a therapeutic nanoparticle (e.g., a therapeutic nanoparticle from BIND Therapeutics). Therapeutic nanoparticles may be formulated by methods described herein and known in the art such as, but not limited to, PCT Publication Nos. WO 2010/005740, WO 2010/030763, WO 2010/005721, WO 2010/005723, WO 2012/054923, U.S. Patent Publication Nos. 2201/10262491, 2010/0104645, 2010/0087337, 2010/0068285, 2011/0274759, 2010/0068286, 2012/0288541, 2013/0123351, 2013/0230567, 2013/0236500, 2013/0302433, 2013/0302432, 1013/0280339 and 2013/0251757, and U.S. Pat. Nos. 8,206,747, 8,293,276 8,318,208, 8,318,211, 8,623,417, 8,617,608, 8,613,954, 8,613,951, 8,609,142, 8,603,534 and 8,563,041; the contents of each of which is herein incorporated by reference in their entirety. In another embodiment, therapeutic polymer nanoparticles may be prepared by the methods described in U.S. Patent Publication No. 2012/0140790, herein incorporated by reference in its entirety. As a non-limiting example, the therapeutic nanoparticle may comprise about 4 to about 25 weight percent of a disrupting agent and about 10 to about 99 weight percent of a diblock poly (lactic) acid-poly (ethylene)glycol copolymer comprising poly(lactic) acid as described in US Patent Publication No. 2013/0236500 (Bind), the contents of which are herein incorporated by reference in its entirety. As another non-limiting example, the nanoparticle may comprise about 0.2 to about 35 weight percent of a disrupting agent and about 10 to about 99 weight percent of a diblock poly(lactic) acid-poly(ethylene)glycol copolymer as described in U.S. Patent Publication Nos. 2013/0280339 (Bind) and 2010251757 and U.S. Pat. No. 8,652,528, the contents of each of which are herein incorporated by reference in their entirety.


In one embodiment, a disrupting agent formulated in therapeutic nanoparticles may be administered intramuscularly, intrademrally, or intravenously.


In one embodiment, a disrupting agent formulated in ACCURINS™ nanoparticles may be administered intramuscularly, intrademrally, or intravenously.


In one embodiment, a disrupting agent may be delivered in therapeutic nanoparticles having a high glass transition temperature such as, but not limited to, the nanoparticles described in US Patent Publication Nos. 2014/0030351 and 2011/0294717, the entire contents of each of which are incorporated herein by reference.


In one embodiment, the therapeutic nanoparticle may be formulated for sustained release. As used herein, “sustained release” refers to a pharmaceutical composition or compound that conforms to a release rate over a specific period of time. The period of time may include, but is not limited to, hours, days, weeks, months and years. As a nonlimiting example, the sustained release nanoparticle may comprise a polymer and a disrupting agent of the present invention (see PCT Publication No. WO2010075072 and U.S. Patent Publication Nos. 2010/0216804, 2011/0217377, 2012/0201859, 2013/0243848 and 2013/0243827, each of which is herein incorporated by reference in their entirety).


In one embodiment, a disrupting agent of the invention may be encapsulated in, linked to and/or associated with synthetic nanocarriers. Synthetic nanocarriers include, but are not limited to, those described in PCT Publication Nos. WO 2010/005740, WO 2010/030763, WO 2012/13501, WO 2012/149252, WO 2012149255, WO 2012149259, WO 2012149265, WO 2012149268, WO 2012149282, WO 2012149301, WO 2012149393, WO 2012149405, WO 2012149411 and WO 2012149454 and US Patent Publication Nos. 20110262491, 20100104645, 20100087337, 20120244222 and US20130236533, and U.S. Pat. No. 8,652,487, the contents of each of which is herein incorporated by reference in their entirety. The synthetic nanocarriers may be formulated using methods known in the art and/or described herein. As a nonlimiting example, the synthetic nanocarriers may be formulated by the methods described in PCT Publication Nos. WO 2010005740, WO 2010030763 and WO 201213501 and US Patent Publication Nos. 20110262491, 20100104645, 20100087337 and 20120244222, each of which is herein incorporated by reference in their entirety. In another embodiment, the synthetic nanocarrier formulations may be lyophilized by methods described in PCT Publication No. WO 2011072218 and U.S. Pat. No. 8,211,473; each of which is herein incorporated by reference in their entirety. In yet another embodiment, formulations of the present invention, including, but not limited to, synthetic nanocarriers, may be lyophilized or reconstituted by the methods described in US Patent Publication No. 20130230568, the contents of which are herein incorporated by reference in its entirety.


In one embodiment, synthetic nanocarriers comprising a disrupting agent may be administered intramuscularly, intrademrally, or intravenously.


In some embodiments, a disrupting agent may be formulated for delivery using smaller LNPs. Such particles may comprise a diameter from below 0.1 μm up to 1000 μm such as, but not limited to, less than 0.1 μm, less than 1.0 μm, less than 5 μm, less than 10 μm, less than 15 μm, less than 20 μm, less than 25 μm, less than 30 μm, less than 35 μm, less than 40 μm, less than 50 μm, less than 55 μm, less than 60 μm, less than 65 μm, less than 70 μm, less than 75 μm, less than 80 μm, less than 85 μm, less than 90 μm, less than 95 μm, less than 100 μm, less than 125 μm, less than 150 μm, less than 175 μm, less than 200 μm, less than 225 μm, less than 250 μm, less than 275 μm, less than 300 μm, less than 325 μm, less than 350 μm, less than 375 μm, less than 400 μm, less than 425 μm, less than 450 μm, less than 475 μm, less than 500 μm, less than 525 μm, less than 550 μm, less than 575 μm, less than 600 μm, less than 625 μm, less than 650 μm, less than 675 μm, less than 700 μm, less than 725 μm, less than 750 μm, less than 775 μm, less than 800 μm, less than 825 μm, less than 850 μm, less than 875 μm, less than 900 μm, less than 925 μm, less than 950 μm, less than 975 μm.


In another embodiment, a disrupting agent may be formulated for delivery using smaller LNPs which may comprise a diameter from about 1 nm to about 100 nm, from about 1 nm to about 10 nm, about 1 nm to about 20 nm, from about 1 nm to about 30 nm, from about 1 nm to about 40 nm, from about 1 nm to about 50 nm, from about 1 nm to about 60 nm, from about 1 nm to about 70 nm, from about 1 nm to about 80 nm, from about 1 nm to about 90 nm, from about 5 nm to about from 100 nm, from about 5 nm to about 10 nm, about 5 nm to about 20 nm, from about 5 nm to about 30 nm, from about 5 nm to about 40 nm, from about 5 nm to about 50 nm, from about 5 nm to about 60 nm, from about 5 nm to about 70 nm, from about 5 nm to about 80 nm, from about 5 nm to about 90 nm, about 10 to about 50 nm, from about 20 to about 50 nm, from about 30 to about 50 nm, from about 40 to about 50 nm, from about 20 to about 60 nm, from about 30 to about 60 nm, from about 40 to about 60 nm, from about 20 to about 70 nm, from about 30 to about 70 nm, from about 40 to about 70 nm, from about 50 to about 70 nm, from about 60 to about 70 nm, from about 20 to about 80 nm, from about 30 to about 80 nm, from about 40 to about 80 nm, from about 50 to about 80 nm, from about 60 to about 80 nm, from about 20 to about 90 nm, from about 30 to about 90 nm, from about 40 to about 90 nm, from about 50 to about 90 nm, from about 60 to about 90 nm and/or from about 70 to about 90 nm.


In one embodiment, a disrupting agent may be formulated in smaller LNPs and may be administered intramuscularly, intrademrally, or intravenously.


In one embodiment, a disrupting agent may be formulated for delivery using the drug encapsulating microspheres described in PCT Patent Publication No. WO 2013063468 or U.S. Pat. No. 8,440,614, each of which is herein incorporated by reference in its entirety. In another aspect, the amino acid, peptide, polypeptide, lipids (APPL) are useful in delivering the disrupting agents of the invention to cells (see PCT Patent Publication No. WO 2013063468, herein incorporated by reference in its entirety).


In one aspect, the lipid nanoparticle may be a limit size lipid nanoparticle described in PCT Patent Publication No. WO 2013059922, herein incorporated by reference in its entirety. The limit size lipid nanoparticle may comprise a lipid bilayer surrounding an aqueous core or a hydrophobic core; where the lipid bilayer may comprise a phospholipid such as, but not limited to, diacylphosphatidylcholine, a diacylphosphatidylethanolamine, a ceramide, a sphingomyelin, a dihydrosphingomyelin, a cephalin, a cerebroside, a C8-C20 fatty acid diacylphophatidylcholine, and I-palmitoyl-2-oleoyl phosphatidylcholine (POPC). In another aspect the limit size lipid nanoparticle may comprise a polyethylene glycol-lipid such as, but not limited to, DLPEPEG, DMPE-PEG, DPPC-PEG and DSPE-PEG.


In one embodiment, a disrupting agent of the invention may be delivered, localized and/or concentrated in a specific location using the delivery methods described in PCT Patent Publication No. WO 2013063530, the contents of which are herein incorporated by reference in its entirety. As a non-limiting example, a subject may be administered an empty polymeric particle prior to, simultaneously with or after delivering the disrupting agent to the subject. The empty polymeric particle undergoes a change in volume once in contact with the subject and becomes lodged, embedded, immobilized or entrapped at a specific location in the subject.


In one embodiment, a disrupting agent may be formulated in an active substance release system (See e.g., US Patent Publication No. 20130102545, herein incorporated by reference in its entirety). The active substance release system may comprise 1) at least one nanoparticle bonded to an oligonucleotide inhibitor strand which is hybridized with a catalytically active nucleic acid and 2) a compound bonded to at least one substrate molecule bonded to a therapeutically active substance (e.g., a disrupting agent of the invention), where the therapeutically active substance is released by the cleavage of the substrate molecule by the catalytically active nucleic acid.


In one embodiment, the nanoparticles of the present invention may be water soluble nanoparticles such as, but not limited to, those described in PCT Publication No. WO 2013090601, the contents of which are herein incorporated by reference in its entirety. The nanoparticles may be inorganic nanoparticles which have a compact and zwitterionic ligand in order to exhibit good water solubility. The nanoparticles may also have small hydrodynamic diameters (HD), stability with respect to time, pH, and salinity and a low level of non-specific protein binding.


In one embodiment, the nanoparticles of the present invention are stealth nanoparticles or target-specific stealth nanoparticles such as, but not limited to, those described in U.S. Patent Publication Nos. 20130172406 (Bind), US20130251817 (Bind), 2013251816 (Bind) and 20130251766 (Bind), the contents of each of which are herein incorporated by reference in its entirety. The stealth nanoparticles may comprise a diblock copolymer and a chemotherapeutic agent. These stealth nanoparticles may be made by the methods described in us Patent Publication Nos. 20130172406, 20130251817, 2013251816 and 20130251766, the contents of each of which are herein incorporated by reference in its entirety. As a non-limiting example, the stealth nanoparticles may target cancer cells such as the nanoparticles described in US Patent Publication Nos. 20130172406, 20130251817, 2013251816 and 20130251766, the contents of each of which are herein incorporated by reference in its entirety.


In one embodiment, stealth nanoparticles comprising a disrupting agent of the invention may be administered intramuscularly, intradermally, or intravenously.


In one embodiment, a disrupting agent of the invention may be formulated in and/or delivered in a lipid nanoparticle comprising a plurality of cationic lipids such as, but not limited to, the lipid nanoparticles described in US Patent Publication No. 20130017223, the contents of which are herein incorporated by reference in its entirety. As a non-limiting example, the LNP formulation may comprise a first cationic lipid and a second cationic lipid. As another non-limiting example, the LNP formulation may comprise DLin-MC2-DMA and DLinMC4-DMA. As yet another non-limiting example, the LNP formulation may comprise DLin-MC3-DMA and CI2-200. In one embodiment, the LNP formulations comprising a plurality of cationic lipids (such as, but not limited to, those described in US Patent Publication No. US20130017223, the contents of which are herein incorporated by reference in its entirety) and may be administered intramuscularly, intradermally, or intravenously.


In one embodiment, a disrupting agent as described herein, may be formulated in and/or delivered in a lipid nanoparticle comprising the cationic lipid DLin-MC3-DMA and the neutral lipid DOPE. The lipid nanoparticle may also comprise a PEG based lipid and a cholesterol or antioxidant. These lipid nanoparticle formulations comprising DLin-MC3-DMA and DOPE and a disrupting agent may be administered intramuscularly, intradermally, or intravenously.


In one embodiment, the lipid nanoparticle comprising DLin-MC3-DMA and DOPE may comprise a PEG lipid such as, but not limited to, pentaerythritol PEG ester tetrasuccinimidyl and pentaerythritol PEG ether tetra-thiol, PEGc-DOMG, PEG-DMG (1,2-Dimyristoyl-sn-glycerol, methoxypolyethylene Glycol), PEG-DSG (1,2-Distearoyl-snglycerol, methoxypolyethylene Glycol), PEG-DPG (1,2-Dipalmitoyl-sn-glycerol, methoxypolyethylene glycol), PEG-DSA (PEG coupled to 1,2-distearyloxypropyl-3-amine), PEG-DMA (PEG coupled to 1,2-dimyristyloxypropyl-3-amine, PEG-c-DNA, PEG-c-DMA, PEG-S-DSG, PEG-c-DMA, PEG-DPG, PEG-DMG 2000 and those described herein and/or known in the art.


In one embodiment, the lipid nanoparticle comprising DLin-MC3-DMA and DOPE may include 0.5% to about 3.0%, from about 1.0% to about 3.5%, from about 1.5% to about 4.0%, from about 2.0% to about 4.5%, from about 2.5% to about 5.0% and/or from about 3.0% to about 6.0% of the lipid molar ratio of a PEG lipid.


In one embodiment, the lipid nanoparticle comprising DLin-MC3-DMA and DOPE may include 25.0% cholesterol to about 50.0% cholesterol, from about 30.0% cholesterol to about 45.0% cholesterol, from about 35.0% cholesterol to about 50.0% cholesterol and/or from about 48.5% cholesterol to about 60% cholesterol. In one embodiment, formulations may comprise a percentage of cholesterol selected from the group consisting of 28.5%, 31.5%, 33.5%, 36.5%, 37.0%, 38.5%, 39.0%, 43.5% and 48.5%.


In one embodiment, the lipid nanoparticle comprising DLin-MC3-DMA and DOPE may include 25.0% antioxidant to about 50.0% antioxidant, from about 30.0% antioxidant to about 45.0% antioxidant, from about 35.0% antioxidant to about 50.0% antioxidant and/or from about 48.5% antioxidant to about 60% antioxidant. In one embodiment, formulations may comprise a percentage of antioxidant selected from the group consisting of 28.5%, 31.5%, 33.5%, 36.5%, 37.0%, 38.5%, 39.0%, 43.5% and 48.5%.


The disrupting agent of the invention can be formulated using natural and/or synthetic polymers. Non-limiting examples of polymers which may be used for delivery include, but are not limited to, DYNAMIC POLYCONJUGATE (Arrowhead Research Corp., Pasadena, Calif.) formulations from MIRUS® Bio (Madison, Wis.) and Roche Madison (Madison, Wis.), PHASERX™ polymer formulations such as, without limitation, SMARTT POLYMER TECHNOLOGY™ (Seattle, Wash.), DMRIIDOPE, poloxamer, VAXFECTIN® adjuvant from Vical (San Diego, Calif.), chitosan, cyclodextrin from Calando Pharmaceuticals (Pasadena, Calif.), dendrimers and polylactic-co-glycolic acid) (PLGA) polymers, RONDEL™ (RNAVOligonucleotide Nanoparticle Delivery) polymers (Arrowhead Research Corporation, Pasadena, Calif.) and pH responsive co-block polymers such as, but not limited to, PHASERX™ (Seattle, Wash.).


The polymer formulations may permit the sustained or delayed release of a disrupting agent (e.g., following intramuscular, intradermal or subcutaneous injection). The altered release profile of the disrupting agent can result in, for example, translation of an encoded protein over an extended period of time. The polymer formulation may also be used to increase the stability of the disrupting agent. For example, biodegradable polymers have been previously used to protect nucleic acids other than modified mRNA from degradation and been shown to result in sustained release of payloads in vivo (Rozema et al., Proc Natl Acad Sci USA. 2007 104:12982-12887; Sullivan et al., Expert Opin Drug Deliv. 2010 7:1433-1446; Convertine et al., Biomacromolecules. 2010 Oct. 1; Chu et al., Acc Chern Res. 2012 Jan. 13; Manganiello et al et al., Biomaterials. 2012 33:2301-2309; Benoit et al., Biomacromolecules. 2011 12:2708-2714; Singha et al., Nucleic Acid Ther. 2011 2: 133-147; deFougerolles Hum Gene Ther. 2008 19:125-132; Schaffert and Wagner, Gene Ther. 2008 16:1131-1138; Chaturvedi et al., Expert Opin Drug Deliv. 2011 8: 1455-1468; Davis, Mol Pharm. 2009 6:659-668; Davis, Nature 201 0464: 1067-1070; each of which is herein incorporated by reference in its entirety).


In one embodiment, the pharmaceutical compositions may be sustained release formulations. In a further embodiment, the sustained release formulations may be for subcutaneous delivery. Sustained release formulations may include, but are not limited to, PLGA microspheres, ethylene vinyl acetate (EVAc), poloxamer, GELSITE® (Nanotherapeutics, Inc. Alachua, Fla.), HYLENEX® (Halozyme Therapeutics, San Diego Calif.), surgical sealants such as fibrinogen polymers (Ethic on Inc. Cornelia, Ga.), TISSELL® (Baxter International, Inc Deerfield, Ill.), PEG-based sealants, and COSEAL® (Baxter International, Inc Deerfield, Ill.).


B. Vector Encoded Site-Specific HNF4α Disrupting Agents of the Invention


Disrupting agents comprising a site-specific HNF4α targeting moiety, e.g., comprising a nucleic acid molecule, may be expressed from transcription units inserted into DNA or RNA vectors (see, e.g., Couture, A, et al., TIG. (1996), 12:5-10; WO 00/22113, WO 00/22114, and U.S. Pat. No. 6,054,299). In some embodiment, expression is sustained (months or longer), depending upon the specific construct used and the target tissue or cell type. These transgenes can be introduced as a linear construct, a circular plasmid, or a viral vector, which can be an integrating or non-integrating vector. The transgene can also be constructed to permit it to be inherited as an extrachromosomal plasmid (Gassmann, et al., (1995) Proc. Natl. Acad. Sci. USA 92:1292). Different components of the disrupting agent, e.g., gRNA and effector, can be located on separate expression vectors that can be co-introduced (e.g., by transfection or infection) into a target cell. Alternatively, each individual component can be transcribed by promoters both of which are located on the same expression plasmid.


Delivery of a disrupting agent expressing vector can be systemic, such as by intravenous or intramuscular administration, by administration to target cells ex-planted from the patient followed by reintroduction into the patient, or by any other means that allows for introduction into a desired target cell.


In certain embodiment, the nucleic acids described herein or the nucleic acids encoding a protein described herein, e.g., an effector, are incorporated into a vector, e.g., a viral vector.


The individual strand or strands of a disrupting agent comprising a site-specific HNF4α targeting moiety comprising a nucleic acid molecule can be transcribed from a promoter in an expression vector. Where two separate strands are to be expressed to generate, for example, a dsRNA, two separate expression vectors can be co-introduced (e.g., by transfection or infection) into a target cell. Alternatively, each individual strand of a nucleic acid molecule can be transcribed by promoters both of which are located on the same expression plasmid. In one embodiment, a nucleic acid molecule is expressed as inverted repeat polynucleotides joined by a linker polynucleotide sequence such that the nucleic acid molecule has a stem and loop structure.


Expression vectors are generally DNA plasmids or viral vectors. Expression vectors compatible with eukaryotic cells, preferably those compatible with vertebrate cells, can be used to produce recombinant constructs for the expression of a disrupting agent as described herein.


Constructs for the recombinant expression of a disrupting agent will generally require regulatory elements, e.g., promoters, enhancers, etc., to ensure the expression of the disrupting agent in target cells.


Expression of natural or synthetic nucleic acids is typically achieved by operably linking a nucleic acid encoding the nucleic acid of interest to a regulatory region, such as a promoter, and incorporating the construct into an expression vector. The vectors can be suitable for replication and integration in eukaryotes.


Regulatory regions, such as a promoter, suitable for operable linking to a nucleic acid molecules can be operably linked to a regulatory region such as a promoter. can be from any species. Any type of promoter can be operably linked to a nucleic acid sequence. Examples of promoters include, without limitation, tissue-specific promoters, constitutive promoters, and promoters responsive or unresponsive to a particular stimulus (e.g., inducible promoters). Additional promoter elements, e.g., enhancing sequences, regulate the frequency of transcriptional initiation. Typically, these are located in the region 30-110 bp upstream of the start site, although a number of promoters have recently been shown to contain functional elements downstream of the start site as well. The spacing between promoter elements frequently is flexible, so that promoter function is preserved when elements are inverted or moved relative to one another. In the thymidine kinase (tk) promoter, the spacing between promoter elements can be increased to 50 bp apart before activity begins to decline. Depending on the promoter, individual elements can function either cooperatively or independently to activate transcription.


One example of a suitable promoter is the immediate early cytomegalovirus (CMV) promoter sequence. This promoter sequence is a strong constitutive promoter sequence capable of driving high levels of expression of any polynucleotide sequence operatively linked thereto. Another example of a suitable promoter is Elongation Growth Factor-1a (EF-1a). However, other constitutive promoter sequences may also be used, including, but not limited to the simian virus 40 (SV40) early promoter, mouse mammary tumor virus (MMTV), human immunodeficiency virus (HIV) long terminal repeat (LTR) promoter, MoMuLV promoter, an avian leukemia virus promoter, an Epstein-Barr virus immediate early promoter, a Rous sarcoma virus promoter, as well as human gene promoters such as, but not limited to, the actin promoter, the myosin promoter, the hemoglobin promoter, and the creatine kinase promoter.


Further, the present invention should not be limited to the use of constitutive promoters. Inducible promoters are also contemplated as part of the invention. The use of an inducible promoter provides a molecular switch capable of turning on expression of the polynucleotide sequence which it is operatively linked when such expression is desired, or turning off the expression when expression is not desired. Examples of inducible promoters include, but are not limited to a metallothionine promoter, a glucocorticoid promoter, a progesterone promoter, and a tetracycline promoter.


Additional regulatory regions that may be useful in nucleic acid constructs, include, but are not limited to, transcription and translation terminators, initiation sequences, polyadenylation sequences, translation control sequences (e.g., an internal ribosome entry segment, IRES), enhancers, inducible elements, or introns. Such regulatory regions may not be necessary, although they may increase expression by affecting transcription, stability of the mRNA, translational efficiency, or the like. Such regulatory regions can be included in a nucleic acid construct as desired to obtain optimal expression of the nucleic acids in the cell(s). Sufficient expression, however, can sometimes be obtained without such additional elements.


The expression vector to be introduced can also contain either a selectable marker gene or a reporter gene or both to facilitate identification and selection of expressing cells from the population of cells sought to be transfected or infected through viral vectors. In other aspects, the selectable marker may be carried on a separate piece of DNA and used in a co-transfection procedure. Both selectable markers and reporter genes may be flanked with appropriate transcriptional control sequences to enable expression in the host cells. Useful selectable markers include, for example, antibiotic-resistance genes, such as neo and the like. Non-limiting examples of selectable markers include puromycin, ganciclovir, adenosine deaminase (ADA), aminoglycoside phosphotransferase (neo, G418, APH), dihydrofolate reductase (DHFR), hygromycin-B-phosphtransferase, thymidine kinase (TK), and xanthin-guanine phosphoribosyltransferase (XGPRT). Such markers are useful for selecting stable transformants in culture. Other selectable markers include fluorescent polypeptides, such as green fluorescent protein or yellow fluorescent protein.


Signal peptides may also be included and can be used such that an encoded polypeptide is directed to a particular cellular location (e.g., the cell surface).


Reporter genes may be used for identifying potentially transfected cells and for evaluating the functionality of transcriptional control sequences. In general, a reporter gene is a gene that is not present in or expressed by the recipient source and that encodes a polypeptide whose expression is manifested by some easily detectable property, e.g., enzymatic activity. Expression of the reporter gene is assayed at a suitable time after the DNA has been introduced into the recipient cells. Suitable reporter genes may include genes encoding luciferase, beta-galactosidase, chloramphenicol acetyl transferase, secreted alkaline phosphatase, or the green fluorescent protein gene (e.g., Ui-Tei et al., 2000 FEBS Letters 479: 79-82). Suitable expression systems are well known and may be prepared using known techniques or obtained commercially. In general, the construct with the minimal 5′ flanking region showing the highest level of expression of reporter gene is identified as the promoter. Such promoter regions may be linked to a reporter gene and used to evaluate agents for the ability to modulate promoter-driven transcription.


Other aspects to consider for vectors and constructs are known in the art.


In some embodiments, a vector, e.g., a viral vector comprises a disrupting agent comprising a site-specific HNF4α targeting moiety comprising a nucleic acid molecule.


Viral vector systems which can be utilized with the methods and compositions described herein include, but are not limited to, (a) adenovirus vectors (e.g., an Ad5/F35 vector); (b) retrovirus vectors, including but not limited to lentiviral vectors (including integration competent or integration-defective lentiviral vectors), moloney murine leukemia virus, etc.; (c) adeno-associated virus vectors; (d) herpes simplex virus vectors; (e) SV 40 vectors; (f) polyoma virus vectors; (g) papilloma virus vectors; (h) picornavirus vectors; (i) pox virus vectors such as an orthopox, e.g., vaccinia virus vectors or avipox, e.g. canary pox or fowl pox; and (j) a helper-dependent or gutless adenovirus. Replication-defective viruses can also be advantageous. Different vectors will or will not become incorporated into the cells' genome. The constructs can include viral sequences for transfection, if desired. Alternatively, the construct can be incorporated into vectors capable of episomal replication, e.g. EPV and EBV vectors. See, e.g., U.S. Pat. Nos. 6,534,261; 6,607,882; 6,824,978; 6,933,113; 6,979,539; 7,013,219; and 7,163,824, the entire contents of each of which is incorporated by reference herein.


Vectors, including those derived from retroviruses such as adenoviruses and adeno-associated viruses and lentiviruses, are suitable tools to achieve long-term gene transfer since they allow long-term, stable integration of a transgene and its propagation in daughter cells. Examples of vectors include expression vectors, replication vectors, probe generation vectors, and sequencing vectors. The expression vector may be provided to a cell in the form of a viral vector. Viral vector technology is well known in the art, and described in a variety of virology and molecular biology manuals.


In one embodiment, a suitable viral vector for use in the present invention is an adeno-associated viral vector, such as a recombinant adeno-associate viral vector.


Recombinant adeno-associated virus vectors (rAAV) are gene delivery systems based on the defective and nonpathogenic parvovirus adeno-associated type 2 virus. All vectors are derived from a plasmid that retains only the AAV 145 bp inverted terminal repeats flanking the transgene expression cassette. Efficient gene transfer and stable transgene delivery due to integration into the genomes of the transduced cell are key features for this vector system. (Wagner et al., Lancet 351:9117 1702-3 (1998), Kearns et al., Gene Ther. 9:748-55 (1996)). AAV serotypes, including AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8 and AAV9, can be used in accordance with the present invention.


Replication-deficient recombinant adenoviral vectors (Ad) can be produced at high titer and readily infect a number of different cell types. Most adenovirus vectors are engineered such that a transgene replaces the Ad E1a, E1b, and/or E3 genes; subsequently the replication defective vector is propagated in human 293 cells that supply deleted gene function in trans. Ad vectors can transduce multiple types of tissues in vivo, including nondividing, differentiated cells such as those found in liver, kidney and muscle. Conventional Ad vectors have a large carrying capacity. An example of the use of an Ad vector in a clinical trial involved polynucleotide therapy for antitumor immunization with intramuscular injection (Sterman et al., Hum. Gene Ther. 7:1083-9 (1998)). Additional examples of the use of adenovirus vectors for gene transfer in clinical trials include Rosenecker et al., Infection 24:1 5-10 (1996); Sterman et al., Hum. Gene Ther. 9:7 1083-1089 (1998); Welsh et al., Hum. Gene Ther. 2:205-18 (1995); Alvarez et al., Hum. Gene Ther. 5:597-613 (1997); Topf et al., Gene Ther. 5:507-513 (1998); Sterman et al., Hum. Gene Ther. 7:1083-1089 (1998).


Packaging cells are used to form virus particles that are capable of infecting a host cell. Such cells include 293 cells, which package adenovirus, and ψ2 cells or PA317 cells, which package retrovirus. Viral vectors used in gene therapy are usually generated by a producer cell line that packages a nucleic acid vector into a viral particle. The vectors typically contain the minimal viral sequences required for packaging and subsequent integration into a host (if applicable), other viral sequences being replaced by an expression cassette encoding the protein to be expressed. The missing viral functions are supplied in trans by the packaging cell line. For example, AAV vectors used in gene therapy typically only possess inverted terminal repeat (ITR) sequences from the AAV genome which are required for packaging and integration into the host genome. Viral DNA is packaged in a cell line, which contains a helper plasmid encoding the other AAV genes, namely rep and cap, but lacking ITR sequences. The cell line is also infected with adenovirus as a helper. The helper virus promotes replication of the AAV vector and expression of AAV genes from the helper plasmid. The helper plasmid is not packaged in significant amounts due to a lack of ITR sequences. Contamination with adenovirus can be reduced by, e.g., heat treatment to which adenovirus is more sensitive than AAV


IV. Methods of the Invention

The present invention also provides methods of use of the agents and compositions described herein to modulate expression of hepatocyte nuclear factor 4 alpha- (HNF4α) in a cell. The methods include contacting the cell with a site-specific HNF4α disrupting agent, the disrupting agent comprising a site-specific HNF4α targeting moiety which targets an HNF4α expression control region, and an effector molecule, thereby modulating expression of HNF4α in the cell. The site-specific disrupting agent, the effector, or both the site-specific disrupting agent and the effector may be present in a composition, such as a composition described above. In some embodiments, the site-specific disrupting agent and the effector are present in the same compositions. In other embodiments, the site-specific disrupting agent and the effector are present in different compositions. In some embodiments, the methods of the invention include contacting a cell with two site-specific HNF4α disrupting agents (a first and a second agent). The two site specific HNF4α disrupting agents may be present in the same composition, e.g., pharmaceutical composition, e.g., pharmaceutical composition comprising an LNP, or in seprate compositions, e.g., pharmaceutical composition, e.g., pharmaceutical composition comprising an LNP. The cell may be contacted with the first site specific HNF4α disrupting agent at one time and contacted with the second site specific HNF4α disrupting agent at a second time, or the cell may be contacted with both agents at the same time.


As indicated above, in fibrotic liver disease, HNF4α is dysregulated and, as a result, gene expression in its network declines significantly or stops. As reported by Guzman-Lepe (Guzman-Lepe J, et al. Liver-enriched transcription factor expression relates to chronic hepatic failure in humans. Hepatol Commun. 2018; 2(5):582-594. doi:10.1002/hep4.1172), HNF4α expression was down-regulated and correlated well with the extent of liver dysfunction (P=0.001), stage of fibrosis (P=0.0005), and serum levels of total bilirubin (P=0.009; r=0.35), albumin (P<0.001; r=0.52), and prothrombin time activity (P=0.002; r=0.41). HNF4α expression also correlated with CYP3A4, ornithine transcarbamylase (OTC), and F7 as well as CDH1 RNA levels. This dysregulation of the network contributes to the pathology of liver failure in the organ itself, and to co-morbidities throughout the patient. In addition to a repression of proteins associated with healthy liver (e.g. albumin and CYP3A enzyme production), proteins that contribute to the production of fibrosis are activated, including COL1a1 and αSMA.


Proof of principle that increased expression of HNF4α can revert senescent and irreversibly dysfunctional hepatocytes from terminal rodent livers to normal function has been established in several studies (Nishikawa, supra; Scholten D, Trebicka J, Liedtke C, Weiskirchen R. The carbon tetrachloride model in mice. Lab Anim 2015; 49(1 Suppl):4-11. doi:10.1177/0023677215571192; Varga J, Brenner D A, Phan S H, eds. Fibrosis Research: Methods and Protocols. Humana Press; 2005. doi:10.1385/1592599400). Interestingly, as reported by Nishikawa et al. (Nishikawa, supra), reversal of the distorted extracellular matrix is not absolutely required to reverse hepatic failure in degenerative liver disease, as only minimal resolution of fibrosis was found by histology two weeks after forced re-expression of HNF4α, well after improvement in hepatic function was documented. Significant improvement in histology, however, was observed at 100 days. In addition, long-term correction took place despite that forced re-expression generated only 0.01% of the endogenous level of HNF4α. Thus, improvement in hepatic function may only require increasing expression of HNF4α in a relatively modest number of hepatocytes in end-stage degenerative disease.


Recently, Huang et al. (Huang K-W, et al. Liver Activation of Hepatocellular Nuclear Factor-4a by Small Activating RNA Rescues Dyslipidemia and Improves Metabolic Profile. Mol Ther Nucleic Acids. 2019; 19:361-370. doi:10.1016/j.omtn.2019.10.044) also observed that stimulating HNF4α expression with a small-activating RNA in a rat model of non-alcoholic fatty liver disease (NAFLD) restored metabolic regulation and improved lipid profile.


As demonstrated in the examples below, one embodiment of the invention is to increase the expression of the HNF4α gene by delivering an engineered transcription factor to the gene that is pathologically dysregulated. The transcriptional activator, VPR (Chavez A, et al. Highly efficient Cas9-mediated transcriptional programming Nat Methods. 2015; 12(4):326-328. doi:10.1038/nmeth.3312), is a concatemer of the HSV transcriptional activator VP16, nuclear factor NF-kappa-B p65 subunit, and the EBV R transactivator. Each of these transcription factors is capable, individually, of attracting the cellular machinery of transcription, resulting in an upregulation of RNA production from the target locus. As a group, their synergistic cooperation results in physiologic or supra-physiologic expression of a target gene.


Expression of HNF4α may be enhanced or reduced as compared to, for example, a cell that was not contacted with the site-specific HNF4α disrupting agent. Modulation in gene expression can be assessed by any methods known in the art. For example, a modulation in the expression may be determined by determining the mRNA expression level of a gene, e.g., in a cell, a plurality of cells, and/or a tissue sample, using methods routine to one of ordinary skill in the art, e.g., northern blotting, qRT-PCR; by determining the protein level of a gene using methods routine to one of ordinary skill in the art, such as western blotting, immunological techniques.


The term “reduced” in the context of the level of HNF4α gene expression or HNF4α protein production in a subject, or a disease marker or symptom refers to a statistically significant decrease in such level. The decrease can be, for example, at least 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95%, or below the level of detection for the detection method. In certain embodiments, the expression of the target is normalized, i.e., decreased towards or to a level accepted as within the range of normal for an individual without such disorder. As used here, “lower” in a subject can refer to lowering of gene expression or protein production in a cell in a subject does not require lowering of expression in all cells or tissues of a subject. For example, as used herein, lowering in a subject can include lowering of gene expression or protein production in the liver of a subject.


The term “reduced” can also be used in association with normalizing a symptom of a disease or condition, i.e. decreasing the difference between a level in a subject suffering from an HNF4α-associated disease towards or to a level in a normal subject not suffering from an HNF4α-associated disease. As used herein, if a disease is associated with an elevated value for a symptom, “normal” is considered to be the upper limit of normal. If a disease is associated with a decreased value for a symptom, “normal” is considered to be the lower limit of normal.


The term “enhanced” in the context of the level of HNF4α gene expression or HNF4α protein production in a subject, or a disease marker or symptom refers to a statistically significant increase in such level. The increase can be, for example, at least 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95%, or above the level of detection for the detection method. In certain embodiments, the expression of the target is normalized, i.e., increase towards or to a level accepted as within the range of normal for an individual without such disorder. As used here, “higher” in a subject can refer to increasing gene expression or protein production in a cell in a subject does not require increasing expression in all cells or tissues of a subject. For example, as used herein, increasing in a subject can include increasing gene expression or protein production in the liver of a subject.


The term “enhanced” can also be used in association with normalizing a symptom of a disease or condition, i.e. increasing the difference between a level in a subject suffering from an HNF4α-associated disease towards or to a level in a normal subject not suffering from an HNF4α-associated disease. As used herein, if a disease is associated with an elevated value for a symptom, “normal” is considered to be the upper limit of normal. If a disease is associated with a decreased value for a symptom, “normal” is considered to be the lower limit of normal.


In some embodiments, a suitable cell for use in the methods of the invention is a mammalian cell. In some embodiments, the cell is a somatic cell. In some embodiments, the cell is a primary cell. For example, in some embodiments, the cell is a mammalian somatic cell. In some embodiments, the mammalian somatic cell is a primary cell. In some embodiments, the mammalian somatic cell is a non-embryonic cell.


The step of contacting may be performed in vitro, in vivo (i.e., the cell may be within a subject), or ex vivo. In some embodiments, contacting a cell is performed ex vivo and the methods further include, prior to the step of contacting, a step of removing the cell (e.g., a mammalian cell) from a subject. In some embodiments, the methods further comprise, after the step of contacting, a step of (b) administering the cell (e.g., mammalian cells) to a subject.


The in vivo methods of the invention may include administering to a subject an agent or composition of the invention.


The term “subject,” as used herein refers to an organism, for example, a mammal (e.g., a human, a non-human mammal, a non-human primate, a primate, a laboratory animal, a mouse, a rat, a hamster, a gerbil, a cat, or a a dog). In some embodiments a human subject is an adult, adolescent, or pediatric subject. In some embodiments, a subject had a disease or a condition. In some embodiments, the subject is suffering from a disease, disorder or condition, e.g., a disease, disorder or condition that can be treated as provided herein. In some embodiments, a subject is susceptible to a disease, disorder, or condition; in some embodiments, a susceptible subject is predisposed to and/or shows an increased risk (as compared to the average risk observed in a reference subject or population) of developing the disease, disorder or condition. In some embodiments, a subject displays one or more symptoms of a disease, disorder or condition. In some embodiments, a subject does not display a particular symptom (e.g., clinical manifestation of disease) or characteristic of a disease, disorder, or condition. In some embodiments, a subject does not display any symptom or characteristic of a disease, disorder, or condition. In some embodiments, a subject is a patient. In some embodiments, a subject is an individual to whom diagnosis and/or therapy is and/or has been administered.


Subjects that would benefit from the methods of the invention include subjects having an “HNF4α-associated disease” or a subject at risk of an “HNF4α-associated disease.”


Thus, the present invention further provides methods of treatment of a subject in need thereof. The treatment methods of the invention include administering an agent or composition of the invention to a subject, e.g., a subject that would benefit from a modulation of HNF4α expression, such as a subject having an HNF4α-associated disease, in a therapeutically effective amount. In some embodiments, the methods of the invention include the subject may be administered two site-specific HNF4α disrupting agents (a first and a second agent). The two site specific HNF4α disrupting agents may be present in the same composition, e.g., pharmaceutical composition, e.g., pharmaceutical composition comprising an LNP, or in seprate compositions, e.g., pharmaceutical composition, e.g., pharmaceutical composition comprising an LNP. The subject may be administered the first site specific HNF4α disrupting agent at one time and administered the second site specific HNF4α disrupting agent at a second time, or the subject may be administered both agents at the same time.


In addition, the present invention provides methods for preventing at least one symptom in a subject that would benefit from a modulation of HNF4α expression, such as a subject having an HNF4α-associated disease, by administering to the subject an agent or composition of the invention in a prophylactically effective amount.


“Therapeutically effective amount,” as used herein, is intended to include the amount of an agent or composition that, when administered to a patient for treating a subject having a HNF4α-associated disease, is sufficient to effect treatment of the disease (e.g., by diminishing, ameliorating, or maintaining the existing disease or one or more symptoms of disease or its related comorbidities). The “therapeutically effective amount” may vary depending on the agent or composition, how it is administered, the disease and its severity and the history, age, weight, family history, genetic makeup, stage of pathological processes mediated by HNF4α gene expression, the types of preceding or concomitant treatments, if any, and other individual characteristics of the patient to be treated.


“Prophylactically effective amount,” as used herein, is intended to include the amount of an agent or composition that, when administered to a subject who does not yet experience or display symptoms of an HNF4α-associated disease, but who may be predisposed to an HNF4α-associated disease, is sufficient to prevent or delay the development or progression of the disease or one or more symptoms of the disease for a clinically significant period of time. The “prophylactically effective amount” may vary depending on the agent or composition, how it is administered, the degree of risk of disease, and the history, age, weight, family history, genetic makeup, the types of preceding or concomitant treatments, if any, and other individual characteristics of the patient to be treated.


As used herein, “prevention” or “preventing,” when used in reference to a disease, disorder or condition thereof, that would benefit from a reduction in expression of an HNF4α gene or production of HNF4α protein, refers to a reduction in the likelihood that a subject will develop a symptom associated with such a disease, disorder, or condition, e.g., a sign or symptom of HNF4α gene expression or HNF4α activity.


A “therapeutically-effective amount” or “prophylactically effective amount” also includes an amount of an agent or composition that produces some desired local or systemic effect at a reasonable benefit/risk ratio applicable to any treatment. Agents and compositions employed in the methods of the present invention may be administered in a sufficient amount to produce a reasonable benefit/risk ratio applicable to such treatment. In some embodiments, a therapeutically effective amount or prophylactically effect amount tis administered in a single dose; in some embodiments, multiple unit doses are required to deliver a therapeutically or prophylactically effective amount.


As used herein, the phrase “symptoms are reduced” may be used when one or more symptoms of a particular disease, disorder or condition is reduced in magnitude (e.g., intensity, severity, etc.) and/or frequency. In some embodiments, a delay in the onset of a particular symptom is considered one form of reducing the frequency of that symptom.


When the subject to be treated is a mammal such as a human, the composition can be administered by any means known in the art including, but not limited to oral, intraperitoneal, or parenteral routes, including intracranial (e.g., intraventricular, intraparenchymal, and intrathecal), intravenous, intramuscular, subcutaneous, transdermal, airway (aerosol), nasal, rectal, and topical (including buccal and sublingual) administration. In certain embodiments, the compositions are administered by intravenous infusion or injection. In certain embodiments, the compositions are administered by subcutaneous injection.


As used herein, the term “HNF4α-associated disease,” is a disease or disorder that is caused by, or associated with HNF4α gene expression or HNF4α protein production. The term “HNF4α-associated disease” includes a disease, disorder or condition that would benefit from a decrease in HNF4α gene expression, replication, or protein activity. Non-limiting examples of HNF4α-associated diseases include, for example, liver disease (e.g., fatty liver, steatohepatitis including non-alcoholic steatohepatitis (NASH)), inflammatory bowel disease (IBD), hepatocellular carcinoma, MODY I, polycystic kidney disease, dyslipidemia (e.g., hyperlipidemia, high LDL cholesterol, low HDL cholesterol, hypertriglyceridemia, postprandial hypertriglyceridemia), disorders of glycemic control (e.g., insulin resistance not related to immune response to insulin, type 2 diabetes), cardiovascular disease (e.g., hypertension, endothelial cell dysfunction), kidney disease (e.g., acute kidney disorder, tubular dysfunction, proinflammatory changes to the proximal tubules, chronic kidney disease), metabolic syndrome, disease of lipid deposition or dysfunction (e.g., adipocyte dysfunction, visceral adipose deposition, obesity), disease of elevated uric acid (e.g., hyperuricemia, gout), and eating disorders such as excessive sugar craving. Details regarding signs and symptoms of the various diseases or conditions are well known in the art.


In one embodiment, the HNF4α-associated disease is selected from the group consisting of fatty liver (steatosis), nonalcoholic steatohepatitis (NASH), cirrhosis of the liver, accumulation of fat in the liver, inflammation of the liver, hepatocellular necrosis, liver fibrosis, and nonalcoholic fatty liver disease (NAFLD), polycystic kidney disease, inflammatory bowel disease (IBD), and MODY I.


Administration of the agents or compositions according to the methods of the invention may result in a reduction of the severity, signs, symptoms, or markers of an HNF4α-associated disease or disorder in a patient with an HNF4α-associated disease or disorder. By “reduction” in this context is meant a statistically significant decrease in such level. The reduction (absolute reduction or reduction of the difference between the elevated level in the subject and a normal level) can be, for example, at least about 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, or 95%, or to below the level of detection of the assay used.


Administration of the agents or compositions according to the methods of the invention may stably or transiently modulating expression of a target gene. In some embodiments, a modulation of expression persists for at least about 1 hr to about 30 days, or at least about 2 hrs, 6 hrs, 12 hrs, 18 hrs, 24 hrs, 2 days, 3, days, 4 days, 5 days, 6 days, 7 days, 8 days, 9 days, 10 days, 11 days, 12 days, 13 days, 14 days, 15 days, 16 days, 17 days, 18 days, 19 days, 20 days, 21 days, 22 days, 23 days, 24 days, 25 days, 26 days, 27 days, 28 days, 29 days, 30 days, or longer or any time therebetween. In some other embodiments, a modulation of expression persists for no more than about 30 mins to about 7 days, or no more than about 1 hr, 2 hrs, 3 hrs, 4 hrs, 5 hrs, 6 hrs, 7 hrs, 8 hrs, 9 hrs, 10 hrs, 11 hrs, 12 hrs, 13 hrs, 14 hrs, 15 hrs, 16 hrs, 17 hrs, 18 hrs, 19 hrs, 20 hrs, 21 hrs, 22 hrs, 24 hrs, 36 hrs, 48 hrs, 60 hrs, 72 hrs, 4 days, 5 days, 6 days, 7 days, or any time therebetween.


The agents or compositions may be administered once to the subject or, alternatively, multiple administrations may be performed over a period of time. For example, two, three, four, five, or more administrations may be given to the subject during one treatment or over a period of time. In some embodiments, six, eight, ten, 12, 15 or 20 or more administrations may be given to the subject during one treatment or over a period of time as a treatment regimen.


In some embodiments, administrations may be given as needed, e.g., for as long as symptoms associated with the disease, disorder or condition persist. In some embodiments, repeated administrations may be indicated for the remainder of the subject's life. Treatment periods may vary and could be, e.g., one day, two days, three days, one week, two weeks, one month, two months, three months, six months, a year, or longer.


Efficacy of treatment or prevention of disease can be assessed, for example by measuring disease progression, disease remission, symptom severity, reduction in pain, quality of life, dose of a medication required to sustain a treatment effect, level of a disease marker, or any other measurable parameter appropriate for a given disease being treated or targeted for prevention. It is well within the ability of one skilled in the art to monitor efficacy of treatment or prevention by measuring any one of such parameters, or any combination of parameters. As discussed herein, the specific parameters to be measured depend on the HNF4α-associated disease that the subject is suffering from.


Comparisons of the later readings with the initial readings provide a physician an indication of whether the treatment is effective. It is well within the ability of one skilled in the art to monitor efficacy of treatment or prevention by measuring any one of such parameters, or any combination of parameters. In connection with the administration of an agent or composition, “effective against” a HNF4α-associated disorder indicates that administration in a clinically appropriate manner results in a beneficial effect for at least a statistically significant fraction of patients, such as a improvement of symptoms, a cure, a reduction in disease, extension of life, improvement in quality of life, or other effect generally recognized as positive by medical doctors familiar with treating HNF4α-associated disorders.


A treatment or preventive effect is evident when there is a statistically significant improvement in one or more parameters of disease status, or by a failure to worsen or to develop symptoms where they would otherwise be anticipated. As an example, a favorable change of at least 10% in a measurable parameter of disease, and preferably at least 20%, 30%, 40%, 50% or more can be indicative of effective treatment. Efficacy for a given agent or composition can also be judged using an experimental animal model for the given disease as known in the art. When using an experimental animal model, efficacy of treatment is evidenced when a statistically significant reduction in a marker or symptom is observed.


Alternatively, the efficacy can be measured by a reduction in the severity of disease as determined by one skilled in the art of diagnosis based on a clinically accepted disease severity grading scale. Any positive change resulting in e.g., lessening of severity of disease measured using the appropriate scale, represents adequate treatment using an agent or composition as described herein.


As used herein, the terms “treating” or “treatment” refer to a beneficial or desired result including, but not limited to, alleviation or amelioration of one or more signs or symptoms associated with HNF4α gene expression or HNF4α protein production. “Treatment” can also mean prolonging survival as compared to expected survival in the absence of treatment.


The present invention is next described by means of the following examples. However, the use of these and other examples anywhere in the specification is illustrative only, and in no way limits the scope and meaning of the invention or of any exemplified form. The invention is not limited to any particular preferred embodiments described herein. Many modifications and variations of the invention may be apparent to those skilled in the art and can be made without departing from its spirit and scope. The contents of all references, patents and published patent applications cited throughout this application, including the figures and informal sequence listing, are incorporated herein by reference.


EXAMPLES
Example 1. Modulation of HNF4α Expression

This example describes silencing of HNF4α expression with a site-specific HNF4α disrupting agent, comprising a site-specific HNF4α targeting moiety which targets an HNF4α expression control region, i.e., a guide RNA, and an effector comprising a fusion molecule comprising dCAS9 and KRAB.


Guide RNAs were designed to site-specifically target the transcriptional control region comprising promoter 1 (near the transcriptional start site) of the HNF4α gene (see, e.g., FIG. 1) and synthesized according to standard methods for oligonucleotide synthesis. The nucleotide sequences of the guide RNAs are provided in Table 2, below. Exemplary guide RNAs designed to site-specifically target the transcriptional control region comprising promoter 2 (near the transcriptional start site) of the HNF4α gene were also designed and are provided in Table 3, below. Additional exemplary guide RNAs designed to site-specifically target the transcriptional control region comprising promoter 2 (near the transcriptional start site) of the HNF4α gene were also designed and are provided in Table 9, below. Table 4, below, includes the unmodified nucleotide sequences, reverse complement nucleotide sequences, and chromosomal coordinates of the targeting portion of the guide RNAs in Tables 2 and 3.


HepG2 cells were seeded in 96-well plate (at a density of 3×10E4 cells/well) in appropriate media 24 hours prior to transfection. Cells were transfected with SSOP Lipid Nano-Particles (LNPs) containing mRNA encoding fusion proteins for dCAS9-KRAB (MR_28122) and guide RNAs (sgRNA) according to standard methods (see, e.g., Akita, et al. (Advanced Healthcare Materials (2013) 2:8.:1120-1125).


LNP formulations prepared using individual sgRNAs or Pools of sgRNAs (see, FIG. 1), were added to the cells for a final concentration of 5.0, 2.5, 1.25, or 0.625 μg/ml. The experiment was ended after 72 hours by lysing/freezing in RLT buffer for downstream mRNA purification or treatment with the Cell Titer Glo 2 reagent to quantify final cell number. qPCR was performed using HNF4α and ACTB probes to quantify relative HNF4α RNA transcription.



FIG. 2 demonstrates that Pool 1 or Pool 2 guides in combination with the effector dCAS-KRAB inhibited HNF4α expression in a dose dependent manner relative to dCAS alone, SH-KRAB, and untreated. dCAS alone does affect transcription in some cases, and without wishing to be bound by theory, the inhibition may be the result of dCAS interference with polymerase binding.


In addition, and as demonstrated in FIG. 3, Pool 1 at the highest concentration has the strongest silencing activity and guide RNAs GD-28432 and GD-28433 each have strong silencing activity alone, even at the lowest concentration.


Example 2. Modulation of HNF4α Expression

This example describes activation of HNF4α expression with a site-specific HNF4α disrupting agent, comprising a site-specific HNF4α targeting moiety which targets an HNF4α expression control region, i.e., a guide RNA, and an effector comprising a fusion molecule comprising dCAS9 (a non-editing Cas9 protein) and VPR.


Guide RNAs were designed to site-specifically target the transcriptional control region comprising promoter 1 (near the transcriptional start site) of the HNF4α gene (see, e.g., FIG. 1) and were synthesized according to standard methods for oligonucleotide synthesis. The nucleotide sequences of the guide RNAs are provided in Table 2, below. Exemplary guide RNAs designed to site-specifically target the transcriptional control region comprising promoter 2 (near the transcriptional start site) of the HNF4α gene were also designed and are provided in Table 3, below. Additional exemplary guide RNAs designed to site-specifically target the transcriptional control region comprising promoter 2 (near the transcriptional start site) of the HNF4α gene were also designed and are provided in Table 9, below. Table 4, below, includes the unmodified nucleotide sequences, reverse complement nucleotide sequences, and chromosomal coordinates of the guide RNAs in Tables 2 and 3.


LX-2, A549, and HepG2 cells were seeded in 96-well plates (at a density of 1×10E4 cells/well) in appropriate media 24 hours prior to transfection. LX-2 cells normally do not express HNF4α; HepG2 cells, liver cancer cell line cells, normally express more HNF4α that normal non-cancerous liver cells.


Cells were transfected with mRNA encoding fusion proteins for dCAS9-VPR (MR_28196) and guide RNAs (sgRNA). Individual sgRNAs or pools of sgRNAs were prepared using the Lipofectamine MessengerMAX Transfection protocol as described by the manufacturer. Untreated HepG2 cells were also included in the experiment as an internal control (comparator).


Briefly, 2.5 μl lipofectamine per μg of RNA was mixed with OptiMEM media and incubated for 10 minutes at room temperature (RT), and then added to cells in a 1:10 ratio of the media volume to the cells. Lipofectmine media mix was combined with RNA for a total volume of 120 μl at a final concentration of 100 μg/ml. This mix was incubated at RT for 5 minutes and added to the cells for a final concentration of 5.0, 2.5, 1.25, 0.625 μg/ml.


The experiment was ended after 48 hours by lysing/freezing in RLT buffer for downstream mRNA purification or treatment with the Cell Titer Glo 2 reagent to quantify final cell number. Briefly, the plate for RNA extraction was washed three times with PBS, following which 150 μL RLT buffer was added to each well. The plate was then frozen at −80° C. for later RNA processing. RNA was extracted following thawing of the plates, using the Qiagen RNeasy 96-well kit. RNA was quantified using Ribogreen. Reverse transcription and qPCR were carried out following the protocol for Absolute Quantitation for mRNA Expression by RT-qPCR.


The standard curves for HNF4α and β-actin (ACTB) were prepared as follows: HNF4α and ACTB gene block stocks (a synthesized reference copy of target cDNA) were prepared in nuclease-free water to a concentration of 10 mg/mL. A mixture 0.5 mg/mL of each gene block stock was prepared by combining 5 μL of each individual gene block stock and 40 μL H2O to a final volume of 50 uL. This mixture was then serially diluted (10-fold dilutions) 8 times. Two microliter (2 μL) of each standard curve dilution was used as the cDNA for the standard curve to which was added 8 μL of Taqman Master Mix with probes. The standard curve was set up in duplicate Wells were analyzed in technical triplicates.


RNA was extracted and HNF4α mRNA levels were determined by qPCR.


qPCR was performed using HNF4α and ACTB probes to quantify relative HNF4α RNA transcription. HepG2 HNF4α RNA expression was measured as a positive control.



FIGS. 4A and 4B demonstrate that Pool 1 guides in combination with the effector dCAS-VPR show strong activation of HNF4α expression in both A549 cells and LX-2 cells in a dose dependent manner relative to SH-VPR, and untreated.


Delivering dCAS9-P300 using Lipofectamine MessengerMAX Transfection also upregulates HNF4α, but significantly less than dCAS-VPR (data not shown).


Similar experiments were conducted to evaluate the effect of dCas9 fusion proteins comprising dCas9-VPR and guides targeting the promoter region of HNF4α using Dlin-MC3-DMA (MC3), an LNP formulated for in vivo delivery. Using MC3-LNP, mRNA for dCas9-VPR and individual and pooled sgRNAs (Pool 1) were delivered to cells in culture. RNA was extracted and HNF4α mRNA levels were determined by qPCR as described above. Untreated HepG2 cells were included in the experiment as an internal control (comparitor).



FIG. 5 demonstrates that, in this experiment, the MC3 LNPs are more efficient at transfecting LX-2 cells than Messanger Max in comparison to HNF4α levels to HepG2 (internal control). The individual sgRNAs targeting the HNF4α promoter show upregulation but significantly less than the pooled guides.


Activation of HNF4α in HepG2 Cells using MC3-LNP mediated delivery was also evaluated. HepG2 cells express a basal level of HNF4α that is higher than normal cells. Supraphysiological expression may facilitate greater durability of efficacy.


Using MC3-LNP, mRNA for dCas9-VPR and pooled sgRNAs were delivered to HepG2 cells in culture. A dCas9-VPR control with a control, SH (safe harbor non-targeting control) guide, was also included. HNF4α mRNA levels were determined by qPCR.



FIG. 6 demonstrates that MC3-LNP formulations induced over-expression of HNF4α in HepG2 cells. Expression levels of up to about 200% of control levels were observed. Such expression level were tolerated in cells in culture. The result of the SH control demonstrates that upregulation of HNF4α is specifically due to the promoter targeted delivery of the activating effector.


To investigate whether the HNF4α protein induced by dCas9-VPR Pool 1 was localized to the nucleus of the cell, LX-2 cells were transfected with mRNA for dCas9-VPR and and pooled sgRNAs (Pool 1). In order to visualize cells, treated LX-2 cells were fixed and permeabilized. Cells were grown in 96 well plates with black walls and a clear bottom. Media was removed. All incubations were at room temperature (RT) and stationary. All solutions were made with 10×PBS stock solution. Media was removed and cells were washed 2 times with 1×PBS. To wash the cells, 100 μl of 1×PBS was added, and removed. The wash was repeated twice. One hundred microliter (100 μl) of 4% PFA in 1×PBS was added and incubated for 15 minutes at RT. Cells were washed 3 times with 1×PBS, with 5 minutes incubation between each wash. Cells were then incubaded in 100 μl of 0.1% Triton X 100 in 1×PBS for 15 minutes at RT. One hundred microliter (100 μl) of 10% normal goat serum in 1×PBS with 0.1% Triton X 100 was added and kept at 4 C until further use.


The fixed and permeabilized cells were stained with anti-HNF4α antibody and DAPI. FIG. 7 shows that the HNF4α protein induced by dCas9-VPR Pool 1 can be detected and travels to the nucleus.









TABLE 2







Site-Specific HNF4α Promoter 1 Targeting Moieties (sgRNA)- The first 20 nucleotides


in each moiety below comprise the targeting portion of the moiety.










SEQ




ID



Identifier
NO.
Modified Nucleotide Sequence 5′ to 3′





GD-28431
68
mAs; mUs; mUs; rG; rA; rA; rU; rU; rA; rG; rG; rG; rG; rA; rU; rC; rU; rC; rG; rG;




rG; rU; rU; rU; rU; rA; rG; rA; rG; rC; rU; rA; rG; rA; rA; rA; rU; rA; rG; rC; rA;




rA; rG; rU; rU; rA; rA; rA; rA; rU; rA; rA; rG; rG; rC; rU; rA; rG; rU; rC; rC; rG;




rU; rU; rA; rU; rC; rA; rA; rC; rU; rU; rG; rA; rA; rA; rA; rA; rG; rU; rG; rG; rC;




rA; rC; rC; rG; rA; rG; rU; rC; rG; rG; rU; rG; rC; rUs; mUs; mUs; mU





GD-28432
69
mGs; mAs; mCs; rU; rU; rG; rG; rG; rG; rU; rG; rA; rC; rA; rA; rU; rG; rG; rC; rU;




rG; rU; rU; rU; rU; rA; rG; rA; rG; rC; rU; rA; rG; rA; rA; rA; rU; rA; rG; rC;




rA; rA; rG; rU; rU; rA; rA; rA; rA; rU; rA; rA; rG; rG; rC; rU; rA; rG; rU; rC;




rC; rG; rU; rU; rA; rU; rC; rA; rA; rC; rU; rU; rG; rA; rA; rA; rA; rA; rG; rU;




rG; rG; rC; rA; rC; rC; rG; rA; rG; rU; rC; rG; rG; rU; rG; rC; rUs; mUs; mUs; mU





GD-28433
70
mAs; mAs; mCs; rU; rG; rA; rA; rC; rA; rU; rC; rG; rG; rU; rG; rA; rG; rU; rU; rA;




rG; rU; rU; rU; rU; rA; rG; rA; rG; rC; rU; rA; rG; rA; rA; rA; rU; rA; rG; rC; rA;




rA; rG; rU; rU; rA; rA; rA; rA; rU; rA; rA; rG; rG; rC; rU; rA; rG; rU; rC; rC; rG;




rU; rU; rA; rU; rC; rA; rA; rC; rU; rU; rG; rA; rA; rA; rA; rA; rG; rU; rG; rG; rC;




rA; rC; rC; rG; rA; rG; rU; rC; rG; rG; rU; rG; rC; rUs; mUs; mUs; mU





GD-28434
71
mUs; mGs; mGs; rU; rU; rU; rC; rU; rG; rG; rC; rU; rG; rA; rC; rA; rC; rC; rC; rG;




rG; rU; rU; rU; rU; rA; rG; rA; rG; rC; rU; rA; rG; rA; rA; rA; rU; rA; rG; rC; rA;




rA; rG; rU; rU; rA; rA; rA; rA; rU; rA; rA; rG; rG; rC; rU; rA; rG; rU; rC; rC; rG;




rU; rU; rA; rU; rC; rA; rA; rC; rU; rU; rG; rA; rA; rA; rA; rA; rG; rU; rG; rG; rC;




rA; rC; rC; rG; rA; rG; rU; rC; rG; rG; rU; rG; rC; rUs; mUs; mUs; mU





GD-28435
72
mAs; mUs; mGs; rG; rU; rU; rA; rA; rU; rC; rG; rG; rU; rC; rC; rC; rC; rC; rG; rC;




rG; rU; rU; rU; rU; rA; rG; rA; rG; rC; rU; rA; rG; rA; rA; rA; rU; rA; rG; rC; rA;




rA; rG; rU; rU; rA; rA; rA; rA; rU; rA; rA; rG; rG; rC; rU; rA; rG; rU; rC; rC; rG;




rU; rU; rA; rU; rC; rA; rA; rC; rU; rU; rG; rA; rA; rA; rA; rA; rG; rU; rG; rG; rC;




 rA; rC; rC; rG; rA; rG; rU; rC; rG; rG; rU; rG; rC; rUs; mUs; mUs; mU





GD-28436
73
mGs; mUs; mCs; rC; rU; rC; rU; rG; rG; rG; rA; rA; rG; rA; rU; rC; rU; rG; rC; rU;




rG; rU; rU; rU; rU; rA; rG; rA; rG; rC; rU; rA; rG; rA; rA; rA; rU; rA; rG; rC; rA;




rA; rG; rU; rU; rA; rA; rA; rA; rU; rA; rA; rG; rG; rC; rU; rA; rG; rU; rC; rC; rG;




rU; rU; rA; rU; rC; rA; rA; rC; rU; rU; rG; rA; rA; rA; rA; rA; rG; rU; rG; rG; rC;




rA; rC; rC; rG; rA; rG; rU; rC; rG; rG; rU; rG; rC; rUs; mUs; mUs; mU





GD-28437
74
mGs; mGs; mUs; rU; rU; rG; rA; rA; rA; rG; rG; rA; rA; rG; rG; rC; rA; rG; rA; rG;




rG; rU; rU; rU; rU; rA; rG; rA; rG; rC; rU; rA; rG; rA; rA; rA; rU; rA; rG; rC; rA;




rA; rG; rU; rU; rA; rA; rA; rA; rU; rA; rA; rG; rG; rC; rU; rA; rG; rU; rC; rC; rG;




rU; rU; rA; rU; rC; rA; rA; rC; rU; rU; rG; rA; rA; rA; rA; rA; rG; rU; rG; rG; rC;




rA; rC; rC; rG; rA; rG; rU; rC; rG; rG; rU; rG; rC; rUs; mUs; mUs; mU





GD-28438
75
mAs; mCs; mCs; rC; rU; rG; rG; rG; rC; rG; rC; rC; rC; rA; rC; rC; rC; rC; rG; rA;




rG; rU; rU; rU; rU; rA; rG; rA; rG; rC; rU; rA; rG; rA; rA; rA; rU; rA; rG; rC; rA;




rA; rG; rU; rU; rA; rA; rA; rA; rU; rA; rA; rG; rG; rC; rU; rA; rG; rU; rC; rC; rG;




rU; rU; rA; rU; rC; rA; rA; rC; rU; rU; rG; rA; rA; rA; rA; rA; rG; rU; rG; rG; rC;




rA; rC; rC; rG; rA; rG; rU; rC; rG; rG; rU; rG; rC; rUs; mUs; mUs; mU





GD-28439
76
mUs; mUs; mCs; rU; rC; rC; rC; rU; rG; rC; rC; rU; rC; rC; rA; rC; rG; rC; rC; rG;




rG; rU; rU; rU; rU; rA; rG; rA; rG; rC; rU; rA; rG; rA; rA; rA; rU; rA; rG; rC; rA;




rA; rG; rU; rU; rA; rA; rA; rA; rU; rA; rA; rG; rG; rC; rU; rA; rG; rU; rC; rC; rG;




rU; rU; rA; rU; rC; rA; rA; rC; rU; rU; rG; rA; rA; rA; rA; rA; rG; rU; rG; rG; rC;




rA; rC; rC; rG; rA; rG; rU; rC; rG; rG; rU; rG; rC; rUs; mUs; mUs; mU
















TABLE 3







Site-Specific HNF4α Promoter 2 Targeting Moieties (sgRNA) - The first 20


nucleotides in each moiety below comprise the targeting portion of the moiety.










SEQ




ID



Identifier
NO.
Modified Nucleotide Sequence 5′ to 3′





GD28427
77
mAs; mUs; mGs; rC; rC; rC; rC; rC; rA; rG; rC; rU; rC; rU; rC; rC; rG; rG; rC; rU;




rG; rU; rU; rU; rU; rA; rG; rA; rG; rC; rU; rA; rG; rA; rA; rA; rU; rA; rG; rC; rA;




rA; rG; rU; rU; rA; rA; rA; rA; rU; rA; rA; rG; rG; rC; rU; rA; rG; rU; rC; rC; rG;




rU; rU; rA; rU; rC; rA; rA; rC; rU; rU; rG; rA; rA; rA; rA; rA; rG; rU; rG; rG; rC;




rA; rC; rC; rG; rA; rG; rU; rC; rG; rG; rU; rG; rC; rUs; mUs; mUs; mU





GD28428
78
mCs; mAs; mGs; rC; rG; rU; rG; rA; rA; rC; rG; rC; rG; rC; rC; rC; rC; rU; rC; rG;




rG; rU; rU; rU; rU; rA; rG; rA; rG; rC; rU; rA; rG; rA; rA; rA; rU; rA; rG; rC; rA;




rA; rG; rU; rU; rA; rA; rA; rA; rU; rA; rA; rG; rG; rC; rU; rA; rG; rU; rC; rC; rG;




rU; rU; rA; rU; rC; rA; rA; rC; rU; rU; rG; rA; rA; rA; rA; rA; rG; rU; rG; rG; rC;




rA; rC; rC; rG; rA; rG; rU; rC; rG; rG; rU; rG; rC; rUs; mUs; mUs; mU





GD28429
79
mCs; mUs; mUs; rA; rC; rG; rG; rU; rA; rA; rG; rU; rG; rG; rG; rG; rC; rU; rG; rG;




rG; rU; rU; rU; rU; rA; rG; rA; rG; rC; rU; rA; rG; rA; rA; rA; rU; rA; rG; rC; rA;




rA; rG; rU; rU; rA; rA; rA; rA; rU; rA; rA; rG; rG; rC; rU; rA; rG; rU; rC; rC; rG;




rU; rU; rA; rU; rC; rA; rA; rC; rU; rU; rG; rA; rA; rA; rA; rA; rG; rU; rG; rG; rC;




rA; rC; rC; rG; rA; rG; rU; rC; rG; rG; rU; rG; rC; rUs; mUs; mUs; mU





GD28430
80
mCs; mCs; mCs; rG; rU; rA; rA; rG; rA; rA; rA; rC; rA; rC; rA; rC; rG; rG; rG; rG; rG;




rU; rU; rU; rU; rA; rG; rA; rG; rC; rU; rA; rG; rA; rA; rA; rU; rA; rG; rC; rA; rA;




rG; rU; rU; rA; rA; rA; rA; rU; rA; rA; rG; rG; rC; rU; rA; rG; rU; rC; rC; rG; rU;




rU; rA; rU; rC; rA; rA; rC; rU; rU; rG; rA; rA; rA; rA; rA; rG; rU; rG; rG; rC; rA;




rC; rC; rG; rA; rG; rU; rC; rG; rG; rU; rG; rC; rUs; mUs; mUs; mU





GD28431
68
mAs; mUs; mUs; rG; rA; rA; rU; rU; rA; rG; rG; rG; rG; rA;




rU; rC; rU; rC; rG; rG; rG; rU; rU; rU; rU; rA; rG; rA; rG; rC; rU; rA; rG; rA; rA;




rA; rU; rA; rG; rC; rA; rA; rG; rU; rU; rA; rA; rA; rA; rU; rA; rA; rG; rG; rC; rU;




rA; rG; rU; rC; rC; rG; rU; rU; rA; rU; rC; rA; rA; rC; rU; rU; rG; rA; rA; rA; rA;




rA; rG; rU; rG; rG; rC; rA; rC; rC; rG; rA; rG; rU; rC; rG; rG; rU; rG; rC; rUs;




mUs; mUs; mU
















TABLE 4







Unmodified Nucleotide Sequences of the Targeting Portion of the Site-Specific HNF4α


Targeting Moieties in Tables 2 and 3.













SEQ

SEQ





ID
Unmodified Nucleotide Sequence
ID
Reverse Complement Nucleotide
Chromosomal


Identifier
NO.
5′ to 3′
NO.
Sequence 5′ to 3′
Coordinates















GD-28427
329
ATGCCCCCAGCTCTCCGGCT
335
AGCCGGAGAGCTGGGGGCAT
chr20: 42984411-







42984433





GD-28428
330
CAGCGTGAACGCGCCCCTCG
336
CGAGGGGCGCGTTCACGCTG
chr20: 42984450-







42984472





GD-28429
331
CTTACGGTAAGTGGGGCTGG
337
CCAGCCCCACTTACCGTAAG
chr20: 42984488-







42984510





GD-28430
332
CCCGTAAGAAACACACGGGG
338
CCCCGTGTGTTTCTTACGGG
chr20: 42984560-







42984582





GD-28431
333
ATTGAATTAGGGGATCTCGG
339
CCGAGATCCCCTAATTCAAT
chr20: 43029535-







43029557





GD-28432
334
GACTTGGGGTGACAATGGCT
340
AGCCATTGTCACCCCAAGTC
chr20: 43029596-







43029618





GD-28433
81
AACTGAACATCGGTGAGTTA
82
TAACTCACCGATGTTCAGTT
chr20: 43029685-







43029707





GD-28434
83
TGGTTTCTGGCTGACACCCG
84
CGGGTGTCAGCCAGAAACCA
chr20: 43029729-







43029751





GD-28435
85
ATGGTTAATCGGTCCCCCGC
86
GCGGGGGACCGATTAACCAT
chr20: 43029792-







43029814





GD-28436
87
GTCCTCTGGGAAGATCTGCT
88
AGCAGATCTTCCCAGAGGAC
chr20: 43029873-







43029895





GD-28437
89
GGTTTGAAAGGAAGGCAGAG
90
CTCTGCCTTCCTTTCAAACC
chr20: 43029896-







43029918





GD-28438
91
ACCCTGGGCGCCCACCCCGA
92
TCGGGGTGGGCGCCCAGGGT
chr20: 43029957-







43029979





GD-28439
93
TTCTCCCTGCCTCCACGCCG
94
CGGCGTGGAGGCAGGGAGAA
chr20: 43029991-







43030013
















TABLE 5







Abbreviations of nucleotide monomers used in nucleic


acid sequence representation. It will be understood


that these monomers, when present in an oligonucleotide,


are mutually linked by 5′-3′- phosphodiester bonds.










Abbreviation
Nucleotide(s)







A
Adenosine-3′-phosphate



As
adenosine-3′-phosphorothioate



C
cytidine-3′-phosphate



Cs
cytidine-3′-phosphorothioate



G
guanosine-3′-phosphate



Gs
guanosine-3′-phosphorothioate



U
Uridine-3′-phosphate



Us
uridine-3′-phosphorothioate



N
any nucleotide, modified or unmodified



mA
2′-O-methyladenosine-3′-phosphate



mAs
2′-O-methyladenosine-3′-phosphorothioate



mC
2′-O-methylcytidine-3′-phosphate



mCs
2′-O-methylcytidine-3′- phosphorothioate



mG
2′-O-methylguanosine-3′-phosphate



mGs
2′-O-methylguanosine-3′- phosphorothioate



mU
2′-O-methyluridine-3′-phosphate



mUs
2′-O-methyluridine-3′-phosphorothioate



s
phosphorothioate linkage



r
ribonucleotide










Example 3. Design of Zinc Finger DNA Binding Domain and TALE Fusion Protein

As described in Examples 1 and 2, the sites of effective activation of HNF4α gene expression were identified using dCas9 fusion proteins and guides directed to specific nucleotide regions 5′ of promoter 1 of the HNF4α gene. Based on these data, Zinc Finger DNA binding domain polypeptides (ZF) and TALE polypeptides were designed to target the same or similar sequence regions 5′ of promoter 1 of the HNF4α gene. FIG. 9 depicts the target areas for design of the exemplary ZF proteins and TALE proteins of the inventions.



FIG. 10 depicts the structure of the exemplary site-specific HNF4α disrupting agent, comprising a site-specific HNF4α targeting moiety which targets an HNF4α expression control region, e.g., a zing finger (ZF) or a TALE, and an effector comprising, e.g., VPR (e.g., ZF-VPR or TALE-VPR fusion proteins) of the invention. As shown in FIG. 10, an exemplary ZF-VPR protein may include one or more nuclear localization signals (NLS), such as SV40 NLS or nucleoplasmin NLS. In some embodiments, the NLSs are located at the N-terminus, between the VP64 and RelA (p65) activation domain (VPR), and at the C-terminus.


In some embodiments, the mRNAs encoding the ZF fusion protein or the TAL fusion proteins, may contain a “natural cap” structure at the 5′-terminus, e.g., the mRNAs may be cap0 (no methyl), cap1 (methyl on first ribose), cap2 (methyl on second ribose).


Downstream of the cap is a 5′ untranslated region (UTR), a sequence which is designed to promote high levels of protein translation. Downstream of the 5′ UTR is the coding sequence, 3′ UTR and the polyA tail.


An exemplary nucleotide sequence of a 5′ UTR for use in the constructs is









(SEQ ID NO.: 341)


AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACC.






An exemplary nucleotide sequence of a 3′ UTR for use in the constructs is









(SEQ ID NO.: 342)


CUCUCCCUUGCACCUGUACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGA





AGUCUAG






An exemplary nucleotide sequence of a poly-A tail for use in the constructs is









(SEQ ID NO.: 343)


AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAA






The mRNAs encoding the ZF fusion proteins or the TALE fusion proteins may be complexed with a lipid nanoparticle (e.g., MC3). The ZF domains of the fusion proteins are targeted to P1 of HNF4α, which controls the transcription of the isoforms of HNF4α expressed primarily in the liver.


Tables 6A and 6B below provide the amino acid sequences of various exemplary fusion protein constructs, the corresponding encoding mRNA sequences, the target genomic sequences thereof, and the amino acid sequences of the DNA binding domains for use in the disrupting agents of the constructs.















TABLE 6A










Column 6
Column 7





Column 4
Column 5
Nucleotide
Genomic Coordinates of





Sequence of
Amino Acid
Sequence
the Target Site in the


Column 1
Column 2
Column 3
Target
Sequence of the
of the
HNF4α Expression


Exemplary HNF4α
Amino Acid
Nucleotide
Site in HNF4α
DNA Binding
DNA Binding
Control Region


Disrupting Agents
Sequence of the
Sequence of
Expression Control
Domain of the
Domain of the
(Genome Reference


Comprising a Zinc
Disrupting
the Disrupting
Region Targeted by
Disrupting
Disrupting
Consortium


Finger DNA Binding
Agent in
Agents in
the Disrupting
Agent in
Agent in
Human Build


Domain and an Effector
Column 1
Column 1
Agents in Column 1
Column 1
Column 1
37 (GRCh37):


Name
(SEQ ID NO:)
(SEQ ID NO).
(SEQ ID NO.)
(SEQ ID NO.)
(SEQ ID NO.)
Chromosome 20)







ZF1-VPR
100
101
102
103
211
GrCh37: chr20:43029991-








43030011


ZF2-VPR
104
105
106
107
212
GrCh37: chr20:43029959-








43029979


ZF3-VPR
108
109
110
111
213
GrCh37: chr20:43029897-








43029917


ZF4-VPR
112
113
114
115
214
GrCh37: chr20:43029874-








43029894


ZF5-VPR
116
117
118
119
215
GRCh37: chr20:43029794-








43029814


ZF6-VPR
120
121
122
123
216
CRCh37: chr20:43029731-








43029751


ZF7-VPR
124
125
126
127
217
GRCh37: chr20:43029686-








43029706


ZF8-VPR
128
129
130
131
218
GrCh37: chr20:43029598-








43029618


ZF9-VPR
132
133
134
135
219
GrCh37: chr20:43029536-








43029558


ZF10-VPR
136
137
138
139
220
GrCh37: chr20:43029767-








43029787


ZF11-VPR
140
141
142
143
221
GrCh37: chr20:43029820-








43029840


ZF12-VPR
144
145
146
147
222
GrCh37: chr20:43029855-








43029875


ZF13-VPR
148
149
150
151
223
GrCh37: chr20:43029766-








43029786


ZF14-VPR
152
153
154
155
224
GRCh37: chr20:43029810-








43029830


ZF15-VPR
156
157
158
159
225
GRCh37: chr20:43029832-








43029852


ZF5-VPR ATUM Opt_1
116
160
118
119
226
GRCh37: chr20:43029794-


(ZF5.1-VPR)





43029814


ZF5-VPR ATUM Opt_2
116
161
118
119
227
GRCh37: chr20:43029794-


(ZF5.2-VPR)





43029814


ZF5-VPR ATUM Opt_3
116
162
118
119
228
GRCh37: chr20:43029794-


(ZF5.3-VPR)





43029814


ZF5-VPR ATUM Opt_4
116
163
118
119
229
GRCh37: chr20:43029794-


(ZF5.4-VPR)





43029814


ZF5-VPR ATUM Opt_5
116
164
118
119
230
GRCh37: chr20:43029794-


(ZF5.5-VPR)





43029814


ZF5-VPR ATUM Opt_6
116
165
118
119
231
GRCh37: chr20:43029794-


(ZF5.6-VPR)





43029814


ZF5-P300
166
167
118
119
232
GRCh37: chr20:43029794-








43029814


ZF5-VPR + ZF7-VPR
116 + 124
117 + 125
118 + 126
119 + 127
215 + 217
GRCh37: chr20:43029794-








43029814








GRCh37: chr20:43029686-








43029706


ZF5-VPR ATUMOpt_3 +
116 + 124
162 + 125
118 + 126
119 + 127
228 + 217
GRCh37: chr20:43029794-


ZF7-VPR





43029814








GRCh37: chr20:43029686-








43029706


ZF7-P300
168
169
126
127
233
GRCh37: chr20:43029686-








43029706


ZF5.3-VPR3
170
171
118
119
234
GRCh37: chr20:43029794-








43029814


ZF5-no effector
174
175
118
119
235
GRCh37: chr20:43029794-








43029814


ZF5.3-VPR-tPT2a-ZF7-
176
177
118 + 126
119 + 127
236 + 237
GRCh37: chr20:43029794-


VPR





43029814








GRCh37: chr20:43029686-








43029706


ZF7-VPR-tPT2a-ZF5.3-
178
179
126 + 118
127 + 119
238 + 239
GRCh37: chr20:43029686-


VPR





43029706








GRCh37: chr20:43029794-








43029814


ZF5.3-VPR-tPT2a-ZF7-
180
181
118 + 126
119 + 127
240 + 241
GRCh37: chr20:43029794-


p300





43029814








GRCh37: chr20:43029686-








43029706


ZF7-p300-tPT2a-ZF5.3-
182
183
126 + 118
127 + 119
242 + 243
GRCh37: chr20:43029686-


VPR





43029706








GRCh37: chr20:43029794-








43029814


TAL1-VPR
184
185
102
186
244
GrCh37: chr20:43029991-








43030011


TAL2-VPR
187
188
106
189
245
GrCh37: chr20:43029959-








43029979


TAL3-VPR
190
191
110
192
246
GrCh37: chr20:43029897-








43029917


TAL4-VPR
193
194
114
195
247
GrCh37: chr20:43029874-








43029894


TAL5-VPR
196
197
118
198
248
GRCh37: chr20:43029794-








43029814


TAL6-VPR
199
200
122
201
249
CRCh37: chr20:43029731-








43029751


TAL7-VPR
202
203
126
204
250
GRCh37: chr20:43029686-








43029706


TAL8-VPR
205
206
130
207
251
GrCh37: chr20:43029598-








43029618


TAL9-VPR
208
209
134
210
252
GrCh37: chr20:43029536-








43029558



















TABLE 6B





Column 1


Column 4


Exemplary Effector
Column 2
Column 3
Amino acid Sequence of


Fusion Protein of
Amino acid sequence of
Nucleotide sequence of
DNA Binding Domain


Exemplary HNF4α
the Effector Fusion
the Effector Fusion
of the Effector Fusion


Disrupting Agents
Protein in Column 1
Protein in Column 1
Protein in Column 1


Name
(SEQ ID NO:)
(SEQ ID NO:)
(SEQ ID NO:)


















dCas9-VPR
95
96
97


dCas9-P300
98
99
97


dCas9-VPR3
172
173
97









Example 4. Modulation of HNF4α Expression by ZF-Fusion Proteins

In order to test a single mRNA encoding a site-specific HNF4α disrupting agent, comprising a site-specific HNF4α targeting moiety which targets an HNF4α expression control region, e.g., a zing finger (ZF), and an effector comprising, e.g., VPR, (ZF-VPR fusion proteins), rather than dCas9-effector fusions with pooled sgRNAs, Zinc Finger DNA Binding Domain (DBD)-VPR (ZF-VPR) fusion proteins were designed as described above. Such fusion proteins which bind sites similar to or identical to the P1 targeting guide RNAs described in Examples 1 and 2 were evaluated for their ability to effect expression of HNF4α in vitro.


Several mRNAs encoding fusion proteins comprising a ZF DBD and a VPR domain were constructed (see Table 6A). Individual and pooled fusion protein encoding mRNAs were delivered to LX-2 cells as MC3 LNP formulations as described above. Expression of HNF4α was measured by qPCR as described above. Untreated HepG2 and dCas9-VPR-Pool 1 were included as positive controls.


As shown in FIG. 11, it was found that ZF005-VPR (also referred to as ZF5-VPR) showed robust upregulation of HNF4α. ZF007-VPR (also referred to as ZF7-VPR) also showed a quantifiable upregulation.


Example 5. Modulation of HNF4α Expression by TALE-Fusion Proteins

In a further effort to use a single mRNA coding a site-specific HNF4α disrupting agent, comprising a site-specific HNF4α targeting moiety which targets an HNF4α expression control region, e.g., a Talen based DNA Binding Domain (DBD), and an effector comprising, e.g., VPR, (TAL-VPR fusion proteins), were designed. Such fusion proteins which bind sites similar to or identical to the P1 targeting guide RNAs described in Examples 1 and 2 were evaluated for their ability to effect expression of HNF4α in vitro.


Several mRNAs encoding fusion proteins of a TALEN DBD with a VPR domain were constructed (see Table 6A). Individual and pooled fusion protein encoding mRNAs were delivered to LX-2 cells as MC3 LNP formulations as described above. Expression of HNF4α was measured by qPCR as described above. Untreated HepG2 and dCas9-VPR-Pool 1 were included a positive controls. ZF5-VPR and ZF7-VPR were also included in the study as comparators.


Table 7 below shows the experiment design for the pooled TAL-VPR experiments.











TABLE 7







TAL1-VPR
Pool X
Pools are made by mixing 1:1:1


TAL2-VPR

of each TAL fusion protein.


TAL3-VPR


TAL4-VPR
Pool Y


TAL5-VPR


TAL6-VPR


TAL7-VPR
Pool Z


TAL8-VPR


Unrelated domain-VPR









As shown in FIG. 12, it was found that pooled TALE based-VPR fusion proteins showed strong up-regulation of HNF4α expression Individually, each TAL-VPR fusion protein did not have significant activity to up-regulate the expression of HNF4α (data not shown). This Figure also demonstrates that the results described above for ZF5-VPR and Z7-VPR were reproducible.


Example 6. Modulation of HNF4α Expression with Codon Optimized ZF5-VPR and Synergy of the Combination of ZF5-VPR and ZF7-VPR in LX-2 Cells

In this example, activation of HNF4α in LX-2 cells with codon optimized ZF5-VPR fusion proteins was evaluated. The objective was to analyze whether codon optimization of ZF5-VPR would improve activity of the ZF5-VPR fusion proteins at the HNF4α promoter. Another objective was to analyze whether the combination of ZF5-VPR and ZF7-VPR fusion proteins was synergistic in activating HNF4α expression.


The sequence for ZF5-VPR fusion proteins were sent to ATUM for codon optimization, a process that uses al algorithm to re-design ideal codon frequency and context to optimize translational efficiency. ATUM returned 6 new variants which all encode the same protein, each with different RNA sequences. (see Table 6A).


mRNAs for all five ATUM codon optimized ZF5-VPR, unaltered ZF5-VPR and ZF7-VPR were transfected into LX-2 cells as MC3 LNPs. Expression of HNF4α was measured by qPCR 48 hours after transfection.


As shown in FIG. 13, ZF-5-VPR ATUM codon otpimized variant 3 (ZF5.3-VPR) showed stronger upregulation of HNF4α expression as compared to the other codon-optimized variants. Co-transfecting ZF5-VPR and ZF7-VPR led to a supraphysiological (>200%) increase in expression of HNF4α, demonstrating synergy of the combination of these two fusion proteins. (In FIG. 13, Pool 1 is dCas9-VPR+GD-28431+GD-28432+GD-28433; Pool 2 is dCas9-VPR+GD-28434+GD-28435+GD-28436).


Example 7. Durability of Modulation of HNF4α Expression in K562 Cells

This example identified the effectors (VPR and/or P300) that was able to induce the longest lasting up-regulation of HNF4α expression. The example also measures how long the up-regulation of HNF4α in K562 could last when treated with various ZF-effector fusion proteins and various combinations of ZF-effector fusion proteins.


Various individual mRNAs encoding ZF-effector fusion proteins and combinations of the mRNAs were transfected into K562 cells and allowed to grow for 10 days in culture.


The following individual fusion proteins and combinations were tested in K562 cells, transfected with MC3 LNPs: ZF5-VPR, ZF5-P300, a combination of ZF5-VPR and ZF5-P300, or a combination of ZF5-VPR and ZF7-VPR (see Table 6A).


Transfections and quantifications were performed as described above. Briefly, untreated K562 cells were included as an assay control. K562 cells were treated with individual and different combination of effectors (2.5 μg/mL) in triplicate. Data points were collected over a period of 10 days. qPCR readout was used to measure mRNA expression. K562 cells were seeded at 100 k/well in triplicates: Time points were collected every 2 days for a total of 10 days. Three wells of untreated K562 were included as a control. RT qPCR for HNF4α was performed to measure the expression of HNF4α at each time point for similarly treated LX-2 cells.


As shown in FIGS. 14 and 15, HNF4α expression upregulation was observed in K562 cell until day 6 when any of the single fusion proteins or combinations of the fusion proteins were transfected. Co-transfection with mRNAs encoding ZF5-VPR and ZF7-VPR led to the highest and most durable increase of HNF4α expression in cultured cells, with detectable expression out to 10 days.


In another experiment, the durability of up-regulation of HNF4α expression in K562 when treated with ZF-5-VPR, ZF-5-P300, a combination of ZF5-VPR-ATUM 3 and ZF7-VPR (also referred to as ZF (5.3+7) or a combination of ZF5-PR and ZF7-VPR (also referred to as ZF-(5+7)-VPR) was determined.


K562 cells were treated with individual and different combination of effectors (2.5 ug/mL) in triplicate. Time points were collected over a period of 10 days. qPCR readout was used to measure mRNA expression. K562 cells were seeded at 100 k/well in triplicates: Time points were collected every 2 days for a total of 10 days. Three wells of untreated K562 were included as a control. RT qPCR was performed to measure the expression of HNF4α at each time point. Fusion protein were formulated in MC3 LNPs.


As shown in FIG. 16, HNF4α upregulation was observed in K562 cell until day 6 using both combinations of ZF5-VPR and ZF7-VPR and ZF5.3-VPR and ZF7-VPR. Both combinations of ZF5-VPR and ZF7-VPR and ZF5.3-VPR and ZF7-VPR showed slight upregulation at day 8 and 10.


Example 8. Modulation of Biomarker Genes Expression Following Activation of HNF4α in LX-2 Cells

The first experiment of this example demonstrates the change in expression level of downstream biomarker genes following activation of HNF4α in LX-2 Cells. The objective of this experiment was to demonstrate that the HNF4α induced by dCas9-VPR in LX-2 cells is an active transcription factor by examining the effect of upregulating HNF4α expression on the expression of two downstream gene Cola1 and aSMA. Collagen 1a1 (Col1a1), and alpha-Smooth Muscle Actin (aSMA) are 2 proteins highly expressed in damaged liver cells and are key drivers of fibrosis in end-stage liver disease. Both of these genes are negatively-regulated by HNF4α.


LX-2 cells, which highly express Col1a1 and aSMA, were transfected with dCas-VPR-Pool 1 as described above, followed by qPCR measurement of HNF4A, Col1a1, and aSMA.


As shown in FIG. 17, upregulation of HNF4α significantly down-regulated expression of Col1a1 and aSMA.


The second experiment of this example demonstrates that the ZF-VPR fusion proteins also down-regulate expression of Col1a1 and aSMA, biomarkers of fibrotic liver disease.


As shown in FIG. 18, biomarkers of fibrotic liver disease. Cola1 and aSMA, were down-regulated in LX-2 cells following up-regulation of HNF4α with ZF5.3-VPR, a combination of ZF5-VPR and ZF7-VPR or a combination of ZF5.3-VPR and ZF7-VPR.


Example 9. Screening of Additional ZF-VPR Fusion Proteins and Combinations Thereof and Assessment of dCas9-VPR3

This example demonstrates that, in addition to the nine ZF-VPR proteins tested in Example 4, other ZF-VPR fusion proteins and various combinations thereof can upregulate the expression of HNF4α in LX-2 cells.


In the first experiment, fusion proteins were screened in LX-2 cells. Untreated Hep G2 cells were included as an assay control. LX-2 cells was treated with a single concentration of effector (2.5 μg/mL) in triplicate and incubated for 48 hours. ZF-5-VPR was used as a positive control for qPCR readout. LX-2 cells were transfected as described above using MC3-LNP formulations. The mRNAs encoding the following ZF-VPR fusion proteins and various combinations were transferred: ZF5-VPR, ZF5-VPR-ATUM3 (ZF5.3-VPR), ZF-7-VPR, ZF-11-VPR, ZF-13-VPR, and ZF-15-VPR, or combinations thereof.


As shown in FIG. 19, stronger upregulation was observed when ZF5-VPR or ZF5.3-VPR were combined with ZF7-VPR. ZF7-VPR alone upregulated HNF4α in LX-2 cells to a low level.


In the second experiment, activation of HNF4α using dCas9-VPR3-Pool 1 or ZF-VPR fusion protein combinations was assessed. The objective was to evaluate the effects of various ZF-VPR fusion proteins in combinations in LX-2 Cells and test the new effector dCas9-VPR3 on LX-2 cells. VPR3 is a dCas9 DNA binding domain fused to 3 consecutive VPR domains in a row, expressed as a single protein. The combinations of ZF5.3-VPR and other ZF-VPRs that slightly upregulated the expression of HNF4α in LX-2 cells were tested for possible synergy. LX-2 cells were treated with a single concentration of effectors (2.5 ag/mL) in MC3 LNP formulations in triplicate for 48 hours. ZF-5-VPR was used as a positive control qPCR readout was used. The following combinations were tested: ZF5.3-VPR and ZF10-VPR, ZF5.3-VPR and ZF14-VPR, and ZF5.3-VPR and ZF15-VPR.


As shown in FIG. 20, dCas9-VPR3 Pool 1 upregulates HNF4α in LX-2 cells. ZF14-VPR and ZF15-VPR caused low upregulation of HNF4α when transfected individually. A strong synergistic upregulation (similar to the synergistic upregulation observed with ZF5.3-VPR+ZF7-VPR) was observed when ZF14-VPR or ZF15-VPR were in combination with ZF5.3-VPR.


Example 10. HNF4α Activation in FRG-KO Mouse Liver Humanized Hepatocytes (Yecuris Human Hepatocytes)

Yecuris human hepatocytes are primary human hepatocytes ex-planted into an immunocompromised mouse, allowed to proliferate, and then harvested for in vitro tissue culture. Yecuris hepatocytes were obtained as a cell suspension and plated in 96-well format at 40K cells/well. Cells were treated with ZF fusion protein-MC3 LNPs for 24 hrs at two concentrations, 2.5 μg/ml and 1.25 μg/ml. HNF4α gene expression was measured at 48 hrs post treatment. ZF7.4-VPR was used instead of ZF7-VPR.


Yecuris cells were transfected with mRNAs encoding ZF5-VPR, ZF5.3-VPR, ZF7.4-VPR, ZF5.3-VPR3 (a ZF5.3 protein fused to 3 consecutive VPR domains in a row, expressed as a single protein), ZF7.4-VPR3, and ZF5-alone (with no VPR fused to ZF5).


As shown in FIG. 21, ZF5.3-VPR3 does not increase HNF4α gene expression compared to ZF5.3-VPR. An increase in HNF4α gene expression was observed with ZF7-VPR3 as compared to ZF7.4-VPR. As compared to the LX-2 cells, the ZF7-VPR constructs upregulate HNF4α to a greater level.


Example 11. Activation of LX-2 Cells with Bicistronic ZF5.3-VPR and ZF7-VPR

This example evaluates whether ZF5.3-VPR (ZF5-VPR ATUM variant 3) and ZF7-VPR bicistronic constructs upregulate HNF4α in LX-2 cells to the same level as ZF5.3-VPR and ZF7-VPR when individually combined.


Untreated Hep G2 cells were included as an assay control. LX-2 cells were treated with a single concentration of the mRNAs encoding the fusion proteins (2.5 μg/mL) in triplicate. The combination of ZF5.3-VPR and ZF7-VPR was used as a positive control. qPCR readout was used to measure mRNA expression.


The following bicistronic mRNA constructs were tested: ZF5.3-VPR-tPT2A-ZF7-VPR, ZF7-ZF5.3-VPR-tPT2A-ZF7-p300, and ZF7-p300-tPT2A-ZF5.3-VPR (see Table 6A). tPT2A is a linker that covalently links a first fusion protein to a second fusion protein to generate a bicistronic fusion protein.


As shown in FIG. 22, the bicistronic mRNA ZF5.3-VPR-tPT2A-ZF7-VPR induced stronger upregulation of HNF4α in LX-2 cells than the other 3 bicistronic mRNAs tested.


Example 12. Effect of Repeat Dosing of Yecuris Hepatocytes with VPR and p300 on HNF4α Gene Expression

This example describes the effect of delivering multiple doses of mRNA encoding ZF-effector fusion proteins to cells. The objective was to evaluate whether additive or synergistic upregulation of HNF4α can be achieved by multiple dosing.


Yecuris hepatocytes were plated at 64K cells/well. Cells were treated with the combination of ZF5.3-VPR and ZF7-VPR or ZF7-p300 via MC3 LNP formulations at a final concentration of 1.25 μg/ml. Bicistronic mRNAs ZF5.3-VPR-tPT2A-ZF7-VPR, ZF7-VPR-tPT2A-ZF5.3-VPR, ZF5.3-VPR-tPT2A-ZF7-p300, ZF7-p300-tPT2A-ZF5.3-VPR in MC3 LNP formulations were used to treat cells at a final concentration of 1.25 μg/ml.


The expression level of the HNF4α was measured at 48 hrs post last dose. The dosing and harvesting schedule followed in this experiment is provided in Table 8, below.












TABLE 8







Dose
Harvest




















Day 0
Dose 1




Day 1
Media change



Day 2

Harvest



Day 3
Dose 2



Day 4
Media change



Day 5

Harvest



Day 6
Dose 3



Day 7
Media change



Day 8

Harvest










As shown in FIG. 23, repeated dosing of cells with MC3 LNP formulations containing mRNAs encoding ZF-effector fusion proteins resulted in an additive increase of expression of HNF4α. The combination of ZF5.3-VPR and ZF7-VPR was stronger than ZF7-p300 in its activation potential. In addition, and as shown in FIG. 24, bicistronic constructs increased HNF4α in Yecuris hepatocytes, with ZF7-p300-tPT2A-ZF5.3-VPR providing the strongest upregulation.


Example 13: Activation of HNF4α in K562 Cells with Bi-Cistronic mRNA-10 Day Durability Study

This example describes the durability of the bicistronic constructs in K562 cells. K562 cells were treated with a single concentration of mRNAs encoding the fusion proteins in MC3 LNPs (2.5 μg/mL) in triplicate as described above.


The bicistronic mRNAs encoding the following constructs were tested: ZF5.3-VPR-tPT2A-ZF7-VPR, ZF7-VPR-tPT2A-ZF5.3-VPR, ZF5.3-VPR-tPT2A-ZF7-p300, ZF7-p300-tPT2A-ZF5.3-VPR.


As shown in FIG. 25, ZF7-p300-tPT2A-ZF5 (v3)-VPR showed better durability in K562 at later days as compared to other bicistronic constructs.


Table 9 below provides exemplary target nucleotide sequences and corresponding sgRNA nucleotide sequences suitable for use in the present invention.















TABLE 9





Genomic Coordinates of








the Target Start Site in








Genome Reference Consortium 








Human Build 37 (GRCh37):




sgRNA sequence



Chromosome 20)
Strand
Target sequence
SEQ ID NO
PAM
[PLEASE ADVISE, NO “Us”]
SEQ ID NO





















43029193
−1
CCCTCACCCCCACCCCCTCC
344
CGG
CCCTCACCCCCACCCCCTCCGTTTTAGAGCTAGAAATAGCAAGTTA
596







AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029203
1
TCCGGGAGGGGGTGGGGGTG
345
AGG
TCCGGGAGGGGGTGGGGGTGGTTTTAGAGCTAGAAATAGCAAGTTA
597







AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029204
1
CCGGGAGGGGGTGGG
346
GGG
CCGGGAGGGGGTGGGGGTGAGTTTTAGAGCTAGAAATAGCAAGTTA
598




GGTGA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029211
1
GGGGTGGGGGTGAGG
347
AGG
GGGGTGGGGGTGAGGGAAACGTTTTAGAGCTAGAAATAGCAAGTTA
599




GAAAC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029223
1
AGGGAAACAGGAGAA
348
TGG
AGGGAAACAGGAGAATGTGAGTTTTAGAGCTAGAAATAGCAAGTTA
600




TGTGA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029224
1
GGGAAACAGGAGAAT
349
GGG
GGGAAACAGGAGAATGTGATGTTTTAGAGCTAGAAATAGCAAGTTA
601




GTGAT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029238
1
TGTGATGGGAAAATC
350
TGG
TGTGATGGGAAAATCCGAGAGTTTTAGAGCTAGAAATAGCAAGTTA
602




CGAGA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029241
−1
GGCCCAGGCTGGCTC
351
CGG
GGCCCAGGCTGGCTCCATCTGTTTTAGAGCTAGAAATAGCAAGTTA
603




CATCT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029249
1
AATCCGAGATGGAGC
352
TGG
AATCCGAGATGGAGCCAGCCGTTTTAGAGCTAGAAATAGCAAGTTA
604




CAGCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029250
1
ATCCGAGATGGAGCC
353
GGG
ATCCGAGATGGAGCCAGCCTGTTTTAGAGCTAGAAATAGCAAGTTA
605




AGCCT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029252
−1
CCAGTGTTTCTGGCC
354
TGG
CCAGTGTTTCTGGCCCAGGCGTTTTAGAGCTAGAAATAGCAAGTTA
606




CAGGC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029256
−1
GCTCCCAGTGTTTCT
355
AGG
GCTCCCAGTGTTTCTGGCCCGTTTTAGAGCTAGAAATAGCAAGTTA
607




GGCCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029262
−1
CCCACAGCTCCCAGT
356
TGG
CCCACAGCTCCCAGTGTTTCGTTTTAGAGCTAGAAATAGCAAGTTA
608




GTTTC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029263
1
CCAGCCTGGGCCAGA
357
TGG
CCAGCCTGGGCCAGAAACACGTTTTAGAGCTAGAAATAGCAAGTTA
609




AACAC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029264
1
CAGCCTGGGCCAGAA
358
GGG
CAGCCTGGGCCAGAAACACTGTTTTAGAGCTAGAAATAGCAAGTTA
610




ACACT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029272
1
GCCAGAAACACTGGG
359
TGG
GCCAGAAACACTGGGAGCTGGTTTTAGAGCTAGAAATAGCAAGTTA
611




AGCTG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029273
1
CCAGAAACACTGGGA
360
GGG
CCAGAAACACTGGGAGCTGTGTTTTAGAGCTAGAAATAGCAAGTTA
612




GCTGT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029279
1
ACACTGGGAGCTGTGGGAGA
361
CGG
ACACTGGGAGCTGTGGGAGAGTTTTAGAGCTAGAAATAGCAAGTTA
613







AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029284
1
GGGAGCTGTGGGAGA
362
AGG
GGGAGCTGTGGGAGACGGAGGTTTTAGAGCTAGAAATAGCAAGTTA
614




CGGAG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029285
1
GGAGCTGTGGGAGAC
363
GGG
GGAGCTGTGGGAGACGGAGAGTTTTAGAGCTAGAAATAGCAAGTTA
615




GGAGA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029286
1
GAGCTGTGGGAGACG
364
GGG
GAGCTGTGGGAGACGGAGAGGTTTTAGAGCTAGAAATAGCAAGTTA
616




GAGAG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029290
1
TGTGGGAGACGGAGA
365
AGG
TGTGGGAGACGGAGAGGGGCGTTTTAGAGCTAGAAATAGCAAGTTA
617




GGGGC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029291
1
GTGGGAGACGGAGAG
366
GGG
GTGGGAGACGGAGAGGGGCAGTTTTAGAGCTAGAAATAGCAAGTTA
618




GGGCA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029294
1
GGAGACGGAGAGGGG
367
TGG
GGAGACGGAGAGGGGCAGGGGTTTTAGAGCTAGAAATAGCAAGTTA
619




CAGGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029295
1
GAGACGGAGAGGGGC
368
GGG
GAGACGGAGAGGGGCAGGGTGTTTTAGAGCTAGAAATAGCAAGTTA
620




AGGGT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029303
1
GAGGGGCAGGGTGGG
369
AGG
GAGGGGCAGGGTGGGATCACGTTTTAGAGCTAGAAATAGCAAGTTA
621




ATCAC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029304
1
AGGGGCAGGGTGGGA
370
GGG
AGGGGCAGGGTGGGATCACAGTTTTAGAGCTAGAAATAGCAAGTTA
622




TCACA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029310
1
AGGGTGGGATCACAGGGAGC
371
AGG
AGGGTGGGATCACAGGGAGCGTTTTAGAGCTAGAAATAGCAAGTTA
623







AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029315
1
GGGATCACAGGGAGC
372
CGG
GGGATCACAGGGAGCAGGAGGTTTTAGAGCTAGAAATAGCAAGTTA
624




AGGAG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029316
1
GGATCACAGGGAGCA
373
GGG
GGATCACAGGGAGCAGGAGCGTTTTAGAGCTAGAAATAGCAAGTTA
625




GGAGC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029317
1
GATCACAGGGAGCAG
374
GGG
GATCACAGGGAGCAGGAGCGGTTTTAGAGCTAGAAATAGCAAGTTA
626




GAGCG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029323
1
AGGGAGCAGGAGCGG
375
TGG
AGGGAGCAGGAGCGGGGAATGTTTTAGAGCTAGAAATAGCAAGTTA
627




GGAAT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029326
1
GAGCAGGAGCGGGGA
376
AGG
GAGCAGGAGCGGGGAATTGGGTTTTAGAGCTAGAAATAGCAAGTTA
628




ATTGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029335
1
CGGGGAATTGGAGGT
377
TGG
CGGGGAATTGGAGGTGAATCGTTTTAGAGCTAGAAATAGCAAGTTA
629







AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG





GAATC


GTGCTTTT






43029347
−1
AATGGACTGGAAGTT
378
GGG
AATGGACTGGAAGTTTGGGAGTTTTAGAGCTAGAAATAGCAAGTTA
630




TGGGA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029348
−1
GAATGGACTGGAAGT
379
AGG
GAATGGACTGGAAGTTTGGGGTTTTAGAGCTAGAAATAGCAAGTTA
631




TTGGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029351
−1
GCAGAATGGACTGGA
380
GGG
GCAGAATGGACTGGAAGTTTGTTTTAGAGCTAGAAATAGCAAGTTA
632




AGTTT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029352
−1
AGCAGAATGGACTGGAAGTT
381
TGG
AGCAGAATGGACTGGAAGTTGTTTTAGAGCTAGAAATAGCAAGTTA
633







AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029360
−1
CCCCTGGGAGCAGAA
382
TGG
CCCCTGGGAGCAGAATGGACGTTTTAGAGCTAGAAATAGCAAGTTA
634




TGGAC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029365
−1
CGGTTCCCCTGGGAG
383
TGG
CGGTTCCCCTGGGAGCAGAAGTTTTAGAGCTAGAAATAGCAAGTTA
635




CAGAA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029369
1
TTCCAGTCCATTCTG
384
AGG
TTCCAGTCCATTCTGCTCCCGTTTTAGAGCTAGAAATAGCAAGTTA
636




CTCCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029370
1
TCCAGTCCATTCTGC
385
GGG
TCCAGTCCATTCTGCTCCCAGTTTTAGAGCTAGAAATAGCAAGTTA
637




TCCCA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029371
1
CCAGTCCATTCTGCT
386
GGG
CCAGTCCATTCTGCTCCCAGGTTTTAGAGCTAGAAATAGCAAGTTA
638




CCCAG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029375
−1
CGCAGTTTCCCGGTT
387
GGG
CGCAGTTTCCCGGTTCCCCTGTTTTAGAGCTAGAAATAGCAAGTTA
639




CCCCT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029376
−1
CCGCAGTTTCCCGGT
388
TGG
CCGCAGTTTCCCGGTTCCCCGTTTTAGAGCTAGAAATAGCAAGTTA
640




TCCCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029377
1
CATTCTGCTCCCAGG
389
CGG
CATTCTGCTCCCAGGGGAACGTTTTAGAGCTAGAAATAGCAAGTTA
641




GGAAC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029378
1
ATTCTGCTCCCAGGG
390
GGG
ATTCTGCTCCCAGGGGAACCGTTTTAGAGCTAGAAATAGCAAGTTA
642




GAACC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029385
−1
CCAGTTCCCCCGCAG
391
CGG
CCAGTTCCCCCGCAGTTTCCGTTTTAGAGCTAGAAATAGCAAGTTA
643




TTTCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029387
1
CCAGGGGAACCGGGA
392
CGG
CCAGGGGAACCGGGAAACTGGTTTTAGAGCTAGAAATAGCAAGTTA
644




AACTG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029388
1
CAGGGGAACCGGGAA
393
GGG
CAGGGGAACCGGGAAACTGCGTTTTAGAGCTAGAAATAGCAAGTTA
645




ACTGC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029389
1
AGGGGAACCGGGAAA
394
GGG
AGGGGAACCGGGAAACTGCGGTTTTAGAGCTAGAAATAGCAAGTTA
646




CTGCG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029390
1
GGGGAACCGGGAAAC
395
GGG
GGGGAACCGGGAAACTGCGGGTTTTAGAGCTAGAAATAGCAAGTTA
647




TGCGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029396
1
CCGGGAAACTGCGGG
396
TGG
CCGGGAAACTGCGGGGGAACGTTTTAGAGCTAGAAATAGCAAGTTA
648




GGAAC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029400
1
GAAACTGCGGGGGAA
397
AGG
GAAACTGCGGGGGAACTGGAGTTTTAGAGCTAGAAATAGCAAGTTA
649




CTGGA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029401
1
AAACTGCGGGGGAAC
398
GGG
AAACTGCGGGGGAACTGGAAGTTTTAGAGCTAGAAATAGCAAGTTA
650




TGGAA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029417
1
GGAAGGGAGCTCCCA
399
AGG
GGAAGGGAGCTCCCAGAACAGTTTTAGAGCTAGAAATAGCAAGTTA
651




GAACA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029417
−1
ATCTTCTGGATCCTT
400
GGG
ATCTTCTGGATCCTTGTTCTGTTTTAGAGCTAGAAATAGCAAGTTA
652




GTTCT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029418
−1
AATCTTCTGGATCCTTGTTC
401
TGG
AATCTTCTGGATCCTTGTTCGTTTTAGAGCTAGAAATAGCAAGTTA
653







AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029431
1
AGAACAAGGATCCAG
402
TGG
AGAACAAGGATCCAGAAGATGTTTTAGAGCTAGAAATAGCAAGTTA
654




AAGAT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029431
−1
GGCCCCAGATGCCAA
403
TGG
GGCCCCAGATGCCAATCTTCGTTTTAGAGCTAGAAATAGCAAGTTA
655




TCTTC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029438
1
GGATCCAGAAGATTG
404
TGG
GGATCCAGAAGATTGGCATCGTTTTAGAGCTAGAAATAGCAAGTTA
656




GCATC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029439
1
GATCCAGAAGATTGG
405
GGG
GATCCAGAAGATTGGCATCTGTTTTAGAGCTAGAAATAGCAAGTTA
657




CATCT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029440
1
ATCCAGAAGATTGGC
406
GGG
ATCCAGAAGATTGGCATCTGGTTTTAGAGCTAGAAATAGCAAGTTA
658




ATCTG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029445
1
GAAGATTGGCATCTG
407
TGG
GAAGATTGGCATCTGGGGCCGTTTTAGAGCTAGAAATAGCAAGTTA
659




GGGCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029446
1
AAGATTGGCATCTGG
408
GGG
AAGATTGGCATCTGGGGCCTGTTTTAGAGCTAGAAATAGCAAGTTA
660




GGCCT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029452
−1
GATTTAGAAACCTAA
409
AGG
GATTTAGAAACCTAAATCCCGTTTTAGAGCTAGAAATAGCAAGTTA
661




ATCCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029453
1
GCATCTGGGGCCTGG
410
AGG
GCATCTGGGGCCTGGGATTTGTTTTAGAGCTAGAAATAGCAAGTTA
662




GATTT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029467
1
GGATTTAGGTTTCTA
411
TGG
GGATTTAGGTTTCTAAATCGGTTTTAGAGCTAGAAATAGCAAGTTA
663




AATCG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029468
1
GATTTAGGTTTCTAA
412
GGG
GATTTAGGTTTCTAAATCGTGTTTTAGAGCTAGAAATAGCAAGTTA
664




ATCGT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029474
1
GGTTTCTAAATCGTG
413
TGG
GGTTTCTAAATCGTGGGCCAGTTTTAGAGCTAGAAATAGCAAGTTA
665




GGCCA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029475
1
GTTTCTAAATCGTGG
414
GGG
GTTTCTAAATCGTGGGCCATGTTTTAGAGCTAGAAATAGCAAGTTA
666




GCCAT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029476
1
TTTCTAAATCGTGGG
415
GGG
TTTCTAAATCGTGGGCCATGGTTTTAGAGCTAGAAATAGCAAGTTA
667




CCATG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029480
−1
GCAGAGATAAGGCTG
416
TGG
GCAGAGATAAGGCTGCCCCAGTTTTAGAGCTAGAAATAGCAAGTTA
668




CCCCA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029491
−1
TCAATGCTTTTGCAG
417
AGG
TCAATGCTTTTGCAGAGATAGTTTTAGAGCTAGAAATAGCAAGTTA
669




AGATA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029504
1
TTATCTCTGCAAAAG
418
AGG
TTATCTCTGCAAAAGCATTGGTTTTAGAGCTAGAAATAGCAAGTTA
670




CATTG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029505
1
TATCTCTGCAAAAGC
419
GGG
TATCTCTGCAAAAGCATTGAGTTTTAGAGCTAGAAATAGCAAGTTA
671




ATTGA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029523
1
GAGGGTAGAAGTCAA
420
TGG
GAGGGTAGAAGTCAATGATTGTTTTAGAGCTAGAAATAGCAAGTTA
672




TGATT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029524
1
AGGGTAGAAGTCAATGATTT
421
GGG
AGGGTAGAAGTCAATGATTTGTTTTAGAGCTAGAAATAGCAAGTTA
673







AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029540
1
ATTTGGGAAGTTATT
422
AGG
ATTTGGGAAGTTATTGAATTGTTTTAGAGCTAGAAATAGCAAGTTA
674




GAATT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029541
1
TTTGGGAAGTTATTG
423
GGG
TTTGGGAAGTTATTGAATTAGTTTTAGAGCTAGAAATAGCAAGTTA
675




AATTA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029542
1
TTGGGAAGTTATTGA
424
GGG
TTGGGAAGTTATTGAATTAGGTTTTAGAGCTAGAAATAGCAAGTTA
676




ATTAG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029549
1
GTTATTGAATTAGGG
425
CGG
GTTATTGAATTAGGGGATCTGTTTTAGAGCTAGAAATAGCAAGTTA
677




GATCT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029552
1
ATTGAATTAGGGGAT
333
AGG
ATTGAATTAGGGGATCTCGGGTTTTAGAGCTAGAAATAGCAAGTTA
678




CTCGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029556
1
AATTAGGGGATCTCG
426
AGG
AATTAGGGGATCTCGGAGGTGTTTTAGAGCTAGAAATAGCAAGTTA
679




GAGGT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029577
−1
GCATTCTAACTGATA
427
AGG
GCATTCTAACTGATACTATCGTTTTAGAGCTAGAAATAGCAAGTTA
680




CTATC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029597
1
ATCAGTTAGAATGCC
428
TGG
ATCAGTTAGAATGCCTGACTGTTTTAGAGCTAGAAATAGCAAGTTA
681




TGACT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029598
1
TCAGTTAGAATGCCT
429
GGG
TCAGTTAGAATGCCTGACTTGTTTTAGAGCTAGAAATAGCAAGTTA
682




GACTT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029599
1
CAGTTAGAATGCCTG
430
GGG
CAGTTAGAATGCCTGACTTGGTTTTAGAGCTAGAAATAGCAAGTTA
683




ACTTG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029599
−1
AGCCATTGTCACCCC
340
AGG
AGCCATTGTCACCCCAAGTCGTTTTAGAGCTAGAAATAGCAAGTTA
684




AAGTC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029608
1
TGCCTGACTTGGGGT
431
TGG
TGCCTGACTTGGGGTGACAAGTTTTAGAGCTAGAAATAGCAAGTTA
685




GACAA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029613
1
GACTTGGGGTGACAA
334
TGG
GACTTGGGGTGACAATGGCTGTTTTAGAGCTAGAAATAGCAAGTTA
686




TGGCT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029616
1
TTGGGGTGACAATGG
432
AGG
TTGGGGTGACAATGGCTTGGGTTTTAGAGCTAGAAATAGCAAGTTA
687




CTTGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029617
1
TGGGGTGACAATGGC
433
GGG
TGGGGTGACAATGGCTTGGAGTTTTAGAGCTAGAAATAGCAAGTTA
688




TTGGA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029618
1
GGGGTGACAATGGCT
434
GGG
GGGGTGACAATGGCTTGGAGGTTTTAGAGCTAGAAATAGCAAGTTA
689




TGGAG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029621
1
GTGACAATGGCTTGG
435
TGG
GTGACAATGGCTTGGAGGGGGTTTTAGAGCTAGAAATAGCAAGTTA
690




AGGGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029622
1
TGACAATGGCTTGGA
436
GGG
TGACAATGGCTTGGAGGGGTGTTTTAGAGCTAGAAATAGCAAGTTA
691




GGGGT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029632
1
TTGGAGGGGTGGGTG
437
AGG
TTGGAGGGGTGGGTGAGTCAGTTTTAGAGCTAGAAATAGCAAGTTA
692




AGTCA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029633
1
TGGAGGGGTGGGTGA
438
GGG
TGGAGGGGTGGGTGAGTCAAGTTTTAGAGCTAGAAATAGCAAGTTA
693




GTCAA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029656
−1
AGGCAGGCATCATGA
439
GGG
AGGCAGGCATCATGACTCACGTTTTAGAGCTAGAAATAGCAAGTTA
694




CTCAC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029657
−1
AAGGCAGGCATCATG
440
CGG
AAGGCAGGCATCATGACTCAGTTTTAGAGCTAGAAATAGCAAGTTA
695




ACTCA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029672
−1
AGTTATCAATTGTAC
441
AGG
AGTTATCAATTGTACAAGGCGTTTTAGAGCTAGAAATAGCAAGTTA
696




AAGGC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029676
−1
GTTCAGTTATCAATT
442
AGG
GTTCAGTTATCAATTGTACAGTTTTAGAGCTAGAAATAGCAAGTTA
697




GTACA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029692
1
TACAATTGATAACTG
443
CGG
TACAATTGATAACTGAACATGTTTTAGAGCTAGAAATAGCAAGTTA
698




AACAT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029701
1
TAACTGAACATCGGT
444
AGG
TAACTGAACATCGGTGAGTTGTTTTAGAGCTAGAAATAGCAAGTTA
699




GAGTT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029702
1
AACTGAACATCGGTG
2511
GGG
AACTGAACATCGGTGAGTTAGTTTTAGAGCTAGAAATAGCAAGTTA
700




AGTTA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029714
−1
GGTGCTAATTACAAC
445
GGG
GGTGCTAATTACAACTGCTGGTTTTAGAGCTAGAAATAGCAAGTTA
701




TGCTG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029715
−1
GGGTGCTAATTACAA
446
GGG
GGGTGCTAATTACAACTGCTGTTTTAGAGCTAGAAATAGCAAGTTA
702




CTGCT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029716
−1
GGGGTGCTAATTACA
447
TGG
GGGGTGCTAATTACAACTGCGTTTTAGAGCTAGAAATAGCAAGTTA
703




ACTGC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029729
1
AGCAGTTGTAATTAG
448
CGG
AGCAGTTGTAATTAGCACCCGTTTTAGAGCTAGAAATAGCAAGTTA
704




CACCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029730
1
GCAGTTGTAATTAGC
449
GGG
GCAGTTGTAATTAGCACCCCGTTTTAGAGCTAGAAATAGCAAGTTA
705




ACCCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029735
−1
TGGTTTCTGGCTGAC
2512
GGG
TGGTTTCTGGCTGACACCCGGTTTTAGAGCTAGAAATAGCAAGTTA
706




ACCCG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029736
−1
TTGGTTTCTGGCTGA
450
GGG
TTGGTTTCTGGCTGACACCCGTTTTAGAGCTAGAAATAGCAAGTTA
707




CACCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029737
−1
GTTGGTTTCTGGCTG
451
CGG
GTTGGTTTCTGGCTGACACCGTTTTAGAGCTAGAAATAGCAAGTTA
708




ACACC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029748
−1
TTTGGCTGTTTGTTG
452
TGG
TTTGGCTGTTTGTTGGTTTCGTTTTAGAGCTAGAAATAGCAAGTTA
709




GTTTC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029755
−1
GCAGGGATTTGGCTG
453
TGG
GCAGGGATTTGGCTGTTTGTGTTTTAGAGCTAGAAATAGCAAGTTA
710




TTTGT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029766
-1
TGGGCGGGGCTGCAG
454
TGG
TGGGCGGGGCTGCAGGGATTGTTTTAGAGCTAGAAATAGCAAGTTA
711




GGATT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029772
−1
ATAGGCTGGGCGGGG
455
GGG
ATAGGCTGGGCGGGGCTGCAGTTTTAGAGCTAGAAATAGCAAGTTA
712




CTGCA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029773
−1
GATAGGCTGGGCGGG
456
AGG
GATAGGCTGGGCGGGGCTGCGTTTTAGAGCTAGAAATAGCAAGTTA
713




GCTGC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029780
−1
GCCGGTGGATAGGCT
457
GGG
GCCGGTGGATAGGCTGGGCGGTTTTAGAGCTAGAAATAGCAAGTTA
714




GGGCG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029781
−1
CGCCGGTGGATAGGC
458
GGG
CGCCGGTGGATAGGCTGGGCGTTTTAGAGCTAGAAATAGCAAGTTA
715




TGGGC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029782
−1
CCGCCGGTGGATAGG
459
CGG
CCGCCGGTGGATAGGCTGGGGTTTTAGAGCTAGAAATAGCAAGTTA
716




CTGGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029785
−1
CCCCCGCCGGTGGAT
460
GGG
CCCCCGCCGGTGGATAGGCTGTTTTAGAGCTAGAAATAGCAAGTTA
717




AGGCT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029786
−1
TCCCCCGCCGGTGGA
461
TGG
TCCCCCGCCGGTGGATAGGCGTTTTAGAGCTAGAAATAGCAAGTTA
718




TAGGC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029790
1
GCCCCGCCCAGCCTA
462
CGG
GCCCCGCCCAGCCTATCCACGTTTTAGAGCTAGAAATAGCAAGTTA
719




TCCAC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029790
−1
TCGGTCCCCCGCCGG
463
AGG
TCGGTCCCCCGCCGGTGGATGTTTTAGAGCTAGAAATAGCAAGTTA
720




TGGAT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029793
1
CCGCCCAGCCTATCC
464
CGG
CCGCCCAGCCTATCCACCGGGTTTTAGAGCTAGAAATAGCAAGTTA
721




ACCGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029794
1
CGCCCAGCCTATCCA
465
GGG
CGCCCAGCCTATCCACCGGCGTTTTAGAGCTAGAAATAGCAAGTTA
722




CCGGC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029795
1
GCCCAGCCTATCCAC
466
GGG
GCCCAGCCTATCCACCGGCGGTTTTAGAGCTAGAAATAGCAAGTTA
723




CGGCG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029795
−1
GTTAATCGGTCCCCC
467
TGG
GTTAATCGGTCCCCCGCCGGGTTTTAGAGCTAGAAATAGCAAGTTA
724




GCCGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029796
1
CCCAGCCTATCCACC
468
GGG
CCCAGCCTATCCACCGGCGGGTTTTAGAGCTAGAAATAGCAAGTTA
725




GGCGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029798
−1
ATGGTTAATCGGTCC
2513
CGG
ATGGTTAATCGGTCCCCCGCGTTTTAGAGCTAGAAATAGCAAGTTA
726




CCCGC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029809
−1
GGTGGGGGTTAATGG
469
CGG
GGTGGGGGTTAATGGTTAATGTTTTAGAGCTAGAAATAGCAAGTTA
727




TTAAT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029817
−1
CGGGGAGGGGTGGGG
470
TGG
CGGGGAGGGGTGGGGGTTAAGTTTTAGAGCTAGAAATAGCAAGTTA
728




GTTAA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029824
−1
GCTCTGCCGGGGAGG
471
GGG
GCTCTGCCGGGGAGGGGTGGGTTTTAGAGCTAGAAATAGCAAGTTA
729




GGTGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029825
−1
GGCTCTGCCGGGGAG
472
GGG
GGCTCTGCCGGGGAGGGGTGGTTTTAGAGCTAGAAATAGCAAGTTA
730




GGGTG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029826
−1
AGGCTCTGCCGGGGA
473
GGG
AGGCTCTGCCGGGGAGGGGTGTTTTAGAGCTAGAAATAGCAAGTTA
731




GGGGT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029827
−1
GAGGCTCTGCCGGGG
474
TGG
GAGGCTCTGCCGGGGAGGGGGTTTTAGAGCTAGAAATAGCAAGTTA
732




AGGGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029829
1
CATTAACCCCCACCC
475
CGG
CATTAACCCCCACCCCTCCCGTTTTAGAGCTAGAAATAGCAAGTTA
733




CTCCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029830
−1
GTGGAGGCTCTGCCG
476
GGG
GTGGAGGCTCTGCCGGGGAGGTTTTAGAGCTAGAAATAGCAAGTTA
734




GGGAG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029831
−1
GGTGGAGGCTCTGCC
477
GGG
GGTGGAGGCTCTGCCGGGGAGTTTTAGAGCTAGAAATAGCAAGTTA
735




GGGGA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029832
−1
GGGTGGAGGCTCTGC
478
AGG
GGGTGGAGGCTCTGCCGGGGGTTTTAGAGCTAGAAATAGCAAGTTA
736




CGGGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029835
−1
AAGGGGTGGAGGCTC
479
GGG
AAGGGGTGGAGGCTCTGCCGGTTTTAGAGCTAGAAATAGCAAGTTA
737




TGCCG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029836
−1
GAAGGGGTGGAGGCT
480
GGG
GAAGGGGTGGAGGCTCTGCCGTTTTAGAGCTAGAAATAGCAAGTTA
738




CTGCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029837
−1
TGAAGGGGTGGAGGC
481
CGG
TGAAGGGGTGGAGGCTCTGCGTTTTAGAGCTAGAAATAGCAAGTTA
739




TCTGC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029846
−1
TAGCCTCTGTGAAGG
482
AGG
TAGCCTCTGTGAAGGGGTGGGTTTTAGAGCTAGAAATAGCAAGTTA
740




GGTGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029849
−1
GCCTAGCCTCTGTGA
483
TGG
GCCTAGCCTCTGTGAAGGGGGTTTTAGAGCTAGAAATAGCAAGTTA
741




AGGGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029852
−1
TTGGCCTAGCCTCTG
484
GGG
TTGGCCTAGCCTCTGTGAAGGTTTTAGAGCTAGAAATAGCAAGTTA
742




TGAAG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029853
−1
CTTGGCCTAGCCTCT
485
GGG
CTTGGCCTAGCCTCTGTGAAGTTTTAGAGCTAGAAATAGCAAGTTA
743




GTGAA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029854
1
GAGCCTCCACCCCTT
486
AGG
GAGCCTCCACCCCTTCACAGGTTTTAGAGCTAGAAATAGCAAGTTA
744




CACAG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029854
−1
TCTTGGCCTAGCCTC
487
AGG
TCTTGGCCTAGCCTCTGTGAGTTTTAGAGCTAGAAATAGCAAGTTA
745




TGTGA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029859
1
TCCACCCCTTCACAG
488
AGG
TCCACCCCTTCACAGAGGCTGTTTTAGAGCTAGAAATAGCAAGTTA
746




AGGCT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029871
−1
GGAAGATCTGCTGGG
489
TGG
GGAAGATCTGCTGGGAGTCTGTTTTAGAGCTAGAAATAGCAAGTTA
747




AGTCT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029879
−1
GTCCTCTGGGAAGAT
2514
GGG
GTCCTCTGGGAAGATCTGCTGTTTTAGAGCTAGAAATAGCAAGTTA
748




CTGCT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029880
−1
CGTCCTCTGGGAAGA
490
TGG
CGTCCTCTGGGAAGATCTGCGTTTTAGAGCTAGAAATAGCAAGTTA
749




TCTGC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029888
1
CTCCCAGCAGATCTT
491
AGG
CTCCCAGCAGATCTTCCCAGGTTTTAGAGCTAGAAATAGCAAGTTA
750




CCCAG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029892
1
CAGCAGATCTTCCCA
492
CGG
CAGCAGATCTTCCCAGAGGAGTTTTAGAGCTAGAAATAGCAAGTTA
751




GAGGA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029892
−1
TTCCTTTCAAACCGT
493
GGG
TTCCTTTCAAACCGTCCTCTGTTTTAGAGCTAGAAATAGCAAGTTA
752




CCTCT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029893
−1
CTTCCTTTCAAACCG
494
TGG
CTTCCTTTCAAACCGTCCTCGTTTTAGAGCTAGAAATAGCAAGTTA
753




TCCTC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029901
1
TTCCCAGAGGACGGT
495
AGG
TTCCCAGAGGACGGTTTGAAGTTTTAGAGCTAGAAATAGCAAGTTA
754




TTGAA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029905
1
CAGAGGACGGTTTGA
496
AGG
CAGAGGACGGTTTGAAAGGAGTTTTAGAGCTAGAAATAGCAAGTTA
755




AAGGA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029913
1
GGTTTGAAAGGAAGG
2515
AGG
GGTTTGAAAGGAAGGCAGAGGTTTTAGAGCTAGAAATAGCAAGTTA
756




CAGAG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029914
1
GTTTGAAAGGAAGGC
497
GGG
GTTTGAAAGGAAGGCAGAGAGTTTTAGAGCTAGAAATAGCAAGTTA
757




AGAGA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029920
1
AAGGAAGGCAGAGAG
498
TGG
AAGGAAGGCAGAGAGGGCACGTTTTAGAGCTAGAAATAGCAAGTTA
758




GGCAC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029921
1
AGGAAGGCAGAGAGG
499
GGG
AGGAAGGCAGAGAGGGCACTGTTTTAGAGCTAGAAATAGCAAGTTA
759




GCACT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029924
1
AAGGCAGAGAGGGCA
500
AGG
AAGGCAGAGAGGGCACTGGGGTTTTAGAGCTAGAAATAGCAAGTTA
760




CTGGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029927
1
GCAGAGAGGGCACTG
501
AGG
GCAGAGAGGGCACTGGGAGGGTTTTAGAGCTAGAAATAGCAAGTTA
761




GGAGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029933
1
AGGGCACTGGGAGGA
502
TGG
AGGGCACTGGGAGGAGGCAGGTTTTAGAGCTAGAAATAGCAAGTTA
762




GGCAG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029934
1
GGGCACTGGGAGGAGGCAGT
503
GGG
GGGCACTGGGAGGAGGCAGTGTTTTAGAGCTAGAAATAGCAAGTTA
763







AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029937
1
CACTGGGAGGAGGCA
504
AGG
CACTGGGAGGAGGCAGTGGGGTTTTAGAGCTAGAAATAGCAAGTTA
764




GTGGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029938
1
ACTGGGAGGAGGCAG
505
GGG
ACTGGGAGGAGGCAGTGGGAGTTTTAGAGCTAGAAATAGCAAGTTA
765




TGGGA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029941
1
GGGAGGAGGCAGTGG
506
CGG
GGGAGGAGGCAGTGGGAGGGGTTTTAGAGCTAGAAATAGCAAGTTA
766




GAGGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029944
1
AGGAGGCAGTGGGAG
507
AGG
AGGAGGCAGTGGGAGGGCGGGTTTTAGAGCTAGAAATAGCAAGTTA
767




GGCGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029945
1
GGAGGCAGTGGGAGG
508
GGG
GGAGGCAGTGGGAGGGCGGAGTTTTAGAGCTAGAAATAGCAAGTTA
768




GCGGA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029948
1
GGCAGTGGGAGGGCG
509
CGG
GGCAGTGGGAGGGCGGAGGGGTTTTAGAGCTAGAAATAGCAAGTTA
769




GAGGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029949
1
GCAGTGGGAGGGCGG
510
GGG
GCAGTGGGAGGGCGGAGGGCGTTTTAGAGCTAGAAATAGCAAGTTA
770




AGGGC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029950
1
CAGTGGGAGGGCGGA
511
GGG
CAGTGGGAGGGCGGAGGGCGGTTTTAGAGCTAGAAATAGCAAGTTA
771




GGGCG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029951
1
AGTGGGAGGGCGGAG
512
GGG
AGTGGGAGGGCGGAGGGCGGGTTTTAGAGCTAGAAATAGCAAGTTA
772




GGCGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029958
1
GGGCGGAGGGCGGGG
513
CGG
GGGCGGAGGGCGGGGGCCTTGTTTTAGAGCTAGAAATAGCAAGTTA
773




GCCTT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029959
1
GGCGGAGGGCGGGGG
514
GGG
GGCGGAGGGCGGGGGCCTTCGTTTTAGAGCTAGAAATAGCAAGTTA
774




CCTTC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029960
1
GCGGAGGGCGGGGGC
515
GGG
GCGGAGGGCGGGGGCCTTCGGTTTTAGAGCTAGAAATAGCAAGTTA
775




CTTCG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029963
1
GAGGGCGGGGGCCTT
516
TGG
GAGGGCGGGGGCCTTCGGGGGTTTTAGAGCTAGAAATAGCAAGTTA
776




CGGGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029963
−1
ACCCTGGGCGCCCAC
2516
AGG
ACCCTGGGCGCCCACCCCGAGTTTTAGAGCTAGAAATAGCAAGTTA
777




CCCGA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029964
1
AGGGCGGGGGCCTTC
517
GGG
AGGGCGGGGGCCTTCGGGGTGTTTTAGAGCTAGAAATAGCAAGTTA
778




GGGGT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029972
1
GGCCTTCGGGGTGGG
518
AGG
GGCCTTCGGGGTGGGCGCCCGTTTTAGAGCTAGAAATAGCAAGTTA
779




CGCCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029973
1
GCCTTCGGGGTGGGC
519
GGG
GCCTTCGGGGTGGGCGCCCAGTTTTAGAGCTAGAAATAGCAAGTTA
780




GCCCA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029977
1
TCGGGGTGGGCGCCC
2517
AGG
TCGGGGTGGGCGCCCAGGGTGTTTTAGAGCTAGAAATAGCAAGTTA
781




AGGGT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029978
1
CGGGGTGGGCGCCCA
520
GGG
CGGGGTGGGCGCCCAGGGTAGTTTTAGAGCTAGAAATAGCAAGTTA
782




GGGTA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029978
−1
GCGGCCACCTGCCCT
521
GGG
GCGGCCACCTGCCCTACCCTGTTTTAGAGCTAGAAATAGCAAGTTA
783




ACCCT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029979
−1
CGCGGCCACCTGCCC
522
TGG
CGCGGCCACCTGCCCTACCCGTTTTAGAGCTAGAAATAGCAAGTTA
784




TACCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029982
1
GTGGGCGCCCAGGGT
523
AGG
GTGGGCGCCCAGGGTAGGGCGTTTTAGAGCTAGAAATAGCAAGTTA
785




AGGGC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029985
1
GGCGCCCAGGGTAGG
524
TGG
GGCGCCCAGGGTAGGGCAGGGTTTTAGAGCTAGAAATAGCAAGTTA
786




GCAGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029991
1
CAGGGTAGGGCAGGT
525
CGG
CAGGGTAGGGCAGGTGGCCGGTTTTAGAGCTAGAAATAGCAAGTTA
787




GGCCG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029996
1
TAGGGCAGGTGGCCG
526
TGG
TAGGGCAGGTGGCCGCGGCGGTTTTAGAGCTAGAAATAGCAAGTTA
788




CGGCG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029997
−1
TTCTCCCTGCCTCCA
2518
CGG
TTCTCCCTGCCTCCACGCCGGTTTTAGAGCTAGAAATAGCAAGTTA
789




CGCCG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43029999
1
GGCAGGTGGCCGCGG
527
AGG
GGCAGGTGGCCGCGGCGTGGGTTTTAGAGCTAGAAATAGCAAGTTA
790




CGTGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030003
1
GGTGGCCGCGGCGTG
528
AGG
GGTGGCCGCGGCGTGGAGGCGTTTTAGAGCTAGAAATAGCAAGTTA
791




GAGGC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030004
1
GTGGCCGCGGCGTGG
529
GGG
GTGGCCGCGGCGTGGAGGCAGTTTTAGAGCTAGAAATAGCAAGTTA
792




AGGCA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030029
−1
GTCCATGTCGACGAG
530
TGG
GTCCATGTCGACGAGGGTTTGTTTTAGAGCTAGAAATAGCAAGTTA
793




GGTTT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030035
−1
GGCCATGTCCATGTC
531
GGG
GGCCATGTCCATGTCGACGAGTTTTAGAGCTAGAAATAGCAAGTTA
794




GACGA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030036
−1
CGGCCATGTCCATGT
532
AGG
CGGCCATGTCCATGTCGACGGTTTTAGAGCTAGAAATAGCAAGTTA
795




CGACG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030038
1
CTCCAAAACCCTCGT
533
TGG
CTCCAAAACCCTCGTCGACAGTTTTAGAGCTAGAAATAGCAAGTTA
796




CGACA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030044
1
AACCCTCGTCGACAT
534
TGG
AACCCTCGTCGACATGGACAGTTTTAGAGCTAGAAATAGCAAGTTA
797




GGACA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030056
−1
GTCCAGTGCAGCACT
535
CGG
GTCCAGTGCAGCACTGTAGTGTTTTAGAGCTAGAAATAGCAAGTTA
798




GTAGT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030065
1
GGCCGACTACAGTGC
536
TGG
GGCCGACTACAGTGCTGCACGTTTTAGAGCTAGAAATAGCAAGTTA
799




TGCAC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030078
−1
ATTCCAGGGTGGTGT
537
GGG
ATTCCAGGGTGGTGTAGGCTGTTTTAGAGCTAGAAATAGCAAGTTA
800




AGGCT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030079
−1
AATTCCAGGGTGGTG
538
TGG
AATTCCAGGGTGGTGTAGGCGTTTTAGAGCTAGAAATAGCAAGTTA
801




TAGGC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030083
−1
CTCAAATTCCAGGGT
539
AGG
CTCAAATTCCAGGGTGGTGTGTTTTAGAGCTAGAAATAGCAAGTTA
802




GGTGT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030086
1
GGACCCAGCCTACACCACCC
540
TGG
GGACCCAGCCTACACCACCCGTTTTAGAGCTAGAAATAGCAAGTTA
803







AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030089
−1
CACATTCTCAAATTC
541
TGG
CACATTCTCAAATTCCAGGGGTTTTAGAGCTAGAAATAGCAAGTTA
804




CAGGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030092
−1
CTGCACATTCTCAAA
542
GGG
CTGCACATTCTCAAATTCCAGTTTTAGAGCTAGAAATAGCAAGTTA
805




TTCCA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030093
−1
CCTGCACATTCTCAA
543
AGG
CCTGCACATTCTCAAATTCCGTTTTAGAGCTAGAAATAGCAAGTTA
806




ATTCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030104
1
CCTGGAATTTGAGAA
544
AGG
CCTGGAATTTGAGAATGTGCGTTTTAGAGCTAGAAATAGCAAGTTA
807




TGTGC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030116
1
GAATGTGCAGGTGTT
545
TGG
GAATGTGCAGGTGTTGACGAGTTTTAGAGCTAGAAATAGCAAGTTA
808




GACGA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030117
1
AATGTGCAGGTGTTG
546
GGG
AATGTGCAGGTGTTGACGATGTTTTAGAGCTAGAAATAGCAAGTTA
809




ACGAT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030123
1
CAGGTGTTGACGATG
547
TGG
CAGGTGTTGACGATGGGCAAGTTTTAGAGCTAGAAATAGCAAGTTA
810




GGCAA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030127
1
TGTTGACGATGGGCA
548
AGG
TGTTGACGATGGGCAATGGTGTTTTAGAGCTAGAAATAGCAAGTTA
811




ATGGT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030130
1
TGACGATGGGCAATG
549
TGG
TGACGATGGGCAATGGTAGGGTTTTAGAGCTAGAAATAGCAAGTTA
812




GTAGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030131
1
GACGATGGGCAATGGTAGGT
550
GGG
GACGATGGGCAATGGTAGGTGTTTTAGAGCTAGAAATAGCAAGTTA
813







AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030132
1
ACGATGGGCAATGGT
551
GGG
ACGATGGGCAATGGTAGGTGGTTTTAGAGCTAGAAATAGCAAGTTA
814




AGGTG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030133
1
CGATGGGCAATGGTA
552
GGG
CGATGGGCAATGGTAGGTGGGTTTTAGAGCTAGAAATAGCAAGTTA
815




GGTGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030147
1
AGGTGGGGGCAGATG
553
AGG
AGGTGGGGGCAGATGTGCCCGTTTTAGAGCTAGAAATAGCAAGTTA
816




TGCCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030153
−1
CTGCCCCCACTGGCA
554
GGG
CTGCCCCCACTGGCACACCTGTTTTAGAGCTAGAAATAGCAAGTTA
817




CACCT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030154
−1
CCTGCCCCCACTGGC
555
TGG
CCTGCCCCCACTGGCACACCGTTTTAGAGCTAGAAATAGCAAGTTA
818




ACACC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030158
1
GATGTGCCCAGGTGT
556
TGG
GATGTGCCCAGGTGTGCCAGGTTTTAGAGCTAGAAATAGCAAGTTA
819




GCCAG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030159
1
ATGTGCCCAGGTGTG
557
GGG
ATGTGCCCAGGTGTGCCAGTGTTTTAGAGCTAGAAATAGCAAGTTA
820




CCAGT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030160
1
TGTGCCCAGGTGTGC
558
GGG
TGTGCCCAGGTGTGCCAGTGGTTTTAGAGCTAGAAATAGCAAGTTA
821




CAGTG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030161
1
GTGCCCAGGTGTGCC
559
GGG
GTGCCCAGGTGTGCCAGTGGGTTTTAGAGCTAGAAATAGCAAGTTA
822




AGTGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030163
−1
CCAGGCACACCTGCC
560
TGG
CCAGGCACACCTGCCCCCACGTTTTAGAGCTAGAAATAGCAAGTTA
823




CCCAC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030165
1
CCAGGTGTGCCAGTG
561
AGG
CCAGGTGTGCCAGTGGGGGCGTTTTAGAGCTAGAAATAGCAAGTTA
824




GGGGC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030174
1
CCAGTGGGGGCAGGT
562
TGG
CCAGTGGGGGCAGGTGTGCCGTTTTAGAGCTAGAAATAGCAAGTTA
825




GTGCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030175
1
CAGTGGGGGCAGGTG
563
GGG
CAGTGGGGGCAGGTGTGCCTGTTTTAGAGCTAGAAATAGCAAGTTA
826




TGCCT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030181
1
GGGCAGGTGTGCCTG
564
AGG
GGGCAGGTGTGCCTGGGTCCGTTTTAGAGCTAGAAATAGCAAGTTA
827




GGTCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030181
−1
AAAGATCTGCTCCTG
565
AGG
AAAGATCTGCTCCTGGACCCGTTTTAGAGCTAGAAATAGCAAGTTA
828




GACCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030188
−1
GAGTGCCAAAGATCT
566
TGG
GAGTGCCAAAGATCTGCTCCGTTTTAGAGCTAGAAATAGCAAGTTA
829




GCTCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030194
1
TGGGTCCAGGAGCAG
567
TGG
TGGGTCCAGGAGCAGATCTTGTTTTAGAGCTAGAAATAGCAAGTTA
830




ATCTT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030207
1
AGATCTTTGGCACTC
568
TGG
AGATCTTTGGCACTCAACTTGTTTTAGAGCTAGAAATAGCAAGTTA
831




AACTT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030208
1
GATCTTTGGCACTCA
569
GGG
GATCTTTGGCACTCAACTTTGTTTTAGAGCTAGAAATAGCAAGTTA
832




ACTTT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030209
1
ATCTTTGGCACTCAACTTTG
570
GGG
ATCTTTGGCACTCAACTTTGGTTTTAGAGCTAGAAATAGCAAGTTA
833







AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030212
1
TTTGGCACTCAACTT
571
TGG
TTTGGCACTCAACTTTGGGGGTTTTAGAGCTAGAAATAGCAAGTTA
834




TGGGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030213
1
TTGGCACTCAACTTT
572
GGG
TTGGCACTCAACTTTGGGGTGTTTTAGAGCTAGAAATAGCAAGTTA
835




GGGGT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030216
1
GCACTCAACTTTGGG
573
AGG
GCACTCAACTTTGGGGTGGGGTTTTAGAGCTAGAAATAGCAAGTTA
836




GTGGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030233
1
GGGAGGAGAATGATA
574
TGG
GGGAGGAGAATGATACAAAAGTTTTAGAGCTAGAAATAGCAAGTTA
837




CAAAA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030237
1
GGAGAATGATACAAA
575
AGG
GGAGAATGATACAAAATGGTGTTTTAGAGCTAGAAATAGCAAGTTA
838




ATGGT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030241
1
AATGATACAAAATGG
576
TGG
AATGATACAAAATGGTAGGTGTTTTAGAGCTAGAAATAGCAAGTTA
839




TAGGT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030250
1
AAATGGTAGGTTGGT
577
AGG
AAATGGTAGGTTGGTCCTACGTTTTAGAGCTAGAAATAGCAAGTTA
840




CCTAC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030254
−1
CAACACCTGTGCTGG
578
AGG
CAACACCTGTGCTGGCCTGTGTTTTAGAGCTAGAAATAGCAAGTTA
841




CCTGT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030260
1
TTGGTCCTACAGGCC
579
AGG
TTGGTCCTACAGGCCAGCACGTTTTAGAGCTAGAAATAGCAAGTTA
842




AGCAC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030262
−1
TCACTTGGCAACACC
580
TGG
TCACTTGGCAACACCTGTGCGTTTTAGAGCTAGAAATAGCAAGTTA
843




TGTGC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030277
−1
CTGGGCACATGGGCT
581
TGG
CTGGGCACATGGGCTTCACTGTTTTAGAGCTAGAAATAGCAAGTTA
844




TCACT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030287
−1
ATCACTGTGCCTGGG
582
GGG
ATCACTGTGCCTGGGCACATGTTTTAGAGCTAGAAATAGCAAGTTA
845




CACAT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030288
−1
GATCACTGTGCCTGG
583
TGG
GATCACTGTGCCTGGGCACAGTTTTAGAGCTAGAAATAGCAAGTTA
846




GCACA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030289
1
CAAGTGAAGCCCATG
584
AGG
CAAGTGAAGCCCATGTGCCCGTTTTAGAGCTAGAAATAGCAAGTTA
847




TGCCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030295
−1
TGCCTGTGATCACTG
585
GGG
TGCCTGTGATCACTGTGCCTGTTTTAGAGCTAGAAATAGCAAGTTA
848




TGCCT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030296
−1
ATGCCTGTGATCACT
586
TGG
ATGCCTGTGATCACTGTGCCGTTTTAGAGCTAGAAATAGCAAGTTA
849




GTGCC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030304
1
TGCCCAGGCACAGTG
587
AGG
TGCCCAGGCACAGTGATCACGTTTTAGAGCTAGAAATAGCAAGTTA
850




ATCAC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030312
1
CACAGTGATCACAGG
588
TGG
CACAGTGATCACAGGCATTCGTTTTAGAGCTAGAAATAGCAAGTTA
851




CATTC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030313
1
ACAGTGATCACAGGC
589
GGG
ACAGTGATCACAGGCATTCTGTTTTAGAGCTAGAAATAGCAAGTTA
852




ATTCT


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030319
1
ATCACAGGCATTCTGGGTGA
590
AGG
ATCACAGGCATTCTGGGTGAGTTTTAGAGCTAGAAATAGCAAGTTA
853







AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030320
1
TCACAGGCATTCTGG
591
GGG
TCACAGGCATTCTGGGTGAAGTTTTAGAGCTAGAAATAGCAAGTTA
854




GTGAA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030323
1
CAGGCATTCTGGGTG
592
AGG
CAGGCATTCTGGGTGAAGGGGTTTTAGAGCTAGAAATAGCAAGTTA
855




AAGGG


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030332
1
TGGGTGAAGGGAGGC
593
AGG
TGGGTGAAGGGAGGCCTGCAGTTTTAGAGCTAGAAATAGCAAGTTA
856




CTGCA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030333
1
GGGTGAAGGGAGGCC
594
GGG
GGGTGAAGGGAGGCCTGCAAGTTTTAGAGCTAGAAATAGCAAGTTA
857




TGCAA


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT






43030335
−1
TGCTGGAAATTGGCC
595
AGG
TGCTGGAAATTGGCCCTTGCGTTTTAGAGCTAGAAATAGCAAGTTA
858




CTTGC


AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCG








GTGCTTTT









Table 10 below provides exemplary target nucleotide sequences of an HNF4α expression control genomic region and the corresponding zinc finger DNA binding domain amino acid sequences and amino acid structures of disrupting agent fusion proteins comprising a zing finger polypeptide and an effector suitable for use in the present invention.


The forward strand of the HNF4α expression control region targeted by the exemplary target nucleotide sequence in Table 10 comprises the nucleotide sequence of:









(SEQ ID NO: 859)


CTCCGGGAGGGGGTGGGGGTGAGGGAAACAGGAGAATGTGATGGGAAAATC





CGAGATGGAGCCAGCCTGGGCCAGAAACACTGGGAGCTGTGGGAGACGGAG





AGGGGCAGGGTGGGATCACAGGGAGCAGGAGCGGGGAATTGGAGGTGAATC





TGGCCCTCCCAAACTTCCAGTCCATTCTGCTCCCAGGGGAACCGGGAAACT





GCGGGGGAACTGGAAGGGAGCTCCCAGAACAAGGATCCAGAAGATTGGCAT





CTGGGGCCTGGGATTTAGGTTTCTAAATCGTGGGCCATGGGGCAGCCTTAT





CTCTGCAAAAGCATTGAGGGTAGAAGTCAATGATTTGGGAAGTTATTGAAT





TAGGGGATCTCGGAGGTAGGCTGTCAGTGCCTGATAGTATCAGTTAGAATG





CCTGACTTGGGGTGACAATGGCTTGGAGGGGTGGGTGAGTCAAGGGTCAAA





TGAGTGCCCGTGAGTCATGATGCCTGCCTTGTACAATTGATAACTGAACAT





CGGTGAGTTAGGGCCCCAGCAGTTGTAATTAGCACCCCGGGTGTCAGCCAG





AAACCAACAAACAGCCAAATCCCTGCAGCCCCGCCCAGCCTATCCACCGGC





GGGGGACCGATTAACCATTAACCCCCACCCCTCCCCGGCAGAGCCTCCACC





CCTTCACAGAGGCTAGGCCAAGACTCCCAGCAGATCTTCCCAGAGGACGGT





TTGAAAGGAAGGCAGAGAGGGCACTGGGAGGAGGCAGTGGGAGGGCGGAGG





GCGGGGGCCTTCGGGGTGGGCGCCCAGGGTAGGGCAGGTGGCCGCGGCGTG





GAGGCAGGGAGAATGCGACTCTCCAAAACCCTCGTCGACATGGACATGGCC





GACTACAGTGCTGCACTGGACCCAGCCTACACCACCCTGGAATTTGAGAAT





GTGCAGGTGTTGACGATGGGCAATGGTAGGTGGGGGCAGATGTGCCCAGGT





GTGCCAGTGGGGGCAGGTGTGCCTGGGTCCAGGAGCAGATCTTTGGCACTC





AACTTTGGGGTGGGAGGAGAATGATACAAAATGGTAGGTTGGTCCTACAGG





CCAGCACAGGTGTTGCCAAGTGAAGCCCATGTGCCCAGGCACAGTGATCAC





AGGCATTCTGGGTGAAGGGAGGCCTGCAAGGGCCAATTTCCAGCAAAAGTT





GAT






The reverse strand of the HNF4α expression control region targeted by the exemplary target nucleotide sequence in Table 10 comprises the sequence of:









(SEQ ID NO: 860)


ATCGACTTTTGCTGGAAATTGGCCCTTGCAGGCCTCCCTTCACCCAGAATG





CCTGTGATCACTGTGCCTGGGCACATGGGCTTCACTTGGCAACACCTGTGC





TGGCCTGTAGGACCAACCTACCATTTTGTATCATTCTCCTCCCACCCCAAA





GTTGAGTGCCAAAGATCTGCTCCTGGACCCAGGCACACCTGCCCCCACTGG





CACACCTGGGCACATCTGCCCCCACCTACCATTGCCCATCGTCAACACCTG





CACATTCTCAAATTCCAGGGTGGTGTAGGCTGGGTCCAGTGCAGCACTGTA





GTCGGCCATGTCCATGTCGACGAGGGTTTTGGAGAGTCGCATTCTCCCTGC





CTCCACGCCGCGGCCACCTGCCCTACCCTGGGCGCCCACCCCGAAGGCCCC





CGCCCTCCGCCCTCCCACTGCCTCCTCCCAGTGCCCTCTCTGCCTTCCTTT





CAAACCGTCCTCTGGGAAGATCTGCTGGGAGTCTTGGCCTAGCCTCTGTGA





AGGGGTGGAGGCTCTGCCGGGGAGGGGTGGGGGTTAATGGTTAATCGGTCC





CCCGCCGGTGGATAGGCTGGGCGGGGCTGCAGGGATTTGGCTGTTTGTTGG





TTTCTGGCTGACACCCGGGGTGCTAATTACAACTGCTGGGGCCCTAACTCA





CCGATGTTCAGTTATCAATTGTACAAGGCAGGCATCATGACTCACGGGCAC





TCATTTGACCCTTGACTCACCCACCCCTCCAAGCCATTGTCACCCCAAGTC





AGGCATTCTAACTGATACTATCAGGCACTGACAGCCTACCTCCGAGATCCC





CTAATTCAATAACTTCCCAAATCATTGACTTCTACCCTCAATGCTTTTGCA





GAGATAAGGCTGCCCCATGGCCCACGATTTAGAAACCTAAATCCCAGGCCC





CAGATGCCAATCTTCTGGATCCTTGTTCTGGGAGCTCCCTTCCAGTTCCCC





CGCAGTTTCCCGGTTCCCCTGGGAGCAGAATGGACTGGAAGTTTGGGAGGG





CCAGATTCACCTCCAATTCCCCGCTCCTGCTCCCTGTGATCCCACCCTGCC





CCTCTCCGTCTCCCACAGCTCCCAGTGTTTCTGGCCCAGGCTGGCTCCATC





TCGGATTTTCCCATCACATTCTCCTGTTTCCCTCACCCCCACCCCCTCCCG





GAG






In some embodiments, the linker has the amino acid sequence of THPRAPIPKPFQ (SEQ ID NO: 311). In other embodiments, the linker has the amino acid sequence of TPNPHRRTDPSHKPFQ (SEQ ID NO: 312).















TABLE 10





Column 1








Nucleot1de








Sequence of

Column 2






Target Site

SEQ ID NO.
Column 4





in HNF4α
Column 3
(corresponds
Amino acid Sequence of Zinc

Column 6



Expression
Target
to sequences
Finger DNA Binding Domain
Column 5
Amino Acid Sequence Structure of Zinc Finger DNA Binding
Column 7


Control
sequence
in Columns 1
Polypeptides Targeting the
SEQ ID
Domain Polypeptides Targeting the Target Site in Column 1
SEQ ID


Region
with space
and 3)
Target Site in Column 1
NO:
(see Table 1B above and description thereof)
NO:





















agatggagc
AGA TGG
861
LEPGEKPYKCPECGKSFSTSHSL
1410
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18X19
1959


cagcctggg
AGC CAG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



cca
CCT ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17




CCA

GKSFSTKNSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEH






KCPECGKSFSRADNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLR






GEKPYKCPECGKSFSERSHLREH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QL






HLTTHQRTHTGEKPYKCPECGKS

AHLRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQLAHLRAHQRTHTGEKPTGKK








TS








cagcctggg
CAG CCT
862
LEPGEKPYKCPECGKSFSRNDAL
1411
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18X19
1960


ccagaaaca
ggg CCA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18



ctg
GAA ACA

SPADLTRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17




CTG

GKSFSQSSNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEH






KCPECGKSFSTSHSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNS






QRTHTGEKPYKCPECGKSFSTKN

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RA






SLTEHQRTHTGEKPYKCPECGKS

DNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRADNLTEHQRTHTGEKPTGKK








TS








agccagcct
AGC CAG
863
LEPGEKPYKCPECGKSFSSPADL
1412
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18X19
1961


gggccagaa
CCT ggg

TRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18



aca
CCA GAA

QSSNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17




ACA

GKSFSTSHSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLT






GEKPYKCPECGKSFSTKNSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADN






QRTHTGEKPYKCPECGKSFSRAD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16ER






NLTEHQRTHTGEKPYKCPECGKS

SHLREHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSERSHLREHQRTHTGEKPTGKK








TS








tggagccag
TGG AGC
864
LEPGEKPYKCPECGKSFSQSSNL
1413
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18X19
1962


cctgggcca
CAG CCT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18



gaa
ggg CCA

TSHSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




GAA

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEH






KCPECGKSFSTKNSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLT






GEKPYKCPECGKSFSRADNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSH






QRTHTGEKPYKCPECGKSFSERS

LREHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLREHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








ccgagatgg
CCG AGA
865
LEPGEKPYKCPECGKSFSRSDKL
1414
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
1963


agccagcct
TGG AGC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18



ggg
CAG CCT

TKNSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17




GGG

GKSFSRADNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREH






KCPECGKSFSERSHLREHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAH






QRTHTGEKPYKCPECGKSFSQLA

LRAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






HLRAHQRTHTGEKPYKCPECGKS

DTLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDTLTEHQRTHTGEKPTGKK








TS








aatccgaga
AGA TGG
866
LEPGEKPYKCPECGKSFSTKNSL
1415
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18X19
1964


tggagccag
AAT CCG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18



cct
AGC CAG

RADNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREHX17




CCT

GKSFSERSHLREHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLR






GEKPYKCPECGKSFSQLAHLRAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDT






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TT






TLTEHQRTHTGEKPYKCPECGKS

GNLTVHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTTGNLTVHQRTHTGEKPTGKK








TS








aggagaatg
AGG AGA
867
LEPGEKPYKCPECGKSFSTTGNL
1416
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17X18X19
1965


tgatgggaa
ATG TGA

TVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18



aat
TGG GAA

QSSNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




AAT

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASH






KCPECGKSFSQAGHLASHQRTHT

X17X18X19HX20X21X22X23X24X8X9CXHX11CX12X13X14X15X16RRDELN






GEKPYKCPECGKSFSRRDELNVH

VHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAH






QRTHTGEKPYKCPECGKSFSQLA

LRAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLRAHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








aacaggaga
AAC AGG
868
LEPGEKPYKCPECGKSFSQSSNL
1417
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18X19
1966


atgtgatgg
AGA ATG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RDSDHLTTHX17X18



gaa
TGA TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14HX15X16QAGHLASHX17




GAA

GKSFSQAGHLASHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVH






KCPECGKSFSRRDELNVHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLR






GEKPYKCPECGKSFSQLAHLRAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DS






HLTNHQRTHTGEKPYKCPECGKS

GNLRVHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDSGNLRVHQRTHTGEKPTGKK








TS








tgatgggaa
TGA TGG
869
LEPGEKPYKCPECGKSFSRSDHL
1418
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
1967


aatccgaga
GAA AAT

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17X18



tgg
CCG AGA

QLAHLRAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17




TGG

GKSFSRNDTLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVH






KCPECGKSFSTTGNLTVHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLV






GEKPYKCPECGKSFSQSSNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QA






HLTTHQRTHTGEKPYKCPECGKS

GHLASHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQAGHLASHQRTHTGEKPTGKK








TS








gaaaatccg
GAA AAT
870
LEPGEKPYKCPECGKSFSRADNL
1419
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18X19
1968


agatggagc
CCG AGA

TEHQRTHTGEKPYKCPECGKSFS

HX20X2022X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREHX17X18



cag
TGG AGC

ERSHLREHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




CAG

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAH






KCPECGKSFSQLAHLRAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLT






GEKPYKCPECGKSFSRNDTLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGN






QRTHTGEKPYKCPECGKSFSTTG

LTVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






NLTVHQRTHTGEKPYKCPECGKS

SNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSSNLVRHQRTHTGEKPTGKK








TS








tgggaaaat
TGG GAA
871
LEPGEKPYKCPECGKSFSERSHL
1420
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16ERSHLREHX17X18X19
1969


ccgagatgg
AAT CCG

REHQRTHTGEKPYKCPECGKSFS

HX20X2022X23X24X8X9C10X11CX12X13X14X15X16RSDHLTTHX17X18



agc
AGA TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17




AGC

GKSFSQLAHLRAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEH






KCPECGKSFSRNDTLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLT






GEKPYKCPECGKSFSTTGNLTVH

VHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSN






QRTHTGEKPYKCPECGKSFSQSS

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLVRHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








cctgggcca
CCT ggg
872
LEPGEKPYKCPECGKSFSQRAHL
1421
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18X19
1970


gaaacactg
CCA GAA

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18



gga
ACA CTG

RNDALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17




GGA

GKSFSSPADLTRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRH






KCPECGKSFSQSSNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLT






GEKPYKCPECGKSFSTSHSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TK






KLVRHQRTHTGEKPYKCPECGKS

NSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTKNSLTEHQRTHTGEKPTGKK








TS








ccagaaaca
CCA GAA
873
LEPGEKPYKCPECGKSFSRSDEL
1422
X1X2X3X4X5X6X7X8X9CX11X11CX12X13X14X15X16RSDELVRHX17X18X19
1971


ctgggagct
ACA CTG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRHX17X18



gtg
GGA GCT

TSGELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17




GTG

GKSFSQRAHLERHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEH






KCPECGKSFSRNDALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLT






GEKPYKCPECGKSFSSPADLTRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSN






QRTHTGEKPYKCPECGKSFSQSS

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






NLVRHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








gggccagaa
ggg CCA
874
LEPGEKPYKCPECGKSFSTSGEL
1423
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGELVRHX17X18X19
1972


acactggga
GAA ACA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



gct
CTG GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




GCT

GKSFSRNDALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRH






KCPECGKSFSSPADLTRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLV






GEKPYKCPECGKSFSQSSNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHS






QRTHTGEKPYKCPECGKSFSTSH

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






SLTEHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








gagggaaac
GAG GGA
875
LEPGEKPYKCPECGKSFSQAGHL
1424
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QAGHLASHX17X18X19
1973


aggagaatg
AAC AGG

ASHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVHX17X18



tga
AGA ATG

RRDELNVHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17




TGA

GKSFSQLAHLRAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGNLR






GEKPYKCPECGKSFSDSGNLRVH

VHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLERHQRTHTGEKPYKCPECGKS

DNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDNLVRHQRTHTGEKPTGKK








TS








ggaaacagg
GGA AAC
876
LEPGEKPYKCPECGKSFSRSDHL
1425
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
1974


agaatgtga
AGG AGA

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASHX17X18



tgg
ATG TGA

QAGHLASHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVHX17




TGG

GKSFSRRDELNVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAH






KCPECGKSFSQLAHLRAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGN






QRTHTGEKPYKCPECGKSFSDSG

LRVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






NLRVHQRTHTGEKPYKCPECGKS

AHLERHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRAHLERHQRTHTGEKPTGKK








TS








atgtgatgg
ATG TGA
877
LEPGEKPYKCPECGKSFSQLAHL
1426
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17X18X19
1975


gaaaatccg
TGG GAA

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17X18



aga
AAT CCG

RNDTLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17




AGA

GKSFSTTGNLTVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRH






KCPECGKSFSQSSNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLY






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGH






QRTHTGEKPYKCPECGKSFSQAG

LASHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RR






HLASHQRTHTGEKPYKCPECGKS

DELNVHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRRDELNVHQRTHTGEKPTGKK








TS








agaatgtga
AGA ATG
878
LEPGEKPYKCPECGKSFSRNDTL
1427
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17X18X19
1976


tgggaaaat
TGA TGG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17X18



ccg
GAA AT

TTGNLTVHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17




CCG

GKSFSQSSNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLA






GEKPYKCPECGKSFSQAGHLASH

SHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDE






QRTHTGEKPYKCPECGKSFSRRD

LNVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QL 






ELNVHQRTHTGEKPYKCPECGKS

AHLRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQLAHLRAHQRTHTGEKPTGKK








TS








gaaacactg
GAA ACA
879
LEPGEKPYKCPECGKSFSQRAHL
1428
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18X19
1977


ggagctgtg
CTG GGA

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18



gga
GCT GTG

RSDELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRHX17




GGA

GKSFSTSGELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALT






GEKPYKCPECGKSFSRNDALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPAD






QRTHTGEKPYKCPECGKSFSSPA

LTRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






DLTRHQRTHTGEKPYKCPECGKS

SNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSSNLVRHQRTHTGEKPTGKK








TS








gtgggagac
GTG GGA
880
LEPGEKPYKCPECGKSFSRADNL
1429
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18X19
1978


ggagagggg
GAC GGA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



cag
GAG ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17




CAG

GKSFSRSDNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGNLV






GEKPYKCPECGKSFSDPGNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLERHQRTHTGEKPYKCPECGKS

DELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDELVRHQRTHTGEKPTGKK








TS








acactggga
ACA CTG
881
LEPGEKPYKCPECGKSFSDPGNL
1430
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGNLVRHX17X18X19
1979


gctgtggga
GGA GCT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



gac
GTG GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17




GAC

GKSFSRSDELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRH






KCPECGKSFSTSGELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SP






ALTEHQRTHTGEKPYKCPECGKS

ADLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSPADLTRHQRTHTGEKPTGKK








TS








gctgtggga
GCT GTG
882
LEPGEKPYKCPECGKSFSRSDKL
1431
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
1980


gacggagag
GGA GAC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18



ggg
GGA GAG

RSDNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17




GGG

GKSFSQRAHLERHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGNLVRH






KCPECGKSFSDPGNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDE






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






ELVRHQRTHTGEKPYKCPECGKS

GELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGELVRHQRTHTGEKPTGKK








TS








ctgggagct
CTG GGA
883
LEPGEKPYKCPECGKSFSQRAHL
1432
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18X19
1981


gtgggagac
GCT GTG

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGNLVRHX17X18



gga
GGA GAC

DPGNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17




GGA

GKSFSQRAHLERHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRH






KCPECGKSFSRSDELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELV






GEKPYKCPECGKSFSTSGELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






HLERHQRTHTGEKPYKCPECGKS

DALTEHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








ggagctgtg
GGA GCT
884
LEPGEKPYKCPECGKSFSRSDNL
1433
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18X19
1982


ggagacgga
GTG GGA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



gag
GAC GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGNLVRHX17




GAG

GKSFSDPGNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELV






GEKPYKCPECGKSFSRSDELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGE






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






ELVRHQRTHTGEKPYKCPECGKS

AHLERHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRAHLERHQRTHTGEKPTGKK








TS








gggggtgag
GGG GGT
885
LEPGEKPYKCPECGKSFSQLAHL
1434
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17X18X19
1983


ggaaacagg
GAG GGA

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



aga
AAC AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGNLRVHX17




AGA

GKSFSDSGNLRVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGH






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS 






HLVRHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








gacggagag
GAC GGA
886
LEPGEKPYKCPECGKSFSRSDKL
1435
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
1984


gggcagggt
GAG ggg

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18



ggg
CAG GGT

TSGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17




GGG

GKSFSRADNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






HLERHQRTHTGEKPYKCPECGKS

GNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGNLVRHQRTHTGEKPTGKK








TS








ggtgaggga
GGT GAG
887
LEPGEKPYKCPECGKSFSRRDEL
1436
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RRDELNVHX17X18X19
1985


aacaggaga
GGA AAC

NVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17X18



atg
AGG AGA

QLAHLRAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




ATG

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGNLRVH






KCPECGKSFSDSGNLRVHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






NLVRHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGHLVRHQRTHTGEKPTGKK








TS








ggagacgga
GGA GAC
888
LEPGEKPYKCPECGKSFSTSGHL
1437
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18X19
1986


gaggggcag
GGA GAG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18



ggt
ggg CAG

RADNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




GGT

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRH






KCPECGKSFSRSDNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGN






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






NLVRHQRTHTGEKPYKCPECGKS

AHLERHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRAHLERHQRTHTGEKPTGKK








TS








gggggtggg
GGG GGT
889
LEPGEKPYKCPECGKSFSDSGNL
1438
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DSGNLRVHX17X18X19
1987


ggtgaggga
GGG GGT

RVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



aac
GAG GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17




AAC

GKSFSRSDNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRH






KCPECGKSFSTSGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGH






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLVRHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








ggtgggggt
GGT GGG
890
LEPGEKPYKCPECGKSFSRSDHL
1439
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
1988


gagggaaac
GGT GAG

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGNLRVHX17X18



agg
GGA AAC

DSGNLRVHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17




AGG

GKSFSQRAHLERHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRH






KCPECGKSFSRSDNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLV






GEKPYKCPECGKSFSTSGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






KLVRHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGHLVRHQRTHTGEKPTGKK








TS








ggagggggt
GGA GGG
891
LEPGEKPYKCPECGKSFSQRAHL
1440
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18X19
1989


gggggtgag
GGT GGG

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18



gga
GGT GAG

RSDNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17




GGA

GKSFSTSGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLV






GEKPYKCPECGKSFSTSGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






KLVRHQRTHTGEKPYKCPECGKS

AHLERHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRAHLERHQRTHTGEKPTGKK








TS








ccgggaggg
CCG GGA
892
LEPGEKPYKCPECGKSFSRSDNL
1441
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18X19
1990


ggtgggggt
GGG GGT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18



gag
GGG GGT

TSGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




GAG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRH






KCPECGKSFSTSGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN 






HLERHQRTHTGEKPYKCPECGKS

DTLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDTLTEHQRTHTGEKPTGKK








TS








cgggagggg
CGG GAG
893
LEPGEKPYKCPECGKSFSRSDHL
1442
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
1991


gtgggggtg
GGG GTG

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18



agg
GGG gtg

RSDELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




AGG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRH






KCPECGKSFSRSDELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLVRHQRTHTGEKPYKCPECGKS

DKLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLTEHQRTHTGEKPTGKK








TS








gagggggtg
GAG GGG
894
LEPGEKPYKCPECGKSFSQSSNL
1443
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18X19
1992


ggggtgagg
GTG GGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



gaa
gtg AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17




GAA

GKSFSRSDELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELV






GEKPYKCPECGKSFSRSDELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






KLVRHQRTHTGEKPYKCPECGKS

DNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDNLVRHQRTHTGEKPTGKK








TS








ggggtgggg
GGG GTG
895
LEPGEKPYKCPECGKSFSSPADL
1444
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18X19
1993


gtgagggaa
GGG gtg

TRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18



aca
AGG GAA

QSSNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




ACA

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRH






KCPECGKSFSRSDELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDE






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






ELVRHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








gtgggggtg
GTG GGG
896
LEPGEKPYKCPECGKSFSQRAHL
1445
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18X19
1994


agggaaaca
gtg AGG

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18



gga
GAA ACA

SPADLTRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17




GGA

GKSFSQSSNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELV






GEKPYKCPECGKSFSRSDELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






KLVRHQRTHTGEKPYKCPECGKS

DELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDELVRHQRTHTGEKPTGKK








TS








ggggtgagg
GGG gtg
897
LEPGEKPYKCPECGKSFSQSSNL
1446
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18X19
1995


gaaacagga
AGG GAA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



gaa
ACA GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17




GAA

GKSFSSPADLTRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRH






KCPECGKSFSQSSNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDE






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






ELVRHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








gggtgaggg
GGG TGA
898
LEPGEKPYKCPECGKSFSTTGNL
1447
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17X18X19
1996


aaacaggag
ggg AAA

TVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18



aat
CAG gag

RSDNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17




AAT

GKSFSRADNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAH






KCPECGKSFSQRANLRAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGH






QRTHTGEKPYKCPECGKSFSQAG

LASHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLASHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








gggtggggg
GGG TGG
899
LEPGEKPYKCPECGKSFSRADNL
1448
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18X19
1997


tgagggaaa
GGG TGA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAHX17X18



cag
ggg AAA

QRANLRAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




CAG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASH






KCPECGKSFSQAGHLASHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLTTHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








tgagggaaa
TGA ggg
900
LEPGEKPYKCPECGKSFSRSDEL
1449
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18X19
1998


caggagaat
AAA CAG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17X18



gtg
gag AAT

TTGNLTVHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17




GTG

GKSFSRSDNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEH






KCPECGKSFSRADNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLR






GEKPYKCPECGKSFSQRANLRAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QA






KLVRHQRTHTGEKPYKCPECGKS

GHLASHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQAGHLASHQRTHTGEKPTGKK








TS








tgggggtga
TGG GGG
901
LEPGEKPYKCPECGKSFSRSDNL
1450
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18X19
1999


gggaaacag
TGA ggg

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18



gag
AAA CAG

RADNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAHX17




GAG

GKSFSQRANLRAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLA






GEKPYKCPECGKSFSQAGHLASH

SHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






KLVRHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








gggaggggg
ggg AGG
902
LEPGEKPYKCPECGKSFSRSDKL
1451
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2000


tgggggtga
GGG TGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASHX17X18



ggg
GGG TGA

QAGHLASHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




GGG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLTNHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








agggggtgg
AGG GGG
903
LEPGEKPYKCPECGKSFSQRANL
1452
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRANLRAHX17X18X19
2001


gggtgaggg
TGG GGG

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



aaa
TGA ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASHX17




AAA

GKSFSQAGHLASHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






KLVRHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








gggaaacag
ggg AAA
904
LEPGEKPYKCPECGKSFSRRDEL
1453
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RRDELNVHX17X18X19
2002


gagaatgtg
CAG gag

NVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18



atg
AAT gtg

RSDELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17




ATG

GKSFSTTGNLTVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRH






KCPECGKSFSRSDNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLT






GEKPYKCPECGKSFSRADNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAN






QRTHTGEKPYKCPECGKSFSQRA

LRAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLRAHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








aaacaggag
AAA CAG
905
LEPGEKPYKCPECGKSFSQRAHL
1454
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18X19
2003


aatgtgatg
gag AAT

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVHX17X18



gga
gtg ATG

RRDELNVHQRTHTGEKPYKCPEC

X11HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17




GGA

GKSFSRSDELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVH






KCPECGKSFSTTGNLTVHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADN






QRTHTGEKPYKCPECGKSFSRAD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






NLTEHQRTHTGEKPYKCPECGKS

ANLRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRANLRAHQRTHTGEKPTGKK








TS








caggagaat
CAG gag
906
LEPGEKPYKCPECGKSFSQRANL
1455
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRANLRAHX17X18X19
2004


gtgatggga
AAT gtg

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



aaa
ATG GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVHX17




AAA

GKSFSRRDELNVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRH






KCPECGKSFSRSDELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLT






GEKPYKCPECGKSFSTTGNLTVH

VHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RA






NLVRHQRTHTGEKPYKCPECGKS

DNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRADNLTEHQRTHTGEKPTGKK








TS








ggccagaaa
GGC CAG
907
LEPGEKPYKCPECGKSFSRNDAL
1456
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18X19
2005


cactgggag
AAA CAC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18



ctg
TGG gag

RSDNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




CTG

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEH






KCPECGKSFSSKKALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLR






GEKPYKCPECGKSFSQRANLRAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADN






QRTHTGEKPYKCPECGKSFSRAD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






NLTEHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








ctgggccag
CTG GGC
908
LEPGEKPYKCPECGKSFSRSDNL
1457
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18X19
2006


aaacactgg
CAG AAA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



gag
CAC TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17




GAG

GKSFSSKKALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAH






KCPECGKSFSQRANLRAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLT






GEKPYKCPECGKSFSRADNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






HLVRHQRTHTGEKPYKCPECGKS

DALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








aaacactgg
AAA CAC
909
LEPGEKPYKCPECGKSFSRSDNL
1458
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18X19
2007


gagctgtgg
TGG gag

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



gag
CTG TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




GAG

GKSFSRNDALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRH






KCPECGKSFSRSDNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKA






QRTHTGEKPYKCPECGKSFSSKK

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






ALTEHQRTHTGEKPYKCPECGKS

ANLRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRANLRAHQRTHTGEKPTGKK








TS








gccagcctg
GCC AGC
910
LEPGEKPYKCPECGKSFSSKKAL
1459
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18X19
2008


ggccagaaa
CTG GGC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAHX17X18



cac
CAG AAA

QRANLRAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17




CAC

GKSFSRADNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALT






GEKPYKCPECGKSFSRNDALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSH






QRTHTGEKPYKCPECGKSFSERS

LREHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DC






HLREHQRTHTGEKPYKCPECGKS

RDLARHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDCRDLARHQRTHTGEKPTGKK








TS








agcctgggc
AGC CTG
911
LEPGEKPYKCPECGKSFSRSDHL
1460
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2009


cagaaacac
GGC CAG

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18



tgg
AAA CAC

SKKALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAHX17




TGG

GKSFSQRANLRAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEH






KCPECGKSFSRADNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ER






ALTEHQRTHTGEKPYKCPECGKS

SHLREHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSERSHLREHQRTHTGEKPTGKK








TS








cagaaacac
CAG AAA
912
LEPGEKPYKCPECGKSFSRSDHL
1461
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2010


tgggagctg
CAC TGG

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18



tgg
gag CTG

RNDALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17




TGG

GKSFSRSDNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALT






GEKPYKCPECGKSFSSKKALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAN






QRTHTGEKPYKCPECGKSFSQRA

LRAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RA






NLRAHQRTHTGEKPYKCPECGKS

DNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRADNLTEHQRTHTGEKPTGKK








TS








gatggagcc
GAT GGA
913
LEPGEKPYKCPECGKSFSRADNL
1462
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18X19
2011


agcctgggc
GCC AGC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



cag
CTG GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




CAG

GKSFSRNDALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREH






KCPECGKSFSERSHLREHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLA






GEKPYKCPECGKSFSDCRDLARH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLERHQRTHTGEKPYKCPECGKS

GNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGNLVRHQRTHTGEKPTGKK








TS








ggagccagc
GGA GCC
914
LEPGEKPYKCPECGKSFSQRANL
1463
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRANLRAHX17X18X19
2012


ctgggccag
AGC CTG

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18



aaa
GGC CAG

RADNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




AAA

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEH






KCPECGKSFSRNDALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLR






GEKPYKCPECGKSFSERSHLREH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRD






QRTHTGEKPYKCPECGKSFSDCR

LARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






DLARHQRTHTGEKPYKCPECGKS

AHLERHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRAHLERHQRTHTGEKPTGKK








TS








tgggagctg
TGG gag
915
LEPGEKPYKCPECGKSFSRSDNL
1464
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18X19
2013


tgggagacg
CTG TGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RTDTLRDHX17X18



gag
gag ACG

RTDTLRDHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17




GAG

GKSFSRSDNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALT






GEKPYKCPECGKSFSRNDALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLVRHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








gagctgtgg
gag CTG
916
LEPGEKPYKCPECGKSFSRSDHL
1465
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2014


gagacggag
TGG gag

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18



agg
ACG gag

RSDNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RTDTLRDHX17




AGG

GKSFSRTDTLRDHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRH






KCPECGKSFSRSDNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






ALTEHQRTHTGEKPYKCPECGKS

DNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDNLVRHQRTHTGEKPTGKK








TS








ctgtgggag
CTG TGG
917
LEPGEKPYKCPECGKSFSDPGHL
1466
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2015


acggagagg
gag ACG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



ggc
gag AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17




GGC

GKSFSRSDNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RTDTLRDH






KCPECGKSFSRTDTLRDHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






HLTTHQRTHTGEKPYKCPECGKS

DALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








gagacggag
gag ACG
918
LEPGEKPYKCPECGKSFSRSDEL
1467
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18X19
2016


aggggcagg
gag AGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



gtg
GGC AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




GTG

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RTDT






QRTHTGEKPYKCPECGKSFSRTD

LRDHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






TLRDHQRTHTGEKPYKCPECGKS

DNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDNLVRHQRTHTGEKPTGKK








TS








tgggagacg
TGG gag
919
LEPGEKPYKCPECGKSFSRSDHL
1468
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2017


gagaggggc
ACG gag

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



agg
AGG GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




AGG

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRH






KCPECGKSFSRSDNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RTDTLR






GEKPYKCPECGKSFSRTDTLRDH

DHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLVRHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








cactgggag
CAC TGG
920
LEPGEKPYKCPECGKSFSRTDTL
1469
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RTDTLRDHX17X18X19
2018


ctgtgggag
gag CTG

RDHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18



acg
TGG gag

RSDNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




ACG

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEH






KCPECGKSFSRNDALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK 






HLTTHQRTHTGEKPYKCPECGKS

KALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKALTEHQRTHTGEKPTGKK








TS








acggagagg
ACG gag
921
LEPGEKPYKCPECGKSFSQRAHL
1470
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18X19
2019


ggcagggtg
AGG GGC

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18



gga
AGG GTG

RSDELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




GGA

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RT






NLVRHQRTHTGEKPYKCPECGKS

DTLRDHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRTDTLRDHQRTHTGEKPTGKK








TS








cgagatgga
CGA GAT
922
LEPGEKPYKCPECGKSFSDPGHL
1471
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2020


gccagcctg
GGA GCC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18



ggc
AGC CTG

RNDALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREHX17




GGC

GKSFSERSHLREHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARH






KCPECGKSFSDCRDLARHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGN






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






NLVRHQRTHTGEKPYKCPECGKS

GHLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGHLTEHQRTHTGEKPTGKK








TS








gcctgggcc
GCC TGG
923
LEPGEKPYKCPECGKSFSRSDKL
1472
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2021


agaaacact
GCC AGA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRHX17X18



ggg
AAC ACT

THLDLIRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGNLRVHX17




GGG

GKSFSDSGNLRVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAH






KCPECGKSFSQLAHLRAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLA






GEKPYKCPECGKSFSDCRDLARH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DC






HLTTHQRTHTGEKPYKCPECGKS

RDLARHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDCRDLARHQRTHTGEKPTGKK








TS








cagcctgg
CCA GCC
924
LEPGEKPYKCPECGKSFSTHLDL
1473
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16THLDLIRHX17X18X19
2022


gccagaaac
TGG GCC

IRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGNLRVHX17X18



act
AGA AAC

DSGNLRVHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17




ACT

GKSFSQLAHLRAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARH






KCPECGKSFSDCRDLARHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRD






QRTHTGEKPYKCPECGKSFSDCR

LARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






DLARHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








tgggccaga
TGG GCC
925
LEPGEKPYKCPECGKSFSERSHL
1474
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16ERSHLREHX17X18X19
2023


aacactggg
AGA AAC

REHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



agc
ACT ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRHX17




AGC

GKSFSTHLDLIRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGNLRVH






KCPECGKSFSDSGNLRVHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLR






GEKPYKCPECGKSFSQLAHLRAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRD






QRTHTGEKPYKCPECGKSFSDCR

LARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






DLARHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








gagccagcc
gag CCA
926
LEPGEKPYKCPECGKSFSDSGNL
1475
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DSGNLRVHX17X18X19
2024


tgggccaga
GCC TGG

RVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17X18



aac
GCC AGA

QLAHLRAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17




AAC

GKSFSDCRDLARHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLA






GEKPYKCPECGKSFSDCRDLARH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHS






QRTHTGEKPYKCPECGKSFSTSH

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






SLTEHQRTHTGEKPYKCPECGKS

DNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDNLVRHQRTHTGEKPTGKK








TS








atggagcca
ATG gag
927
LEPGEKPYKCPECGKSFSQLAHL
1476
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17X18X19
2025


gcctgggcc
CCA GCC

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18



aga
TGG GCC

DCRDLARHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




AGA

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARH






KCPECGKSFSDCRDLARHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLT






GEKPYKCPECGKSFSTSHSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RR 






NLVRHQRTHTGEKPYKCPECGKS

DELNVHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRRDELNVHQRTHTGEKPTGKK








TS








gagatggag
gag ATG
928
LEPGEKPYKCPECGKSFSDCRDL
1477
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18X19
2026


ccagcctgg
gag CCA

ARHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



gcc
GCC TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17




GCC

GKSFSDCRDLARHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEH






KCPECGKSFSTSHSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDE






QRTHTGEKPYKCPECGKSFSRRD

LNVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






ELNVHQRTHTGEKPYKCPECGKS

DNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDNLVRHQRTHTGEKPTGKK








TS








gggagacgg
ggg AGA
929
LEPGEKPYKCPECGKSFSRSDKL
1478
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2027


agaggggca
CGG AGA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18



ggg
GGG GCA

QSGDLRRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




GGG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAH






KCPECGKSFSQLAHLRAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLT






GEKPYKCPECGKSFSRSDKLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAH






QRTHTGEKPYKCPECGKSFSQLA

LRAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLRAHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








agacggaga
AGA CGG
930
LEPGEKPYKCPECGKSFSRSDHL
1479
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2028


ggggcaggg
AGA GGG

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



tgg
GCA GGG

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17




TGG

GKSF SQSGDLRRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLR






GEKPYKCPECGKSFSQLAHLRAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QL






KLTEHQRTHTGEKPYKCPECGKS

AHLRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQLAHLRAHQRTHTGEKPTGKK








TS








cggagaggg
CGG AGA
931
LEPGEKPYKCPECGKSFSTSGNL
1480
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17X18X19
2029


gcagggtgg
GGG GCA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



gat
GGG TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




GAT

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRH






KCPECGKSFSQSGDLRRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAH






QRTHTGEKPYKCPECGKSFSQLA

LRAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLRAHQRTHTGEKPYKCPECGKS

DKLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLTEHQRTHTGEKPTGKK








TS








agaggggca
AGA GGG
932
LEPGEKPYKCPECGKSFSSKKAL
1481
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18X19
2030


gggtgggat
GCA GGG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17X18



cac
TGG GAT

TSGNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




CAC

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLR






GEKPYKCPECGKSFSQSGDLRRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QL






KLVRHQRTHTGEKPYKCPECGKS

AHLRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQLAHLRAHQRTHTGEKPTGKK








TS








gatcacagg
GAT CAC
933
LEPGEKPYKCPECGKSFSRSDKL
1482
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17X18X19
2031


gagcaggag
AGG gag

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18



cgg
CAG gag

RSDNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17




CGG

GKSFSRADNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRH






KCPECGKSFSRSDNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKA






QRTHTGEKPYKCPECGKSFSSKK

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






ALTEHQRTHTGEKPYKCPECGKS

GNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGNLVRHQRTHTGEKPTGKK








TS








ggggcaggg
GGG GCA
934
LEPGEKPYKCPECGKSFSRSDHL
1483
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2032


tgggatcac
GGG TGG

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18



agg
GAT CAC

SKKALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17




AGG

GKSFSTSGNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGD






QRTHTGEKPYKCPECGKSFSQSG

LRRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS 






DLRRHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








gggtgggat
GGG TGG
935
LEPGEKPYKCPECGKSFSRADNL
1484
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18X19
2033


cacagggag
GAT CAC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18



cag
AGG gag

RSDNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




CAG

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEH






KCPECGKSFSSKKALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLV






GEKPYKCPECGKSFSTSGNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLTTHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








tgggatcac
TGG GAT
936
LEPGEKPYKCPECGKSFSRSDNL
1485
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18X19
2034


agggagcag
CAC AGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18



gag
gag CAG

RADNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17




GAG

GKSFSRSDNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALT






GEKPYKCPECGKSFSSKKALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGN






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLVRHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








gcagggtgg
GCA GGG
937
LEPGEKPYKCPECGKSFSRSDNL
1486
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18X19
2035


gatcacagg
TGG GAT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



gag
CAC AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17




GAG

GKSFSSKKALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRH






KCPECGKSFSTSGNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






KLVRHQRTHTGEKPYKCPECGKS

GDLRRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGDLRRHQRTHTGEKPTGKK








TS








agggagcag
AGG gag
938
LEPGEKPYKCPECGKSFSHKNAL
1487
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18X19
2036


gagcgggga
CAG gag

QNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



att
CGG GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17




ATT

GKSFSRSDKLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRH






KCPECGKSFSRSDNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLT






GEKPYKCPECGKSFSRADNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLVRHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








cacagggag
CAC AGG
939
LEPGEKPYKCPECGKSFSQRAHL
1488
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18X19
2037


caggagcgg
gag CAG

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17X18



gga
gag CGG

RSDKLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17




GGA

GKSFSRSDNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEH






KCPECGKSFSRADNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






HLTNHQRTHTGEKPYKCPECGKS

KALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKALTEHQRTHTGEKPTGKK








TS








gagcaggag
gag CAG
940
LEPGEKPYKCPECGKSFSQRAHL
1489
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18X19
2038


cggggaatt
gag CGG

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18



gga
GGA ATT

HKNALQNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17




GGA

GKSFSQRAHLERHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEH






KCPECGKSFSRSDKLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADN






QRTHTGEKPYKCPECGKSFSRAD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLTEHQRTHTGEKPYKCPECGKS

DNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDNLVRHQRTHTGEKPTGKK








TS








caggagcgg
CAG gag
941
LEPGEKPYKCPECGKSFSTSGHL
1490
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18X19
2039


ggaattgga
CGG GGA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



ggt
ATT GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNHX17




GGT

GKSFSHKNALQNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLT






GEKPYKCPECGKSFSRSDKLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RA






NLVRHQRTHTGEKPYKCPECGKS

DNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRADNLTEHQRTHTGEKPTGKK








TS








gagcgggga
gag CGG
942
LEPGEKPYKCPECGKSFSQSSNL
1491
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18X19
2040


attggaggt
GGA ATT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18



gaa
GGA GGT

TSGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17




GAA

GKSFSQRAHLERHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNH






KCPECGKSFSHKNALQNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






KLTEHQRTHTGEKPYKCPECGKS

DNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDNLVRHQRTHTGEKPTGKK








TS








acagggagc
ACA ggg
943
LEPGEKPYKCPECGKSFSQSSNL
1492
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18X19
2041


aggagcggg
AGC AGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



gaa
AGC GGG

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREHX17




GAA

GKSFSERSHLREHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLR






GEKPYKCPECGKSFSERSHLREH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SP






KLVRHQRTHTGEKPYKCPECGKS

ADLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSPADLTRHQRTHTGEKPTGKK








TS








gcaggagcg
GCA GGA
944
LEPGEKPYKCPECGKSFSRSDHL
1493
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2042


gggaattgg
GCG ggg

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



agg
AAT TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17




AGG

GKSFSTTGNLTVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLV






GEKPYKCPECGKSFSRSDDLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






HLERHQRTHTGEKPYKCPECGKS

GDLRRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGDLRRHQRTHTGEKPTGKK








TS








ggagcgggg
GGA GCG
945
LEPGEKPYKCPECGKSFSQAGHL
1494
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QAGHLASHX17X18X19
2043


aattggagg
ggg AAT

ASHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



tga
TGG AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




TGA

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVH






KCPECGKSFSTTGNLTVHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDD






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






DLVRHQRTHTGEKPYKCPECGKS

AHLERHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRAHLERHQRTHTGEKPTGKK








TS








cagggagca
CAG GGA
946
LEPGEKPYKCPECGKSFSTTGNL
1495
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17X18X19
2044


ggagcgggg
GCA GGA

TVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



aat
GCG ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17




AAT

GKSFSRSDDLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLR






GEKPYKCPECGKSFSQSGDLRRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RA






HLERHQRTHTGEKPYKCPECGKS

DNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRADNLTEHQRTHTGEKPTGKK








TS








ggagcagga
GGA GCA
947
LEPGEKPYKCPECGKSFSRSDHL
1496
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2045


gcggggaat
GGA GCG

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17X18



tgg
ggg AAT

TTGNLTVHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




TGG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRH






KCPECGKSFSRSDDLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGD






QRTHTGEKPYKCPECGKSFSQSG

LRRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






DLRRHQRTHTGEKPYKCPECGKS

AHLERHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRAHLERHQRTHTGEKPTGKK








TS








ggccctccc
GGC CCT
948
LEPGEKPYKCPECGKSFSDPGAL
1497
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGALVRHX17X18X19
2046


aaacttcca
CCC AAA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18



gtc
CTT CCA

TSHSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEHX17




GTC

GKSFSTTGALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAH






KCPECGKSFSQRANLRAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLA






GEKPYKCPECGKSFSSKKHLAEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNS






QRTHTGEKPYKCPECGKSFSTKN

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP 






SLTEHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








cctcccaaa
CCT CCC
949
LEPGEKPYKCPECGKSFSTSGNL
1498
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18X19
2047


cttccagtc
AAA CTT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALVRHX17X18



cat
CCA GTC

DPGALVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17




CAT

GKSFSTSHSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEH






KCPECGKSFSTTGALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLR






GEKPYKCPECGKSFSQRANLRAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TK






HLAEHQRTHTGEKPYKCPECGKS

NSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTKNSLTEHQRTHTGEKPTGKK








TS








gggaaactg
ggg AAA
950
LEPGEKPYKCPECGKSFSRSDHL
1499
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2048


cgggggaac
CTG CGG

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGNLRVHX17X18



tgg
ggg AAC

DSGNLRVHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




TGG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEH






KCPECGKSFSRSDKLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALT






GEKPYKCPECGKSFSRNDALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAN






QRTHTGEKPYKCPECGKSFSQRA

LRAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLRAHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








accgggaaa
ACC ggg
951
LEPGEKPYKCPECGKSFSDSGNL
1500
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DSGNLRVHX17X18X19
2049


ctgcggggg
AAA CTG

RVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



aac
CGG ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17




AAC

GKSFSRSDKLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEH






KCPECGKSFSRNDALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLR






GEKPYKCPECGKSFSQRANLRAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16DK






KLVRHQRTHTGEKPYKCPECGKS

KDLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDKKDLTRHQRTHTGEKPTGKK








TS








aaactgcgg
AAA CTG
952
LEPGEKPYKCPECGKSFSRKDNL
1501
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RKDNLKNHX17X18X19
2050


gggaactgg
CGG ggg

KNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



aag
AAC TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGNLRVHX17




AAG

GKSFSDSGNLRVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLT






GEKPYKCPECGKSFSRSDKLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






ALTEHQRTHTGEKPYKCPECGKS

ANLRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRANLRAHQRTHTGEKPTGKK








TS








ggaaccggg
GGA ACC
953
LEPGEKPYKCPECGKSFSRSDKL
1502
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2051


aaactgcgg
ggg AAA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17X18



ggg
CTG CGG

RSDKLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




GGG

GKSFSRNDALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAH






KCPECGKSFSQRANLRAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKD






QRTHTGEKPYKCPECGKSFSDKK

LTRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






DLTRHQRTHTGEKPYKCPECGKS

AHLERHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRAHLERHQRTHTGEKPTGKK








TS








aggggaacc
AGG GGA
954
LEPGEKPYKCPECGKSFSRSDKL
1503
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17X18X19
2052


gggaaactg
ACC ggg

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18



cgg
AAA CTG

RNDALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAHX17




CGG

GKSFSQRANLRAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLT






GEKPYKCPECGKSFSDKKDLTRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLERHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








ctgcggggg
CTG CGG
955
LEPGEKPYKCPECGKSFSQRAHL
1504
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QPAHLERHX17X18X19
2053


aactggaag
ggg AAC

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X1X14X15X16RKDNLKNHX17X18



gga
TGG AAG

RKDNLKNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




GGA

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGNLRVH






KCPECGKSFSDSGNLRVHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






KLTEHQRTHTGEKPYKCPECGKS

DALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








cgggggaac
CGG ggg
956
LEPGEKPYKCPECGKSFSTSGEL
1505
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGELVRHX17X18X19
2054


tggaaggga
AAC TGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



gct
AAG GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLKNHX17




GCT

GKSFSRKDNLKNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGNLR






GEKPYKCPECGKSFSDSGNLRVH

VHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






KLVRHQRTHTGEKPYKCPECGKS

DKLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLTEHQRTHTGEKPTGKK








TS








cccagggga
CCC AGG
957
LEPGEKPYKCPECGKSFSRNDAL
1506
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18X19
2055


accgggaaa
GGA ACC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAHX17X18



ctg
ggg AAA

QRANLRAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




CTG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLTRH






KCPECGKSFSDKKDLTRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






HLTNHQRTHTGEKPYKCPECGKS

KHLAEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKHLAEHQRTHTGEKPTGKK








TS








gctcccagg
GCT CCC
958
LEPGEKPYKCPECGKSFSQRANL
1507
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRANLRAHX17X18X19
2056


ggaaccggg
AGG GGA

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



aaa
ACC ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLTRHX17




AAA

GKSFSDKKDLTRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLAEHQRTHTGEKPYKCPECGKS

GELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGELVRHQRTHTGEKPTGKK








TS








gggaactgg
ggg AAC
959
LEPGEKPYKCPECGKSFSSKKHL
1508
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18X19
2057


aagggagct
TGG AAG

AEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRHX17X18



ccc
GGA GCT

TSGELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17




CCC

GKSFSQRAHLERHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLKNH






KCPECGKSFSRKDNLKNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGN






QRTHTGEKPYKCPECGKSFSDSG

LRVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLRVHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








aactggaag
AAC TGG
960
LEPGEKPYKCPECGKSFSQLAHL
1509
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17X18X19
2058


ggagctccc
AAG GGA

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18



aga
GCT CCC

SKKHLAEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRHX17




AGA

GKSFSTSGELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLK






GEKPYKCPECGKSFSRKDNLKNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DS






HLTTHQRTHTGEKPYKCPECGKS

GNLRVHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDSGNLRVHQRTHTGEKPTGKK








TS








tggaaggga
TGG AAG
961
LEPGEKPYKCPECGKSFSSPADL
1510
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18X19
2059


gctcccaga
GGA GCT

TRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17X18



aca
CCC AGA

QLAHLRAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




ACA

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRH






KCPECGKSFSTSGELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDN






QRTHTGEKPYKCPECGKSFSRKD

LKNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLKNHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








aagggagct
AAG GGA
962
LEPGEKPYKCPECGKSFSRSDHL
1511
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2060


cccagaaca
GCT CCC

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18



agg
AGA ACA

SPADLTRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17




AGG

GKSFSQLAHLRAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH






KCPECGKSFSSKKHLAEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELV






GEKPYKCPECGKSFSTSGELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RK 






HLERHQRTHTGEKPYKCPECGKS

DNLKNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRKDNLKNHQRTHTGEKPTGKK








TS








caggggaac
CAG ggg
963
LEPGEKPYKCPECGKSFSRSDDL
1512
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17X18X19
2061


cgggaaact
AAC CGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRHX17X18



gcg
GAA ACT

THLDLIRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17




GCG

GKSFSQSSNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEH






KCPECGKSFSRSDKLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGNLR






GEKPYKCPECGKSFSDSGNLRVH

VHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RA






KLVRHQRTHTGEKPYKCPECGKS

DNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRADNLTEHQRTHTGEKPTGKK








TS








gggaaccgg
ggg AAC
964
LEPGEKPYKCPECGKSFSRSDKL
1513
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2062


gaaactgcg
CGG GAA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17X18



ggg
ACT GCG

RSDDLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLTRHX17




GGG

GKSFSTHLDLIRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRH






KCPECGKSFSQSSNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLT






GEKPYKCPECGKSFSRSDKLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGN






QRTHTGEKPYKCPECGKSFSDSG

LRVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLRVHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








aaccgggaa
AAC CGG
965
LEPGEKPYKCPECGKSFSQSSNL
1514
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18X19
2063


actgcgggg
GAA ACT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



gaa
GCG GGG

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17




GAA

GKSFSRSDDLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRH






KCPECGKSFSTHLDLIRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLV






GEKPYKCPECGKSFSQSSNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DS






KLTEHQRTHTGEKPYKCPECGKS

GNLRVHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDSGNLRVHQRTHTGEKPTGKK








TS








gcgggggaa
GCG GGG
966
LEPGEKPYKCPECGKSFSERSHL
1515
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16ERSHLREHX17X18X19
2064


ctggaaggg
GAA CTG

REHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



agc
GAA ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17




AGC

GKSFSQSSNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEH






KCPECGKSFSRNDALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLV






GEKPYKCPECGKSFSQSSNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






KLVRHQRTHTGEKPYKCPECGKS

DDLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDDLVRHQRTHTGEKPTGKK








TS








actgcgggg
ACT GCG
967
LEPGEKPYKCPECGKSFSRSDKL
1516
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2065


gaactggaa
GGG GAA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18



ggg
CTG GAA

QSSNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




GGG

GKSFSRNDALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRH






KCPECGKSFSQSSNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDD






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TH






DLVRHQRTHTGEKPYKCPECGKS

LDLIRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTHLDLIRHQRTHTGEKPTGKK








TS








cgggaaact
CGG GAA
968
LEPGEKPYKCPECGKSFSRNDAL
1517
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18X19
2066


gcgggggaa
ACT GCG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18



ctg
GGG GAA

QSSNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




CTG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRH






KCPECGKSFSRSDDLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLI






GEKPYKCPECGKSFSTHLDLIRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSN






QRTHTGEKPYKCPECGKSFSQSS

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLVRHQRTHTGEKPYKCPECGKS

DKLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLTEHQRTHTGEKPTGKK








TS








gaaactgcg
GAA ACT
969
LEPGEKPYKCPECGKSFSQSSNL
1518
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18X19
2067


ggggaactg
GCG GGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18



gaa
GAA CTG

RNDALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17




GAA

GKSFSQSSNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLV






GEKPYKCPECGKSFSRSDDLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLD






QRTHTGEKPYKCPECGKSFSTHL

LIRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






DLIRHQRTHTGEKPYKCPECGKS

SNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSSNLVRHQRTHTGEKPTGKK








TS








catctgggg
CAT CTG
970
LEPGEKPYKCPECGKSFSREDNL
1519
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16REDNLHTHX17X18X19
2068


cctgggatt
ggg CCT

HTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18



tag
ggg ATT

HKNALQNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




TAG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEH






KCPECGKSFSTKNSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






ALTEHQRTHTGEKPYKCPECGKS

GNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGNLTEHQRTHTGEKPTGKK








TS








gattggcat
GAT TGG
971
LEPGEKPYKCPECGKSFSRSDKL
1520
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2069


ctggggcct
CAT CTG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18



ggg
ggg CCT

TKNSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




GGG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEH






KCPECGKSFSRNDALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLT






GEKPYKCPECGKSFSTSGNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLTTHQRTHTGEKPYKCPECGKS

GNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGNLVRHQRTHTGEKPTGKK








TS








tggcatctg
TGG CAT
972
LEPGEKPYKCPECGKSFSHKNAL
1521
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18X19
2070


gggcctggg
CTG ggg

QNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



att
CCT ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17




ATT

GKSFSTKNSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALT






GEKPYKCPECGKSFSRNDALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGN






QRTHTGEKPYKCPECGKSFSTSG

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLTEHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








cagaacaag
CAG AAC
973
LEPGEKPYKCPECGKSFSTSGNL
1522
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17X18X19
2071


gatccagaa
AAG GAT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18



gat
CCA GAA

QSSNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17




GAT

GKSFSTSHSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRH






KCPECGKSFSTSGNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLK






GEKPYKCPECGKSFSRKDNLKNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGN






QRTHTGEKPYKCPECGKSFSDSG

LRVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RA






NLRVHQRTHTGEKPYKCPECGKS

DNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRADNLTEHQRTHTGEKPTGKK








TS








gaagattgg
GAA GAT
974
LEPGEKPYKCPECGKSFSTKNSL
1523
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18X19
2072


catctgggg
TGG CAT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



cct
CTG ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




CCT

GKSFSRNDALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEH






KCPECGKSFSTSGNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGN






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






NLVRHQRTHTGEKPYKCPECGKS

SNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSSNLVRHQRTHTGEKPTGKK








TS








aacaaggat
AAC AAG
975
LEPGEKPYKCPECGKSFSRSDHL
1524
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2073


ccagaagat
GAT CCA

TTHQRTHTGEKPYKCPECGKSFS

HX20X20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17X18



tgg
GAA GAT

TSGNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17




TGG

GKSFSQSSNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEH






KCPECGKSFSTSHSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLV






GEKPYKCPECGKSFSTSGNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDN






QRTHTGEKPYKCPECGKSFSRKD

LKNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DS






NLKNHQRTHTGEKPYKCPECGKS

GNLRVHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDSGNLRVHQRTHTGEKPTGKK








TS








ccagaagat
CCA GAA
976
LEPGEKPYKCPECGKSFSRSDKL
1525
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2074


tggcatctg
GAT TGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18



ggg
CAT CTG

RNDALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17




GGG

GKSFSTSGNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLV






GEKPYKCPECGKSFSTSGNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSN






QRTHTGEKPYKCPECGKSFSQSS

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






NLVRHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








gatccagaa
GAT CCA
977
LEPGEKPYKCPECGKSFSRNDAL
1526
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18X19
2075


gattggcat
GAA GAT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18



ctg
TGG CAT

TSGNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




CTG

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRH






KCPECGKSFSTSGNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLV






GEKPYKCPECGKSFSQSSNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHS






QRTHTGEKPYKCPECGKSFSTSH

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






SLTEHQRTHTGEKPYKCPECGKS

GNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGNLVRHQRTHTGEKPTGKK








TS








aaggatcca
AAG GAT
978
LEPGEKPYKCPECGKSFSTSGNL
1527
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18X19
2076


gaagattgg
CCA GAA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



cat
GAT TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17




CAT

GKSFSTSGNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRH






KCPECGKSFSQSSNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLT






GEKPYKCPECGKSFSTSHSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGN






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16RK






NLVRHQRTHTGEKPYKCPECGKS

DNLKNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRKDNLKNHQRTHTGEKPTGKK








TS








ctggggcct
CTG ggg
979
LEPGEKPYKCPECGKSFSTSGSL
1528
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGSLVRHX17X18X19
2077


gggatttag
CCT ggg

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLHTHX17X18



gtt
ATT TAG

REDNLHTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNHX17




GTT

GKSFSHKNALQNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLT






GEKPYKCPECGKSFSTKNSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






KLVRHQRTHTGEKPYKCPECGKS

DALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








ctaaatcgt
CTA AAT
980
LEPGEKPYKCPECGKSFSDPGHL
1529
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2078


gggccatgg
CGT ggg

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



ggc
CCA TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17




GGC

GKSFSTSHSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SRRTCR






GEKPYKCPECGKSFSSRRTCRAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGN






QRTHTGEKPYKCPECGKSFSTTG

LTVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QN






NLTVHQRTHTGEKPYKCPECGKS

STLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQNSTLTEHQRTHTGEKPTGKK








TS








aatcgtggg
AAT CGT
981
LEPGEKPYKCPECGKSFSERSHL
1530
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16ERSHLREHX17X18X19
2079


ccatggggc
ggg CCA

REHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



agc
TGG GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




AGC

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEH






KCPECGKSFSTSHSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SRRT






QRTHTGEKPYKCPECGKSFSSRR

CRAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TT






TCRAHQRTHTGEKPYKCPECGKS

GNLTVHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTTGNLTVHQRTHTGEKPTGKK








TS








cgtgggcca
CGT ggg
982
LEPGEKPYKCPECGKSFSTTGAL
1531
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TTGALTEHX17X18X19
2080


tggggcagc
CCA TGG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREHX17X18



ctt
GGC AGC

ERSHLREHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




CTT

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLT






GEKPYKCPECGKSFSTSHSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16SR






KLVRHQRTHTGEKPYKCPECGKS

RTCRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSRRTCRAHQRTHTGEKPTGKK








TS








ctgcaaaag
CTG CAA
983
LEPGEKPYKCPECGKSFSREDNL
152
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16REDNLHTHX17X18X19
2081


cattgaggg
AAG CAT

HTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



tag
TGA GGG

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASHX17




TAG

GKSFSQAGHLASHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEH






KCPECGKSFSTSGNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLK






GEKPYKCPECGKSFSRKDNLKNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGN






QRTHTGEKPYKCPECGKSFSQSG

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






NLTEHQRTHTGEKPYKCPECGKS

DALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








caaaagcat
CAA AAG
984
LEPGEKPYKCPECGKSFSRKDNL
1533
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RKDNLKNHX17X18X19
2082


tgagggtag
CAT TGA

KNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLHTHX17X18



aag
GGG TAG

REDNLHTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




APG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASH






KCPECGKSFSQAGHLASHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLT






GEKPYKCPECGKSFSTSGNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDN






QRTHTGEKPYKCPECGKSFSRKD

LKNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






NLKNHQRTHTGEKPYKCPECGKS

GNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGNLTEHQRTHTGEKPTGKK








TS








aaaagcatt
AAA AGC
985
LEPGEKPYKCPECGKSFSHRTTL
1534
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17X18X19
2083


gagggtaga
ATT GAG

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17X18



agt
GGT AGA

QLAHLRAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17




AGT

GKSFSTSGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRH






KCPECGKSFSRSDNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQ






GEKPYKCPECGKSFSHKNALQNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSH






QRTHTGEKPYKCPECGKSFSERS

LREHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






HLREHQRTHTGEKPYKCPECGKS

ANLRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRANLRAHQRTHTGEKPTGKK








TS








attgagggt
ATT GAG
986
LEPGEKPYKCPECGKSFSQAGHL
1535
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QAGHLASHX17X18X19
2084


agaagtcaa
GGT AGA

ASHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17X18



tga
AGT CAA

QSGNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17




TGA

GKSFSHRTTLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAH






KCPECGKSFSQLAHLRAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLV






GEKPYKCPECGKSFSTSGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HK






NLVRHQRTHTGEKPYKCPECGKS

NALQNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSHKNALQNHQRTHTGEKPTGKK








TS








agcattgag
AGC ATT
987
LEPGEKPYKCPECGKSFSQSGNL
1536
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17X18X19
2085


ggtagaagt
GAG GGT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17X18



caa
AGA AGT

HRTTLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17




CAA

GKSFSQLAHLRAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRH






KCPECGKSFSTSGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNA






QRTHTGEKPYKCPECGKSFSHKN

LQNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ER






ALQNHQRTHTGEKPYKCPECGKS

SHLREHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSERSHLREHQRTHTGEKPTGKK








TS








atgatttgg
ATG ATT
988
LEPGEKPYKCPECGKSFSQSSNL
1537
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18X19
2086


gaagttatt
TGG GAA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18



gaa
GTT ATT

HKNALQNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLVRHX17




GAA

GKSFSTSGSLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRH






KCPECGKSFSQSSNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNA






QRTHTGEKPYKCPECGKSFSHKN

LQNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RR






ALQNHQRTHTGEKPYKCPECGKS

DELNVHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRRDELNVHQRTHTGEKPTGKK








TS








taggctgtc
TAG GCT
989
LEPGEKPYKCPECGKSFSREDNL
1538
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16REDNLHTHX17X18X19
2087


agtgcctga
GTC AGT

HTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASHX17X18



tag
GCC TGA

QAGHLASHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17




TAG

GKSFSDCRDLARHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNH






KCPECGKSFSHRTTLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALV






GEKPYKCPECGKSFSDPGALVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGE






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16RE






ELVRHQRTHTGEKPYKCPECGKS

DNLHTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSREDNLHTHQRTHTGEKPTGKK








TS








aggtaggct
AGG TAG
990
LEPGEKPYKCPECGKSFSQAGHL
1539
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QAGHLASHX17X18X19
2088


gtcagtgcc
GCT GTC

ASHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18



tga
AGT GCC

DCRDLARHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17




TGA

GKSFSHRTTLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALVRH






KCPECGKSFSDPGALVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELV






GEKPYKCPECGKSFSTSGELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDN






QRTHTGEKPYKCPECGKSFSRED

LHTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS 






NLHTHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








cggaggtag
CGG AGG
991
LEPGEKPYKCPECGKSFSDCRDL
1540
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18X19
2089


gctgtcagt
TAG GCT

ARHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17X18



gcc
GTC AGT

HRTTLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALVRHX17




GCC

GKSFSDPGALVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRH






KCPECGKSFSTSGELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLH






GEKPYKCPECGKSFSREDNLHTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLTNHQRTHTGEKPYKCPECGKS

DKLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLTEHQRTHTGEKPTGKK








TS








tagaatgcc
TAG AAT
992
LEPGEKPYKCPECGKSFSRSDEL
1541
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18X19
2090


tgacttggg
GCC TGA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



gtg
CTT GGG

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEHX17




GTG

GKSFSTTGALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASH






KCPECGKSFSQAGHLASHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLA






GEKPYKCPECGKSFSDCRDLARH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGN






QRTHTGEKPYKCPECGKSFSTTG

LTVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RE






NLTVHQRTHTGEKPYKCPECGKS

DNLHTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSREDNLHTHQRTHTGEKPTGKK








TS








aatgcctga
AAT GCC
993
LEPGEKPYKCPECGKSFSSPADL
1542
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18X19
2091


cttggggtg
TGA CTT

TRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18



aca
GGG gtg

RSDELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




ACA

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEH






KCPECGKSFSTTGALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLA






GEKPYKCPECGKSFSQAGHLASH

SHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRD






QRTHTGEKPYKCPECGKSFSDCR

LARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16TT






DLARHQRTHTGEKPYKCPECGKS

GNLTVHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTTGNLTVHQRTHTGEKPTGKK








TS








agttagaat
AGT TAG
994
LEPGEKPYKCPECGKSFSRSDKL
1543
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2092


gcctgactt
AT GCC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEHX17X18



ggg
TGA CTT

TTGALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASHX17




GGG

GKSFSQAGHLASHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARH






KCPECGKSFSDCRDLARHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLT






GEKPYKCPECGKSFSTTGNLTVH

VHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDN






QRTHTGEKPYKCPECGKSFSRED

LHTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HR






NLHTHQRTHTGEKPYKCPECGKS

TTLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSHRTTLTNHQRTHTGEKPTGKK








TS








gcctgactt
GCC TGA
995
LEPGEKPYKCPECGKSFSRRDEL
1544
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RRDELNVHX17X18X19
2093


ggggtgaca
CTT GGG

NVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18



atg
gtg ACA

SPADLTRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17




ATG

GKSFSRSDELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALT






GEKPYKCPECGKSFSTTGALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGH






QRTHTGEKPYKCPECGKSFSQAG

LASHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DC






HLASHQRTHTGEKPYKCPECGKS

RDLARHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDCRDLARHQRTHTGEKPTGKK








TS








ggggtgaca
GGG gtg
996
LEPGEKPYKCPECGKSFSRSDHL
1545
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2094


atggcttgg
ACA ATG

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



agg
GCT TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRHX17




AGG

GKSFSTSGELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVH






KCPECGKSFSRRDELNVHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLT






GEKPYKCPECGKSFSSPADLTRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDE






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






ELVRHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








tgacttggg
TGA CTT
997
LEPGEKPYKCPECGKSFSTSGEL
1546
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGELVRHX17X18X19
2095


gtgacaatg
GGG gtg

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVHX17X18



gct
ACA ATG

RRDELNVHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17




GCT

GKSFSSPADLTRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRH






KCPECGKSFSRSDELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGA






QRTHTGEKPYKCPECGKSFSTTG

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QA 






ALTEHQRTHTGEKPYKCPECGKS

GHLASHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQAGHLASHQRTHTGEKPTGKK








TS








cttggggtg
CTT GGG
130
LEPGEKPYKCPECGKSFSRSDHL
131
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
320


acaatggct
gtg ACA

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRHX17X18



tgg
ATG GCT

TSGELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVHX17




TGG

GKSFSRRDELNVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRH






KCPECGKSFSSPADLTRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELV






GEKPYKCPECGKSFSRSDELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16TT






KLVRHQRTHTGEKPYKCPECGKS

GALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTTGALTEHQRTHTGEKPTGKK








TS








gcttggagg
GCT TGG
998
LEPGEKPYKCPECGKSFSDPGAL
1547
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGALVRHX17X18X19
2096


ggtgggtga
AGG GGT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASHX17X18



gtc
GGG TGA

QAGHLASHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




GTC

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRH






KCPECGKSFSTSGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLTTHQRTHTGEKPYKCPECGKS

GELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGELVRHQRTHTGEKPTGKK








TS








tggaggggt
TGG AGG
999
LEPGEKPYKCPECGKSFSRKDNL
1548
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RKDNLKNHX17X18X19
2097


gggtgagtc
GGT GGG

KNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALVRHX17X18



aag
TGA GTC

DPGALVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASHX17




APG

GKSFSQAGHLASHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLV






GEKPYKCPECGKSFSTSGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLTNHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








gtgacaatg
gtg ACA
1000
LEPGEKPYKCPECGKSFSTSGHL
1549
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18X19
2098


gcttggagg
ATG GCT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



ggt
TGG AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




GGT

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRH






KCPECGKSFSTSGELVRHQRTHT

X27X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELN






GEKPYKCPECGKSFSRRDELNVH

VHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPAD






QRTHTGEKPYKCPECGKSFSSPA

LTRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






DLTRHQRTHTGEKPYKCPECGKS

DELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDELVRHQRTHTGEKPTGKK








TS








atggcttgg
ATG GCT
1001
LEPGEKPYKCPECGKSFSQAGHL
1550
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QAGHLASHX17X18X19
2099


aggggtggg
TGG AGG

ASHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



tga
GGT GGG

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17




TGA

GKSFSTSGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGE






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RR






ELVRHQRTHTGEKPYKCPECGKS

DELNVHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRRDELNVHQRTHTGEKPTGKK








TS








acaatggct
ACA ATG
1002
LEPGEKPYKCPECGKSFSRSDKL
1551
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2100


tggaggggt
GCT TGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18



ggg
AGG GGT

TSGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




GGG

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELV






GEKPYKCPECGKSFSTSGELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDE






QRTHTGEKPYKCPECGKSFSRRD

LNVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SP






ELNVHQRTHTGEKPYKCPECGKS

ADLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSPADLTRHQRTHTGEKPTGKK








TS








gggtgagtc
GGG TGA
1003
LEPGEKPYKCPECGKSFSRRDEL
1552
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RRDELNVHX17X18X19
2101


aagggtcaa
GTC AAG

NVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17X18



atg
GGT CAA

QSGNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17




ATG

GKSFSTSGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLKNH






KCPECGKSFSRKDNLKNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALV






GEKPYKCPECGKSFSDPGALVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGH






QRTHTGEKPYKCPECGKSFSQAG

LASHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS 






HLASHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








ggtcaaatg
GGT CAA
1004
LEPGEKPYKCPECGKSFSRSDNL
1553
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18X19
2102


agtgcccgt
ATG AGT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SRRTCRAHX17X18



gag
GCC CGT

SRRTCRAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17




GAG

GKSFSDCRDLARHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNH






KCPECGKSFSHRTTLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELN






GEKPYKCPECGKSFSRRDELNVH

VHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGN






QRTHTGEKPYKCPECGKSFSQSG

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






NLTEHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGHLVRHQRTHTGEKPTGKK








TS








tgagtcaag
TGA GTC
1005
LEPGEKPYKCPECGKSFSHRTTL
1554
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17X18X19
2103


ggtcaaatg
AAG GGT

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVHX17X18



agt
CAA ATG

RRDELNVHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17




AGT

GKSFSQSGNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRH






KCPECGKSFSTSGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLK






GEKPYKCPECGKSFSRKDNLKNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGA






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QA






ALVRHQRTHTGEKPYKCPECGKS

GHLASHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQAGHLASHQRTHTGEKPTGKK








TS








gtcaagggt
GTC AAG
1006
LEPGEKPYKCPECGKSFSDCRDL
1555
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18X19
2104


caaatgagt
GGT CAA

ARHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17X18



gcc
ATG AGT

HRTTLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVHX17




GCC

GKSFSRRDELNVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEH






KCPECGKSFSQSGNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLV






GEKPYKCPECGKSFSTSGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDN






QRTHTGEKPYKCPECGKSFSRKD

LKNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






NLKNHQRTHTGEKPYKCPECGKS

GALVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGALVRHQRTHTGEKPTGKK








TS








aagggtcaa
AAG GGT
1007
LEPGEKPYKCPECGKSFSSRRTC
1556
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SRRTCRAHX17X18X19
2105


atgagtgcc
CAA ATG

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18



cgt
AGT GCC

DCRDLARHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17




CGT

GKSFSHRTTLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVH






KCPECGKSFSRRDELNVHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLT






GEKPYKCPECGKSFSQSGNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGH






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RK






HLVRHQRTHTGEKPYKCPECGKS

DNLKNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRKDNLKNHQRTHTGEKPTGKK








TS








aggggtggg
AGG GGT
1008
LEPGEKPYKCPECGKSFSTSGHL
1557
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18X19
2106


tgagtcaag
GGG TGA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLKNHX17X18



ggt
GTC AAG

RKDNLKNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALVRHX17




GGT

GKSFSDPGALVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASH






KCPECGKSFSQAGHLASHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGH






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLVRHQRTHTGEKPYKCPECGKS

DHLTNHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








ggtgggtga
GGT GGG
1009
LEPGEKPYKCPECGKSFSQSGNL
1558
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17X18X19
2107


gtcaagggt
TGA GTC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18



caa
AAG GGT

TSGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLKNHX17




CAA

GKSFSRKDNLKNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALVRH






KCPECGKSFSDPGALVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLA






GEKPYKCPECGKSFSQAGHLASH

SHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






KLVRHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGHLVRHQRTHTGEKPTGKK








TS








ctgacttgg
CTG ACT
1010
LEPGEKPYKCPECGKSFSDPGHL
1559
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2108


ggtgacaat
TGG GGT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17X18



ggc
GAC AAT

TTGNLTVHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGNLVRHX17




GGC

GKSFSDPGNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRH






KCPECGKSFSTSGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLD






QRTHTGEKPYKCPECGKSFSTHL

LIRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN 






DLIRHQRTHTGEKPYKCPECGKS

DALTEHX17X18X19XHX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








gggtgacaa
GGG TGA
1011
LEPGEKPYKCPECGKSFSRSDKL
1560
X1X2X3X4X5X6X7X8X9CXl0X11CX12X13X14X15X16RSDKLVRHX17X18X19
2109


tggcttgga
CAA TGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



ggg
CTT GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEHX17




GGG

GKSFSTTGALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLT






GEKPYKCPECGKSFSQSGNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGH






QRTHTGEKPYKCPECGKSFSQAG

LASHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLASHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








tgacaatgg
TGA CAA
1012
LEPGEKPYKCPECGKSFSRSDEL
1561
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18X19
2110


cttggaggg
TGG CTT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



gtg
GGA GGG

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17




GTG

GKSFSQRAHLERHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEH






KCPECGKSFSTTGALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGN






QRTHTGEKPYKCPECGKSFSQSG

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QA






NLTEHQRTHTGEKPYKCPECGKS

GHLASHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQAGHLASHQRTHTGEKPTGKK








TS








tggcttgga
TGG CTT
1013
LEPGEKPYKCPECGKSFSRSDNL
1562
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18X19
2111


ggggtgggt
GGA GGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14Z15X16TSGHLVRHX17X18



gag
GTG GGT

TSGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17




GAG

GKSFSRSDELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGA






QRTHTGEKPYKCPECGKSFSTTG

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






ALTEHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








caatggctt
CAA TGG
1014
LEPGEKPYKCPECGKSFSTSGHL
1563
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18X19
2112


ggaggggtg
CTT GGA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18



ggt
GGG GTG

RSDELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




GGT

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALT






GEKPYKCPECGKSFSTTGALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






HLTTHQRTHTGEKPYKCPECGKS

GNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGNLTEHQRTHTGEKPTGKK








TS








gaggggtgg
GAG GGG
1015
LEPGEKPYKCPECGKSFSRSDKL
1564
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2113


gtgagtcaa
TGG gtg

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17X18



ggg
AGT CAA

QSGNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17




GGG

GKSFSHRTTLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRH






KCPECGKSFSRSDELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






KLVRHQRTHTGEKPYKCPECGKS

DNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDNLVRHQRTHTGEKPTGKK








TS








agggtcaaa
AGG GTC
1016
LEPGEKPYKCPECGKSFSRSDEL
1565
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18X19
2114


tgagtgccc
AAA TGA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18



gtg
gtg CCC

SKKHLAEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17




GTG

GKSFSRSDELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASH






KCPECGKSFSQAGHLASHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLR






GEKPYKCPECGKSFSQRANLRAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGA






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






ALVRHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








gtcaaatga
GTC AAA
1017
LEPGEKPYKCPECGKSFSHRTTL
1566
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17X18X19
2115


gtgcccgtg
TGA gtg

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18



agt
CCC gtg

RSDELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




AGT

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRH






KCPECGKSFSRSDELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLA






GEKPYKCPECGKSFSQAGHLASH

SHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAN






QRTHTGEKPYKCPECGKSFSQRA

LRAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






NLRAHQRTHTGEKPYKCPECGKS

GALVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGALVRHQRTHTGEKPTGKK








TS








aaatgagtg
AAA TGA
1018
LEPGEKPYKCPECGKSFSTSGNL
1567
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18X19
2116


cccgtgagt
gtg CCC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17X18



cat
gtg AGT

HRTTLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17




CAT

GKSFSRSDELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH






KCPECGKSFSSKKHLAEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELV






GEKPYKCPECGKSFSRSDELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGH






QRTHTGEKPYKCPECGKSFSQAG

LASHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






HLASHQRTHTGEKPYKCPECGKS

ANLRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRANLRAHQRTHTGEKPTGKK








TS








tgagtgccc
TGA gtg
1019
LEPGEKPYKCPECGKSFSTSGNL
1568
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17X18X19
2117


gtgagtcat
CCC gtg

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18



gat
AGT CAT

TSGNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17




GAT

GKSFSHRTTLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRH






KCPECGKSFSRSDELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLA






GEKPYKCPECGKSFSSKKHLAEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDE






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QA






ELVRHQRTHTGEKPYKCPECGKS

GHLASHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQAGHLASHQRTHTGEKPTGKK








TS








gtgcccgtg
gtg CCC
1020
LEPGEKPYKCPECGKSFSDCRDL
1569
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18X19
2118


agtcatgat
gtg AGT

ARHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17X18



gcc
CAT GAT

TSGNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17




GCC

GKSFSTSGNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNH






KCPECGKSFSHRTTLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELV






GEKPYKCPECGKSFSRSDELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLAEHQRTHTGEKPYKCPECGKS

DELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDELVRHQRTHTGEKPTGKK








TS








ccgtgagtc
CCG TGA
1021
LEPGEKPYKCPECGKSFSDCRDL
1570
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18X19
2119


atgatgcct
GTC ATG

ARHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18



gcc
ATG CCT

TKNSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVHX17




GCC

GKSFSRRDELNVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVH






KCPECGKSFSRRDELNVHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALV






GEKPYKCPECGKSFSDPGALVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGH






QRTHTGEKPYKCPECGKSFSQAG

LASHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






HLASHQRTHTGEKPYKCPECGKS

DTLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDTLTEHQRTHTGEKPTGKK








TS








gtcagccag
GTC AGC
1022
LEPGEKPYKCPECGKSFSDSGNL
1571
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DSGNLRVHX17X18X19
2120


aaaccaaca
CAG AAA

RVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18



aac
CCA ACA

SPADLTRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17




AAC

GKSFSTSHSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAH






KCPECGKSFSQRANLRAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLT






GEKPYKCPECGKSFSRADNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSH






QRTHTGEKPYKCPECGKSFSERS

LREHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






HLREHQRTHTGEKPYKCPECGKS

GALVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGALVRHQRTHTGEKPTGKK








TS








agccagaaa
AGC CAG
1023
LEPGEKPYKCPECGKSFSERSHL
1572
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16ERSHLREHX17X18X19
2121


ccaacaaac
AAA CCA

REHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGNLRVHX17X18



agc
ACA AAC

DSGNLRVHQRTHTGEKPYKCPEC

X9HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17




AGC

GKSFSSPADLTRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEH






KCPECGKSFSTSHSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLR






GEKPYKCPECGKSFSQRANLRAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADN






QRTHTGEKPYKCPECGKSFSRAD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ER






NLTEHQRTHTGEKPYKCPECGKS

SHLREHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSERSHLREHQRTHTGEKPTGKK








TS








cagaaacca
CAG AAA
1024
LEPGEKPYKCPECGKSFSQSGNL
1573
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17X18X19
2122


acaaacagc
CCA ACA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREHX17X18



caa
AAC AGC

ERSHLREHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGNLRVHX17




CAA

GKSFSDSGNLRVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRH






KCPECGKSFSSPADLTRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLT






GEKPYKCPECGKSFSTSHSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAN






QRTHTGEKPYKCPECGKSFSQRA

LRAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RA






NLRAHQRTHTGEKPYKCPECGKS

DNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRADNLTEHQRTHTGEKPTGKK








TS








gccccagca
GCC CCA
1025
LEPGEKPYKCPECGKSFSERSHL
1574
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16ERSHLREHX17X18X19
2123


gttgtaatt
GCA GTT

REHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18



agc
GTA ATT

HKNALQNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSSLVRHX17




AGC

GKSFSQSSSLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLVRH






KCPECGKSFSTSGSLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLR






GEKPYKCPECGKSFSQSGDLRRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHS






QRTHTGEKPYKCPECGKSFSTSH

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DC






SLTEHQRTHTGEKPYKCPECGKS

RDLARHX17X18X19HX20X21X22X23X24X25X26X27X28X29X3






FSDCRDLARHQRTHTGEKPTGKK








TS








cggtgagtt
CGG TGA
1026
LEPGEKPYKCPECGKSFSQSGDL
1575
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18X19
2124


agggcccca
GTT AGG

RRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18



gca
GCC CCA

TSHSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17




GCA

GKSFSDCRDLARHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLV






GEKPYKCPECGKSFSTSGSLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGH






QRTHTGEKPYKCPECGKSFSQAG

LASHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLASHQRTHTGEKPYKCPECGKS

DKLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLTEHQRTHTGEKPTGKK








TS








ggtgtcagc
GGT GTC
1027
LEPGEKPYKCPECGKSFSSPADL
1576
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18X19
2125


cagaaacca
AGC CAG

TRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18



aca
AAA CCA

TSHSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAHX17




ACA

GKSFSQRANLRAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEH






KCPECGKSFSRADNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLR






GEKPYKCPECGKSFSERSHLREH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGA






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






ALVRHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGHLVRHQRTHTGEKPTGKK








TS








agggcccca
AGG GCC
1028
LEPGEKPYKCPECGKSFSHKNAL
1577
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18X19
2126


gcagttgta
CCA GCA

QNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSSLVRHX17X18



att
GTT GTA

QSSSLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLVRHX17




ATT

GKSFSTSGSLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRH






KCPECGKSFSQSGDLRRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLT






GEKPYKCPECGKSFSTSHSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRD






QRTHTGEKPYKCPECGKSFSDCR

LARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






DLARHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








ccagcagtt
CCA GCA
1029
LEPGEKPYKCPECGKSFSDKKDL
1578
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DKKDLTRHX17X18X19
2127


gtaattagc
GTT GTA

TRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREHX17X18



acc
ATT AGC

ERSHLREHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNHX17




ACC

GKSFSHKNALQNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSSLVRH






KCPECGKSFSQSSSLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLV






GEKPYKCPECGKSFSTSGSLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGD






QRTHTGEKPYKCPECGKSFSQSG

LRRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






DLRRHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








ataactgaa
ATA ACT
1030
LEPGEKPYKCPECGKSFSTSGSL
1579
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGSLVRHX17X18X19
2128


catcggtga
GAA CAT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASHX17X18



gtt
CGG TGA

QAGHLASHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17




GTT

GKSFSRSDKLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEH






KCPECGKSFSTSGNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLV






GEKPYKCPECGKSFSQSSNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLD






QRTHTGEKPYKCPECGKSFSTHL

LIRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QK






DLIRHQRTHTGEKPYKCPECGKS

SSLIAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQKSSLIAHQRTHTGEKPTGKK








TS








catcggtga
CAT CGG
1031
LEPGEKPYKCPECGKSFSTSHSL
1580
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18X19
2129


gttagggcc
TGA GTT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18



cca
AGG GCC

DCRDLARHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




CCA

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLVRH






KCPECGKSFSTSGSLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLA






GEKPYKCPECGKSFSQAGHLASH

SHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






KLTEHQRTHTGEKPYKCPECGKS

GNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGNLTEHQRTHTGEKPTGKK








TS








ccgggtgtc
CCG GGT
122
LEPGEKPYKCPECGKSFSTSHSL
123
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18X19
318


agccagaaa
GTC AGC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAHX17X18



cca
CAG AAA

QRANLRAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17




CCA

GKSFSRADNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREH






KCPECGKSFSERSHLREHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALV






GEKPYKCPECGKSFSDPGALVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGH






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






HLVRHQRTHTGEKPYKCPECGKS

DTLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDTLTEHQRTHTGEKPTGKK








TS








actgaacat
ACT GAA
126
LEPGEKPYKCPECGKSFSRSDHL
127
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
319


cggtgagtt
CAT CGG

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLVRHX17X18



agg
TGA GTT

TSGSLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASHX17




AGG

GKSFSQAGHLASHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEH






KCPECGKSFSRSDKLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLT






GEKPYKCPECGKSFSTSGNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSN






QRTHTGEKPYKCPECGKSFSQSS

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TH






NLVRHQRTHTGEKPYKCPECGKS

LDLIRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTHLDLIRHQRTHTGEKPTGKK








TS








gaacatcgg
GAA CAT
1032
LEPGEKPYKCPECGKSFSDCRDL
1581
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18X19
2130


tgagttagg
CGG TGA

ARHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



gcc
GTT AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLVRHX17




GCC

GKSFSTSGSLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASH






KCPECGKSFSQAGHLASHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLT






GEKPYKCPECGKSFSRSDKLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGN






QRTHTGEKPYKCPECGKSFSTSG

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






NLTEHQRTHTGEKPYKCPECGKS

SNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSSNLVRHQRTHTGEKPTGKK








TS








accccgggt
ACC CCG
1033
LEPGEKPYKCPECGKSFSQRANL
1582
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRANLRAHX17X18X19
2131


gtcagccag
GGT GTC

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18



aaa
AGC CAG

RADNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREHX17




AAA

GKSFSERSHLREHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALVRH






KCPECGKSFSDPGALVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLV






GEKPYKCPECGKSFSTSGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDT






QRTHTGEKPYKCPECGKSFSRND

LTEHX1XX9HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DK






TLTEHQRTHTGEKPYKCPECGKS

KDLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDKKDLTRHQRTHTGEKPTGKK








TS








agcaccccg
AGC ACC
1034
LEPGEKPYKCPECGKSFSRADNL
1583
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18X19
2132


ggtgtcagc
CCG GGT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREHX17X18



cag
GTC AGC

ERSHLREHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALVRHX17




CAG

GKSFSDPGALVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRH






KCPECGKSFSTSGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLT






GEKPYKCPECGKSFSRNDTLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKD






QRTHTGEKPYKCPECGKSFSDKK

LTRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ER






DLTRHQRTHTGEKPYKCPECGKS

SHLREHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSERSHLREHQRTHTGEKPTGKK








TS








gcagttgta
GCA GTT
1035
LEPGEKPYKCPECGKSFSRNDTL
1584
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17X18X19
2133


attagcacc
GTA ATT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLTRHX17X18



ccg
AGC ACC

DKKDLTRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREHX17




CCG

GKSFSERSHLREHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNH






KCPECGKSFSHKNALQNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSSLV






GEKPYKCPECGKSFSQSSSLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGS






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






SLVRHQRTHTGEKPYKCPECGKS

GDLRRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGDLRRHQRTHTGEKPTGKK








TS








attagcacc
ATT AGC
1036
LEPGEKPYKCPECGKSFSERSHL
1585
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16ERSHLREHX17X18X19
2134


ccgggtgtc
ACC CCG

REHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALVRHX17X18



agc
GGT GTC

DPGALVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17




AGC

GKSFSTSGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEH






KCPECGKSFSRNDTLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLT






GEKPYKCPECGKSFSDKKDLTRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSH






QRTHTGEKPYKCPECGKSFSERS

LREHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HK






HLREHQRTHTGEKPYKCPECGKS

NALQNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSHKNALQNHQRTHTGEKPTGKK








TS








gtaattagc
GTA ATT
1037
LEPGEKPYKCPECGKSFSDPGAL
1586
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGALVRHX17X18X19
2135


accccgggt
AGC ACC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18



gtc
CCG GGT

TSGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17




GTC

GKSFSRNDTLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLTRH






KCPECGKSFSDKKDLTRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLR






GEKPYKCPECGKSFSERSHLREH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNA






QRTHTGEKPYKCPECGKSFSHKN

LQNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






ALQNHQRTHTGEKPYKCPECGKS

SSLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSSSLVRHQRTHTGEKPTGKK








TS








tgagttagg
TGA GTT
1038
LEPGEKPYKCPECGKSFSTSGSL
1587
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGSLVRHX17X18X19
2136


gccccagca
AGG GCC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18



gtt
CCA GCA

QSGDLRRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17




GTT

GKSFSTSHSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARH






KCPECGKSFSDCRDLARHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGS






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QA






SLVRHQRTHTGEKPYKCPECGKS

GHLASHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQAGHLASHQRTHTGEKPTGKK








TS








gttagggcc
GTT AGG
1039
LEPGEKPYKCPECGKSFSQSSSL
1588
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSSSLVRHX17X18X19
2137


ccagcagtt
GCC CCA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLVRHX17X18



gta
GCA GTT

TSGSLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17




GTA

GKSFSQSGDLRRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEH






KCPECGKSFSTSHSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLA






GEKPYKCPECGKSFSDCRDLARH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLTNHQRTHTGEKPYKCPECGKS

GSLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGSLVRHQRTHTGEKPTGKK








TS








gttgtaatt
GTT GTA
1040
LEPGEKPYKCPECGKSFSTSGHL
1589
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18X19
2138


agcaccccg
ATT AGC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17X18



ggt
ACC CCG

RNDTLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLTRHX17




GGT

GKSFSDKKDLTRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREH






KCPECGKSFSERSHLREHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQ






GEKPYKCPECGKSFSHKNALQNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSS






QRTHTGEKPYKCPECGKSFSQSS

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






SLVRHQRTHTGEKPYKCPECGKS

GSLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGSLVRHQRTHTGEKPTGKK








TS








gtgagttag
gtg AGT
1041
LEPGEKPYKCPECGKSFSHRTTL
1590
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17X18X19
2139


ggccccagc
TAG GGC

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREHX17X18



agt
CCC AGC

ERSHLREHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




AGT

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLH






GEKPYKCPECGKSFSREDNLHTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTT






QRTHTGEKPYKCPECGKSFSHRT

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






TLTNHQRTHTGEKPYKCPECGKS

DELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDELVRHQRTHTGEKPTGKK








TS








cagccagaa
CAG CCA
1042
LEPGEKPYKCPECGKSFSRADNL
1591
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18X19
2140


accaacaaa
GAA ACC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAHX17X18



cag
AAC AAA

QRANLRAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGNLRVHX17




CAG

GKSFSDSGNLRVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLTRH






KCPECGKSFSDKKDLTRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLV






GEKPYKCPECGKSFSQSSNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHS






QRTHTGEKPYKCPECGKSFSTSH

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RA






SLTEHQRTHTGEKPYKCPECGKS

DNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRADNLTEHQRTHTGEKPTGKK








TS








gaaaccaac
GAA ACC
1043
LEPGEKPYKCPECGKSFSTTGNL
1592
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17X18X19
2141


aaacagcca
AAC AAA

TVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18



aat
CAG CCA

TSHSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17




AAT

GKSFSRADNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAH






KCPECGKSFSQRANLRAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGNLR






GEKPYKCPECGKSFSDSGNLRVH

VHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKD






QRTHTGEKPYKCPECGKSFSDKK

LTRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






DLTRHQRTHTGEKPYKCPECGKS

SNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSSNLVRHQRTHTGEKPTGKK








TS








ccagaaacc
CCA GAA
1044
LEPGEKPYKCPECGKSFSTSHSL
1593
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18X19
2142


aacaaacag
ACC AAC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18



cca
AAA CAG

RADNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAHX17




CCA

GKSFSQRANLRAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGNLRVH






KCPECGKSFSDSGNLRVHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLT






GEKPYKCPECGKSFSDKKDLTRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSN






QRTHTGEKPYKCPECGKSFSQSS

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






NLVRHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








accaacaaa
ACC AAC
1045
LEPGEKPYKCPECGKSFSSKKHL
1594
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18X19
2143


cagccaaat
AAA CAG

AEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17X18



ccc
CCA AAT

TTGNLTVHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17




CCC

GKSFSTSHSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEH






KCPECGKSFSRADNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLR






GEKPYKCPECGKSFSQRANLRAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGN






QRTHTGEKPYKCPECGKSFSDSG

LRVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DK






NLRVHQRTHTGEKPYKCPECGKS

KDLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDKKDLTRHQRTHTGEKPTGKK








TS








gccagaaac
GCC AGA
1046
LEPGEKPYKCPECGKSFSDCRDL
1595
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18X19
2144


caacaaaca
AAC CAA

ARHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18



gcc
CAA ACA

SPADLTRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17




GCC

GKSFSQSGNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEH






KCPECGKSFSQSGNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGNLR






GEKPYKCPECGKSFSDSGNLRVH

VHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAH






QRTHTGEKPYKCPECGKSFSQLA

LRAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DC






HLRAHQRTHTGEKPYKCPECGKS

RDLARHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDCRDLARHQRTHTGEKPTGKK








TS








agaaaccaa
AGA AAC
1047
LEPGEKPYKCPECGKSFSQRANL
1596
X1X2X3X4X5X6X7X8X9CX10X11X12X13X14X15X16QRANLRAHX17X18X19
2145


caaacagcc
CAA CAA

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18



aaa
ACA GCC

DCRDLARHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17




AAA

GKSFSSPADLTRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEH






KCPECGKSFSQSGNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLT






GEKPYKCPECGKSFSQSGNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGN






QRTHTGEKPYKCPECGKSFSDSG

LRVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QL






NLRVHQRTHTGEKPYKCPECGKS

AHLRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQLAHLRAHQRTHTGEKPTGKK








TS








cctgcagcc
CCT GCA
138
LEPGEKPYKCPECGKSFSQNSTL
139
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QNSTLTEHX17X18X19
322


ccgcccagc
GCC CCG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREHX17X18



cta
CCC AGC

ERSHLREHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




CTA

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEH






KCPECGKSFSRNDTLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLA






GEKPYKCPECGKSFSDCRDLARH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGD






QRTHTGEKPYKCPECGKSFSQSG

LRRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TK






DLRRHQRTHTGEKPYKCPECGKS

NSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTKNSLTEHQRTHTGEKPTGKK








TS








accggcggg
ACC GGC
1048
LEPGEKPYKCPECGKSFSDSGNL
1597
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DSGNLRVHX17X18X19
2146


ggaccgatt
GGG GGA

RVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18



aac
CCG ATT

HKNALQNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17




AAC

GKSFSRNDTLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16DK






HLVRHQRTHTGEKPYKCPECGKS

KDLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDKKDLTRHQRTHTGEKPTGKK








TS








ggcggggga
GGC GGG
118
LEPGEKPYKCPECGKSFSTSGNL
119
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18X19
317


ccgattaac
GGA CCG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGNLRVHX17X18



cat
ATT AAC

DSGNLRVHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNHX17




CAT

GKSFSHKNALQNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEH






KCPECGKSFSRNDTLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






KLVRHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








cccacccct
CCC ACC
142
LEPGEKPYKCPECGKSFSERSHL
143
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16ERSHLREHX17X18X19
323


ccccggcag
CCT CCC

REHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18



agc
CGG CAG

RADNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17




AGC

GKSFSRSDKLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH






KCPECGKSFSSKKHLAEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLT






GEKPYKCPECGKSFSTKNSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKD






QRTHTGEKPYKCPECGKSFSDKK

LTRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






DLTRHQRTHTGEKPYKCPECGKS

KHLAEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKHLAEHQRTHTGEKPTGKK








TS








acccccacc
ACC CCC
1049
LEPGEKPYKCPECGKSFSRADNL
1598
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18X19
2147


cctccccgg
ACC CCT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17X18



cag
CCC CGG

RSDKLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




CAG

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEH






KCPECGKSFSTKNSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLT






GEKPYKCPECGKSFSDKKDLTRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DK






HLAEHQRTHTGEKPYKCPECGKS

KDLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDKKDLTRHQRTHTGEKPTGKK








TS








cacagaggc
CAC AGA
1050
LEPGEKPYKCPECGKSFSTHLDL
1599
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16THLDLIRHX17X18X19
2148


taggccaag
GGC TAG

IRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLKNHX17X18



act
GCC AAG

RKDNLKNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17




ACT

GKSFSDCRDLARHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLHTH






KCPECGKSFSREDNLHTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAH






QRTHTGEKPYKCPECGKSFSQLA

LRAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






HLRAHQRTHTGEKPYKCPECGKS

KALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKALTEHQRTHTGEKPTGKK








TS








agaggctag
AGA GGC
146
LEPGEKPYKCPECGKSFSSKKHL
147
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18X19
324


gccaagact
TAG GCC

AEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRHX17X18



ccc
AAG ACT

THLDLIRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLKNHX17




CCC

GKSFSRKDNLKNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARH






KCPECGKSFSDCRDLARHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLH






GEKPYKCPECGKSFSREDNLHTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QL






HLVRHQRTHTGEKPYKCPECGKS

AHLRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQLAHLRAHQRTHTGEKPTGKK








TS








cttcacaga
CTT CAC
1051
LEPGEKPYKCPECGKSFSRKDNL
1600
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RKDNLKNHX17X18X19
2149


ggctaggcc
AGA GGC

KNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18



aag
TAG GCC

DCRDLARHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLHTHX17




APG

GKSFSREDNLHTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLR






GEKPYKCPECGKSFSQLAHLRAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKA






QRTHTGEKPYKCPECGKSFSSKK

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TT






ALTEHQRTHTGEKPYKCPECGKS

GALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTTGALTEHQRTHTGEKPTGKK








TS








taggccaag
TAG GCC
1052
LEPGEKPYKCPECGKSFSQLAHL
1601
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17X18X19
2150


actcccagc
AAG ACT

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREHX17X18



aga
CCC AGC

ERSHLREHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




AGA

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRH






KCPECGKSFSTHLDLIRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLK






GEKPYKCPECGKSFSRKDNLKNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRD






QRTHTGEKPYKCPECGKSFSDCR

LARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RE






DLARHQRTHTGEKPYKCPECGKS

DNLHTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSREDNLHTHQRTHTGEKPTGKK








TS








ggctaggcc
GGC TAG
1053
LEPGEKPYKCPECGKSFSERSHL
1602
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16ERSHLREHX17X18X19
2151


aagactccc
GCC AAG

REHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18



agc
ACT CCC

SKKHLAEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRHX17




AGC

GKSFSTHLDLIRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLKNH






KCPECGKSFSRKDNLKNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLA






GEKPYKCPECGKSFSDCRDLARH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDN






QRTHTGEKPYKCPECGKSFSRED

LHTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






NLHTHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








ccccttcac
CCC CTT
1054
LEPGEKPYKCPECGKSFSDCRDL
1603
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18X19
2152


agaggctag
CAC AGA

ARHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLHTHX17X18



gcc
GGC TAG

REDNLHTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




GCC

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAH






KCPECGKSFSQLAHLRAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALT






GEKPYKCPECGKSFSSKKALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGA






QRTHTGEKPYKCPECGKSFSTTG

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






ALTEHQRTHTGEKPYKCPECGKS

KHLAEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKHLAEHQRTHTGEKPTGKK








TS








ccacccctt
CCA CCC
1055
LEPGEKPYKCPECGKSFSREDNL
1604
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16REDNLHTHX17X18X19
2153


cacagaggc
CTT CAC

HTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



tag
AGA GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17




TAG

GKSFSQLAHLRAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEH






KCPECGKSFSSKKALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALT






GEKPYKCPECGKSFSTTGALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLAEHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








cctccaccc
CCT CCA
1056
LEPGEKPYKCPECGKSFSDPGHL
1605
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2154


cttcacaga
CCC CTT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17X18



ggc
CAC AGA

QLAHLRAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17




GGC

GKSFSSKKALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEH






KCPECGKSFSTTGALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLA






GEKPYKCPECGKSFSSKKHLAEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHS






QRTHTGEKPYKCPECGKSFSTSH

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TK






SLTEHQRTHTGEKPYKCPECGKS

NSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTKNSLTEHQRTHTGEKPTGKK








TS








gcagagcct
GCA gag
1057
LEPGEKPYKCPECGKSFSSKKAL
1606
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18X19
2155


ccacccctt
CCT CCA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEHX17X18



cac
CCC CTT

TTGALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




CAC

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEH






KCPECGKSFSTSHSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLT






GEKPYKCPECGKSFSTKNSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






NLVRHQRTHTGEKPYKCPECGKS

GDLRRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGDLRRHQRTHTGEKPTGKK








TS








gagcctcca
gag CCT
1058
LEPGEKPYKCPECGKSFSQLAHL
1607
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17X18X19
2156


ccccttcac
CCA CCC

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18



aga
CTT CAC

SKKALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEHX17




AGA

GKSFSTTGALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH






KCPECGKSFSSKKHLAEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLT






GEKPYKCPECGKSFSTSHSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNS






QRTHTGEKPYKCPECGKSFSTKN

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






SLTEHQRTHTGEKPYKCPECGKS

DNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDNLVRHQRTHTGEKPTGKK








TS








ccggcagag
CCG GCA
1059
LEPGEKPYKCPECGKSFSTTGAL
1608
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TTGALTEHX17X18X19
2157


cctccaccc
gag CCT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18



ctt
CCA CCC

SKKHLAEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17




CTT

GKSFSTSHSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEH






KCPECGKSFSTKNSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGD






QRTHTGEKPYKCPECGKSFSQSG

LRRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






DLRRHQRTHTGEKPYKCPECGKS

DTLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDTLTEHQRTHTGEKPTGKK








TS








gcagatctt
GCA GAT
1060
LEPGEKPYKCPECGKSFSRSDKL
1609
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17X18X19
2158


cccagagga
CTT CCC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



cgg
AGA GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17




CGG

GKSFSQLAHLRAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH






KCPECGKSFSSKKHLAEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALT






GEKPYKCPECGKSFSTTGALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGN






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






NLVRHQRTHTGEKPYKCPECGKS

GDLRRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGDLRRHQRTHTGEKPTGKK








TS








ccagcagat
CCA GCA
114
LEPGEKPYKCPECGKSFSQRAHL
115
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18X19
316


cttcccaga
GAT CTT

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17X18



gga
CCC AGA

QLAHLRAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




GGA

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEH






KCPECGKSFSTTGALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLV






GEKPYKCPECGKSFSTSGNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGD






QRTHTGEKPYKCPECGKSFSQSG

LRRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






DLRRHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








ggcagagag
GGC AGA
1061
LEPGEKPYKCPECGKSFSRSDHL
1610
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2159


ggcactggg
GAG GGC

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



agg
ACT ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRHX17




AGG

GKSFSTHLDIIRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAH






QRTHTGEKPYKCPECGKSFSQLA

LRAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






HLRAHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








gaaggcaga
GAA GGC
1062
LEPGEKPYKCPECGKSFSRSDKL
1611
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2160


gagggcact
AGA GAG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRHX17X18



ggg
GGC ACT

THLDLIRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




GGG

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRH






KCPECGKSFSRSDNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLR






GEKPYKCPECGKSFSQLAHLRAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






HLVRHQRTHTGEKPYKCPECGKS

SNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSSNLVRHQRTHTGEKPTGKK








TS








gagggcact
GAG GGC
1063
LEPGEKPYKCPECGKSFSRADNL
1612
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18X19
2161


gggaggagg
ACT ggg

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



cag
AGG AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




CAG

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLI






GEKPYKCPECGKSFSTHLDLIRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLVRHQRTHTGEKPYKCPECGKS

DNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDNLVRHQRTHTGEKPTGKK








TS








agagagggc
AGA GAG
1064
LEPGEKPYKCPECGKSFSRSDHL
1613
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2162


actgggagg
GGC ACT

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



agg
ggg AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




AGG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRH






KCPECGKSFSTHLDLIRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QL






NLVRHQRTHTGEKPYKCPECGKS

AHLRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQLAHLRAHQRTHTGEKPTGKK








TS








gggaggagg
ggg AGG
1065
LEPGEKPYKCPECGKSFSDPGHL
1614
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2163


cagtgggag
AGG CAG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18



ggc
TGG GAG

RSDNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




GGC

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEH






KCPECGKSFSRADNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLTNHQRTHTGEKPYKCPECGKS

DKLVRHX17X18KX19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








ggcactggg
GGC ACT
1066
LEPGEKPYKCPECGKSFSRSDHL
1615
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2164


aggaggcag
ggg AGG

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18



tgg
AGG CAG

RADNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




TGG

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLD






QRTHTGEKPYKCPECGKSFSTHL

LIRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






DLIRHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








gagggcgga
GAG GGC
1067
LEPGEKPYKCPECGKSFSTKNSL
1616
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18X19
2165


gggcggggg
GGA ggg

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



cct
CGG ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17




CCT

GKSFSRSDKLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLVRHQRTHTGEKPYKCPECGKS

DNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDNLVRHQRTHTGEKPTGKK








TS








actgggagg
ACT ggg
1068
LEPGEKPYKCPECGKSFSRSDNL
1617
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18X19
2166


aggcagtgg
AGG AGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



gag
CAG TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17




GAG

GKSFSRADNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TH






KLVRHQRTHTGEKPYKCPECGKS

LDLIRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTHLDLIRHQRTHTGEKPTGKK








TS








tgaaaggaa
TGA AAG
1069
LEPGEKPYKCPECGKSFSDPGHL
1618
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2167


ggcagagag
GAA GGC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18



ggc
AGA GAG

RSDNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17




GGC

GKSFSQLAHLRAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLV






GEKPYKCPECGKSFSQSSNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDN






QRTHTGEKPYKCPECGKSFSRKD

LKNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QA






NLKNHQRTHTGEKPYKCPECGKS

GHLASHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQAGHLASHQRTHTGEKPTGKK








TS








aaggaaggc
AAG GAA
1070
LEPGEKPYKCPECGKSFSTHLDL
1619
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16THLDLIRHX17X18X19
2168


agagagggc
GGC AGA

IRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



act
GAG GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17




ACT

GKSFSRSDNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAH






KCPECGKSFSQLAHLRAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSN






QRTHTGEKPYKCPECGKSFSQSS

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RK






NLVRHQRTHTGEKPYKCPECGKS

DNLKNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRKDNLKNHQRTHTGEKPTGKK








TS








tgggagggc
TGG GAG
1071
LEPGEKPYKCPECGKSFSRSDKL
1620
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2169


ggagggcgg
GGC GGA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17X18



ggg
ggg CGG

RSDKLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




GGG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

VLVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLVRHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








cagtgggag
CAG TGG
1072
LEPGEKPYKCPECGKSFSRSDKL
1621
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17X18X19
2170


ggcggaggg
GAG GGC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



cgg
GGA ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17




CGG

GKSFSQRAHLERHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RA






HLTTHQRTHTGEKPYKCPECGKS

DNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRADNLTEHQRTHTGEKPTGKK








TS








aggcagtgg
AGG CAG
1073
LEPGEKPYKCPECGKSFSRSDKL
1622
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2171


gagggcgga
TGG GAG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



ggg
GGC GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




GGG

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRH






KCPECGKSFSRSDNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADN






QRTHTGEKPYKCPECGKSFSRAD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLTEHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








aggaggcag
AGG AGG
1074
LEPGEKPYKCPECGKSFSQRAHL
1623
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18X19
2172


tgggagggc
CAG TGG

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



gga
GAG GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17




GGA

GKSFSRSDNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLT






GEKPYKCPECGKSFSRADNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLTNHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








gtttgaaag
GTT TGA
110
LEPGEKPYKCPECGKSFSRSDNL
111
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18X19
315


gaaggcaga
AAG GAA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17X18



gag
GGC AGA

QLAHLRAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




GAG

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRH






KCPECGKSFSQSSNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLK






GEKPYKCPECGKSFSRKDNLKNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGH






QRTHTGEKPYKCPECGKSFSQAG

LASHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLASHQRTHTGEKPYKCPECGKS

GSLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FVSTSGSLVRHQRTHTGEKPTGK








TS








acggtttga
ACG GTT
1075
LEPGEKPYKCPECGKSFSQLAHL
1624
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17X18X19
2173


aaggaaggc
TGA AAG

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



aga
GAA GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17




AGA

GKSFSQSSNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLKNH






KCPECGKSFSRKDNLKNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLA






GEKPYKCPECGKSFSQAGHLASH

SHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGS






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RT






SLVRHQRTHTGEKPYKCPECGKS

DTLRDHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRTDTLRDHQRTHTGEKPTGKK








TS








aggacggtt
AGG ACG
1076
LEPGEKPYKCPECGKSFSDPGHL
1625
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2174


tgaaaggaa
GTT TGA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18



ggc
AAG GAA

QSSNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLKNHX17




GGC

GKSFSRKDNLKNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASH






KCPECGKSFSQAGHLASHQRTHT

X17X18X19HX20X21X22X23X24X8X9CXHX11CX12X13X14X15X16TSGSLV






GEKPYKCPECGKSFSTSGSLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RTDT






QRTHTGEKPYKCPECGKSFSRTD

LRDHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






TLRDHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








cagaggacg
CAG AGG
1077
LEPGEKPYKCPECGKSFSQSSNL
1626
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18X19
2175


gtttgaaag
ACG GTT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLKNHX17X18



gaa
TGA AAG

RKDNLKNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASHX17




GAA

GKSFSQAGHLASHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLVRH






KCPECGKSFSTSGSLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RTDTLR






GEKPYKCPECGKSFSRTDTLRDH

DHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RA






HLTNHQRTHTGEKPYKCPECGKS

DNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRADNLTEHQRTHTGEKPTGKK








TS








gggcgccca
ggg CGC
1078
LEPGEKPYKCPECGKSFSRSDHL
1627
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2176


gggtagggc
CCA GGG

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



agg
TAG GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLHTHX17




AGG

GKSFSREDNLHTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLT






GEKPYKCPECGKSFSTSHSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGH






QRTHTGEKPYKCPECGKSFSHTG

LLEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLLEHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








cgcccaggg
CGC CCA
1079
LEPGEKPYKCPECGKSFSRSDHL
1628
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2177


tagggcagg
GGG TAG

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



tgg
GGC AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




TGG

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLHTH






KCPECGKSFSREDNLHTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHS






QRTHTGEKPYKCPECGKSFSTSH

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HT






SLTEHQRTHTGEKPYKCPECGKS

GHLLEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSHTGHLLEHQRTHTGEKPTGKK








TS








gggagggcg
ggg AGG
1080
LEPGEKPYKCPECGKSFSDPGHL
1629
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2178


gagggcggg
GCG GAG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



ggc
GGC GGG

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




GGC

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRH






KCPECGKSFSRSDNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLV






GEKPYKCPECGKSFSRSDDLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLTNHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








cggcgtgga
CGG CGT
1081
LEPGEKPYKCPECGKSFSTTGNL
1630
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17X18X19
2179


ggcagggag
GGA GGC

TVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18



aat
AGG gag

RSDNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




AAT

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SRRT






QRTHTGEKPYKCPECGKSFSSRR

CRAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






TCRAHQRTHTGEKPYKCPECGKS

DKLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLTEHQRTHTGEKPTGKK








TS








aggaaggca
AGG AAG
1082
LEPGEKPYKCPECGKSFSRNDAL
1631
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18X19
2180


gagagggca
GCA gag

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18



ctg
AGG GCA

QSGDLRRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




CTG

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRH






KCPECGKSFSRSDNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLR






GEKPYKCPECGKSFSQSGDLRRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDN






QRTHTGEKPYKCPECGKSFSRKD

LKNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLKNHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








tagggcagg
TAG GGC
1083
LEPGEKPYKCPECGKSFSSRRTC
1632
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SRRTCRAHX17X18X19
2181


tggccgcgg
AGG TGG

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17X18



cgt
CCG CGG

RSDKLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17




CGT

GKSFSRNDTLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RE






HLVRHQRTHTGEKPYKCPECGKS

DNLHTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSREDNLHTHQRTHTGEKPTGKK








TS








gggtagggc
GGG TAG
1084
LEPGEKPYKCPECGKSFSRSDKL
1633
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17X18X19
2182


aggtggccg
GGC AGG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17X18



cgg
TGG CCG

RNDTLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




CGG

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDN






QRTHTGEKPYKCPECGKSFSRED

LHTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLHTHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X9HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








ccagggtag
CCA GGG
1085
LEPGEKPYKCPECGKSFSRNDTL
1634
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17X18X19
2183


ggcaggtgg
TAG GGC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



ccg
AGG TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




CCG

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLH






GEKPYKCPECGKSFSREDNLHTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






KLVRHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








gcggagggc
GCG GAG
1086
LEPGEKPYKCPECGKSFSRSDKL
1635
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17X18X19
2184


gggggcctt
GGC GGG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEHX17X18



cgg
GGC CTT

TTGALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




CGG

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLVRHQRTHTGEKPYKCPECGKS

DDLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDDLVRHQRTHTGEKPTGKK








TS








gaaaggaag
GAA AGG
1087
LEPGEKPYKCPECGKSFSQSGDL
1636
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18X19
2185


gcagagagg
AAG GCA

RRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



gca
gag AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17




GCA

GKSFSRSDNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRH






KCPECGKSFSQSGDLRRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLK






GEKPYKCPECGKSFSRKDNLKNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






HLTNHQRTHTGEKPYKCPECGKS

SNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSSNLVRHQRTHTGEKPTGKK








TS








ggcagtggg
GGC AGT
1088
LEPGEKPYKCPECGKSFSDPGHL
1637
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2186


agggcggag
ggg AGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18



ggc
GCG GAG

RSDNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17




GGC

GKSFSRSDDLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSSRSDHLVTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTT






QRTHTGEKPYKCPECGKSFSHRT

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






TLTNHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








agtgggagg
AGT ggg
1089
LEPGEKPYKCPECGKSFSRSDKL
1638
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2187


gcggagggc
AGG GCG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



ggg
GAG GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17




GGG

GKSFSRSDNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRH






KCPECGKSFSRSDDLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HR






KLVRHQRTHTGEKPYKCPECGKS

TTLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSHRTTLTNHQRTHTGEKPTGKK








TS








agggcggag
AGG GCG
1090
LEPGEKPYKCPECGKSFSTTGAL
1639
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TTGALTEHX17X18X19
2188


ggcgggggc
GAG GGC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



ctt
GGG GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




CTT

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKYKCPEVCGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDD






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






DLVRHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








aaggcagag
AAG GCA
1091
LEPGEKPYKCPECGKSFSQRAHL
1640
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18X19
2189


agggcactg
gag AGG

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18



gga
GCA CTG

RNDALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17




GGA

GKSFSQSGDLRRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGD






QRTHTGEKPYKCPECGKSFSQSG

LRRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RK






DLRRHQRTHTGEKPYKCPECGKS

DNLKNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRKDNLKNHQRTHTGEKPTGKK








TS








ccgcggcgt
CCG CGG
102
LEPGEKPYKCPECGKSFSRSDNL
103
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18X19
313


ggaggcagg
CGT GGA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



gag
GGC AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




GAG

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SRRTCR






GEKPYKCPECGKSFSSRRTCRAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






KLTEHQRTHTGEKPYKCPECGKS

DTLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDTLTEHQRTHTGEKPTGKK








TS








cgtggaggc
CGT GGA
1092
LEPGEKPYKCPECGKSFSRSDDL
1641
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17X18X19
2190


agggagaat
GGC AGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17X18



gcg
gag AAT

TTGNLTVHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17




GCG

GKSFSRSDNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SR 






HLERHQRTHTGEKPYKCPECGKS

RTCRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSRRTCRAHQRTHTGEKPTGKK








TS








gcagagagg
GCA gag
1093
LEPGEKPYKCPECGKSFSQRAHL
1642
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18X19
2191


gcactggga
AGG GCA

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



gga
CTG GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




GGA

GKSFSRNDALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRH






KCPECGKSFSQSGDLRRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






NLVRHQRTHTGEKPYKCPECGKS

GDLRRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGDLRRHQRTHTGEKPTGKK








TS








gagagggca
gag AGG
1094
LEPGEKPYKCPECGKSFSDPGHL
1643
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2192


ctgggagga
GCA CTG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



ggc
GGA GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17




GGC

GKSFSQRAHLERHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEH






KCPECGKSFSRNDALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLR






GEKPYKCPECGKSFSQSGDLRRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLTNHQRTHTGEKPYKCPECGKS

DNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDNLVRHQRTHTGEKPTGKK








TS








aggtggccg
AGG TGG
1095
LEPGEKPYKCPECGKSFSDPGHL
1644
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2193


cggcgtgga
CCG CGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



ggc
CGT GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SRRTCRAHX17




GGC

GKSFSSRRTCRAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEH






KCPECGKSFSRSDKLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLT






GEKPYKCPECGKSFSRNDTLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLTTHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








tggccgcgg
TGG CCG
1096
LEPGEKPYKCPECGKSFSRSDHL
1645
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2194


cgtggaggc
CGG CGT

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



agg
GGA GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17




AGG

GKSFSQRAHLERHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SRRTCRAH






KCPECGKSFSSRRTCRAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLT






GEKPYKCPECGKSFSRSDKLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDT






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






TLTEHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








ggcaggtgg
GGC AGG
1097
LEPGEKPYKCPECGKSFSQRAHL
1646
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18X19
2195


ccgcggcgt
TGG CCG

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SRRTCRAHX17X18



gga
CGG CGT

SRRTCRAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17




GGA

GKSFSRSDKLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEH






KCPECGKSFSRNDTLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






HLTNHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








ggaggcagg
GGA GGC
1098
LEPGEKPYKCPECGKSFSTHLDL
1647
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16THLDLIRHX17X18X19
2196


gagaatgcg
AGG gag

IRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17X18



act
AAT gcg

RSDDLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17




ACT

GKSFSTTGNLTVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRH






KCPECGKSFSRSDNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






HLVRHQRTHTGEKPYKCPECGKS

AHLERHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRAHLERHQRTHTGEKPTGKK








TS








gcactggga
GCA CTG
1099
LEPGEKPYKCPECGKSFSRSDKL
1648
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2197


ggaggcagt
GGA GGA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17X18



ggg
GGC AGT

HRTTLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




GGG

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS 






ALTEHQRTHTGEKPYKCPECGKS

GDLRRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGDLRRHQRTHTGEKPTGKK








TS








ggaggcagt
GGA GGC
1100
LEPGEKPYKCPECGKSFSRSDNL
1649
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18X19
2198


gggagggcg
AGT ggg

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17X18



gag
AGG GCG

RSDDLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




GAG

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLT






GEKPYKCPECGKSFSHRTTLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






HLVRHQRTHTGEKPYKCPECGKS

AHLERHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRAHLERHQRTHTGEKPTGKK








TS








agggcactg
AGG GCA
1101
LEPGEKPYKCPECGKSFSHRTTL
1650
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17X18X19
2199


ggaggaggc
CTG GGA

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



agt
GGA GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17




AGT

GKSFSQRAHLERHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALT






GEKPYKCPECGKSFSRNDALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGD






QRTHTGEKPYKCPECGKSFSQSG

LRRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






DLRRHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








ctgggagga
CTG GGA
1102
LEPGEKPYKCPECGKSFSRSDHL
1651
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2200


ggcagtggg
GGA GGC

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



agg
AGT ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17




AGG

GKSFSHRTTLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






HLERHQRTHTGEKPYKCPECGKS

DALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








ggtgggcgc
GGT ggg
1103
LEPGEKPYKCPECGKSFSDPGHL
1652
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2201


ccagggtag
CGC CCA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLHTHX17X18



ggc
GGG TAG

REDNLHTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




GGC

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEH






KCPECGKSFSTSHSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGHLL






GEKPYKCPECGKSFSHTGHLLEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






KLVRHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGHLVRHQRTHTGEKPTGKK








TS








cggggtggg
CGG GGT
1104
LEPGEKPYKCPECGKSFSREDNL
1653
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16REDNLHTHX17X18X19
2202


cgcccaggg
ggg CGC

HTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



tag
CCA GGG

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17




TAG

GKSFSTSHSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGHLLEH






KCPECGKSFSHTGHLLEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGH






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLVRHQRTHTGEKPYKCPECGKS

DKLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLTEHQRTHTGEKPTGKK








TS








ggcgggggc
GGC GGG
1105
LEPGEKPYKCPECGKSFSRSDKL
1654
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2203


cttcggggt
GGC CTT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18



ggg
CGG GGT

TSGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17




GGG

GKSFSRSDKLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEH






KCPECGKSFSTTGALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






KLVRHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








gagggcggg
GAG GGC
1106
LEPGEKPYKCPECGKSFSTSGHL
1655
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18X19
2204


ggccttcgg
GGG GGC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17X18



ggt
CTT CGG

RSDKLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEHX17




GGT

GKSFSTTGALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLVRHQRTHTGEKPYKCPECGKS

DNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDNLVRHQRTHTGEKPTGKK








TS








gggggcctt
GGG GGC
1107
LEPGEKPYKCPECGKSFSHTGHL
1656
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HTGHLLEHX17X18X19
2205


cggggtggg
CTT CGG

LEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



cgc
GGT ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17




CGC

GKSFSTSGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEH






KCPECGKSFSRSDKLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALT






GEKPYKCPECGKSFSTTGALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLVRHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








ggaggaggc
GGA GGA
1108
LEPGEKPYKCPECGKSFSRSDDL
1657
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17X18X19
2206


agtgggagg
GGC AGT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



gcg
ggg AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




GCG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNH






KCPECGKSFSHRTTLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






HLERHQRTHTGEKPYKCPECGKS

AHLERHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRAHLERHQRTHTGEKPTGKK








TS








ggccttcgg
GGC CTT
1109
LEPGEKPYKCPECGKSFSTSHSL
1658
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18X19
2207


ggtgggcgc
CGG GGT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGHLLEHX17X18



cca
ggg CGC

HTGHLLEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




CCA

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRH






KCPECGKSFSTSGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLT






GEKPYKCPECGKSFSRSDKLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGA






QRTHTGEKPYKCPECGKSFSTTG

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






ALTEHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








cttcggggt
CTT CGG
1110
LEPGEKPYKCPECGKSFSRSDKL
1659
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2208


gggcgccca
GGT ggg

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18



ggg
CGC CCA

TSHSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGHLLEHX17




GGG

GKSFSHTGHLLEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLV






GEKPYKCPECGKSFSTSGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TT






KLTEHQRTHTGEKPYKCPECGKS

GALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTTGALTEHQRTHTGEKPTGKK








TS








cagagaggg
CAG AGA
1111
LEPGEKPYKCPECGKSFSRSDNL
1660
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18X19
2209


cactgggag
ggg CAC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18



gag
TGG GAG

RSDNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




GAG

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEH






KCPECGKSFSSKKALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAH






QRTHTGEKPYKCPECGKSFSQLA

LRAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RA






HLRAHQRTHTGEKPYKCPECGKS

DNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRADNLTEHQRTHTGEKPTGKK








TS








agagggcac
AGA ggg
1112
LEPGEKPYKCPECGKSFSQSGDL
1661
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18X19
2210


tgggaggag
CAC TGG

RRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18



gca
GAG GAG

RSDNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17




GCA

GKSFSRSDNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALT






GEKPYKCPECGKSFSSKKALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QL






KLVRHQRTHTGEKPYKCPECGKS

AHLRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQLAHLRAHQRTHTGEKPTGKK








TS








gggcactgg
ggg CAC
1113
LEPGEKPYKCPECGKSFSRSDEL
1662
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18X19
2211


gaggaggca
TGG GAG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18



gtg
GAG GCA

QSGDLRRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17




GTG

GKSFSRSDNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRH






KCPECGKSFSRSDNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKA






QRTHTGEKPYKCPECGKSFSSKK

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






ALTEHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








aggcagaga
AGG CAG
1114
LEPGEKPYKCPECGKSFSRSDNL
1663
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18X19
2212


gggcactgg
AGA ggg

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



gag
CAC TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17




GAG

GKSFSSKKALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLR






GEKPYKCPECGKSFSQLAHLRAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADN






QRTHTGEKPYKCPECGKSFSRAD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLTEHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








gaggaggca
GAG GAG
1115
LEPGEKPYKCPECGKSFSRSDKL
1664
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17X18X19
2213


gtgggaggg
GCA GTG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



cgg
GGA ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17




CGG

GKSFSQRAHLERHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRH






KCPECGKSFSRSDELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLR






GEKPYKCPECGKSFSQSGDLRRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLVRHQRTHTGEKPYKCPECGKS

DNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDNLVRHQRTHTGEKPTGKK








TS








tgggaggag
TGG GAG
1116
LEPGEKPYKCPECGKSFSRSDKL
1665
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2214


gcagtggga
GAG GCA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



ggg
GTG GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17




GGG

GKSFSRSDELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRH






KCPECGKSFSQSGDLRRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLVRHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








gaggcagtg
GAG GCA
1117
LEPGEKPYKCPECGKSFSRSDHL
1666
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2215


ggagggcgg
GTG GGA

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17X18



agg
ggg CGG

RSDKLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




AGG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELV






GEKPYKCPECGKSFSRSDELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGD






QRTHTGEKPYKCPECGKSFSQSG

LRRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






DLRRHQRTHTGEKPYKCPECGKS

DNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDNLVRHQRTHTGEKPTGKK








TS








cactgggag
CAC TGG
1118
LEPGEKPYKCPECGKSFSQRAHL
1667
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18X19
2216


gaggcagtg
GAG GAG

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18



gga
GCA GTG

RSDELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17




GGA

GKSFSQSGDLRRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRH






KCPECGKSFSRSDNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






HLTTHQRTHTGEKPYKCPECGKS

KALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKALTEHQRTHTGEKPTGKK








TS








aaaggaagg
AAA GGA
1119
LEPGEKPYKCPECGKSFSSKKAL
1668
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18X19
2217


cagagaggg
AGG CAG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



cac
AGA ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17




CAC

GKSFSQLAHLRAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEH






KCPECGKSFSRADNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






HLERHQRTHTGEKPYKCPECGKS

ANLRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRANLRAHQRTHTGEKPTGKK








TS








ggaaggcag
GGA AGG
1120
LEPGEKPYKCPECGKSFSRSDHL
1669
X1X2X3X4X5X6X7X8X9CC10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2218


agagggcac
CAG AGA

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18



tgg
ggg CAC

SKKALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




TGG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAH






KCPECGKSFSQLAHLRAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLT






GEKPYKCPECGKSFSRADNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR 






HLTNHQRTHTGEKPYKCPECGKS

AHLERHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRAHLERHQRTHTGEKPTGKK








TS








gtgggaggg
GTG GGA
1121
LEPGEKPYKCPECGKSFSRSDKL
1670
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2219


cggagggcg
ggg CGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17X18



ggg
AGG GCG

RSDDLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




GGG

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEH






KCPECGKSFSRSDKLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLERHQRTHTGEKPYKCPECGKS

DELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDELVRHQRTHTGEKPTGKK








TS








gcagtggga
GCA GTG
1122
LEPGEKPYKCPECGKSFSRSDDL
1671
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17X18X19
2220


gggcggagg
GGA ggg

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



gcg
CGG AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17




GCG

GKSFSRSDKLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDE






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






ELVRHQRTHTGEKPYKCPECGKS

GDLRRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGDLRRHQRTHTGEKPTGKK








TS








ggagggcgg
GGA ggg
1123
LEPGEKPYKCPECGKSFSDCRDL
1672
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18X19
2221


agggcgggg
CGG AGG

ARHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



gcc
GCG GGG

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17




GCC

GKSFSRSDDLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLT






GEKPYKCPECGKSFSRSDKLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






KLVRHQRTHTGEKPYKCPECGKS

AHLERHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRAHLERHQRTHTGEKPTGKK








TS








cgcggcgtg
CGC GGC
1124
LEPGEKPYKCPECGKSFSQLAHL
1673
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17X18X19
2222


gaggcaggg
GTG GAG

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



aga
GCA ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17




AGA

GKSFSQSGDLRRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRH






KCPECGKSFSRSDNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELV






GEKPYKCPECGKSFSRSDELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HT






HLVRHQRTHTGEKPYKCPECGKS

GHLLEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSHTGHLLEHQRTHTGEKPTGKK








TS








ggccgcggc
GGC CGC
1125
LEPGEKPYKCPECGKSFSRSDKL
1674
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2223


gtggaggca
GGC GTG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18



ggg
GAG GCA

QSGDLRRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17




GGG

GKSFSRSDNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRH






KCPECGKSFSRSDELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGH






QRTHTGEKPYKCPECGKSFSHTG

LLEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






HLLEHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








ggcgtggag
GGC GTG
1126
LEPGEKPYKCPECGKSFSRRDEL
1675
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RRDELNVHX17X18X19
2224


gcagggaga
GAG GCA

NVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17X18



atg
ggg AGA

QLAHLRAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




ATG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRH






KCPECGKSFSQSGDLRRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDE






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






ELVRHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








gtggaggca
GTG GAG
1127
LEPGEKPYKCPECGKSFSQSGHL
1676
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGHLTEHX17X18X19
2225


gggagaatg
GCA ggg

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVHX17X18



cga
AGA ATG

RRDELNVHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17




CGA

GKSFSQLAHLRAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLR






GEKPYKCPECGKSFSQSGDLRRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS 






NLVRHQRTHTGEKPYKCPECGKS

DELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDELVRHQRTHTGEKPTGKK








TS








ggggtgggc
GGG GTG
1128
LEPGEKPYKCPECGKSFSRSDHL
1677
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2226


gcccagggt
GGC GCC

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18



agg
CAG GGT

TSGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17




AGG

GKSFSRADNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARH






KCPECGKSFSDCRDLARHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDE






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






ELVRHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








gtgggcgcc
GTG GGC
1129
LEPGEKPYKCPECGKSFSQSGDL
1678
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18X19
2227


cagggtagg
GCC CAG

RRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



gca
GGT AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17




GCA

GKSFSTSGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEH






KCPECGKSFSRADNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLA






GEKPYKCPECGKSFSDCRDLARH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLVRHQRTHTGEKPYKCPECGKS

DELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDELVRHQRTHTGEKPTGKK








TS








gcaggtggc
GCA GGT
1130
LEPGEKPYKCPECGKSFSRSDNL
1679
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18X19
2228


cgcggcgtg
GGC CGC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18



gag
GGC GTG

RSDELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




GAG

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGHLLEH






KCPECGKSFSHTGHLLEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGH






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






HLVRHQRTHTGEKPYKCPECGKS

GDLRRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGDLRRHQRTHTGEKPTGKK








TS








ggtggccgc
GGT GGC
1131
LEPGEKPYKCPECGKSFSQSGDL
1680
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18X19



ggcgtggag
CGC GGC

RRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18
2229


gca
GTG GAG

RSDNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17




GCA

GKSFSRSDELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGHLL






GEKPYKCPECGKSFSHTGHLLEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLVRHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGHLVRHQRTHTGEKPTGKK








TS








agggcaggt
AGG GCA
1132
LEPGEKPYKCPECGKSFSRSDEL
1681
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18X19
2230


ggccgcggc
GGT GGC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



gtg
CGC GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGHLLEHX17




GTG

GKSFSHTGHLLEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLV






GEKPYKCPECGKSFSTSGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGD






QRTHTGEKPYKCPECGKSFSQSG

LRRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






DLRRHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








gcccagggt
GCC CAG
1133
LEPGEKPYKCPECGKSFSDPGHL
1682
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2231


agggcaggt
GGT AGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18



ggc
GCA GGT

TSGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17




GGC

GKSFSQSGDLRRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLV






GEKPYKCPECGKSFSTSGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADN






QRTHTGEKPYKCPECGKSFSRAD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DC






NLTEHQRTHTGEKPYKCPECGKS

RDLARHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDCRDLARHQRTHTGEKPTGKK








TS








ggcgcccag
GGC GCC
1134
LEPGEKPYKCPECGKSFSTSGHL
1683
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18X19
2232


ggtagggca
CAG GGT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18



ggt
AGG GCA

QSGDLRRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




GGT

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRH






KCPECGKSFSTSGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLT






GEKPYKCPECGKSFSRADNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRD






QRTHTGEKPYKCPECGKSFSDCR

LARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






DLARHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








cagggtagg
CAG GGT
1135
LEPGEKPYKCPECGKSFSHTGHL
1684
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HTGHLLEHX17X18X19
2233


gcaggtggc
AGG GCA

LEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



cgc
GGT GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17




CGC

GKSFSTSGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRH






KCPECGKSFSQSGDLRRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGH






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RA






HLVRHQRTHTGEKPYKCPECGKS

DNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRADNLTEHQRTHTGEKPTGKK








TS








ggtagggca
GGT AGG
1136
LEPGEKPYKCPECGKSFSDPGHL
1685
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2234


ggtggccgc
GCA GGT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGHLLEHX17X18



ggc
GGC CGC

HTGHLLEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




GGC

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRH






KCPECGKSFSTSGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLR






GEKPYKCPECGKSFSQSGDLRRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

VLTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLTNHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGHLVRHQRTHTGEKPTGKK








TS








agggtaggg
AGG GTA
1137
LEPGEKPYKCPECGKSFSRSDDL
1686
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17X18X19
2235


caggtggcc
ggg CAG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18



gcg
GTG GCC

DCRDLARHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17




GCG

GKSFSRSDELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEH






KCPECGKSFSRADNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSS






QRTHTGEKPYKCPECGKSFSQSS

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






SLVRHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








cccagggta
CCC AGG
1138
LEPGEKPYKCPECGKSFSDCRDL
1687
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18X19
2236


gggcaggtg
GTA ggg

ARHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18



gcc
CAG GTG

RSDELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17




GCC

GKSFSRADNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSSLV






GEKPYKCPECGKSFSQSSSLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






HLTNHQRTHTGEKPYKCPECGKS

KHLAEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKHLAEHQRTHTGEKPTGKK








TS








gtagggcag
GTA ggg
1139
LEPGEKPYKCPECGKSFSRSDDL
1688
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17X18X19
2237


gtggccgcg
CAG GTG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17X18



gcg
GCC GCG

RSDDLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17




GCG

GKSFSDCRDLARHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRH






KCPECGKSFSRSDELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLT






GEKPYKCPECGKSFSRADNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






KLVRHQRTHTGEKPYKCPECGKS

SSLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSSSLVRHQRTHTGEKPTGKK








TS








gggcaggtg
ggg CAG
1140
LEPGEKPYKCPECGKSFSRSDHL
1689
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2238


gccgcggcg
GTG GCC

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17X18



tgg
GCG GCG

RSDDLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17




TGG

GKSFSRSDDLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARH






KCPECGKSFSDCRDLARHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELV






GEKPYKCPECGKSFSRSDELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADN






QRTHTGEKPYKCPECGKSFSRAD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLTEHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








gtggccgcg
GTG GCC
1141
LEPGEKPYKCPECGKSFSRADNL
1690
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18X19
2239


gcgtggagg
GCG GCG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



cag
TGG AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




CAG

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRH






KCPECGKSFSRSDDLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLV






GEKPYKCPECGKSFSRSDDLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRD






QRTHTGEKPYKCPECGKSFSDCR

LARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






DLARHQRTHTGEKPYKCPECGKS

DELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDELVRHQRTHTGEKPTGKK








TS








caggtggcc
CAG GTG
1142
LEPGEKPYKCPECGKSFSRSDHL
1691
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2240


gcggcgtgg
GCC GCG

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



agg
GCG TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17




AGG

GKSFSRSDDLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRH






KCPECGKSFSRSDDLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLA






GEKPYKCPECGKSFSDCRDLARH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDE






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RA






ELVRHQRTHTGEKPYKCPECGKS

DNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRADNLTEHQRTHTGEKPTGKK








TS








gcgcccagg
gcg CCC
1143
LEPGEKPYKCPECGKSFSRSDEL
1692
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18X19
2241


gtagggcag
AGG GTA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18



gtg
ggg CAG

RADNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




GTG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSSLVRH






KCPECGKSFSQSSSLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLAEHQRTHTGEKPYKCPECGKS

DDLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDDLVRHQRTHTGEKPTGKK








TS








gggtgggcg
GGG TGG
1144
LEPGEKPYKCPECGKSFSRSDKL
1693
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2242


cccagggta
gcg CCC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSSLVRHX17X18



ggg
AGG GTA

QSSSLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




GGG

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH






KCPECGKSFSSKKHLAEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLV






GEKPYKCPECGKSFSRSDDLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLTTHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








tgggcgccc
TGG gcg
1145
LEPGEKPYKCPECGKSFSRADNL
1694
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18X19
2243


agggtaggg
CCC AGG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



cag
GTA ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSSLVRHX17




CAG

GKSFSQSSSLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLA






GEKPYKCPECGKSFSSKKHLAEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDD






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






DLVRHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








gcggcgtgg
GCG GCG
1146
LEPGEKPYKCPECGKSFSQSSNL
1695
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18X19
2244


aggcaggga
TGG AGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



gaa
CAG GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17




GAA

GKSFSRADNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDD






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






DLVRHQRTHTGEKPYKCPECGKS

DDLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDDLVRHQRTHTGEKPTGKK








TS








gccgcggcg
GCC GCG
1147
LEPGEKPYKCPECGKSFSQRAHL
1696
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18X19
2245


tggaggcag
GCG TGG

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18



gga
AGG CAG

RADNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




GGA

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLV






GEKPYKCPECGKSFSRSDDLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDD






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DC






DLVRHQRTHTGEKPYKCPECGKS

RDLARHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDCRDLARHQRTHTGEKPTGKK








TS








aaccctcgt
AAC CCT
1148
LEPGEKPYKCPECGKSFSTSGNL
1697
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18X19
2246


cgacatgga
CGT CGA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



cat
CAT GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17




CAT

GKSFSTSGNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGHLTEH






KCPECGKSFSQSGHLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SRRTCR






GEKPYKCPECGKSFSSRRTCRAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNS






QRTHTGEKPYKCPECGKSFSTKN

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DS






SLTEHQRTHTGEKPYKCPECGKS

GNLRVHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDSGNLRVHQRTHTGEKPTGKK








TS








cctcgtcga
CCT CGT
1149
LEPGEKPYKCPECGKSFSDPGHL
1698
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2247


catggacat
CGA CAT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18



ggc
GGA CAT

TSGNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17




GGC

GKSFSQRAHLERHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEH






KCPECGKSFSTSGNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGHLT






GEKPYKCPECGKSFSQSGHLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SRRT






QRTHTGEKPYKCPECGKSFSSRR

CRAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TK






TCRAHQRTHTGEKPYKCPECGKS

NSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTKNSLTEHQRTHTGEKPTGKK








TS








catggacat
CAT GGA
1150
LEPGEKPYKCPECGKSFSRADNL
1699
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18X19
2248


ggccgacta
CAT GGC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNSTLTEHX17X18



cag
CGA CTA

QNSTLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGHLTEHX17




CAG

GKSFSQSGHLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLT






GEKPYKCPECGKSFSTSGNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLERHQRTHTGEKPYKCPECGKS

GNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGNLTEHQRTHTGEKPTGKK








TS








caaaaccct
CAA AAC
1151
LEPGEKPYKCPECGKSFSQRAHL
1700
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18X19
2249


cgtcgacat
CCT CGT

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18



gga
CGA CAT

TSGNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGHLTEHX17




GGA

GKSFSQSGHLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SRRTCRAH






KCPECGKSFSSRRTCRAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLT






GEKPYKCPECGKSFSTKNSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGN






QRTHTGEKPYKCPECGKSFSDSG

LRVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






NLRVHQRTHTGEKPYKCPECGKS

GNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGNLTEHQRTHTGEKPTGKK








TS








cgtcgacat
CGT CGA
1152
LEPGEKPYKCPECGKSFSQSGHL
1701
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGHLTEHX17X18X19
2250


ggacatggc
CAT GGA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



cga
CAT GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17




CGA

GKSFSTSGNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLT






GEKPYKCPECGKSFSTSGNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGH






QRTHTGEKPYKCPECGKSFSQSG

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SR






HLTEHQRTHTGEKPYKCPECGKS

RTCRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSRRTCRAHQRTHTGEKPTGKK








TS








cgacatgga
CGA CAT
1153
LEPGEKPYKCPECGKSFSQNSTL
1702
XX2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QNSTLTEHX17X18X19
2251


catggccga
GGA CAT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGHLTEHX17X18



cta
GGC CGA

QSGHLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




CTA

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEH






KCPECGKSFSTSGNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGN






QRTHTGEKPYKCPECGKSFSTSG

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






NLTEHQRTHTGEKPYKCPECGKS

GHLTEHX17X18KX19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGHLTEHQRTHTGEKPTGKK








TS








gtcgacatg
GTC GAC
1154
LEPGEKPYKCPECGKSFSDPGNL
1703
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGNLVRHX17X18X19
2252


gacatggcc
ATG GAC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18



gac
ATG GCC

DCRDLARHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVHX17




GAC

GKSFSRRDELNVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGNLVRH






KCPECGKSFSDPGNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELN






GEKPYKCPECGKSFSRRDELNVH

VHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGN






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






NLVRHQRTHTGEKPYKCPECGKS

GALVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGALVRHQRTHTGEKPTGKK








TS








gtgctgcac
gtg CTG
1155
LEPGEKPYKCPECGKSFSTKNSL
1704
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18X19
2253


tggacccag
CAC TGG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18



cct
ACC CAG

RADNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLTRHX17




CCT

GKSFSDKKDLTRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALT






GEKPYKCPECGKSFSSKKALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






ALTEHQRTHTGEKPYKCPECGKS

DELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDELVRHQRTHTGEKPTGKK








TS








acagtgctg
ACA gtg
1156
LEPGEKPYKCPECGKSFSRADNL
1705
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18X19
2254


cactggacc
CTG CAC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLTRHX17X18



cag
TGG ACC

DKKDLTRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




CAG

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEH






KCPECGKSFSSKKALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALT






GEKPYKCPECGKSFSRNDALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDE






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SP






ELVRHQRTHTGEKPYKCPECGKS

ADLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSPADLTRHQRTHTGEKPTGKK








TS








ctgcactgg
CTG CAC
1157
LEPGEKPYKCPECGKSFSSPADL
1706
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18X19
2255


acccagcct
TGG ACC

TRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18



aca
CAG CCT

TKNSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17




ACA

GKSFSRADNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLTRH






KCPECGKSFSDKKDLTRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKA






QRTHTGEKPYKCPECGKSFSSKK

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






ALTEHQRTHTGEKPYKCPECGKS

DALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








cactggacc
CAC TGG
1158
LEPGEKPYKCPECGKSFSTSHSL
1707
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18X19
2256


cagcctaca
ACC CAG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18



cca
CCT ACA

SPADLTRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17




CCA

GKSFSTKNSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEH






KCPECGKSFSRADNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLT






GEKPYKCPECGKSFSDKKDLTRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






HLTTHQRTHTGEKPYKCPECGKS

KALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKALTEHQRTHTGEKPTGKK








TS








actacagtg
ACT ACA
1159
LEPGEKPYKCPECGKSFSDKKDL
1708
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DKKDLTRHX17X18X19
2257


ctgcactgg
gtg CTG

TRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



acc
CAC TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17




ACC

GKSFSSKKALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEH






KCPECGKSFSRNDALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELV






GEKPYKCPECGKSFSRSDELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPAD






QRTHTGEKPYKCPECGKSFSSPA

LTRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TH






DLTRHQRTHTGEKPYKCPECGKS

LDLIRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTHLDLIRHQRTHTGEKPTGKK








TS








acatggccg
ACA TGG
1160
LEPGEKPYKCPECGKSFSRNDAL
1709
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18X19
2258


actacagtg
CCG ACT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18



ctg
ACA gtg

RSDELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17




CTG

GKSFSSPADLTRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRH






KCPECGKSFSTHLDLIRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLT






GEKPYKCPECGKSFSRNDTLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SP






HLTTHQRTHTGEKPYKCPECGKS

ADLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSPADLTRHQRTHTGEKPTGKK








TS








tggacatgg
TGG ACA
1161
LEPGEKPYKCPECGKSFSRSDEL
1710
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18X19
2259


ccgactaca
TGG CCG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18



gtg
ACT ACA

SPADLTRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRHX17




GTG

GKSFSTHLDLIRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEH






KCPECGKSFSRNDTLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPAD






QRTHTGEKPYKCPECGKSFSSPA

LTRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






DLTRHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








cagcctaca
CAG CCT
1162
LEPGEKPYKCPECGKSFSTTGNL
1711
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17X18X19
2260


ccaccctgg
ACA CCA

TVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



aat
CCC TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




AAT

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEH






KCPECGKSFSTSHSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLT






GEKPYKCPECGKSFSSPADLTRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNS






QRTHTGEKPYKCPECGKSFSTKN

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RA






SLTEHQRTHTGEKPYKCPECGKS

DNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRADNLTEHQRTHTGEKPTGKK








TS








acatggaca
ACA TGG
1163
LEPGEKPYKCPECGKSFSSPADL
1712
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18X19
2261


tggccgact
ACA TGG

TRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRHX17X18



aca
CCG ACT

THLDLIRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17




ACA

GKSFSRNDTLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLT






GEKPYKCPECGKSFSSPADLTRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SP






HLTTHQRTHTGEKPYKCPECGKS

ADLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSPADLTRHQRTHTGEKPTGKK








TS








tggacccag
TGG ACC
1164
LEPGEKPYKCPECGKSFSSKKHL
1713
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18X19
2262


cctacacca
CAG CCT

AEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18



ccc
ACA CCA

TSHSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17




CCC

GKSFSSPADLTRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEH






KCPECGKSFSTKNSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLT






GEKPYKCPECGKSFSRADNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKD






QRTHTGEKPYKCPECGKSFSDKK

LTRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






DLTRHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








acccagcct
ACC CAG
1165
LEPGEKPYKCPECGKSFSRSDHL
1714
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2263


acaccaccc
CCT ACA

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18



tgg
CCA CCC

SKKHLAEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17




TGG

GKSFSTSHSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRH






KCPECGKSFSSPADLTRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLT






GEKPYKCPECGKSFSTKNSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADN






QRTHTGEKPYKCPECGKSFSRAD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DK






NLTEHQRTHTGEKPYKCPECGKS

KDLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDKKDLTRHQRTHTGEKPTGKK








TS








ccgactaca
CCG ACT
1166
LEPGEKPYKCPECGKSFSRSDHL
1715
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2264


gtgctgcac
ACA gtg

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18



tgg
CTG CAC

SKKALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




TGG

GKSFSRNDALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRH






KCPECGKSFSRSDELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLT






GEKPYKCPECGKSFSSPADLTRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLD






QRTHTGEKPYKCPECGKSFSTHL

LIRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






DLIRHQRTHTGEKPYKCPECGKS

DTLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDTLTEHQRTHTGEKPTGKK








TS








tggccgact
TGG CCG
1167
LEPGEKPYKCPECGKSFSSKKAL
1716
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18X19
2265


acagtgctg
ACT ACA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18



cac
gtg CTG

RNDALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17




CAC

GKSFSRSDELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRH






KCPECGKSFSSPADLTRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLI






GEKPYKCPECGKSFSTHLDLIRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDT






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






TLTEHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








agtgctgca
AGT GCT
1168
LEPGEKPYKCPECGKSFSDCRDL
1717
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18X19
2266


ctggaccca
GCA CTG

ARHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18



gcc
GAC CCA

TSHSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGNLVRHX17




GCC

GKSFSDPGNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEH






KCPECGKSFSRNDALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLR






GEKPYKCPECGKSFSQSGDLRRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGE






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HR






ELVRHQRTHTGEKPYKCPECGKS

TTLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSHRTTLTNHQRTHTGEKPTGKK








TS








ctacaccac
CTA CAC
1169
LEPGEKPYKCPECGKSFSQAGHL
1718
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QAGHLASHX17X18X19
2267


cctggaatt
CAC CCT

ASHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18



tga
GGA ATT

HKNALQNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17




TGA

GKSFSQRAHLERHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEH






KCPECGKSFSTKNSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALT






GEKPYKCPECGKSFSSKKALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKA






QRTHTGEKPYKCPECGKSFSSKK

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QN






ALTEHQRTHTGEKPYKCPECGKS

STLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQNSTLTEHQRTHTGEKPTGKK








TS








agcctacac
AGC CTA
1170
LEPGEKPYKCPECGKSFSHKNAL
1719
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18X19
2268


caccctgga
CAC CAC

QNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



att
CCT GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17




ATT

GKSFSTKNSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEH






KCPECGKSFSSKKALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALT






GEKPYKCPECGKSFSSKKALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNST






QRTHTGEKPYKCPECGKSFSQNS

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ER






TLTEHQRTHTGEKPYKCPECGKS

SHLREHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSERSHLREHQRTHTGEKPTGKK








TS








caccaccct
CAC CAC
1171
LEPGEKPYKCPECGKSFSQSSNL
1720
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18X19
2269


ggaatttga
CCT GGA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASHX17X18



gaa
ATT TGA

QAGHLASHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNHX17




GAA

GKSFSHKNALQNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLT






GEKPYKCPECGKSFSTKNSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKA






QRTHTGEKPYKCPECGKSFSSKK

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






ALTEHQRTHTGEKPYKCPECGKS

KALTEHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKALTEHQRTHTGEKPTGKK








TS








ggacccagc
GGA CCC
1172
LEPGEKPYKCPECGKSFSTKNSL
1721
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18X19
2270


ctacaccac
AGC CTA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18



cct
CAC CAC

SKKALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17




CCT

GKSFSSKKALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNSTLTEH






KCPECGKSFSQNSTLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLR






GEKPYKCPECGKSFSERSHLREH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






HLAEHQRTHTGEKPYKCPECGKS

AHLERHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRAHLERHQRTHTGEKPTGKK








TS








cccagccta
CCC AGC
1173
LEPGEKPYKCPECGKSFSQRAHL
1722
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18X19
2271


caccaccct
CTA CAC

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18



gga
CAC CCT

TKNSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17




GGA

GKSFSSKKALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEH






KCPECGKSFSSKKALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNSTLT






GEKPYKCPECGKSFSQNSTLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSH






QRTHTGEKPYKCPECGKSFSERS

LREHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






HLREHQRTHTGEKPYKCPECGKS

KHLAEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKHLAEHQRTHTGEKPTGKK








TS








actggaccc
ACT GGA
1174
LEPGEKPYKCPECGKSFSSKKAL
1723
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18X19
2272


agcctacac
CCC AGC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18



cac
CTA CAC

SKKALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNSTLTEHX17




CAC

GKSFSQNSTLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREH






KCPECGKSFSERSHLREHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLA






GEKPYKCPECGKSFSSKKHLAEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TH






HLERHQRTHTGEKPYKCPECGKS

LDLIRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTHLDLIRHQRTHTGEKPTGKK








TS








tggtaggtg
TGG TAG
1175
LEPGEKPYKCPECGKSFSRSDEL
1724
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18X19
2273


ggggcagat
GTG GGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17X18



gtg
GCA GAT

TSGNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17




GTG

GKSFSQSGDLRRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELV






GEKPYKCPECGKSFSRSDELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDN






QRTHTGEKPYKCPECGKSFSRED

LHTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLHTHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








taggtgggg
TAG GTG
1176
LEPGEKPYKCPECGKSFSSKKHL
1725
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18X19
2274


gcagatgtg
GGG GCA

AEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18



ccc
GAT gtg

RSDELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17




CCC

GKSFSTSGNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRH






KCPECGKSFSQSGDLRRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDE






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RE






ELVRHQRTHTGEKPYKCPECGKS

DNLHTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSREDNLHTHQRTHTGEKPTGKK








TS








gatgggcaa
GAT ggg
1177
LEPGEKPYKCPECGKSFSRSDKL
1726
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2275


tggtaggtg
CAA TGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18



ggg
TAG GTG

RSDELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLHTHX17




GGG

GKSFSREDNLHTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLT






GEKPYKCPECGKSFSQSGNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






KLVRHQRTHTGEKPYKCPECGKS

GNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGNLVRHQRTHTGEKPTGKK








TS








gggcaatgg
ggg CAA
1178
LEPGEKPYKCPECGKSFSQSGDL
1727
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18X19
2276


taggtgggg
TGG TAG

RRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



gca
GTG GGG

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17




GCA

GKSFSRSDELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLHTH






KCPECGKSFSREDNLHTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGN






QRTHTGEKPYKCPECGKSFSQSG

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLTEHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








caatggtag
CAA TGG
1179
LEPGEKPYKCPECGKSFSTSGNL
1728
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17X18X19
2277


gtgggggca
TAG GTG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18



gat
GGG GCA

QSGDLRRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




GAT

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRH






KCPECGKSFSRSDELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLH






GEKPYKCPECGKSFSREDNLHTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






HLTTHQRTHTGEKPYKCPECGKS

GNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGNLTEHQRTHTGEKPTGKK








TS








gtgggggca
GTG GGG
1180
LEPGEKPYKCPECGKSFSRSDHL
1729
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2278


gatgtgccc
GCA GAT

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18



agg
gtg CCC

SKKHLAEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17




AGG

GKSFSRSDELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRH






KCPECGKSFSTSGNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLR






GEKPYKCPECGKSFSQSGDLRRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






KLVRHQRTHTGEKPYKCPECGKS

DELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDELVRHQRTHTGEKPTGKK








TS








gacgatggg
GAC GAT
1181
LEPGEKPYKCPECGKSFSRSDEL
1730
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18X19
2279


caatggtag
ggg CAA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLHTHX17X18



gtg
TGG TAG

REDNLHTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




GTG

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEH






KCPECGKSFSQSGNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGN






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






NLVRHQRTHTGEKPYKCPECGKS

GNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGNLVRHQRTHTGEKPTGKK








TS








gttgacgat
GTT GAC
1182
LEPGEKPYKCPECGKSFSREDNL
1731
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16REDNLHTHX17X18X19
2280


gggcaatgg
GAT ggg

HTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



tag
CAA TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17




TAG

GKSFSQSGNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLV






GEKPYKCPECGKSFSTSGNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGN






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






NLVRHQRTHTGEKPYKCPECGKS

GSLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGSLVRHQRTHTGEKPTGKK








TS








gcaggtgtt
GCA GGT
1183
LEPGEKPYKCPECGKSFSQSGNL
1732
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17X18X19
2281


gacgatggg
GTT GAC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



caa
GAT ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17




CAA

GKSFSTSGNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGNLVRH






KCPECGKSFSDPGNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLV






GEKPYKCPECGKSFSTSGSLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGH






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






HLVRHQRTHTGEKPYKCPECGKS

GDLRRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGDLRRHQRTHTGEKPTGKK








TS








ggtgttgac
GGT GTT
1184
LEPGEKPYKCPECGKSFSRSDHL
1733
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2282


gatgggcaa
GAC GAT

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17X18



tgg
ggg CAA

QSGNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




TGG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRH






KCPECGKSFSTSGNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGNLV






GEKPYKCPECGKSFSDPGNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGS






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






SLVRHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGHLVRHQRTHTGEKPTGKK








TS








gcaatggta
GCA ATG
1185
LEPGEKPYKCPECGKSFSQLAHL
1734
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17X18X19
2283


ggtgggggc
GTA GGT

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



aga
GGG GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




AGA

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRH






KCPECGKSFSTSGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSSLV






GEKPYKCPECGKSFSQSSSLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDE






QRTHTGEKPYKCPECGKSFSRRD

LNVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






ELNVHQRTHTGEKPYKCPECGKS

GDLRRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGDLRRHQRTHTGEKPTGKK








TS








tgggcaatg
TGG GCA
1186
LEPGEKPYKCPECGKSFSDPGHL
1735
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2284


gtaggtggg
ATG GTA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



ggc
GGT GGG

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17




GGC

GKSFSTSGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSSLVRH






KCPECGKSFSQSSSLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELN






GEKPYKCPECGKSFSRRDELNVH

VHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X164SGD






QRTHTGEKPYKCPECGKSFSQSG

LRRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






DLRRHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








tgacgatgg
TGA CGA
1187
LEPGEKPYKCPECGKSFSTSGHL
1736
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18X19



gcaatggta
TGG GCA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSSLVRHX17X18
2285


ggt
ATG GTA

QSSSLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVHX17




GGT

GKSFSRRDELNVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRH






KCPECGKSFSQSGDLRRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGH






QRTHTGEKPYKCPECGKSFSQSG

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QA






HLTEHQRTHTGEKPYKCPECGKS

GHLASHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQAGHLASHQRTHTGEKPTGKK








TS








cgatgggca
CGA TGG
1188
LEPGEKPYKCPECGKSFSRSDKL
1737
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2286


atggtaggt
GCA ATG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18



ggg
GTA GGT

TSGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSSLVRHX17




GGG

GKSFSQSSSLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVH






KCPECGKSFSRRDELNVHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLR






GEKPYKCPECGKSFSQSGDLRRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






HLTTHQRTHTGEKPYKCPECGKS

GHLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGHLTEHQRTHTGEKPTGKK








TS








ggcaatggt
GGC AAT
1189
LEPGEKPYKCPECGKSFSRADNL
1738
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18X19
2287


aggtggggg
GGT AGG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



cag
TGG ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




CAG

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLV






GEKPYKCPECGKSFSTSGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGN






QRTHTGEKPYKCPECGKSFSTTG

LTVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






NLTVHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








atgggcaat
ATG GGC
1190
LEPGEKPYKCPECGKSFSRSDKL
1739
XX2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2288


ggtaggtgg
AAT GGT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



ggg
AGG TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




GGG

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRH






KCPECGKSFSTSGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLT






GEKPYKCPECGKSFSTTGNLTVH

VHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RR






HLVRHQRTHTGEKPYKCPECGKS

DELNVHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRRDELNVHQRTHTGEKPTGKK








TS








aatggtagg
AAT GGT
1191
LEPGEKPYKCPECGKSFSRRDEL
1740
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RRDELNVHX17X18X19
2289


tgggggcag
AGG TGG

NVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18



atg
ggg CAG

RADNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




ATG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGH






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TT






HLVRHQRTHTGEKPYKCPECGKS

GNLTVHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTTGNLTVHQRTHTGEKPTGKK








TS








acgatgggc
ACG ATG
1192
LEPGEKPYKCPECGKSFSRSDHL
1741
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2290


aatggtagg
GGC AAT

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



tgg
GGT AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CKX10X11CX12X13X14X15X16TSGHLVRHX17




TGG

GKSFSTSGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVH






KCPECGKSFSTTGNLTVHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDE






QRTHTGEKPYKCPECGKSFSRRD

LNVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RT






ELNVHQRTHTGEKPYKCPECGKS

DTLRDHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRTDTLRDHQRTHTGEKPTGKK








TS








gtgggggca
GTG GGG
1193
LEPGEKPYKCPECGKSFSRSDKL
1742
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2291


ggtgtgcct
GCA GGT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18



ggg
gtg CCT

TKNSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17




GGG

GKSFSRSDELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15KX16TSGHLVRH






KCPECGKSFSTSGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLR






GEKPYKCPECGKSFSQSGDLRRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






KLVRHQRTHTGEKPYKCPECGKS

DELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDELVRHQRTHTGEKPTGKK








TS








ccagtgggg
CCA GTG
1194
LEPGEKPYKCPECGKSFSTKNSL
1743
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18X19
2292


gcaggtgtg
GGG GCA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18



cct
GGT gtg

RSDELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17




CCT

GKSFSTSGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRH






KCPECGKSFSQSGDLRRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDE






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






ELVRHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








gtgccagtg
gtg CCA
1195
LEPGEKPYKCPECGKSFSRSDEL
1744
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18X19
2293


ggggcaggt
GTG GGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18



gtg
GCA GGT

TSGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17




GTG

GKSFSQSGDLRRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELV






GEKPYKCPECGKSFSRSDELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHS






QRTHTGEKPYKCPECGKSFSTSH

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






SLTEHQRTHTGEKPYKCPECGKS

DELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDELVRHQRTHTGEKPTGKK








TS








ccaggtgtg
CCA GGT
1196
LEPGEKPYKCPECGKSFSQSGDL
1745
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18X19
2294


ccagtgggg
gtg CCA

RRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



gca
GTG GGG

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17




GCA

GKSFSRSDELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEH






KCPECGKSFSTSHSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELV






GEKPYKCPECGKSFSRSDELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGH






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLVRHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








ggtgtgcca
GGT gtg
1197
LEPGEKPYKCPECGKSFSTSGHL
1746
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18X19
2295


gtgggggca
CCA GTG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18



ggt
GGG GCA

QSGDLRRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




GGT

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRH






KCPECGKSFSRSDELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLT






GEKPYKCPECGKSFSTSHSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDE






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






ELVRHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGHLVRHQRTHTGEKPTGKK








TS








ccaggagca
CCA GGA
1198
LEPGEKPYKCPECGKSFSSKKAL
1747
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18X19
2296


gatctttgg
GCA GAT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



cac
CTT TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEHX17




CAC

GKSFSTTGALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRH






KCPECGKSFSTSGNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLR






GEKPYKCPECGKSFSQSGDLRRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLERHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








ctgggtcca
CTG GGT
1199
LEPGEKPYKCPECGKSFSTTGAL
1748
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TTGALTEHX17X18X19
2297


ggagcagat
CCA GGA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17X18



ctt
GCA GAT

TSGNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17




CTT

GKSFSQSGDLRRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLT






GEKPYKCPECGKSFSTSHSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGH






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






HLVRHQRTHTGEKPYKCPECGKS

DALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








ggtccagga
GGT CCA
1200
LEPGEKPYKCPECGKSFSRSDHL
1749
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2298


gcagatctt
GGA GCA

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEHX17X18



tgg
GAT CTT

TTGALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17




TGG

GKSFSTSGNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRH






KCPECGKSFSQSGDLRRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHS






QRTHTGEKPYKCPECGKSFSTSH

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






SLTEHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGHLVRHQRTHTGEKPTGKK








TS








gggaggaga
ggg AGG
1201
LEPGEKPYKCPECGKSFSTTGNL
1750
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17X18X19
2299


atgatacaa
AGA ATG

TVHQRTHTGEKPYKCPECGKSFS

9HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17X18



aat
ATA CAA

QSGNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QKSSLIAHX17




AAT

GKSFSQKSSLIAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVH






KCPECGKSFSRRDELNVHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLR






GEKPYKCPECGKSFSQLAHLRAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLTNHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








tggggtggg
TGG GGT
1202
LEPGEKPYKCPECGKSFSQKSSL
1751
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QKSSLIAHX17X18X19
2300


aggagaatg
ggg AGG

IAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVHX17X18



ata
AGA ATG

RRDELNVHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17




ATA

GKSFSQLAHLRAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGH






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLVRHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








ctttggggt
CTT TGG
1203
LEPGEKPYKCPECGKSFSRRDEL
1752
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RRDELNVHX17X18X19
2301


gggaggaga
GGT ggg

NVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X11X13X14X15X16QLAHLRAHX17X18



atg
AGG AGA

QLAHLRAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




ATG

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLV






GEKPYKCPECGKSFSTSGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TT






HLTTHQRTHTGEKPYKCPECGKS

GALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTTGALTEHQRTHTGEKPTGKK








TS








ggcactcaa
GGC ACT
1204
LEPGEKPYKCPECGKSFSRSDKL
1753
X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2302


ctttggggt
CAA CTT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18



ggg
TGG GGT

TSGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




GGG

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEH






KCPECGKSFSTTGALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLT






GEKPYKCPECGKSFSQSGNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLD






QRTHTGEKPYKCPECGKSFSTHL

LIRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






DLIRHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








aggagaatg
AGG AGA
1205
LEPGEKPYKCPECGKSFSTSGHL
1754
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18X19
2303


atacaaaat
ATG ATA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17X18



ggt
CAA AAT

TTGNLTVHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17




GGT

GKSFSQSGNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QKSSLIAH






KCPECGKSFSQKSSLIAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELN






GEKPYKCPECGKSFSRRDELNVH

VHX17X18X19HX20X21X22X23X24X8X9CX10X1CX12X13X14X15X16QLAH






QRTHTGEKPYKCPECGKSFSQLA

LRAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLRAHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








agaatgata
AGA ATG
1206
LEPGEKPYKCPECGKSFSRSDHL
1755
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2304


caaaatggt
ATA CAA

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18



agg
AAT GGT

TSGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17




AGG

GKSFSTTGNLTVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEH






KCPECGKSFSQSGNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QKSSLI






GEKPYKCPECGKSFSQKSSLIAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDE






QRTHTGEKPYKCPECGKSFSRRD

LNVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QL






ELNVHQRTHTGEKPYKCPECGKS

AHLRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQLAHLRAHQRTHTGEKPTGKK








TS








ggtgggagg
GGT ggg
1207
LEPGEKPYKCPECGKSFSQSGNL
1756
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17X18X19
2305


agaatgata
AGG AGA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QKSSLIAHX17X18



caa
ATG ATA

QKSSLIAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVHX17




CAA

GKSFSRRDELNVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAH






KCPECGKSFSQLAHLRAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






KLVRHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGHLVRHQRTHTGEKPTGKK








TS








caactttgg
CAA CTT
1208
LEPGEKPYKCPECGKSFSQLAHL
1757
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17X18X19
2306


ggtgggagg
TGG GGT

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



aga
ggg AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




AGA

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRH






KCPECGKSFSTSGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGA






QRTHTGEKPYKCPECGKSFSTTG

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






ALTEHQRTHTGEKPYKCPECGKS

GNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGNLTEHQRTHTGEKPTGKK








TS








actcaactt
ACT CAA
1209
LEPGEKPYKCPECGKSFSRSDHL
1758
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2307


tggggtggg
CTT TGG

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



agg
GGT ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17




AGG

GKSFSTSGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALT






GEKPYKCPECGKSFSTTGALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGN






QRTHTGEKPYKCPECGKSFSQSG

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TH






NLTEHQRTHTGEKPYKCPECGKS

LDLIRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTHLDLIRHQRTHTGEKPTGKK








TS








gggtgggag
GGG TGG
1210
LEPGEKPYKCPECGKSFSSPADL
1759
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18X19
2308


gagaatgat
GAG gag

TRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17X18



aca
AAT GAT

TSGNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17




ACA

GKSFSTTGNLTVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRH






KCPECGKSFSRSDNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLTTHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








gagaatgat
gag AAT
1211
LEPGEKPYKCPECGKSFSREDNL
1760
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16REDNLHTHX17X18X19
2309


acaaaatgg
GAT ACA

HTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



tag
AAA TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAHX17




TAG

GKSFSQRANLRAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRH






KCPECGKSFSSPADLTRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLV






GEKPYKCPECGKSFSTSGNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGN






QRTHTGEKPYKCPECGKSFSTTG

LTVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLTVHQRTHTGEKPYKCPECGKS

DNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDNLVRHQRTHTGEKPTGKK








TS








tgggaggag
TGG GAG
1212
LEPGEKPYKCPECGKSFSQRANL
1761
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRANLRAHX17X18X19
2310


aatgataca
gag AAT

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18



aaa
GAT ACA

SPADLTRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17




AAA

GKSFSTSGNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVH






KCPECGKSFSTTGNLTVHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLVRHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








gaggagaat
GAG gag
1213
LEPGEKPYKCPECGKSFSRSDHL
1762
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2311


gatacaaaa
AAT GAT

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAHX17X18



tgg
ACA AAA

QRANLRAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17




TGG

GKSFSSPADLTRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRH






KCPECGKSFSTSGNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLT






GEKPYKCPECGKSFSTTGNLTVH

VHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLVRHQRTHTGEKPYKCPECGKS

DNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDNLVRHQRTHTGEKPTGKK








TS








aatgataca
AAT GAT
1214
LEPGEKPYKCPECGKSFSTSGSL
1763
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGSLVRHX17X18X19
2312


aaatggtag
ACA AAA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLHTHX17X18



gtt
TGG TAG

REDNLHTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




GTT

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAH






KCPECGKSFSQRANLRAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLT






GEKPYKCPECGKSFSSPADLTRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGN






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TT






NLVRHQRTHTGEKPYKCPECGKS

GNLTVHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTTGNLTVHQRTHTGEKPTGKK








TS








ggtcctaca
GGT CCT
1215
LEPGEKPYKCPECGKSFSRSDHL
1764
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2313


ggccagcac
ACA GGC

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18



agg
CAG CAC

SKKALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17




AGG

GKSFSRADNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLT






GEKPYKCPECGKSFSSPADLTRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNS






QRTHTGEKPYKCPECGKSFSTKN

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






SLTEHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGHLVRHQRTHTGEKPTGKK








TS








taggttggt
TAG GTT
1216
LEPGEKPYKCPECGKSFSRADNL
1765
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18X19
2314


cctacaggc
GGT CCT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



cag
ACA GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17




CAG

GKSFSSPADLTRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEH






KCPECGKSFSTKNSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLV






GEKPYKCPECGKSFSTSGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGS






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RE






SLVRHQRTHTGEKPYKCPECGKS

DNLHTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSREDNLHTHQRTHTGEKPTGKK








TS








tggtaggtt
TGG TAG
1217
LEPGEKPYKCPECGKSFSDPGHL
1766
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2315


ggtcctaca
GTT GGT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18



ggc
CCT ACA

SPADLTRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17




GGC

GKSFSTKNSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRH






KCPECGKSFSTSGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLV






GEKPYKCPECGKSFSTSGSLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDN






QRTHTGEKPYKCPECGKSFSRED

LHTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLHTHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








aaatggtag
AAA TGG
1218
LEPGEKPYKCPECGKSFSSPADL
1767
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18X19
2316


gttggtcct
TAG GTT

TRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18



aca
GGT CCT

TKNSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17




GAC

GKSFSTSGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLVRH






KCPECGKSFSTSGSLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLH






GEKPYKCPECGKSFSREDNLHTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






ARTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






HLTTHQRTHTGEKPYKCPECGKS

ANLRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRANLRAHQRTHTGEKPTGKK








TS








acaaaatgg
ACA AAA
1219
LEPGEKPYKCPECGKSFSTKNSL
1768
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18X19
2317


taggttggt
TGG TAG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18



cct
GTT GGT

TSGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLVRHX17




CCT

GKSFSTSGSLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLHTH






KCPECGKSFSREDNLHTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAN






QRTHTGEKPYKCPECGKSFSQRA

LRAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SP






NLRAHQRTHTGEKPYKCPECGKS

ADLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSPADLTRHQRTHTGEKPTGKK








TS








gatacaaaa
GAT ACA
1220
LEPGEKPYKCPECGKSFSTSGHL
1769
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18X19
2318


tggtaggtt
AAA TGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLVRHX17X18



ggt
TAG GTT

TSGSLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLHTHX17




GGT

GKSFSREDNLHTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLR






GEKPYKCPECGKSFSQRANLRAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPAD






QRTHTGEKPYKCPECGKSFSSPA

LTRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






DLTRHQRTHTGEKPYKCPECGKS

GNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGNLVRHQRTHTGEKPTGKK








TS








gttggtcct
GTT GGT
1221
LEPGEKPYKCPECGKSFSSKKAL
1770
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18X19
2319


acaggccag
CCT ACA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18



cac
GGC CAG

RADNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




CAC

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRH






KCPECGKSFSSPADLTRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLT






GEKPYKCPECGKSFSTKNSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGH






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLVRHQRTHTGEKPYKCPECGKS

GSLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGSLVRHQRTHTGEKPTGKK








TS








gtcctacag
GTC CTA
1222
LEPGEKPYKCPECGKSFSTSGHL
1771
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18X19
2320


gccagcaca
CAG GCC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18



ggt
AGC ACA

SPADLTRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREHX17




GGT

GKSFSERSHLREHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARH






KCPECGKSFSDCRDLARHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLT






GEKPYKCPECGKSFSRADNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNST






QRTHTGEKPYKCPECGKSFSQNS

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






TLTEHQRTHTGEKPYKCPECGKS

GALVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGALVRHQRTHTGEKPTGKK








TS








caggccagc
CAG GCC
1223
LEPGEKPYKCPECGKSFSDCRDL
1772
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18X19
2321


acaggtgtt
AGC ACA

ARHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLVRHX17X18



gcc
GGT GTT

TSGSLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17




GCC

GKSFSTSGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRH






KCPECGKSFSSPADLTRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLR






GEKPYKCPECGKSFSERSHLREH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRD






QRTHTGEKPYKCPECGKSFSDCR

LARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RA






DLARHQRTHTGEKPYKCPECGKS

DNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRADNLTEHQRTHTGEKPTGKK








TS








ctacaggcc
CTA CAG
1224
LEPGEKPYKCPECGKSFSTSGSL
1773
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGSLVRHX17X18X19
2322


agcacaggt
GCC AGC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18



gtt
ACA GGT

TSGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17




GTT

GKSFSSPADLTRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREH






KCPECGKSFSERSHLREHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLA






GEKPYKCPECGKSFSDCRDLARH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADN






QRTHTGEKPYKCPECGKSFSRAD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QN






NLTEHQRTHTGEKPYKCPECGKS

STLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQNSTLTEHQRTHTGEKPTGKK








TS








ggtgttgcc
GGT GTT
1225
LEPGEKPYKCPECGKSFSTSHSL
1774
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18X19
2323


aagtgaagc
GCC AAG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREHX17X18



cca
TGA AGC

ERSHLREHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASHX17




CCA

GKSFSQAGHLASHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLKNH






KCPECGKSFSRKDNLKNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLA






GEKPYKCPECGKSFSDCRDLARH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGS






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






SLVRHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGHLVRHQRTHTGEKPTGKK








TS








agcacaggt
AGC ACA
1226
LEPGEKPYKCPECGKSFSQAGHL
1775
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QAGHLASHX17X18X19
2324


gttgccaag
GGT GTT

ASHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10XCX12X13X14X15X16RKDNLKNHX17X18



tga
GCC AAG

RKDNLKNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17




TGA

GKSFSDCRDLARHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLVRH






KCPECGKSFSTSGSLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLV






GEKPYKCPECGKSFSTSGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPAD






QRTHTGEKPYKCPECGKSFSSPA

LTRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ER






DLTRHQRTHTGEKPYKCPECGKS

SSHLREHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSERSHLREHQRTHTGEKPTGKK








TS








acaggtgtt
ACA GGT
1227
LEPGEKPYKCPECGKSFSERSHL
1776
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16ERSHLREHX17X18X19
2325


gccaagtga
GTT GCC

REHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASHX17X18



agc
AAG TGA

QAGHLASHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLKNHX17




AGC

GKSFSRKDNLKNHQRTHTGEKPY

X8X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARH






KCPECGKSFSDCRDLARHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLV






GEKPYKCPECGKSFSTSGSLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGH






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SP






HLVRHQRTHTGEKPYKCPECGKS

ADLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSPADLTRHQRTHTGEKPTGKK








TS








gccagcaca
GCC AGC
1228
LEPGEKPYKCPECGKSFSRKDNL
1777
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RKDNLKNHX17X18X19
2326


ggtgttgcc
ACA GGT

KNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18



aag
GTT GCC

DCRDLARHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLVRHX17




APG

GKSFSTSGSLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRH






KCPECGKSFSTSGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLT






GEKPYKCPECGKSFSSPADLTRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSH






QRTHTGEKPYKCPECGKSFSERS

LREHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DC






HLREHQRTHTGEKPYKCPECGKS

RDLARHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDCRDLARHQRTHTGEKPTGKK








TS








aggcacagt
AGG CAC
1229
LEPGEKPYKCPECGKSFSTSGNL
1778
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18X19
2327


gatcacagg
AGT GAT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



cat
CAC AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17




CAT

GKSFSSKKALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRH






KCPECGKSFSTSGNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLT






GEKPYKCPECGKSFSHRTTLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKA






QRTHTGEKPYKCPECGKSFSSKK

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






ALTEHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








ccaagtgaa
CCA AGT
1230
LEPGEKPYKCPECGKSFSSKKHL
1779
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18X19
2328


gcccatgtg
GAA GCC

AEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18



ccc
CAT gtg

RSDELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17




CCC

GKSFSTSGNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARH






KCPECGKSFSDCRDLARHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLV






GEKPYKCPECGKSFSQSSNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTT






QRTHTGEKPYKCPECGKSFSHRT

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






TLTNHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








gaagcccat
GAA GCC
1231
LEPGEKPYKCPECGKSFSSKKAL
1780
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18X19
2329


gtgcccagg
CAT gtg

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



cac
CCC AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




CAC

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRH






KCPECGKSFSRSDELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLT






GEKPYKCPECGKSFSTSGNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRD






QRTHTGEKPYKCPECGKSFSDCR

LARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






DLARHQRTHTGEKPYKCPECGKS

SNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSSNLVRHQRTHTGEKPTGKK








TS








agtgaagcc
AGT GAA
1232
LEPGEKPYKCPECGKSFSRSDHL
1781
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2330


catgtgccc
GCC CAT

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18



agg
gtg CCC

SKKHLAEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17




AGG

GKSFSRSDELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEH






KCPECGKSFSTSGNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLA






GEKPYKCPECGKSFSDCRDLARH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSN






QRTHTGEKPYKCPECGKSFSQSS

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HR 






NLVRHQRTHTGEKPYKCPECGKS

TTLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSHRTTLTNHQRTHTGEKPTGKK








TS








gtgcccagg
gtg CCC
1233
LEPGEKPYKCPECGKSFSSKKAL
1782
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18X19
2331


cacagtgat
AGG CAC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17X18



cac
AGT GAT

TSGNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17




CAC

GKSFSHRTTLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEH






KCPECGKSFSSKKALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLAEHQRTHTGEKPYKCPECGKS

DELVRHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDELVRHQRTHTGEKPTGKK








TS








gcccatgtg
GCC CAT
1234
LEPGEKPYKCPECGKSFSHRTTL
1783
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17X18X19
2332


cccaggcac
gtg CCC

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18



agt
AGG CAC

SKKALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




AGT

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH






KCPECGKSFSSKKHLAEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELV






GEKPYKCPECGKSFSRSDELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGN






QRTHTGEKPYKCPECGKSFSTSG

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DC






NLTEHQRTHTGEKPYKCPECGKS

RDLARHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDCRDLARHQRTHTGEKPTGKK








TS








catgtgccc
CAT gtg
1235
LEPGEKPYKCPECGKSFSTSGNL
1784
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17X18X19
2333


aggcacagt
CCC AGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17X18



gat
CAC AGT

HRTTLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17




GAT

GKSFSSKKALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLA






GEKPYKCPECGKSFSSKKHLAEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDE






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






ELVRHQRTHTGEKPYKCPECGKS

GNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGNLTEHQRTHTGEKPTGKK








TS








cccaggcac
CCC AGG
1236
LEPGEKPYKCPECGKSFSRSDHL
1785
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2334


agtgatcac
CAC AGT

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18



agg
GAT CAC

SKKALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17




AGG

GKSFSTSGNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNH






KCPECGKSFSHRTTLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALT






GEKPYKCPECGKSFSSKKALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X1CX12X13X14X15X16SK






HLTNHQRTHTGEKPYKCPECGKS

KHLAEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






ESSKKHLAEHQRTHTGEKPTGKK








TS








gggaggcct
ggg AGG
1237
LEPGEKPYKCPECGKSFSTTGNL
1786
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17X18X19
2335


gcaagggcc
CCT GCA

TVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18



aat
AGG GCC

DCRDLARHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




AAT

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRH






KCPECGKSFSQSGDLRRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLT






GEKPYKCPECGKSFSTKNSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLTNHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








gaagggagg
GAA ggg
1238
LEPGEKPYKCPECGKSFSDCRDL
1787
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18X19
2336


cctgcaagg
AGG CCT

ARHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



gcc
GCA AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17




GCC

GKSFSQSGDLRRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEH






KCPECGKSFSTKNSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






KLVRHQRTHTGEKPYKCPECGKS

SNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSSNLVRHQRTHTGEKPTGKK








TS








acaggcatt
ACA GGC
1239
LEPGEKPYKCPECGKSFSRSDKL
1788
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2337


ctgggtgaa
ATT CTG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18



ggg
GGT GAA

QSSNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17




GGG

GKSFSTSGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEH






KCPECGKSFSRNDALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQ






GEKPYKCPECGKSFSHKNALQNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SP






HLVRHQRTHTGEKPYKCPECGKS

ADLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSPADLTRHQRTHTGEKPTGKK








TS








ctgggtgaa
CTG GGT
1240
LEPGEKPYKCPECGKSFSQSGDL
1789
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18X19
2338


gggaggcct
GAA ggg

RRHQRTHTGEKPYKCPECGKSIS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18



gca
AGG CCT

TKNSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




GCA

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLV






GEKPYKCPECGKSFSQSSNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGH






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






HLVRHQRTHTGEKPYKCPECGKS

DALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








attctgggt
ATT CTG
1241
LEPGEKPYKCPECGKSFSTKNSL
1790
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18X19
2339


gaagggagg
GGT GAA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



cct
ggg AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




CCT

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRH






KCPECGKSFSQSSNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLV






GEKPYKCPECGKSFSTSGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HK






ALTEHQRTHTGEKPYKCPECGKS

NALQNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSHKNALQNHQRTHTGEKPTGKK








TS








ggcattctg
GGC ATT
1242
LEPGEKPYKCPECGKSFSRSDHL
1791
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2340


ggtgaaggg
CTG GGT

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



agg
GAA ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17




AGG

GKSFSQSSNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRH






KCPECGKSFSTSGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALT






GEKPYKCPECGKSFSRNDALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNA






QRTHTGEKPYKCPECGKSFSHKN

LQNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16DP






ALQNHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








ggtgaaggg
GGT GAA
1243
LEPGEKPYKCPECGKSFSRSDHL
1792
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2341


aggcctgca
ggg AGG

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18



agg
CCT GCA

QSGDLRRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17




AGG

GKSFSTKNSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSN






QRTHTGEKPYKCPECGKSFSQSS

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






NLVRHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGHLVRHQRTHTGEKPTGKK








TS








ggaggcctg
GGA GGC
1244
LEPGEKPYKCPECGKSFSHKNAL
1793
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18X19
2342


caagggcca
CTG CAA

QNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18



att
ggg CCA

TSHSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




ATT

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEH






KCPECGKSFSQSGNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALT






GEKPYKCPECGKSFSRNDALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






HLVRHQRTHTGEKPYKCPECGKS

AHLERHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRAHLERHQRTHTGEKPTGKK








TS








gtgaaggga
gtg AAG
1245
LEPGEKPYKCPECGKSFSRSDKL
1794
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2343


ggcctgcaa
GGA GGC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17X18



ggg
CTG CAA

QSGNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




GGG

GKSFSRNDALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDN






QRTHTGEKPYKCPECGKSFSRKD

LKNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLKNHQRTHTGEKPYKCPECGKS

DELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDELVRHQRTHTGEKPTGKK








TS








tgggtgaag
TGG gtg
1246
LEPGEKPYKCPECGKSFSQSGNL
1795
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17X18X19
2344


ggaggcctg
AAG GGA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18



caa
GGC CTG

RNDALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




CAA

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLK






GEKPYKCPECGKSFSRKDNLKNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDE






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






ELVRHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








aagggaggc
AAG GGA
1247
LEPGEKPYKCPECGKSFSTSHSL
1796
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18X19
2345


ctgcaaggg
GGC CTG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



cca
CAA ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17




CCA

GKSFSQSGNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEH






KCPECGKSFSRNDALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RK






HLERHQRTHTGEKPYKCPECGKS

DNLKNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRKDNLKNHQRTHTGEKPTGKK








TS








actgtgcct
ACT gtg
1248
LEPGEKPYKCPECGKSFSDPGHL
1797
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2346


gggcacatg
CCT ggg

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVHX17X18



ggc
CAC ATG

RRDELNVHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17




GGC

GKSFSSKKALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLT






GEKPYKCPECGKSFSTKNSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDE






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TH






ELVRHQRTHTGEKPYKCPECGKS

LDLIRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTHLDLIRHQRTHTGEKPTGKK








TS








gcctgggca
GCC TGG
1249
LEPGEKPYKCPECGKSFSSKKAL
1798
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18X19
2347


catgggctt
GCA CAT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEHX17X18



cac
ggg CTT

TTGALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




CAC

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEH






KCPECGKSFSTSGNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLR






GEKPYKCPECGKSFSQSGDLRRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DC






HLTTHQRTHTGEKPYKCPECGKS

RDLARHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDCRDLARHQRTHTGEKPTGKK








TS








gctggcctg
GCT GGC
1250
LEPGEKPYKCPECGKSFSTKNSL
1799
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18X19
2348


taggaccaa
CTG TAG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17X18



cct
GAC CAA

QSGNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGNLVRHX17




CCT

GKSFSDPGNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLHTH






KCPECGKSFSREDNLHTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALT






GEKPYKCPECGKSFSRNDALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLVRHQRTHTGEKPYKCPECGKS

GELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGELVRHQRTHTGEKPTGKK








TS








ggcctgtag
GGC CTG
1251
LEPGEKPYKCPECGKSFSDKKDL
1800
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DKKDLTRHX17X18X19
2349


gaccaacct
TAG GAC

TRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18



acc
CAA CCT

TKNSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17




ACC

GKSFSQSGNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGNLVRH






KCPECGKSFSDPGNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLH






GEKPYKCPECGKSFSREDNLHTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






ALTEHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








ctgtaggac
CTG TAG
1252
LEPGEKPYKCPECGKSFSHKNAL
1801
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18X19
2350


caacctacc
GAC CAA

QNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLTRHX17X18



att
CCT ACC

DKKDLTRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17




ATT

GKSFSTKNSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEH






KCPECGKSFSQSGNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGNLV






GEKPYKCPECGKSFSDPGNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDN






QRTHTGEKPYKCPECGKSFSRED

LHTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






NLHTHQRTHTGEKPYKCPECGKS

DALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








ccaccccaa
CCA CCC
1253
LEPGEKPYKCPECGKSFSTSHSL
1802
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18X19
2351


agttgagtg
CAA AGT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18



cca
TGA gtg

RSDELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASHX17




CCA

GKSFSQAGHLASHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNH






KCPECGKSFSHRTTLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLT






GEKPYKCPECGKSFSQSGNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLAEHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








ccccaaagt
CCC CAA
1254
LEPGEKPYKCPECGKSFSRKDNL
1803
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RKDNLKNHX17X18X19
2352


tgagtgcca
AGT TGA

KNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18



aag
gtg CCA

TSHSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17




AAG

GKSFSRSDELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASH






KCPECGKSFSQAGHLASHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLT






GEKPYKCPECGKSFSHRTTLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGN






QRTHTGEKPYKCPECGKSFSQSG

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






NLTEHQRTHTGEKPYKCPECGKS

KHLAEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKHLAEHQRTHTGEKPTGKK








TS








gctcctgga
GCT CCT
1255
LEPGEKPYKCPECGKSFSDKKDL
1804
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DKKDLTRHX17X18X19
2353


cccaggcac
GGA CCC

TRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18



acc
AGG CAC

SKKALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




ACC

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH






KCPECGKSFSSKKHLAEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNS






QRTHTGEKPYKCPECGKSFSTKN

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






SLTEHQRTHTGEKPYKCPECGKS

GELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGELVRHQRTHTGEKPTGKK








TS








ctggaccca
CTG GAC
1256
LEPGEKPYKCPECGKSFSDCRDL
1805
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18X19
2354


ggcacacct
CCA GGC

ARHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18



gcc
ACA CCT

TKNSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17




GCC

GKSFSSPADLTRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLT






GEKPYKCPECGKSFSTSHSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGN






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






NLVRHQRTHTGEKPYKCPECGKS

DALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








gcccccact
GCC CCC
1257
LEPGEKPYKCPECGKSFSRSDKL
1806
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2355


ggcacacct
ACT GGC

VRHQRTHTGEKPYKCPECGKSFS

HX20X20X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18



ggg
ACA CCT

TKNSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17




GGG

GKSFSSPADLTRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDL






GEKPYKCPECGKSFSTHLDLIRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DC






HLAEHQRTHTGEKPYKCPECGKS

RDLARHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDCRDLARHQRTHTGEKPTGKK








TS








cctgccccc
CCT GCC
1258
LEPGEKPYKCPECGKSFSTKNSL
1807
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18X19
2356


actggcaca
CCC ACT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18



cct
GGC ACA

SPADLTRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




CCT

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRH






KCPECGKSFSTHLDLIRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLA






GEKPYKCPECGKSFSSKKHLAEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRD






QRTHTGEKPYKCPECGKSFSDCR

LARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TK






DLARHQRTHTGEKPYKCPECGKS

NSLTEHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSTKNSLTEHQRTHTGEKPTGKK








TS








ggcacacct
GGC ACA
1259
LEPGEKPYKCPECGKSFSDPGHL
1808
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2357


gcccccact
CCT GCC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRHX17X18



ggc
CCC ACT

THLDLIRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




GGC

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARH






KCPECGKSFSDCRDLARHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLT






GEKPYKCPECGKSFSTKNSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPAD






QRTHTGEKPYKCPECGKSFSSPA

LTRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






DLTRHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








acacctgcc
ACA CCT
1260
LEPGEKPYKCPECGKSFSSPADL
1809
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18X19
2358


cccactggc
GCC CCC

TRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



aca
ACT GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRHX17




ACA

GKSFSTHLDLIRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH






KCPECGKSFSSKKHLAEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLA






GEKPYKCPECGKSFSDCRDLARH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNS






QRTHTGEKPYKCPECGKSFSTKN

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SP 






SLTEHQRTHTGEKPYKCPECGKS

ADLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSPADLTRHQRTHTGEKPTGKK








TS








ccaggcaca
CCA GGC
1261
LEPGEKPYKCPECGKSFSTHLDL
1810
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16THLDLIRHX17X18X19
2359


cctgccccc
ACA CCT

IRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18



act
GCC CCC

SKKHLAEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17




ACT

GKSFSDCRDLARHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEH






KCPECGKSFSTKNSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLT






GEKPYKCPECGKSFSSPADLTRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLVRHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








cccactggc
CCC ACT
1262
LEPGEKPYKCPECGKSFSSKKAL
1811
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18X19
2360


acacctggg
GGC ACA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



cac
CCT ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17




CAC

GKSFSTKNSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRH






KCPECGKSFSSPADLTRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLD






QRTHTGEKPYKCPECGKSFSTHL

LIRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






DLIRHQRTHTGEKPYKCPECGKS

KHLAEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKHLAEHQRTHTGEKPTGKK








TS








gacccaggc
GAC CCA
1263
LEPGEKPYKCPECGKSFSSKKHL
1812
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18X19
2361


acacctgcc
GGC ACA

AEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18



ccc
CCT GCC

DCRDLARHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17




CCC

GKSFSTKNSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRH






KCPECGKSFSSPADLTRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHS






QRTHTGEKPYKCPECGKSFSTSH

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






SLTEHQRTHTGEKPYKCPECGKS

GNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGNLVRHQRTHTGEKPTGKK








TS








ccactggca
CCA CTG
1264
LEPGEKPYKCPECGKSFSSPADL
1813
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18X19
2362


cacctgggc
GCA CAC

TRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



aca
CTG GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




ACA

GKSFSRNDALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEH






KCPECGKSFSSKKALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLR






GEKPYKCPECGKSFSQSGDLRRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






ALTEHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








gcacacctg
GCA CAC
1265
LEPGEKPYKCPECGKSFSQSGDL
1814
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18X19
2363


cccccactg
CTG CCC

RRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18



gca
CCA CTG

RNDALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17




GCA

GKSFSTSHSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH






KCPECGKSFSSKKHLAEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALT






GEKPYKCPECGKSFSRNDALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKA






QRTHTGEKPYKCPECGKSFSSKK

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






ALTEHQRTHTGEKPYKCPECGKS

GDLRRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGDLRRHQRTHTGEKPTGKK








TS








ctgccccca
CTG CCC
1266
LEPGEKPYKCPECGKSFSRNDAL
1815
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18X19
2364


ctggcacac
CCA CTG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18



ctg
GCA CAC

SKKALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17




CTG

GKSFSQSGDLRRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEH






KCPECGKSFSRNDALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLT






GEKPYKCPECGKSFSTSHSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






HLAEHQRTHTGEKPYKCPECGKS

DALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








cccccactg
CCC CCA
1267
LEPGEKPYKCPECGKSFSDPGHL
1816
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2365


gcacacctg
CTG GCA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18



ggc
CAC CTG

RNDALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17




GGC

GKSFSSKKALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRH






KCPECGKSFSQSGDLRRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALT






GEKPYKCPECGKSFSRNDALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHS






QRTHTGEKPYKCPECGKSFSTSH

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK 






SLTEHQRTHTGEKPYKCPECGKS

KHLAEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKHLAEHQRTHTGEKPTGKK








TS








caggcacac
CAG GCA
1268
LEPGEKPYKCPECGKSFSRNDAL
1817
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18X19
2366


ctgccccca
CAC CTG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18



ctg
CCC CCA

TSHSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




CTG

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEH






KCPECGKSFSRNDALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALT






GEKPYKCPECGKSFSSKKALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGD






QRTHTGEKPYKCPECGKSFSQSG

LRRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RA






DLRRHQRTHTGEKPYKCPECGKS

DNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRADNLTEHQRTHTGEKPTGKK








TS








tggacccag
TGG ACC
1269
LEPGEKPYKCPECGKSFSSKKHL
1818
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18X19
2367


gcacacctg
CAG GCA

AEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18



ccc
CAC CTG

RNDALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17




CCC

GKSFSSKKALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRH






KCPECGKSFSQSGDLRRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLT






GEKPYKCPECGKSFSRADNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKD






QRTHTGEKPYKCPECGKSFSDKK

LTRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






DLTRHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








acccaggca
ACC CAG
1270
LEPGEKPYKCPECGKSFSTSHSL
1819
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18X19
2368


cacctgccc
GCA CAC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18



cca
CTG CCC

SKKHLAEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




CCA

GKSFSRNDALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEH






KCPECGKSFSSKKALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLR






GEKPYKCPECGKSFSQSGDLRRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADN






QRTHTGEKPYKCPECGKSFSRAD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DK






NLTEHQRTHTGEKPYKCPECGKS

KDLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDKKDLTRHQRTHTGEKPTGKK








TS








cacctgccc
CAC CTG
1271
LEPGEKPYKCPECGKSFSSKKAL
1820
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18X19
2369


ccactggca
CCC CCA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18



cac
CTG GCA

QSGDLRRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




CAC

GKSFSRNDALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEH






KCPECGKSFSTSHSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLA






GEKPYKCPECGKSFSSKKHLAEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






ALTEHQRTHTGEKPYKCPECGKS

KALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKALTEHQRTHTGEKPTGKK








TS








ccacctacc
CCA CCT
1272
LEPGEKPYKCPECGKSFSSRRTC
1821
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SRRTCRAHX17X18X19
2370


attgcccat
ACC ATT

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18



cgt
GCC CAT

TSGNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17




CGT

GKSFSDCRDLARHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNH






KCPECGKSFSHKNALQNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLT






GEKPYKCPECGKSFSDKKDLTRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNS






QRTHTGEKPYKCPECGKSFSTKN

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






SLTEHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








tggcacacc
TGG CAC
1273
LEPGEKPYKCPECGKSFSRNDAL
1822
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18X19
2371


tgggcacat
ACC TGG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18



ctg
GCA CAT

TSGNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17




CTG

GKSFSQSGDLRRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLT






GEKPYKCPECGKSFSDKKDLTRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKA






QRTHTGEKPYKCPECGKSFSSKK

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






ALTEHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








cactggcac
CAC TGG
1274
LEPGEKPYKCPECGKSFSTSGNL
1823
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18X19
2372


acctgggca
CAC ACC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18



cat
TGG GCA

QSGDLRRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




CAT

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLTRH






KCPECGKSFSDKKDLTRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALT






GEKPYKCPECGKSFSSKKALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






HLTTHQRTHTGEKPYKCPECGKS

KALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKALTEHQRTHTGEKPTGKK








TS








ctgccccca
CTG CCC
1275
LEPGEKPYKCPECGKSFSDCRDL
1824
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18X19
2373


cctaccatt
CCA CCT

ARHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18



gcc
ACC ATT

HKNALQNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLTRHX17




GCC

GKSFSDKKDLTRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEH






KCPECGKSFSTKNSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLT






GEKPYKCPECGKSFSTSHSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






HLAEHQRTHTGEKPYKCPECGKS

DALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








tgggcacat
TGG GCA
1276
LEPGEKPYKCPECGKSFSTKNSL
1825
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18X19
2374


ctgccccca
CAT CTG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18



cct
CCC CCA

TSHSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




CCT

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEH






KCPECGKSFSRNDALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLT






GEKPYKCPECGKSFSTSGNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGD






QRTHTGEKPYKCPECGKSFSQSG

LRRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






DLRRHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








acctgggca
ACC TGG
1277
LEPGEKPYKCPECGKSFSTSHSL
1826
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18X19
2375


catctgccc
GCA CAT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18



cca
CTG CCC

SKKHLAEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




CCA

GKSFSRNDALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEH






KCPECGKSFSTSGNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLR






GEKPYKCPECGKSFSQSGDLRRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DK






HLTTHQRTHTGEKPYKCPECGKS

KDLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDKKDLTRHQRTHTGEKPTGKK








TS








gcacatctg
GCA CAT
1278
LEPGEKPYKCPECGKSFSDKKDL
1827
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DKKDLTRHX17X18X19
2376


cccccacct
CTG CCC

TRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18



acc
CCA CCT

TKNSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17




ACC

GKSFSTSHSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH






KCPECGKSFSSKKHLAEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALT






GEKPYKCPECGKSFSRNDALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGN






QRTHTGEKPYKCPECGKSFSTSG

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






NLTEHQRTHTGEKPYKCPECGKS

GDLRRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGDLRRHQRTHTGEKPTGKK








TS








cctaccatt
CCT ACC
1279
LEPGEKPYKCPECGKSFSQSGNL
1828
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17X18X19
2377


gcccatcgt
ATT GCC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SRRTCRAHX17X18



caa
CAT CGT

SRRTCRAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17




CAA

GKSFSTSGNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARH






KCPECGKSFSDCRDLARHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQ






GEKPYKCPECGKSFSHKNALQNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKD






QRTHTGEKPYKCPECGKSFSDKK

LTRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TK






DLTRHQRTHTGEKPYKCPECGKS

NSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTKNSLTEHQRTHTGEKPTGKK








TS








catctgccc
CAT CTG
1280
LEPGEKPYKCPECGKSFSHKNAL
1829
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18X19
2378


ccacctacc
CCC CCA

QNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLTRHX17X18



att
CCT ACC

DKKDLTRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17




ATT

GKSFSTKNSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEH






KCPECGKSFSTSHSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLA






GEKPYKCPECGKSFSSKKHLAEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






ALTEHQRTHTGEKPYKCPECGKS

GNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGNLTEHQRTHTGEKPTGKK








TS








ccccactgg
CCC CAC
1281
LEPGEKPYKCPECGKSFSQSGDL
1830
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18X19
2379


cacacctgg
TGG CAC

RRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



gca
ACC TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLTRHX17




GCA

GKSFSDKKDLTRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEH






KCPECGKSFSSKKALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKA






QRTHTGEKPYKCPECGKSFSSKK

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






ALTEHQRTHTGEKPYKCPECGKS

KHLAEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKHLAEHQRTHTGEKPTGKK








TS








cccccacct
CCC CCA
1282
LEPGEKPYKCPECGKSFSTSGNL
1831
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18X19
2380


accattgcc
CCT ACC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18



cat
ATT GCC

DCRDLARHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNHX17




CAT

GKSFSHKNALQNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLTRH






KCPECGKSFSDKKDLTRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLT






GEKYKCPECGKSFSTKNSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHS






QRTHTGEKPYKCPECGKSFSTSH

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






SLTEHQRTHTGEKPYKCPECGKS

KHLAEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKHLAEHQRTHTGEKPTGKK








TS








cacacctgg
CAC ACC
1283
LEPGEKPYKCPECGKSFSSKKHL
1832
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18X19
2381


gcacatctg
TGG GCA

AEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18



ccc
CAT CTG

RNDALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17




CCC

GKSFSTSGNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRH






KCPECGKSFSQSGDLRRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKD






QRTHTGEKPYKCPECGKSFSDKK

LTRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






DLTRHQRTHTGEKPYKCPECGKS

KALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKALTEHQRTHTGEKPTGKK








TS








attgcccat
ATT GCC
1284
LEPGEKPYKCPECGKSFSRNDAL
1833
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18X19
2382


cgtcaacac
CAT CGT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18



ctg
CAA CAC

SKKALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17




CTG

GKSFSQSGNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SRRTCRAH






KCPECGKSFSSRRTCRAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLT






GEKPYKCPECGKSFSTSGNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRD






QRTHTGEKPYKCPECGKSFSDCR

LARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HK






DLARHQRTHTGEKPYKCPECGKS

NALQNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSHKNALQNHQRTHTGEKPTGKK








TS








catcgtcaa
CAT CGT
1285
LEPGEKPYKCPECGKSFSHKNAL
1834
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18X19
2383


cacctgcac
CAA CAC

QNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18



att
CTG CAC

SKKALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




ATT

GKSFSRNDALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEH






KCPECGKSFSSKKALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLT






GEKPYKCPECGKSFSQSGNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SRRT






QRTHTGEKPYKCPECGKSFSSRR

CRAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






TCRAHQRTHTGEKPYKCPECGKS

GNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGNLTEHQRTHTGEKPTGKK








TS








accattgcc
ACC ATT
1286
LEPGEKPYKCPECGKSFSSKKAL
1835
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18X19
2384


catcgtcaa
GCC CAT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEHX17X18



cac
CGT CAA

QSGNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SRRTCRAHX17




CAC

GKSFSSRRTCRAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEH






KCPECGKSFSTSGNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLA






GEKPYKCPECGKSFSDCRDLARH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNA






QRTHTGEKPYKCPECGKSFSHKN

LQNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DK






ALQNHQRTHTGEKPYKCPECGKS

KDLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDKKDLTRHQRTHTGEKPTGKK








TS








gcccatcgt
GCC CAT
1287
LEPGEKPYKCPECGKSFSSKKAL
1836
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18X19
2385


caacacctg
CGT CAA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18



cac
CAC CTG

RNDALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17




CAC

GKSFSSKKALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLTEH






KCPECGKSFSQSGNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SRRTCR






GEKPYKCPECGKSFSSRRTCRAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGN






QRTHTGEKPYKCPECGKSFSTSG

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DC






NLTEHQRTHTGEKPYKCPECGKS

RDLARHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDCRDLARHQRTHTGEKPTGKK








TS








cagggtggt
CAG GGT
1288
LEPGEKPYKCPECGKSFSDPGAL
1837
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGALVRHX17X18X19
2386


gtaggctgg
GGT GTA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



gtc
GGC TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




GTC

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSSLVRH






KCPECGKSFSQSSSLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLV






GEKPYKCPECGKSFSTSGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGH






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RA






HLVRHQRTHTGEKPYKCPECGKS

DNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRADNLTEHQRTHTGEKPTGKK








TS








ggtggtgta
GGT GGT
1289
LEPGEKPYKCPECGKSFSRADNL
1838
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18X19
2387


ggctgggtc
GTA GGC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALVRHX17X18



cag
TGG GTC

DPGALVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




CAG

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSSLV






GEKPYKCPECGKSFSQSSSLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGH






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLVRHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGHLVRHQRTHTGEKPTGKK








TS








aggctgggt
AGG CTG
1290
LEPGEKPYKCPECGKSFSSKKAL
1839
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18X19
2388


ccagtgcag
GGT CCA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18



cac
gtg CAG

RADNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17




CAC

GKSFSRSDELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEH






KCPECGKSFSTSHSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLV






GEKPYKCPECGKSFSTSGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






ALTEHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








agcactgta
AGC ACT
1291
LEPGEKPYKCPECGKSFSDPGAL
1840
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGALVRHX17X18X19
2389


gtcggccat
GTA GTC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18



gtc
GGC CAT

TSGNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




GTC

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALVRH






KCPECGKSFSDPGALVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSS SLV






GEKPYKCPECGKSFSQSSSLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLD






QRTHTGEKPYKCPECGKSFSTHL

LIRHX17X18X19HX20X20X22X23X24X8X9CX10X11CX12X13X14X15X16ER






DLIRHQRTHTGEKPYKCPECGKS

SHLREHX17X18X19HX20X20X22X23X24X25X26X27X28X29X30






FSERSHLREHQRTHTGEKPTGKK








TS








actgtagtc
ACT GTA
1292
LEPGEKPYKCPECGKSFSTSGNL
1841
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18X19
2390


ggccatgtc
GTC GGC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALVRHX17X18



cat
CAT GTC

DPGALVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17




CAT

GKSFSTSGNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALV






GEKPYKCPECGKSFSDPGALVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSS






QRTHTGEKPYKCPECGKSFSQSS

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TH






SLVRHQRTHTGEKPYKCPECGKS

LDLIRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTHLDLIRHQRTHTGEKPTGKK








TS








gtagtcggc
GTA GTC
1293
LEPGEKPYKCPECGKSFSDPGAL
1842
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGALVRHX17X18X19
2391


catgtccat
GGC CAT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18



gtc
GTC CAT

TSGNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALVRHX17




GTC

GKSFSDPGALVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEH






KCPECGKSFSTSGNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGA






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






ALVRHQRTHTGEKPYKCPECGKS

SSLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSSSLVRHQRTHTGEKPTGKK








TS








gtcggccat
GTC GGC
1294
LEPGEKPYKCPECGKSFSDPGNL
1843
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGNLVRHX17X18X19
2392


gtccatgtc
CAT GTC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALVRHX17X18



gac
CAT GTC

DPGALVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17




GAC

GKSFSTSGNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALVRH






KCPECGKSFSDPGALVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLT






GEKPYKCPECGKSFSTSGNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






HLVRHQRTHTGEKPYKCPECGKS

GALVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGALVRHQRTHTGEKPTGKK








TS








ggccatgtc
GGC CAT
1295
LEPGEKPYKCPECGKSFSRSDNL
1844
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18X19
2393


catgtcgac
GTC CAT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGNLVRHX17X18



gag
GTC GAC

DPGNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALVRHX17




GAG

GKSFSDPGALVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEH






KCPECGKSFSTSGNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALV






GEKPYKCPECGKSFSDPGALVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGN






QRTHTGEKPYKCPECGKSFSTSG

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP 






NLTEHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








catgtccat
CAT GTC
1296
LEPGEKPYKCPECGKSFSTSGHL
1845
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18X19
2394


gtcgacgag
CAT GTC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18



ggt
GAC GAG

RSDNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGNLVRHX17




GGT

GKSFSDPGNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALVRH






KCPECGKSFSDPGALVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLT






GEKPYKCPECGKSFSTSGNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGA






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






ALVRHQRTHTGEKPYKCPECGKS

GNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGNLTEHQRTHTGEKPTGKK








TS








ctgcctcca
CTG CCT
1297
LEPGEKPYKCPECGKSFSSKKAL
1846
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18X19
2395


cgccgcggc
CCA CGC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



cac
CGC GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGHLLEHX17




CAC

GKSFSHTGHLLEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGHLLEH






KCPECGKSFSHTGHLLEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLT






GEKPYKCPECGKSFSTSHSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNS






QRTHTGEKPYKCPECGKSFSTKN

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






SLTEHQRTHTGEKPYKCPECGKS

DALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








cctccacgc
CCT CCA
1298
LEPGEKPYKCPECGKSFSRNDAL
1847
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18X19
2396


cgcggccac
CGC CGC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18



ctg
GGC CAC

SKKALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




CTG

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGHLLEH






KCPECGKSFSHTGHLLEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGHLL






GEKPYKCPECGKSFSHTGHLLEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHS






QRTHTGEKPYKCPECGKSFSTSH

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16TK






SLTEHQRTHTGEKPYKCPECGKS

N5LTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTKNSLTEHQRTHTGEKPTGKK








TS








ccacgccgc
CCA CGC
1299
LEPGEKPYKCPECGKSFSSKKHL
1848
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18X19
2397


ggccacctg
CGC GGC

AEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18



ccc
CAC CTG

RNDALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17




CCC

GKSFSSKKALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGHLL






GEKPYKCPECGKSFSHTGHLLEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGH






QRTHTGEKPYKCPECGKSFSHTG

LLEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLLEHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








cggccacct
CGG CCA
1300
LEPGEKPYKCPECGKSFSRSDHL
1849
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2398


gccctaccc
CCT GCC

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18



tgg
CTA CCC

SKKHLAEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNSTLTEHX17




TGG

GKSFSQNSTLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARH






KCPECGKSFSDCRDLARHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLT






GEKPYKCPECGKSFSTKNSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHS






QRTHTGEKPYKCPECGKSFSTSH

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






SLTEHQRTHTGEKPYKCPECGKS

DKLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLTEHQRTHTGEKPTGKK








TS








acgccgcgg
ACG CCG
1301
LEPGEKPYKCPECGKSFSQNSTL
1850
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QNSTLTEHX17X18X19
2399


ccacctgcc
CGG CCA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18



cta
CCT GCC

DCRDLARHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17




CTA

GKSFSTKNSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEH






KCPECGKSFSTSHSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLT






GEKPYKCPECGKSFSRSDKLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDT






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RT






TLTEHQRTHTGEKPYKCPECGKS

DTLRDHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSRTDTLRDHQRTHTGEKPTGKK








TS








ccgcggcca
CCG CGG
1302
LEPGEKPYKCPECGKSFSSKKHL
1851
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18X19
2400


cctgcccta
CCA CCT

AEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNSTLTEHX17X18



ccc
GCC CTA

QNSTLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17




CCC

GKSFSDCRDLARHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEH






KCPECGKSFSTKNSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLT






GEKPYKCPECGKSFSTSHSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






KLTEHQRTHTGEKPYKCPECGKS

DTLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDTLTEHQRTHTGEKPTGKK








TS








ccacctgcc
CCA CCT
1303
LEPGEKPYKCPECGKSFSRSDDL
1852
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17X18X19
2401


ctaccctgg
GCC CTA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



gcg
CCC TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




GCG

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNSTLTEH






KCPECGKSFSQNSTLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLA






GEKPYKCPECGKSFSDCRDLARH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNS






QRTHTGEKPYKCPECGKSFSTKN

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






SLTEHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








cctgcccta
CCT GCC
1304
LEPGEKPYKCPECGKSFSSKKHL
1853
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18X19
2402


ccctgggcg
CTA CCC

AEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17X18



ccc
TGG gcg

RSDDLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




CCC

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH






KCPECGKSFSSKKHLAEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNSTLT






GEKPYKCPECGKSFSQNSTLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRD






QRTHTGEKPYKCPECGKSFSDCR

LARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TK






DLARHQRTHTGEKPYKCPECGKS

NSLTEHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSTKNSLTEHQRTHTGEKPTGKK








TS








gccctaccc
GCC CTA
1305
LEPGEKPYKCPECGKSFSDKKDL
1854
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DKKDLTRHX17X18X19
2403


tgggcgccc
CCC TGG

TRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18



acc
gcg CCC

SKKHLAEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17




ACC

GKSFSRSDDLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLA






GEKPYKCPECGKSFSSKKHLAEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNST






QRTHTGEKPYKCPECGKSFSQNS

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DC






TLTEHQRTHTGEKPYKCPECGKS

RDLARHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSDCRDLARHQRTHTGEKPTGKK








TS








ccctgggcg
CCC TGG
1306
LEPGEKPYKCPECGKSFSRKDNL
1855
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RKDNLKNHX17X18X19
2404


cccaccccg
gcg CCC

KNHQRTHTGEKPYKCPECGKSFS

HX20X20X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17X18



aag
ACC CCG

RNDTLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLTRHX17




AAG 

GKSFSDKKDLTRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH






KCPECGKSFSSKKHLAEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLV






GEKPYKCPECGKSFSRSDDLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16SK






HLTTHQRTHTGEKPYKCPECGKS

KHLAEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKHLAEHQRTHTGEKPTGKK








TS








tgggcgccc
TGG gcg
1307
LEPGEKPYKCPECGKSFSDCRDL
1856
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18X19
2405


accccgaag
CCC ACC

ARHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLKNHX17X18



gcc
CCG AAG

RKDNLKNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17




GCC

GKSFSRNDTLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLTRH






KCPECGKSFSDKKDLTRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLA






GEKPYKCPECGKSFSSKKHLAEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDD






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






DLVRHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








ctaccctgg
CTA CCC
1308
LEPGEKPYKCPECGKSFSRNDTL
1857
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17X18X19
2406


gcgcccacc
TGG gcg

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLTRHX17X18



ccg
CCC ACC

DKKDLTRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




CCG

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRH






KCPECGKSFSRSDDLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QN






HLAEHQRTHTGEKPYKCPECGKS

STLTEHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSQNSTLTEHQRTHTGEKPTGKK








TS








cccaccccg
CCC ACC
1309
LEPGEKPYKCPECGKSFSDCRDL
1858
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18X19
2407


aaggccccc
CCG AAG

ARHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18



gcc
GCC CCC

SKKHLAEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17




GCC

GKSFSDCRDLARHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLKNH






KCPECGKSFSRKDNLKNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLT






GEKPYKCPECGKSFSRNDTLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKD






QRTHTGEKPYKCPECGKSFSDKK

LTRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






DLTRHQRTHTGEKPYKCPECGKS

KHLAEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKHLAEHQRTHTGEKPTGKK








TS








gcgcccacc
gcg CCC
1310
LEPGEKPYKCPECGKSFSSKKHL
1859
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18X19
2408


ccgaaggcc
ACC CCG

AEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18



ccc
AAG GCC

DCRDLARHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLKNHX17




CCC

GKSFSRKDNLKNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEH






KCPECGKSFSRNDTLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLT






GEKPYKCPECGKSFSDKKDLTRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLAEHQRTHTGEKPYKCPECGKS

DDLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDDLVRHQRTHTGEKPTGKK








TS








cctaccctg
CCT ACC
1311
LEPGEKPYKCPECGKSFSSKKHL
1860
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18X19
2409


ggcgcccac
CTG GGC

AEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18



ccc
GCC CAC

SKKALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17




CCC

GKSFSDCRDLARHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALT






GEKPYKCPECGKSFSRNDALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKD






QRTHTGEKPYKCPECGKSFSDKK

LTRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TK






DLTRHQRTHTGEKPYKCPECGKS

NSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTKNSLTEHQRTHTGEKPTGKK








TS








ctgggcgcc
CTG GGC
1312
LEPGEKPYKCPECGKSFSDPGHL
1861
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2410


caccccgaa
GCC CAC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18



ggc
CCC GAA

QSSNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




GGC

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEH






KCPECGKSFSSKKALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLA






GEKPYKCPECGKSFSDCRDLARH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16RN






HLVRHQRTHTGEKPYKCPECGKS

DALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








accctgggc
ACC CTG
106
LEPGEKPYKCPECGKSFSQSSNL
107
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18X19
314


gcccacccc
GGC GCC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18



gaa
CAC CCC

SKKHLAEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17




GAA

GKSFSSKKALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARH






KCPECGKSFSDCRDLARHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DK






ALTEHQRTHTGEKPYKCPECGKS

KDLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDKKDLTRHQRTHTGEKPTGKK








TS








ggcgcccac
GGC GCC
1313
LEPGEKPYKCPECGKSFSSKKHL
1862
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18X19
2411


cccgaaggc
CAC CCC

AEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



ccc
GAA GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17




CCC

GKSFSQSSNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH






KCPECGKSFSSKKHLAEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALT






GEKPYKCPECGKSFSSKKALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRD






QRTHTGEKPYKCPECGKSFSDCR

LARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






DLARHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








caccccgaa
CAC CCC
1314
LEPGEKPYKCPECGKSFSTKNSL
1863
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18X19
2412


ggcccccgc
GAA GGC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGHLLEHX17X18



cct
CCC CGC

HTGHLLEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




CCT

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLV






GEKPYKCPECGKSFSQSSNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






HLAEHQRTHTGEKPYKCPECGKS

KALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKALTEHQRTHTGEKPTGKK








TS








gcccacccc
GCC CAC
1315
LEPGEKPYKCPECGKSFSHTGHL
1864
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HTGHLLEHX17X18X19
2413


gaaggcccc
CCC GAA

LEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18



cgc
GGC CCC

SKKHLAEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




CGC

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRH






KCPECGKSFSQSSNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLA






GEKPYKCPECGKSFSSKKHLAEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKA






QRTHTGEKPYKCPECGKSFSSKK

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DC






ALTEHQRTHTGEKPYKCPECGKS

RDLARHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDCRDLARHQRTHTGEKPTGKK








TS








cccgaaggc
CCC GAA
1316
LEPGEKPYKCPECGKSFSRNDTL
1865
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17X18X19
2414


ccccgccct
GGC CCC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18



ccg
CGC CCT

TKNSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGHLLEHX17




CCG

GKSFSHTGHLLEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH






KCPECGKSFSSKKHLAEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSN






QRTHTGEKPYKCPECGKSFSQSS

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






NLVRHQRTHTGEKPYKCPECGKS

KHLAEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKHLAEHQRTHTGEKPTGKK








TS








gaaggcccc
GAA GGC
1317
LEPGEKPYKCPECGKSFSSKKHL
1866
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18X19
2415


cgccctccg
CCC CGC

AEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17X18



ccc
CCT CCG

RNDTLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17




CCC

GKSFSTKNSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGHLLEH






KCPECGKSFSHTGHLLEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLA






GEKPYKCPECGKSFSSKKHLAEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






HLVRHQRTHTGEKPYKCPECGKS

SNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSSNLVRHQRTHTGEKPTGKK








TS








cctgggcgc
CCT ggg
1318
LEPGEKPYKCPECGKSFSRSDHL
1867
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX11X1BX19
2416


ccaccccga
CGC CCA

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGHLTEHX17X18



agg
CCC CGA

QSGHLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




AGG

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEH






KCPECGKSFSTSHSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGHLL






GEKPYKCPECGKSFSHTGHLLEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TK






KLVRHQRTHTGEKPYKCPECGKS

NSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTKNSLTEHQRTHTGEKPTGKK








TS








gggcgccca
ggg CGC
1319
LEPGEKPYKCPECGKSFSSKKHL
1868
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18X19
2417


ccccgaagg
CCA CCC

AEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



ccc
CGA AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGHLTEHX17




CCC

GKSFSQSGHLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH






KCPECGKSFSSKKHLAEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLT






GEKPYKCPECGKSFSTSHSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGH






QRTHTGEKPYKCPECGKSFSHTG

LLEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLLEHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








cgcccaccc
CGC CCA
1320
LEPGEKPYKCPECGKSFSRNDTL
1869
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17X18X19
2418


cgaaggccc
CCC CGA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18



ccg
AGG CCC

SKKHLAEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




CCG

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGHLTEH






KCPECGKSFSQSGHLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLA






GEKPYKCPECGKSFSSKKHLAEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHS






QRTHTGEKPYKCPECGKSFSTSH

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HT






SLTEHQRTHTGEKPYKCPECGKS

GHLLEHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSHTGHLLEHQRTHTGEKPTGKK








TS








ccaccccga
CCA CCC
1321
LEPGEKPYKCPECGKSFSSKKHL
1870
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18X19
2419


aggcccccg
CGA AGG

AEHQRTHTGEKPYKCPECGKSFS

HX20XX21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17X18



ccc
CCC CCG

RNDTLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




CCC

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGHLT






GEKPYKCPECGKSFSQSGHLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLAEHQRTHTGEKPYKCPECGKS

H5LTEHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








ccactgcct
CCA CTG
1322
LEPGEKPYKCPECGKSFSDCRDL
1871
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18X19
2420


cctcccagt
CCT CCT

ARHQRTHTGEKPYKCPECGKSFS

HX20XX20X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17X18



gcc
CCC AGT

HRTTLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




GCC

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEH






KCPECGKSFSTKNSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLT






GEKPYKCPECGKSFSTKNSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






ALTEHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








tgggaagat
TGG GAA
1323
LEPGEKPYKCPECGKSFSDPGAL
1872
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGALVRHX17X18X19
2421


ctgctggga
GAT CTG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



gtc
CTG GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




GTC

GKSFSRNDALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEH






KCPECGKSFSRNDALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLV






GEKPYKCPECGKSFSTSGNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSN






QRTHTGEKPYKCPECGKSFSQSS

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLVRHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








gctgggagt
GCT ggg
1324
LEPGEKPYKCPECGKSFSDCRDL
1873
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18X19
2422


cttggccta
AGT CTT

ARHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNSTLTEHX17X18



gcc
GGC CTA

QNSTLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




GCC

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEH






KCPECGKSFSTTGALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CXHX11CX12X13X14X15X16HRTTLT






GEKPYKCPECGKSFSHRTTLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






KLVRHQRTHTGEKPYKCPECGKS

GELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGELVRHQRTHTGEKPTGKK








TS








gcctagcct
GCC TAG
1325
LEPGEKPYKCPECGKSFSTSGHL
1874
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18X19
2423


ctgtgaagg
CCT CTG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



ggt
TGA AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASHX17




GGT

GKSFSQAGHLASHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEH






KCPECGKSFSRNDALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLT






GEKPYKCPECGKSFSTKNSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDN






QRTHTGEKPYKCPECGKSFSRED

LHTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DC






NLHTHQRTHTGEKPYKCPECGKS

RDLARHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDCRDLARHQRTHTGEKPTGKK








TS








tagcctctg
TAG CCT
1326
LEPGEKPYKCPECGKSFSQRAHL
1875
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18X19
2424


tgaaggggt
CTG TGA

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18



gga
AGG GGT

TSGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




GGA

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASH






KCPECGKSFSQAGHLASHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALT






GEKPYKCPECGKSFSRNDALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNS






QRTHTGEKPYKCPECGKSFSTKN

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RE






SLTEHQRTHTGEKPYKCPECGKS

DNLHTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSREDNLHTHQRTHTGEKPTGKK








TS








cctctgtga
CCT CTG
1327
LEPGEKPYKCPECGKSFSDPGHL
1876
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2425


aggggtgga
TGA AGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



ggc
GGT GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17




GGC

GKSFSTSGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLA






GEKPYKCPECGKSFSQAGHLASH

SHX17X18X19HX20X21X22X23X24X8X9CXHX11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TK






ALTEHQRTHTGEKPYKCPECGKS

NSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTKNSLTEHQRTHTGEKPTGKK








TS








ggggtggag
GGG GTG
1328
LEPGEKPYKCPECGKSFSRSDKL
1877
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2426


gctctgccg
GAG GCT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17X18



ggg
CTG CCG

RNDTLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




GGG

GKSFSRNDALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRH






KCPECGKSFSTSGELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDE






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






ELVRHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








gaaggggtg
GAA GGG
158
LEPGEKPYKCPECGKSFSRNDTL
159
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17X18X19
327


gaggctctg
GTG GAG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18



ccg
GCT CTG

RNDALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRHX17




CCG

GKSFSTSGELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRH






KCPECGKSFSRSDNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELV






GEKPYKCPECGKSFSRSDELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






KLVRHQRTHTGEKPYKCPECGKS

SNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSSNLVRHQRTHTGEKPTGKK








TS








gtggaggct
GTG GAG
1329
LEPGEKPYKCPECGKSFSRSDHL
1878
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2427


ctgccgggg
GCT CTG

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



agg
CCG ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17




AGG

GKSFSRNDTLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEH






KCPECGKSFSRNDALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELV






GEKPYKCPECGKSFSTSGELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLVRHQRTHTGEKPYKCPECGKS

DELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDELVRHQRTHTGEKPTGKK








TS








gaggctctg
GAG GCT
1330
LEPGEKPYKCPECGKSFSTSGHL
1879
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18X19
2428


ccggggagg
CTG CCG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



ggt
ggg AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




GGT

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEH






KCPECGKSFSRNDTLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALT






GEKPYKCPECGKSFSRNDALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGE






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






ELVRHQRTHTGEKPYKCPECGKS

DNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDNLVRHQRTHTGEKPTGKK








TS








gctctgccg
GCT CTG
1331
LEPGEKPYKCPECGKSFSRSDKL
1880
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2429


gggaggggt
CCG ggg

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18



ggg
AGG GGT

TSGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




GGG

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLT






GEKPYKCPECGKSFSRNDTLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






ALTEHQRTHTGEKPYKCPECGKS

GELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGELVRHQRTHTGEKPTGKK








TS








ctgccgggg
CTG CCG
1332
LEPGEKPYKCPECGKSFSTSGHL
1881
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18X19
2430


aggggtggg
ggg AGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLvRHX17X18



ggt
GGT GGG

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17




GGT 

GKSFSTSGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDT






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16RN






TLTEHQRTHTGEKPYKCPECGKS

DALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








cggggaggg
CGG GGA
1333
LEPGEKPYKCPECGKSFSTTGNL
1882
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17X18X19
2431


gtgggggtt
GGG GTG

TVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLVRHX17X18



aat
GGG GTT

TSGSLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




AAT

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRH






KCPECGKSFSRSDELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLERHQRTHTGEKPYKCPECGKS

DKLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLTEHQRTHTGEKPTGKK








TS








ggaggggtg
GGA GGG
154
LEPGEKPYKCPECGKSFSTSGHL
155
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18X19
326


ggggttaat
GTG GGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17X18



ggt
GTT AAT

TTGNLTVHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLVRHX17




GGT

GKSFSTSGSLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELV






GEKPYKCPECGKSFSRSDELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR






KLVRHQRTHTGEKPYKCPECGKS

AHLERHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRAHLERHQRTHTGEKPTGKK








TS








ggcggggct
GGC GGG
1334
LEPGEKPYKCPECGKSFSRSDHL
1883
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2432


gcagggatt
GCT GCA

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18



tgg
ggg ATT

HKNALQNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




TGG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRH






KCPECGKSFSQSGDLRRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CXHX11CX12X13X14X15X16TSGELV






GEKPYKCPECGKSFSTSGELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP 






KLVRHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








ctgggcggg
CTG GGC
1335
LEPGEKPYKCPECGKSFSHKNAL
1884
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18X19
2433


gctgcaggg
GGG GCT

QNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



att
GCA ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17




ATT

GKSFSQSGDLRRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRH






KCPECGKSFSTSGELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






HLVRHQRTHTGEKPYKCPECGKS

DALTEHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








ggggctgca
GGG GCT
1336
LEPGEKPYKCPECGKSFSRNDAL
1885
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18X19
2434


gggatttgg
GCA ggg

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



ctg
ATT TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNHX17




CTG

GKSFSHKNALQNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDDR






GEKPYKCPECGKSFSQSGDLRRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGE






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






ELVRHQRTHTGEKPYKCPECGKS

DKLVRHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








aggctgggc
AGG CTG
150
LEPGEKPYKCPECGKSFSRSDKL
151
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
325


ggggctgca
GGC GGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18



ggg
GCT GCA

QSGDLRRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13XHX15X16TSGELVRHX17




GGG

GKSFSTSGELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






ALTEHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








gtggatagg
GTG GAT
1337
LEPGEKPYKCPECGKSFSTSGEL
1886
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGELVRHX17X18X19
2435


ctgggcggg
AGG CTG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



gct
GGC GGG

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




GCT

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEH






KCPECGKSFSRNDALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGN






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLVRHQRTHTGEKPYKCPECGKS

DELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDELVRHQRTHTGEKPTGKK








TS








gataggctg
GAT AGG
1338
LEPGEKPYKCPECGKSFSQSGDL
1887
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18X19
2436


ggcggggct
CTG GGC

RRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRHX17X18



gca
GGG GCT

TSGELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




GCA

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALT






GEKPYKCPECGKSFSRNDALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLTNHQRTHTGEKPYKCPECGKS

GNLVRHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGNLVRHQRTHTGEKPTGKK








TS








ccggtggat
CCG GTG
1339
LEPGEKPYKCPECGKSFSRSDKL
1888
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2437


aggctgggc
GAT AGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



ggg
CTG GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




GGG

GKSFSRNDALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLV






GEKPYKCPECGKSFSTSGNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDE






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






ELVRHQRTHTGEKPYKCPECGKS

DTLTEHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDTLTEHQRTHTGEKPTGKK








TS








ccgccggtg
CCG CCG
1340
LEPGEKPYKCPECGKSFSDPGHL
1889
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2438


gataggctg
GTG GAT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18



ggc
AGG CTG

RNDALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




GGC

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRH






KCPECGKSFSTSGNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELV






GEKPYKCPECGKSFSRSDELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDT






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






TLTEHQRTHTGEKPYKCPECGKS

DTLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDTLTEHQRTHTGEKPTGKK








TS








cccccgccg
CCC CCG
1341
LEPGEKPYKCPECGKSFSRNDAL
1890
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18X19
2439


gtggatagg
CCG GTG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



ctg
GAT AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17




CTG

GKSFSTSGNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRH






KCPECGKSFSRSDELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLT






GEKPYKCPECGKSFSRNDTLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDT






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






TLTEHQRTHTGEKPYKCPECGKS

KHLAEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKHLAEHQRTHTGEKPTGKK








TS








ggtcccccg
GGT CCC
1342
LEPGEKPYKCPECGKSFSRSDHL
1891
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18X19
2440


ccggtggat
CCG CCG

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17X18



agg
GTG GAT

TSGNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17




AGG

GKSFSRSDELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEH






KCPECGKSFSRNDTLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLT






GEKPYKCPECGKSFSRNDTLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLAEHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGHLVRHQRTHTGEKPTGKK








TS








ataggctgg
ATA GGC
1343
LEPGEKPYKCPECGKSFSRADNL
1892
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18X19
2441


gcggggctg
TGG GCG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18



cag
ggg CTG

RNDALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




CAG

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRH






KCPECGKSFSRSDDLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGH






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16QK






HLVRHQRTHTGEKPYKCPECGKS

SSLIAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQKSSLIAHQRTHTGEKPTGKK








TS








cggtggata
CGG TGG
1344
LEPGEKPYKCPECGKSFSRSDKL
1893
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2442


ggctgggcg
ATA GGC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17X18



ggg
TGG GCG

RSDDLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




GGG 

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QKSSLI






GEKPYKCPECGKSFSQKSSLIAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLTTHQRTHTGEKPYKCPECGKS

DKLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLTEHQRTHTGEKPTGKK








TS








tggataggc
TGG ATA
1345
LEPGEKPYKCPECGKSFSRNDAL
1894
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18X19
2443


tgggcgggg
GGC TGG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



ctg
GCG ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17




CTG

GKSFSRSDDLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QKSS






QRTHTGEKPYKCPECGKSFSQKS

LIAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






SLIAHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








cgccggtgg
CGC CGG
1346
LEPGEKPYKCPECGKSFSRSDDL
1895
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDDLVRHX17X18X19
2444


ataggctgg
TGG ATA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



gcg
GGC TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




GCG

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QKSSLIAH






KCPECGKSFSQKSSLIAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HT






KLTEHQRTHTGEKPYKCPECGKS

GHLLEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSHTGHLLEHQRTHTGEKPTGKK








TS








gtcccccgc
GTC CCC
1347
LEPGEKPYKCPECGKSFSDPGHL
1896
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2445


cggtggata
CGC CGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QKSSLIAHX17X18



ggc
TGG ATA

QKSSLIAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




GGC

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEH






KCPECGKSFSRSDKLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGHLL






GEKPYKCPECGKSFSHTGHLLEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






HLAEHQRTHTGEKPYKCPECGKS

GALVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGALVRHQRTHTGEKPTGKK








TS








ccccgccgg
CCC CGC
1348
LEPGEKPYKCPECGKSFSRSDHL
1897
X1X2X3X4X5X6X7X8X9CXl0X11CX12X13X14X15X16RSDHLTTHX17XHX19
2446


tggataggc
CGG TGG

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



tgg
ATA GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QKSSLIAHX17




TGG

GKSFSQKSSLIAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLT






GEKPYKCPECGKSFSRSDKLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HTGH






QRTHTGEKPYKCPECGKSFSHTG

LLEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






HLLEHQRTHTGEKPYKCPECGKS

KHLAEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKHLAEHQRTHTGEKPTGKK








TS








ggctgggcg
GGC TGG
1349
LEPGEKPYKCPECGKSFSQRAHL
1898
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18X19
2447


gggctgcag
GCG ggg

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18



gga
CTG CAG

RADNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




GGA

GKSFSRNDALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDDLV






GEKPYKCPECGKSFSRSDDLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DP






HLTTHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








ggtggatag
GGT GGA
1350
LEPGEKPYKCPECGKSFSDPGHL
1899
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2448


gctgggcgg
TAG GCT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17X18



ggc
ggg CGG

RSDKLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




GGC

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRH






KCPECGKSFSTSGELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLH






GEKPYKCPECGKSFSREDNLHTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLERHQRTHTGEKPYKCPECGKS

GHLVRHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGHLVRHQRTHTGEKPTGKK








TS








gccggtgga
GCC GGT
1351
LEPGEKPYKCPECGKSFSRSDKL
1900
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17X18X19
2449


taggctggg
GGA TAG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



cgg
GCT ggg

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRHX17




CGG

GKSFSTSGELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLHTH






KCPECGKSFSREDNLHTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGH






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DC






HLVRHQRTHTGEKPYKCPECGKS

RDLARHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDCRDLARHQRTHTGEKPTGKK








TS








cccgccggt
CCC GCC
1352
LEPGEKPYKCPECGKSFSRSDKL
1901
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2450


ggataggct
GGT GGA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRHX17X18



ggg
TAG GCT

TSGELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLHTHX17




GGG

GKSFSREDNLHTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLV






GEKPYKCPECGKSFSTSGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRD






QRTHTGEKPYKCPECGKSFSDCR

LARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






DLARHQRTHTGEKPYKCPECGKS

KHLAEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKHLAEHQRTHTGEKPTGKK








TS








gctgacacc
GCT GAC
1353
LEPGEKPYKCPECGKSFSTTGNL
1902
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17X18X19
2451


cggggtgct
ACC CGG

TVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRHX17X18



aat
GGT GCT

TSGELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17




AAT

GKSFSTSGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEH






KCPECGKSFSRSDKLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLT






GEKPYKCPECGKSFSDKKDLTRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGN






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






NLVRHQRTHTGEKPYKCPECGKS

GELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGELVRHQRTHTGEKPTGKK








TS








ctggctgac
CTG GCT
1354
LEPGEKPYKCPECGKSFSTSGEL
1903
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGELVRHX17X18X19
2452


acccggggt
GAC ACC

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGHLVRHX17X18



gct
CGG GGT

TSGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLTEHX17




GCT

GKSFSRSDKLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLTRH






KCPECGKSFSDKKDLTRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGNLV






GEKPYKCPECGKSFSDPGNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGE






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






ELVRHQRTHTGEKPYKCPECGKS

DALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








acaactgct
ACA ACT
1355
LEPGEKPYKCPECGKSFSTHLDL
1904
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16THLDLIRHX17X18X19
2453


ggggcccta
GCT GGG

IRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNSTLTEHX17X18



act
GCC CTA

QNSTLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17




ACT

GKSFSDCRDLARHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELV






GEKPYKCPECGKSFSTSGELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLD






QRTHTGEKPYKCPECGKSFSTHL

LIRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SP






DLIRHQRTHTGEKPYKCPECGKS

ADLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSPADLTRHQRTHTGEKPTGKK








TS








ctaattaca
CTA ATT
1356
LEPGEKPYKCPECGKSFSDCRDL
1905
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18X19
2454


actgctggg
ACA ACT

ARHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18



gcc
GCT GGG

RSDKLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRHX17




GCC

GKSFSTSGELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRH






KCPECGKSFSTHLDLIRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLT






GEKPYKCPECGKSFSSPADLTRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNA






QRTHTGEKPYKCPECGKSFSHKN

LQNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QN






ALQNHQRTHTGEKPYKCPECGKS

STLTEHX17EHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQNSTLTEHQRTHTGEKPTGKK








TS








attacaact
ATT ACA
1357
LEPGEKPYKCPECGKSFSQNSTL
1906
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QNSTLTEHX17X18X19
2455


gctggggcc
ACT GCT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17X18



cta
GGG GCC

DCRDLARHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




CTA

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRH






KCPECGKSFSTSGELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLI






GEKPYKCPECGKSFSTHLDLIRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPAD






QRTHTGEKPYKCPECGKSFSSPA

LTRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HK






DLTRHQRTHTGEKPYKCPECGKS

NALQNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSHKNALQNHQRTHTGEKPTGKK








TS








gtgctaatt
gtg CTA
1358
LEPGEKPYKCPECGKSFSRSDKL
1907
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17X18X19
2456


acaactgct
ATT ACA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRHX17X18



ggg
ACT GCT

TSGELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRHX17




GGG

GKSFSTHLDLIRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRH






KCPECGKSFSSPADLTRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQ






GEKPYKCPECGKSFSHKNALQNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNST






QRTHTGEKPYKCPECGKSFSQNS

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






TLTEHQRTHTGEKPYKCPECGKS

DELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDELVRHQRTHTGEKPTGKK








TS








ggggtgcta
GGG gtg
1359
LEPGEKPYKCPECGKSFSTSGEL
1908
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGELVRHX17X18X19
2457


attacaact
CTA ATT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRHX17X18



gct
ACA ACT

THLDLIRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17




GCT

GKSFSSPADLTRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNH






KCPECGKSFSHKNALQNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNS+32






GEKPYKCPECGKSFSQNSTLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDE






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






ELVRHQRTHTGEKPYKCPECGKS

DKLVRHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS








actgctggg
ACT GCT
1360
LEPGEKPYKCPECGKSFSSKKAL
1909
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18X19
2458


gccctaact
GGG GCC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRHX17X18



cac
CTA ACT

THLDLIRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNSTLTEHX17




CAC

GKSFSQNSTLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARH






KCPECGKSFSDCRDLARHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGE






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16TH






ELVRHQRTHTGEKPYKCPECGKS

LDLIRHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSTHLDLIRHQRTHTGEKPTGKK








TS








cccggggtg
CCC GGG
1361
LEPGEKPYKCPECGKSFSTHLDL
1910
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16THLDLIRHX17X18X19
2459


ctaattaca
gtg CTA

IRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18



act
ATT ACA

SPADLTRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNHX17




ACT

GKSFSHKNALQNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNSTLTEH






KCPECGKSFSQNSTLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELV






GEKPYKCPECGKSFSRSDELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






KLVRHQRTHTGEKPYKCPECGKS

KHLAEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKHLAEHQRTHTGEKPTGKK








TS








tggctgaca
TGG CTG
1362
LEPGEKPYKCPECGKSFSQNSTL
1911
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QNSTLTEHX17X18X19
2460


cccggggtg
ACA CCC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17X18



cta
GGG gtg

RSDELVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRHX17




CTA

GKSFSRSDKLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH






KCPECGKSFSSKKHLAEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPADLT






GEKPYKCPECGKSFSSPADLTRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






ALTEHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








acacccggg
ACA CCC
1363
LEPGEKPYKCPECGKSFSSPADL
1912
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SPADLTRHX17X18X19
2461


gtgctaatt
GGG gtg

TRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18



aca
CTA ATT

HKNALQNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNSTLTEHX17




ACA

GKSFSQNSTLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRH






KCPECGKSFSRSDELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLV






GEKPYKCPECGKSFSRSDKLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SP






HLAEHQRTHTGEKPYKCPECGKS

ADLTRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSPADLTRHQRTHTGEKPTGKK








TS








ctgacaccc
CTG ACA
1364
LEPGEKPYKCPECGKSFSHKNAL
1913
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18X19
2462


ggggtgcta
CCC GGG

QNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNSTLTEHX17X18



att
gtg CTA

QNSTLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDELVRHX17




ATT

GKSFSRSDELVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDKLVRH






KCPECGKSFSRSDKLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLA






GEKPYKCPECGKSFSSKKHLAEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SPAD






QRTHTGEKPYKCPECGKSFSSPA

LTRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16RN






DLTRHQRTHTGEKPYKCPECGKS

DALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








gctggggcc
GCT GGG
1365
LEPGEKPYKCPECGKSFSQSGHL
1914
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGHLTEHX17X18X19
2463


ctaactcac
GCC CTA

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKALTEHX17X18



cga
ACT CAC

SKKALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRHX17




CGA

GKSFSTHLDLIRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNSTLTEH






KCPECGKSFSQNSTLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLA






GEKPYKCPECGKSFSDCRDLARH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






KLVRHQRTHTGEKPYKCPECGKS

GELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGELVRHQRTHTGEKPTGKK








TS








attgtacaa
ATT GTA
1366
LEPGEKPYKCPECGKSFSTSGNL
1915
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18X19
2464


ggcaggcat
CAA GGC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18



cat
AGG CAT

TSGNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




CAT

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGNLT






GEKPYKCPECGKSFSQSGNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSS






QRTHTGEKPYKCPECGKSFSQSS

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HK






SLVRHQRTHTGEKPYKCPECGKS

NALQNHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSHKNALQNHQRTHTGEKPTGKK








TS








gtacaaggc
GTA CAA
1367
LEPGEKPYKCPECGKSFSDPGNL
1916
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGNLVRHX17X18X19
2465


aggcatcat
GGC AGG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18



gac
CAT CAT

TSGNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17




GAC

GKSFSTSGNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLV






GEKPYKCPECGKSFSDPGHLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGN






QRTHTGEKPYKCPECGKSFSQSG

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






NLTEHQRTHTGEKPYKCPECGKS

SSLVRHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSSSLVRHQRTHTGEKPTGKK








TS








atgtcaccc
ATT GTC
1368
LEPGEKPYKCPECGKSFSQSGDL
1917
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18X19
2466


ccaagtcag
ACC CCA

RRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18



gca
AGT CAG

RADNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17




GCA

GKSFSHRTTLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEH






KCPECGKSFSTSHSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLT






GEKPYKCPECGKSFSDKKDLTRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGA






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HK 






ALVRHQRTHTGEKPYKCPECGKS

NALQNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSHKNALQNHQRTHTGEKPTGKK








TS








gccattgtc
GCC ATT
1369
LEPGEKPYKCPECGKSFSRADNL
1918
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18X19
2467


accccaagt
GTC ACC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17X18



cag
CCA AGT

HRTTLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17




CAG

GKSFSTSHSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLTRH






KCPECGKSFSDKKDLTRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALV






GEKPYKCPECGKSFSDPGALVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNA






QRTHTGEKPYKCPECGKSFSHKN

LQNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DC






ALQNHQRTHTGEKPYKCPECGKS

RDLARHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDCRDLARHQRTHTGEKPTGKK








TS








caagccatt
CAA GCC
1370
LEPGEKPYKCPECGKSFSHRTTL
1919
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17X18X19
2468


gtcacccca
ATT GTC

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18



agt
ACC CCA

TSHSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DKKDLTRHX17




AGT

GKSFSDKKDLTRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGALVRH






KCPECGKSFSDPGALVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQ






GEKPYKCPECGKSFSHKNALQNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRD






QRTHTGEKPYKCPECGKSFSDCR

LARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






DLARHQRTHTGEKPYKCPECGKS

GNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGNLTEHQRTHTGEKPTGKK








TS








ggcactgac
GGC ACT
1371
LEPGEKPYKCPECGKSFSRNDTL
1920
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17X18X19
2469


agcctacct
GAC AGC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18



ccg
CTA CCT

TKNSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNSTLTEHX17




CCG

GKSFSQNSTLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLREH






KCPECGKSFSERSHLREHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGNLV






GEKPYKCPECGKSFSDPGNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLD






QRTHTGEKPYKCPECGKSFSTHL

LIRHX17X18X19HX20X21X22X23X24X8X9CK10X11CX12X13X14X15X16DP






DLIRHQRTHTGEKPYKCPECGKS

GHLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDPGHLVRHQRTHTGEKPTGKK








TS








actgacagc
ACT GAC
1372
LEPGEKPYKCPECGKSFSQLAHL
1921
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QLAHLRAHX17X18X19
2470


ctacctccg
AGC CTA

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDTLTEHX17X18



aga
CCT CCG

RNDTLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17




AGA

GKSFSTKNSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNSTLTEH






KCPECGKSFSQNSTLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLR






GEKPYKCPECGKSFSERSHLREH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGN






QRTHTGEKPYKCPECGKSFSDPG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TH






NLVRHQRTHTGEKPYKCPECGKS

LDLIRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTHLDLIRHQRTHTGEKPTGKK








TS








ccaaatcat
CCA AAT
1373
LEPGEKPYKCPECGKSFSSKKHL
1922
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18X19
2471


tgacttcta
CAT TGA

AEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNSTLTEHX17X18



ccc
CTT CTA

QNSTLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEHX17




CCC

GKSFSTTGALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QAGHLASH






KCPECGKSFSQAGHLASHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLT






GEKPYKCPECGKSFSTSGNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGN






QRTHTGEKPYKCPECGKSFSTTG

LTVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






NLTVHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








gcagagata
GCA gag
1374
LEPGEKPYKCPECGKSFSTSGNL
1923
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18X19
2472


aggctgccc
ATA AGG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18



cat
CTG CCC

SKKHLAEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




CAT

GKSFSRNDALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CXHX11CX12X13X14X15X16QKSSLI






GEKPYKCPECGKSFSQKSSLIAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






NLVRHQRTHTGEKPYKCPECGKS

GDLRRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSGDLRRHQRTHTGEKPTGKK








TS








gagataagg
gag ATA
1375
LEPGEKPYKCPECGKSFSDPGHL
1924
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2473


ctgccccat
AGG CTG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17X18



ggc
CCC CAT

TSGNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




GGC

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEH






KCPECGKSFSRNDALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QKSS






QRTHTGEKPYKCPECGKSFSQKS

LIAHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






SLIAHQRTHTGEKPYKCPECGKS

DNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDNLVRHQRTHTGEKPTGKK








TS








ataaggctg
ATA AGG
1376
LEPGEKPYKCPECGKSFSTSHSL
1925
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18X19
2474


ccccatggc
CTG CCC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



cca
CAT GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEHX17




CCA

GKSFSTSGNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH






KCPECGKSFSSKKHLAEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALT






GEKPYKCPECGKSFSRNDALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QK






HLTNHQRTHTGEKPYKCPECGKS

SSLIAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQKSSLIAHQRTHTGEKPTGKK








TS








aggctgccc
AGG CTG
1377
LEPGEKPYKCPECGKSFSQSGHL
1926
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSGHLTEHX17X18X19
2475


catggccca
CCC CAT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18



cga
GGC CCA

TSHSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




CGA

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLTEH






KCPECGKSFSTSGNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLA






GEKPYKCPECGKSFSSKKHLAEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






ALTEHQRTHTGEKPYKCPECGKS

DHLTNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTNHQRTHTGEKPTGKK








TS








agagataag
AGA GAT
1378
LEPGEKPYKCPECGKSFSRSDHL
1927
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2476


gctgcccca
AAG GCT

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18



tgg
GCC CCA

TSHSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17




TGG

GKSFSDCRDLARHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELVRH






KCPECGKSFSTSGELVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDNLK






GEKPYKCPECGKSFSRKDNLKNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGN






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QL






NLVRHQRTHTGEKPYKCPECGKS

AHLRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQLAHLRAHQRTHTGEKPTGKK








TS








cccacgatt
CCC ACG
1379
LEPGEKPYKCPECGKSFSQRANL
1928
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRANLRAHX17X18X19
2477


tagaaacct
ATT TAG

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18



aaa
AAA CCT

TKNSLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAHX17




AAA

GKSFSQRANLRAHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLHTH






KCPECGKSFSREDNLHTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQ






GEKPYKCPECGKSFSHKNALQNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RTDT






QRTHTGEKPYKCPECGKSFSRTD

LRDHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






TLRDHQRTHTGEKPYKCPECGKS

KHLAEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKHLAEHQRTHTGEKPTGKK








TS








tggcccacg
TGG CCC
1380
LEPGEKPYKCPECGKSFSTKNSL
1929
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18X19
2478


atttagaaa
ACG ATT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRANLRAHX17X18



cct
TAG AAA

QRANLRAHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLHTHX17




CCT

GKSFSREDNLHTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNH






KCPECGKSFSHKNALQNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CXHX11CX12X13X14X15X16RTDTLR






GEKPYKCPECGKSFSRTDTLRDH

DHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






HLAEHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








TS








ccatggccc
CCA TGG
1381
LEPGEKPYKCPECGKSFSQRANL
1930
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRANLRAHX17X18X19
2479


acgatttag
CCC ACG

RAHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16REDNLHTHX17X18



aaa
ATT TAG

REDNLHTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNHX17




AAA

GKSFSHKNALQNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RTDTLRDH






KCPECGKSFSRTDTLRDHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLA






GEKPYKCPECGKSFSSKKHLAEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLTTHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








gccccatgg
GCC CCA
1382
LEPGEKPYKCPECGKSFSREDNL
1931
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16REDNLHTHX17X18X19
2480


cccacgatt
TGG CCC

HTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18



tag
ACG ATT

HKNALQNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RTDTLRDHX17




TAG

GKSFSRTDTLRDHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH






KCPECGKSFSSKKHLAEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHS






QRTHTGEKPYKCPECGKSFSTSH

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DC 






SLTEHQRTHTGEKPYKCPECGKS

RDLARHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDCRDLARHQRTHTGEKPTGKK








TS








gataaggct
GAT AAG
1383
LEPGEKPYKCPECGKSFSSKKHL
1932
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18X19
2481


gccccatgg
GCT GCC

AEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



ccc
CCA TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17




CCC

GKSFSTSHSLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARH






KCPECGKSFSDCRDLARHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGELV






GEKPYKCPECGKSFSTSGELVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RKDN






QRTHTGEKPYKCPECGKSFSRKD

LKNHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






NLKNHQRTHTGEKPYKCPECGKS

GNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGNLVRHQRTHTGEKPTGKK








TS








gctgcccca
GCT GCC
1384
LEPGEKPYKCPECGKSFSHKNAL
1933
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18X19
2482


tggcccacg
CCA TGG

QNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RTDTLRDHX17X18



att
CCC ACG

RTDTLRDHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




ATT

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLT






GEKPYKCPECGKSFSTSHSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRD






QRTHTGEKPYKCPECGKSFSDCR

LARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






DLARHQRTHTGEKPYKCPECGKS

GELVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGELVRHQRTHTGEKPTGKK








TS








aaggctgcc
AAG GCT
1385
LEPGEKPYKCPECGKSFSRTDTL
1934
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RTDTLRDHX17X18X19
2483


ccatggccc
GCC CCA

RDHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18



acg
TGG CCC

SKKHLAEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




ACG

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLTEH






KCPECGKSFSTSHSLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLA






GEKPYKCPECGKSFSDCRDLARH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGE






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16RK






ELVRHQRTHTGEKPYKCPECGKS

DNLKNHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRKDNLKNHQRTHTGEKPTGKK








TS








agaaaccta
AGA AAC
1386
LEPGEKPYKCPECGKSFSSKKHL
1935
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18X19
2484


aatcccagg
CTA AAT

AEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17X18



ccc
CCC AGG

RSDHLTNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




CCC

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVH






KCPECGKSFSTTGNLTVHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNSTLT






GEKPYKCPECGKSFSQNSTLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DSGN






QRTHTGEKPYKCPECGKSFSDSG

LRVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QL






NLRVHQRTHTGEKPYKCPECGKS

AHLRAHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQLAHLRAHQRTHTGEKPTGKK








TS








aacctaaat
AAC CTA
1387
LEPGEKPYKCPECGKSFSRADNL
1936
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18X19
2485


cccaggccc
AAT CCC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17X18



cag
AGG CCC

SKKHLAEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNHX17




CAG

GKSFSRSDHLTNHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH






KCPECGKSFSSKKHLAEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLT






GEKPYKCPECGKSFSTTGNLTVH

VHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QNST






QRTHTGEKPYKCPECGKSFSQNS

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DS






TLTEHQRTHTGEKPYKCPECGKS

GNLRVHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSDSGNLRVHQRTHTGEKPTGKK








TS








ctaaatccc
CTA AAT
1388
LEPGEKPYKCPECGKSFSRRDEL
1937
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RRDELNVHX17X18X19
2486


aggccccag
CCC AGG

NVHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18



atg
CCC CAG

RADNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEHX17




ATG

GKSFSSKKHLAEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTNH






KCPECGKSFSRSDHLTNHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLA






GEKPYKCPECGKSFSSKKHLAEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGN






QRTHTGEKPYKCPECGKSFSTTG

LTVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16QN






NLTVHQRTHTGEKPYKCPECGKS

STLTEHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSQNSTLTEHQRTHTGEKPTGKK








TS








aatcccagg
AAT CCC
1389
LEPGEKPYKCPECGKSFSTSHSL
1938
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSHSLTEHX17X18X19
2487


ccccagatg
AGG CCC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVHX17X18



cca
CAG ATG

RRDELNVHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEHX17




CCA

GKSFSRADNLTEHQRTHTGEKPY

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLAEH






KCPECGKSFSSKKHLAEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGKSFSRSDHLTNH

NHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TT 






HLAEHQRTHTGEKPYKCPECGKS

GNLTVHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTTGNLTVHQRTHTGEKPTGKK








TS








caggcccca
CAG GCC
1390
LEPGEKPYKCPECGKSFSTTGAL
1939
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TTGALTEHX17X18X19
2488


gatgccaat
CCA GAT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17X18



ctt
GCC AAT

TTGNLTVHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARHX17




CTT

GKSFSDCRDLARHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRH






KCPECGKSFSTSGNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHSLT






GEKPYKCPECGKSFSTSHSLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRD






QRTHTGEKPYKCPECGKSFSDCR

LARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RA






DLARHQRTHTGEKPYKCPECGKS

DNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRADNLTEHQRTHTGEKPTGKK








TS








gatgccaat
GAT GCC
1391
LEPGEKPYKCPECGKSFSTKNSL
1940
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TKNSLTEHX17X18X19
2489


cttctggat
AAT CTT

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17X18



cct
CTG GAT

TSGNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




CCT

GKSFSRNDALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEH






KCPECGKSFSTTGALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLT






GEKPYKCPECGKSFSTTGNLTVH

VHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRD






QRTHTGEKPYKCPECGKSFSDCR

LARHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






DLARHQRTHTGEKPYKCPECGKS

GNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGNLVRHQRTHTGEKPTGKK








TS








gccccagat
GCC CCA
1392
LEPGEKPYKCPECGKSFSRNDAL
1941
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18X19
2490


gccaatctt
GAT GCC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEHX17X18



ctg
AAT CTT

TTGALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17




CTG

GKSFSTTGNLTVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLARH






KCPECGKSFSDCRDLARHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGNLV






GEKPYKCPECGKSFSTSGNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSHS






QRTHTGEKPYKCPECGKSFSTSH

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DC






SLTEHQRTHTGEKPYKCPECGKS

RDLARHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSDCRDLARHQRTHTGEKPTGKK








TS








ccagatgcc
CCA GAT
1393
LEPGEKPYKCPECGKSFSTSGNL
1942
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGNLVRHX17X18X19
2491


aatcttctg
GCC AAT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16RNDALTEHX17X18



gat
CTT CTG

RNDALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGALTEHX17




GAT

GKSFTTGALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVH






KCPECGKSFSTTGNLTVHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DCRDLA






GEKPYKCPECGKSFSDCRDLARH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGN






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






NLVRHQRTHTGEKPYKCPECGKS

HSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSHSLTEHQRTHTGEKPTGKK








TS








ctgggagca
CTG GGA
1394
LEPGEKPYKCPECGKSFSQRAHL
1943
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18X19
2492


gaatggact
GCA GAA

ERHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRHX17X18



gga
TGG ACT

THLDLIRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




GGA

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRH






KCPECGKSFSQSSNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLR






GEKPYKCPECGKSFSQSGDLRRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






HLERHQRTHTGEKPYKCPECGKS

DALTEHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








cccctggga
CCC CTG
1395
LEPGEKPYKCPECGKSFSTHLDL
1944
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16THLDLIRHX17X18X19
2493


gcagaatgg
GGA GCA

IRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



act
GAA TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX11




ACT

GKSFSQSSNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRH






KCPECGKSFSQSGDLRRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






ALTEHQRTHTGEKPYKCPECGKS

KHLAEHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKHLAEHQRTHTGEKPTGKK








TS








ggagcagaa
GGA GCA
1396
LEPGEKPYKCPECGKSFSHRTTL
1945
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HRTTLTNHX17X18X19
2494


tggactgga
GAA TGG

TNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16QRAHLERHX17X18



agt
ACT GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16THLDLIRHX17




AGT

GKSFSTHLDLIRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLV






GEKPYKCPECGKSFSQSSNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGD






QRTHTGEKPYKCPECGKSFSQSG

LRRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QR 






DLRRHQRTHTGEKPYKCPECGKS

AHLERHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRAHLERHQRTHTGEKPTGKK








TS








gttcccctg
GTT CCC
1397
LEPGEKPYKCPECGKSFSRSDHL
1946
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2495


ggagcagaa
CTG GGA

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18



tgg
GCA GAA

QSSNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17




TGG

GKSFSQSGDLRRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALT






GEKPYKCPECGKSFSRNDALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKH






QRTHTGEKPYKCPECGKSFSSKK

LAEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLAEHQRTHTGEKPYKCPECGKS

GSLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGSLVRHQRTHTGEKPTGKK








TS








ccggttccc
CCG GTT
1398
LEPGEKPYKCPECGKSFSQSSNL
1947
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18X19
2496


ctgggagca
CCC CTG

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSGDLRRHX17X18



gaa
GGA GCA

QSGDLRRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17




GAA

GKSFSQRAHLERHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEH






KCPECGKSFSRNDALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SKKHLA






GEKPYKCPECGKSFSSKKHLAEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGS






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






SLVRHQRTHTGEKPYKCPECGKS

DTLTEHX17X13X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDTLTEHQRTHTGEKPTGKK








TS








tgggagcag
TGG gag
1399
LEPGEKPYKCPECGKSFSQSSNL
1948
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18X19
2497


aatggactg
CAG AAT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18



gaa
GGA CTG

RNDALTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17




GAA

GKSFSQRAHLERHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVH






KCPECGKSFSTTGNLTVHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLT






GEKPYKCPECGKSFSRADNLTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDN






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLVRHQRTHTGEKPYKCPECGKS

DHLTTHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDHLTTHQRTHTGEKPTGKK








ccctgggag
CCC TGG
1400
LEPGEKPYKCPECGKSFSRNDAL
1949
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RNDALTEHX17X18X19
2498


cagaatgga
gag CAG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERHX17X18



ctg
AAT GGA

QRAHLERHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLTVHX17




CTG

GKSFSTTGNLTVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADNLTEH






KCPECGKSFSRADNLTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16SK






HLTTHQRTHTGEKPYKCPECGKS

KHLAEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSSKKHLAEHQRTHTGEKPTGKK








TS








ctggaagtt
CTG GAA
1401
LEPGEKPYKCPECGKSFSRADNL
1950
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RADNLTEHX17X18X19
2499


tgggagggc
GTT TGG

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18



cag
GAG GGC

DPGHLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17




CAG

GKSFSRSDNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTH






KCPECGKSFSRSDHLTTHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLV






GEKPYKCPECGKSFSTSGSLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSN






QRTHTGEKPYKCPECGKSFSQSS

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RN






NLVRHQRTHTGEKPYKCPECGKS

DALTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRNDALTEHQRTHTGEKPTGKK








TS








gaagtttgg
GAA GTT
1402
LEPGEKPYKCPECGKSFSHKNAL
1951
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16HKNALQNHX17X18X19
2500


gagggccag
TGG GAG

QNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16RADNLTEHX17X18



att
GGC CAG

RADNLTEHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17




ATT

GKSFSDPGHLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRH






KCPECGKSFSRSDNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLT






GEKPYKCPECGXSFSRSDHLTTH

THX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGS






QRTHTGEKPYKCPECGKSFSTSG

LVRHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QS






SLVRHQRTHTGEKPYKCPECGKS

SNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQSSNLVRHQRTHTGEKPTGKK








TS








gtttgggag
GTT TGG
1403
LEPGEKPYKCPECGKSFSSKKAL
1952
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGSLVRHX17X18X19
2501


ggccagatt
GAG GGC

TEHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18



cac
CAG ATT

HKNALQNHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




CAC

GKSFSRADNLTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSDPGHLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLV






GEKPYKCPECGKSFSRSDNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDH






QRTHTGEKPYKCPECGKSFSRSD

LTTHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TS






HLTTHQRTHTGEKPYKCPECGKS

GSLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTSGSLVRHQRTHTGEKPTGKK








TS








gagcagaat
gag CAG
1404
LEPGEKPYKCPECGKSFSTSGSL
1953
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16TSGSLVRHX17X18X19
2502


ggactggaa
AAT GGA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17X18



gtt
CTG GAA

QSSNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEHX17




GTT

GKSFSRNDALTEHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLERH






KCPECGKSFSQRAHLERHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGNLT






GEKPYKCPECGKSFSTTGNLTVH

VHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RADN






QRTHTGEKPYKCPECGKSFSRAD

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RS






NLTEHQRTHTGEKPYKCPECGKS

DNLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDNLVRHQRTHTGEKPTGKK








TS








aatggactg
AAT GGA
1405
LEPGEKPYKCPECGKSFSRSDNL
1954
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18X19
2503


gaagtttgg
CTG GAA

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



gag
GTT TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLVRHX17




GAG

GKSFSTSGSLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRH






KCPECGKSFSQSSNLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALT






GEKPYKCPECGKSFSRNDALTEH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAH






QRTHTGEKPYKCPECGKSFSQRA

LERHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TT






HLERHQRTHTGEKPYKCPECGKS

GNLTVHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTTGNLTVHQRTHTGEKPTGKK








TS








ggactggaa
GGA CTG
1406
LEPGEKPYKCPECGKSFSDPGHL
1955
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16DPGHLVRHX17X18X19
2504


gtttgggag
GAA GTT

VRHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDNLVRHX17X18



ggc
TGG GAG

RSDNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17




GGC

GKSFSRSDHLTTHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLVRH






KCPECGKSFSTSGSLVRHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLV






GEKPYKCPECGKSFSQSSNLVRH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDA






QRTHTGEKPYKCPECGKSFSRND

LTEHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12XEX14X15X16QR






ALTEHQRTHTGEKPYKCPECGKS

AHLERHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSQRAHLERHQRTHTGEKPTGKK








TS








cagaatgga
CAG AAT
1407
LEPGEKPYKCPECGKSFSRSDHL
1956
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2505


ctggaagtt
GGA CTG

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TSGSLVRHX17X18



tgg
GAA GTT

TSGSLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QSSNLVRHX17




TGG

GKSFSQSSNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RNDALTEH






KCPECGKSFSRNDALTEHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QRAHLE






GEKPYKCPECGKSFSQRAHLERH

RHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16TTGN






QRTHTGEKPYKCPECGKSFSTTG

LTVHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RA






NLTVHQRTHTGEKPYKCPECGKS

DNLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRADNLTEHQRTHTGEKPTGKK








TS








cctgggagc
CCT ggg
1408
LEPGEKPYKCPECGKSFSRSDHL
1957
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18X19
2506


agaatggac
AGC AGA

TTHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGNLVRHX17X18



tgg
ATG GAC

DPGNLVRHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVHX17




TGG

GKSFSRRDELNVHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLRAH






KCPECGKSFSQLAHLRAHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSHLR






GEKPYKCPECGKSFSERSHLREH

EHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDK






QRTHTGEKPYKCPECGKSFSRSD

LVRHX17X18X19HX20X21X22X23X24K8X9CX10X11CX12X13X14X15X16TK






KLVRHQRTHTGEKPYKCPECGKS

NSLTEHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSTKNSLTEHQRTHTGEKPTGKK








TS








gggagcaga
ggg AGC
1409
LEPGEKPYKCPECGKSFSRKDNL
1958
X1X2X3X4X5X6X7X8X9CX10X11CX12X13X14X15X16RKDNLKNHX17X18X19
2507


atggactgg
AGA ATG

KNHQRTHTGEKPYKCPECGKSFS

HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RSDHLTTHX17X18



aag
GAC TGG

RSDHLTTHQRTHTGEKPYKCPEC

X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16DPGNLVRHX17




AAG

GKSFSDPGNLVRHQRTHTGEKPY

X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16RRDELNVH






KCPECGKSFSRRDELNVHQRTHT

X17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16QLAHLR






GEKPYKCPECGKSFSQLAHLRAH

AHX17X18X19HX20X21X22X23X24X8X9CX10X11CX12X13X14X15X16ERSH






QRTHTGEKPYKCPECGKSFSERS

LREHX17X18X19HX20X21X22X23X24X8X9CX10K11CX12X13X14X15X16RS






HLREHQRTHTGEKPYKCPECGKS

DKLVRHX17X18X19HX20X21X22X23X24X25X26X27X28X29X30






FSRSDKLVRHQRTHTGEKPTGKK








TS

















Informal Sequence Listing



>dCas9-VPR Protein


SEQ ID NO.: 95



MAPKKKRKVGIHGVPAADKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRT






ARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKAD





LRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEK





KNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAP





LSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLL





RKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVV





DKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLK





EDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVM





KQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSP





AIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLY





YLQNGRDMYVDQELDINRLSDYDVAAIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKF





DNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINN





YHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLI





ETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVV





AKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALP





SKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHL





FTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDKRPAATKKAGQAKKKKGRADALDDFDL





DMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKK





SPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPA





MVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQG





IPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKP





EAGSAISDVFEGREVCQPKRIRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDE





ETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHI





STGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA





>dCas9-VPR mRNA


SEQ ID NO.: 96



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGACAAGAAGUACAGCAUCGGCCUGGCCAUCGGCACCAACAGCGUGGGCUGGGCCGUGAUCACCGACGA





GUACAAGGUGCCCAGCAAGAAGUUCAAGGUGCUGGGCAACACCGACCGGCACAGCAUCAAGAAGAACCUGAUCGGCGCCCUGC





UGUUCGACAGCGGCGAGACCGCCGAGGCCACCCGGCUGAAGCGGACCGCCCGGCGGCGGUACACCCGGCGGAAGAACCGGAUC





UGCUACCUGCAGGAGAUCUUCAGCAACGAGAUGGCCAAGGUGGACGACAGCUUCUUCCACCGGCUGGAGGAGAGCUUCCUGGU





GGAGGAGGACAAGAAGCACGAGCGGCACCCCAUCUUCGGCAACAUCGUGGACGAGGUGGCCUACCACGAGAAGUACCCCACCA





UCUACCACCUGCGGAAGAAGCUGGUGGACAGCACCGACAAGGCCGACCUGCGGCUGAUCUACCUGGCCCUGGCCCACAUGAUC





AAGUUCCGGGGCCACUUCCUGAUCGAGGGCGACCUGAACCCCGACAACAGCGACGUGGACAAGCUGUUCAUCCAGCUGGUGCA





GACCUACAACCAGCUGUUCGAGGAGAACCCCAUCAACGCCAGCGGCGUGGACGCCAAGGCCAUCCUGAGCGCCCGGCUGAGCA





AGAGCCGGCGGCUGGAGAACCUGAUCGCCCAGCUGCCCGGCGAGAAGAAGAACGGCCUGUUCGGCAACCUGAUCGCCCUGAGC





CUGGGCCUGACCCCCAACUUCAAGAGCAACUUCGACCUGGCCGAGGACGCCAAGCUGCAGCUGAGCAAGGACACCUACGACGA





CGACCUGGACAACCUGCUGGCCCAGAUCGGCGACCAGUACGCCGACCUGUUCCUGGCCGCCAAGAACCUGAGCGACGCCAUCC





UGCUGAGCGACAUCCUGCGGGUGAACACCGAGAUCACCAAGGCCCCCCUGAGCGCCAGCAUGAUCAAGCGGUACGACGAGCAC





CACCAGGACCUGACCCUGCUGAAGGCCCUGGUGCGGCAGCAGCUGCCCGAGAAGUACAAGGAGAUCUUCUUCGACCAGAGCAA





GAACGGCUACGCCGGCUACAUCGACGGCGGCGCCAGCCAGGAGGAGUUCUACAAGUUCAUCAAGCCCAUCCUGGAGAAGAUGG





ACGGCACCGAGGAGCUGCUGGUGAAGCUGAACCGGGAGGACCUGCUGCGGAAGCAGCGGACCUUCGACAACGGCAGCAUCCCC





CACCAGAUCCACCUGGGCGAGCUGCACGCCAUCCUGCGGCGGCAGGAGGACUUCUACCCCUUCCUGAAGGACAACCGGGAGAA





GAUCGAGAAGAUCCUGACCUUCCGGAUCCCCUACUACGUGGGCCCCCUGGCCCGGGGCAACAGCCGGUUCGCCUGGAUGACCC





GGAAAUCCGAGGAGACCAUCACCCCCUGGAACUUCGAGGAGGUGGUGGACAAGGGCGCCAGCGCCCAGAGCUUCAUCGAGCGG





AUGACCAACUUCGACAAGAACCUGCCCAACGAGAAGGUGCUGCCCAAGCACAGCCUGCUGUACGAGUACUUCACCGUGUACAA





CGAGCUGACCAAGGUGAAGUACGUGACCGAGGGCAUGCGGAAGCCCGCCUUCCUGAGCGGCGAGCAGAAGAAGGCCAUCGUGG





ACCUGCUGUUCAAGACCAACCGGAAGGUGACCGUGAAGCAGCUGAAGGAGGACUACUUCAAGAAGAUCGAGUGCUUCGACAGC





GUGGAGAUCAGCGGCGUGGAGGACCGGUUCAACGCCAGCCUGGGCACCUACCACGACCUGCUGAAGAUCAUCAAGGACAAGGA





CUUCCUGGACAACGAGGAGAACGAGGACAUCCUGGAGGACAUCGUGCUGACCCUGACCCUGUUCGAGGACCGGGAGAUGAUCG





AGGAGCGGCUGAAAACCUACGCCCACCUGUUCGACGACAAGGUGAUGAAGCAGCUGAAGCGGCGGCGGUACACCGGCUGGGGC





CGGCUGAGCCGGAAGCUGAUCAACGGCAUCCGGGACAAGCAGAGCGGCAAGACCAUCCUGGACUUCCUGAAAUCCGACGGCUU





CGCCAACCGGAACUUCAUGCAGCUGAUCCACGACGACAGCCUGACCUUCAAGGAGGACAUCCAGAAGGCCCAGGUGAGCGGCC





AGGGCGACAGCCUGCACGAGCACAUCGCCAACCUGGCCGGCAGCCCCGCCAUCAAGAAGGGCAUCCUGCAGACCGUGAAGGUG





GUGGACGAGCUGGUGAAGGUGAUGGGCCGGCACAAGCCCGAGAACAUCGUGAUCGAGAUGGCCCGGGAGAACCAGACCACCCA





GAAGGGCCAGAAGAACAGCCGGGAGCGGAUGAAGCGGAUCGAGGAGGGCAUCAAGGAGCUGGGCAGCCAGAUCCUGAAGGAGC





ACCCCGUGGAGAACACCCAGCUGCAGAACGAGAAGCUGUACCUGUACUACCUGCAGAACGGCCGGGACAUGUACGUGGACCAG





GAGCUGGACAUCAACCGGCUGAGCGACUACGACGUGGCCGCCAUCGUGCCCCAGAGCUUCCUGAAGGACGACAGCAUCGACAA





CAAGGUGCUGACCCGGAGCGACAAGGCCCGGGGCAAGAGCGACAACGUGCCCAGCGAGGAGGUGGUGAAGAAGAUGAAGAACU





ACUGGCGGCAGCUGCUGAACGCCAAGCUGAUCACCCAGCGGAAGUUCGACAACCUGACCAAGGCCGAGCGGGGCGGCCUGAGC





GAGCUGGACAAGGCCGGCUUCAUCAAGCGGCAGCUGGUGGAGACCCGGCAGAUCACCAAGCACGUGGCCCAGAUCCUGGACAG





CCGGAUGAACACCAAGUACGACGAGAACGACAAGCUGAUCCGGGAGGUGAAGGUGAUCACCCUGAAAUCCAAGCUGGUGAGCG





ACUUCCGGAAGGACUUCCAGUUCUACAAGGUGCGGGAGAUCAACAACUACCACCACGCCCACGACGCCUACCUGAACGCCGUG





GUGGGCACCGCCCUGAUCAAGAAGUACCCCAAGCUGGAGAGCGAGUUCGUGUACGGCGACUACAAGGUGUACGACGUGCGGAA





GAUGAUCGCCAAGAGCGAGCAGGAGAUCGGCAAGGCCACCGCCAAGUACUUCUUCUACAGCAACAUCAUGAACUUCUUCAAGA





CCGAGAUCACCCUGGCCAACGGCGAGAUCCGGAAGCGGCCCCUGAUCGAGACCAACGGCGAGACCGGCGAGAUCGUGUGGGAC





AAGGGCCGGGACUUCGCCACCGUGCGGAAGGUGCUGAGCAUGCCCCAGGUGAACAUCGUGAAGAAAACCGAGGUGCAGACCGG





CGGCUUCAGCAAGGAGAGCAUCCUGCCCAAGCGGAACAGCGACAAGCUGAUCGCCCGGAAGAAGGACUGGGACCCCAAGAAGU





ACGGCGGCUUCGACAGCCCCACCGUGGCCUACAGCGUGCUGGUGGUGGCCAAGGUGGAGAAGGGCAAGAGCAAGAAGCUGAAA





UCCGUGAAGGAGCUGCUGGGCAUCACCAUCAUGGAGCGGAGCAGCUUCGAGAAGAACCCCAUCGACUUCCUGGAGGCCAAGGG





CUACAAGGAGGUGAAGAAGGACCUGAUCAUCAAGCUGCCCAAGUACAGCCUGUUCGAGCUGGAGAACGGCCGGAAGCGGAUGC





UGGCCAGCGCCGGCGAGCUGCAGAAGGGCAACGAGCUGGCCCUGCCCAGCAAGUACGUGAACUUCCUGUACCUGGCCAGCCAC





UACGAGAAGCUGAAGGGCAGCCCCGAGGACAACGAGCAGAAGCAGCUGUUCGUGGAGCAGCACAAGCACUACCUGGACGAGAU





CAUCGAGCAGAUCAGCGAGUUCAGCAAGCGGGUGAUCCUGGCCGACGCCAACCUGGACAAGGUGCUGAGCGCCUACAACAAGC





ACCGGGACAAGCCCAUCCGGGAGCAGGCCGAGAACAUCAUCCACCUGUUCACCCUGACCAACCUGGGCGCCCCCGCCGCCUUC





AAGUACUUCGACACCACCAUCGACCGGAAGCGGUACACCAGCACCAAGGAGGUGCUGGACGCCACCCUGAUCCACCAGAGCAU





CACCGGCCUGUACGAGACCCGGAUCGACCUGAGCCAGCUGGGCGGCGACAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGG





CCAAGAAGAAGAAGGGCCGGGCCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGAC





CUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGA





CAUGCUGAGCGGCGGCCCCAAGAAGAAGCGGAAGGUGGGCAGCCAGUACCUGCCCGACACCGACGACCGGCACCGGAUCGAGG





AGAAGCGGAAGCGGACCUACGAGACCUUCAAGAGCAUCAUGAAGAAAUCCCCCUUCAGCGGCCCCACCGACCCCCGGCCCCCC





CCCCGGCGGAUCGCCGUGCCCAGCCGGAGCAGCGCCAGCGUGCCCAAGCCCGCCCCCCAGCCCUACCCCUUCACCAGCAGCCU





GAGCACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCAGCGGCCAGAUCAGCCAGGCCAGCGCCCUGGCCCCCGCCC





CCCCCCAGGUGCUGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCAUGGUGAGCGCCCUGGCCCAGGCCCCCGCCCCCGUG





CCCGUGCUGGCCCCCGGCCCCCCCCAGGCCGUGGCCCCCCCCGCCCCCAAGCCCACCCAGGCCGGCGAGGGCACCCUGAGCGA





GGCCCUGCUGCAGCUGCAGUUCGACGACGAGGACCUGGGCGCCCUGCUGGGCAACAGCACCGACCCCGCCGUGUUCACCGACC





UGGCCAGCGUGGACAACAGCGAGUUCCAGCAGCUGCUGAACCAGGGCAUCCCCGUGGCCCCCCACACCACCGAGCCCAUGCUG





AUGGAGUACCCCGAGGCCAUCACCCGGCUGGUGACCGGCGCCCAGCGGCCCCCCGACCCCGCCCCCGCCCCCCUGGGCGCCCC





CGGCCUGCCCAACGGCCUGCUGAGCGGCGACGAGGACUUCAGCAGCAUCGCCGACAUGGACUUCAGCGCCCUGCUGGGCAGCG





GCAGCGGCAGCCGGGACAGCCGGGAGGGCAUGUUCCUGCCCAAGCCCGAGGCCGGCAGCGCCAUCAGCGACGUGUUCGAGGGC





CGGGAGGUGUGCCAGCCCAAGCGGAUCCGGCCCUUCCACCCCCCCGGCAGCCCCUGGGCCAACCGGCCCCUGCCCGCCAGCCU





GGCCCCCACCCCCACCGGCCCCGUGCACGAGCCCGUGGGCAGCCUGACCCCCGCCCCCGUGCCCCAGCCCCUGGACCCCGCCC





CCGCCGUGACCCCCGAGGCCAGCCACCUGCUGGAGGACCCCGACGAGGAGACCAGCCAGGCCGUGAAGGCCCUGCGGGAGAUG





GCCGACACCGUGAUCCCCCAGAAGGAGGAGGCCGCCAUCUGCGGCCAGAUGGACCUGAGCCACCCCCCCCCCCGGGGCCACCU





GGACGAGCUGACCACCACCCUGGAGAGCAUGACCGAGGACCUGAACCUGGACAGCCCCCUGACCCCCGAGCUGAACGAGAUCC





UGGACACCUUCCUGAACGACGAGUGCCUGCUGCACGCCAUGCACAUCAGCACCGGCCUGAGCAUCUUCGACACCAGCCUGUUC





AGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGA





CUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>dCas9 amino acid sequence


SEQ ID NO.: 97



DKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQE






IFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRGH





FLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTP





NFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLT





LLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHL





GELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFD





KNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISG





VEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRK





LINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELV





KVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDIN





RLSDYDVAAIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKA





GFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTAL





IKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDF





ATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKEL





LGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLK





GSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDT





TIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGD





>dCas9 mRNA sequence


SEQ ID NO.: 300



GACAAGAAGUACAGCAUCGGCCUGGCCAUCGGCACCAACAGCGUGGGCUGGGCCGUGAUCACCGACGAGUACAAGGUGCCCAG






CAAGAAGUUCAAGGUGCUGGGCAACACCGACCGGCACAGCAUCAAGAAGAACCUGAUCGGCGCCCUGCUGUUCGACAGCGGCG





AGACCGCCGAGGCCACCCGGCUGAAGCGGACCGCCCGGCGGCGGUACACCCGGCGGAAGAACCGGAUCUGCUACCUGCAGGAG





AUCUUCAGCAACGAGAUGGCCAAGGUGGACGACAGCUUCUUCCACCGGCUGGAGGAGAGCUUCCUGGUGGAGGAGGACAAGAA





GCACGAGCGGCACCCCAUCUUCGGCAACAUCGUGGACGAGGUGGCCUACCACGAGAAGUACCCCACCAUCUACCACCUGCGGA





AGAAGCUGGUGGACAGCACCGACAAGGCCGACCUGCGGCUGAUCUACCUGGCCCUGGCCCACAUGAUCAAGUUCCGGGGCCAC





UUCCUGAUCGAGGGCGACCUGAACCCCGACAACAGCGACGUGGACAAGCUGUUCAUCCAGCUGGUGCAGACCUACAACCAGCU





GUUCGAGGAGAACCCCAUCAACGCCAGCGGCGUGGACGCCAAGGCCAUCCUGAGCGCCCGGCUGAGCAAGAGCCGGCGGCUGG





AGAACCUGAUCGCCCAGCUGCCCGGCGAGAAGAAGAACGGCCUGUUCGGCAACCUGAUCGCCCUGAGCCUGGGCCUGACCCCC





AACUUCAAGAGCAACUUCGACCUGGCCGAGGACGCCAAGCUGCAGCUGAGCAAGGACACCUACGACGACGACCUGGACAACCU





GCUGGCCCAGAUCGGCGACCAGUACGCCGACCUGUUCCUGGCCGCCAAGAACCUGAGCGACGCCAUCCUGCUGAGCGACAUCC





UGCGGGUGAACACCGAGAUCACCAAGGCCCCCCUGAGCGCCAGCAUGAUCAAGCGGUACGACGAGCACCACCAGGACCUGACC





CUGCUGAAGGCCCUGGUGCGGCAGCAGCUGCCCGAGAAGUACAAGGAGAUCUUCUUCGACCAGAGCAAGAACGGCUACGCCGG





CUACAUCGACGGCGGCGCCAGCCAGGAGGAGUUCUACAAGUUCAUCAAGCCCAUCCUGGAGAAGAUGGACGGCACCGAGGAGC





UGCUGGUGAAGCUGAACCGGGAGGACCUGCUGCGGAAGCAGCGGACCUUCGACAACGGCAGCAUCCCCCACCAGAUCCACCUG





GGCGAGCUGCACGCCAUCCUGCGGCGGCAGGAGGACUUCUACCCCUUCCUGAAGGACAACCGGGAGAAGAUCGAGAAGAUCCU





GACCUUCCGGAUCCCCUACUACGUGGGCCCCCUGGCCCGGGGCAACAGCCGGUUCGCCUGGAUGACCCGGAAAUCCGAGGAGA





CCAUCACCCCCUGGAACUUCGAGGAGGUGGUGGACAAGGGCGCCAGCGCCCAGAGCUUCAUCGAGCGGAUGACCAACUUCGAC





AAGAACCUGCCCAACGAGAAGGUGCUGCCCAAGCACAGCCUGCUGUACGAGUACUUCACCGUGUACAACGAGCUGACCAAGGU





GAAGUACGUGACCGAGGGCAUGCGGAAGCCCGCCUUCCUGAGCGGCGAGCAGAAGAAGGCCAUCGUGGACCUGCUGUUCAAGA





CCAACCGGAAGGUGACCGUGAAGCAGCUGAAGGAGGACUACUUCAAGAAGAUCGAGUGCUUCGACAGCGUGGAGAUCAGCGGC





GUGGAGGACCGGUUCAACGCCAGCCUGGGCACCUACCACGACCUGCUGAAGAUCAUCAAGGACAAGGACUUCCUGGACAACGA





GGAGAACGAGGACAUCCUGGAGGACAUCGUGCUGACCCUGACCCUGUUCGAGGACCGGGAGAUGAUCGAGGAGCGGCUGAAAA





CCUACGCCCACCUGUUCGACGACAAGGUGAUGAAGCAGCUGAAGCGGCGGCGGUACACCGGCUGGGGCCGGCUGAGCCGGAAG





CUGAUCAACGGCAUCCGGGACAAGCAGAGCGGCAAGACCAUCCUGGACUUCCUGAAAUCCGACGGCUUCGCCAACCGGAACUU





CAUGCAGCUGAUCCACGACGACAGCCUGACCUUCAAGGAGGACAUCCAGAAGGCCCAGGUGAGCGGCCAGGGCGACAGCCUGC





ACGAGCACAUCGCCAACCUGGCCGGCAGCCCCGCCAUCAAGAAGGGCAUCCUGCAGACCGUGAAGGUGGUGGACGAGCUGGUG





AAGGUGAUGGGCCGGCACAAGCCCGAGAACAUCGUGAUCGAGAUGGCCCGGGAGAACCAGACCACCCAGAAGGGCCAGAAGAA





CAGCCGGGAGCGGAUGAAGCGGAUCGAGGAGGGCAUCAAGGAGCUGGGCAGCCAGAUCCUGAAGGAGCACCCCGUGGAGAACA





CCCAGCUGCAGAACGAGAAGCUGUACCUGUACUACCUGCAGAACGGCCGGGACAUGUACGUGGACCAGGAGCUGGACAUCAAC





CGGCUGAGCGACUACGACGUGGCCGCCAUCGUGCCCCAGAGCUUCCUGAAGGACGACAGCAUCGACAACAAGGUGCUGACCCG





GAGCGACAAGGCCCGGGGCAAGAGCGACAACGUGCCCAGCGAGGAGGUGGUGAAGAAGAUGAAGAACUACUGGCGGCAGCUGC





UGAACGCCAAGCUGAUCACCCAGCGGAAGUUCGACAACCUGACCAAGGCCGAGCGGGGCGGCCUGAGCGAGCUGGACAAGGCC





GGCUUCAUCAAGCGGCAGCUGGUGGAGACCCGGCAGAUCACCAAGCACGUGGCCCAGAUCCUGGACAGCCGGAUGAACACCAA





GUACGACGAGAACGACAAGCUGAUCCGGGAGGUGAAGGUGAUCACCCUGAAAUCCAAGCUGGUGAGCGACUUCCGGAAGGACU





UCCAGUUCUACAAGGUGCGGGAGAUCAACAACUACCACCACGCCCACGACGCCUACCUGAACGCCGUGGUGGGCACCGCCCUG





AUCAAGAAGUACCCCAAGCUGGAGAGCGAGUUCGUGUACGGCGACUACAAGGUGUACGACGUGCGGAAGAUGAUCGCCAAGAG





CGAGCAGGAGAUCGGCAAGGCCACCGCCAAGUACUUCUUCUACAGCAACAUCAUGAACUUCUUCAAGACCGAGAUCACCCUGG





CCAACGGCGAGAUCCGGAAGCGGCCCCUGAUCGAGACCAACGGCGAGACCGGCGAGAUCGUGUGGGACAAGGGCCGGGACUUC





GCCACCGUGCGGAAGGUGCUGAGCAUGCCCCAGGUGAACAUCGUGAAGAAAACCGAGGUGCAGACCGGCGGCUUCAGCAAGGA





GAGCAUCCUGCCCAAGCGGAACAGCGACAAGCUGAUCGCCCGGAAGAAGGACUGGGACCCCAAGAAGUACGGCGGCUUCGACA





GCCCCACCGUGGCCUACAGCGUGCUGGUGGUGGCCAAGGUGGAGAAGGGCAAGAGCAAGAAGCUGAAAUCCGUGAAGGAGCUG





CUGGGCAUCACCAUCAUGGAGCGGAGCAGCUUCGAGAAGAACCCCAUCGACUUCCUGGAGGCCAAGGGCUACAAGGAGGUGAA





GAAGGACCUGAUCAUCAAGCUGCCCAAGUACAGCCUGUUCGAGCUGGAGAACGGCCGGAAGCGGAUGCUGGCCAGCGCCGGCG





AGCUGCAGAAGGGCAACGAGCUGGCCCUGCCCAGCAAGUACGUGAACUUCCUGUACCUGGCCAGCCACUACGAGAAGCUGAAG





GGCAGCCCCGAGGACAACGAGCAGAAGCAGCUGUUCGUGGAGCAGCACAAGCACUACCUGGACGAGAUCAUCGAGCAGAUCAG





CGAGUUCAGCAAGCGGGUGAUCCUGGCCGACGCCAACCUGGACAAGGUGCUGAGCGCCUACAACAAGCACCGGGACAAGCCCA





UCCGGGAGCAGGCCGAGAACAUCAUCCACCUGUUCACCCUGACCAACCUGGGCGCCCCCGCCGCCUUCAAGUACUUCGACACC





ACCAUCGACCGGAAGCGGUACACCAGCACCAAGGAGGUGCUGGACGCCACCCUGAUCCACCAGAGCAUCACCGGCCUGUACGA





GACCCGGAUCGACCUGAGCCAGCUGGGCGGCGAC





>dCas9-P300 protein


SEQ ID NO.: 98



MAPKKKRKVGIHGVPAADKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRT






ARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKAD





LRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEK





KNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAP





LSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLL





RKQRTEDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVV





DKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLK





EDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVM





KQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSP





AIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLY





YLQNGRDMYVDQELDINRLSDYDVAAIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKF





DNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINN





YHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLI





ETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDEKKYGGFDSPTVAYSVLVV





AKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALP





SKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHL





FTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDKRPAATKKAGQAKKKKGRAIFKPEELR





QALMPTLEALYRQDPESLPFRQPVDPQLLGIPDYFDIVKSPMDLSTIKRKLDTGQYQEPWQYVDDIWLMFNNAWLYNRKTSRV





YKYCSKLSEVFEQEIDPVMQSLGYCCGRKLEFSPQTLCCYGKQLCTIPRDATYYSYQNRYHFCEKCFNEIQGESVSLGDDPSQ





PQTTINKEQFSKRKNDTLDPELFVECTECGRKMHQICVLHHEIIWPAGFVCDGCLKKSARTRKENKFSAKRLPSTRLGTFLEN





RVNDFLRRQNHPESGEVTVRVVHASDKTVEVKPGMKARFVDSGEMAESFPYRTKALFAFEEIDGVDLCFFGMHVQEYGSDCPP





PNQRRVYISYLDSVHFFREKCLRTAVYHEILIGYLEYVKKLGYTTGHIWACPPSEGDDYIFHCHPPDQKIPKPKRLQEWYKKM





LDKAVSERIVHDYKDIFKQATEDRLTSAKELPYFEGDFWPNVLEESIKELEQEEEERKREENTSNESTDVTKGDSKNAKKKNN





KKTSKNKSSLSRGNKKKPGMPNVSNDLSQKLYATMEKHKEVFFVIRLIAGPAANSLPPIVDPDPLIPCDLMDGRDAFLTLARD





KHLEFSSLRRAQWSTMCMLVELHTQSQDSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>dCas9-p300 mRNA


SEQ ID NO.: 99



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGACAAGAAGUACAGCAUCGGCCUGGCCAUCGGCACCAACAGCGUGGGCUGGGCCGUGAUCACCGACGA





GUACAAGGUGCCCAGCAAGAAGUUCAAGGUGCUGGGCAACACCGACCGGCACAGCAUCAAGAAGAACCUGAUCGGCGCCCUGC





UGUUCGACAGCGGCGAGACCGCCGAGGCCACCCGGCUGAAGCGGACCGCCCGGCGGCGGUACACCCGGCGGAAGAACCGGAUC





UGCUACCUGCAGGAGAUCUUCAGCAACGAGAUGGCCAAGGUGGACGACAGCUUCUUCCACCGGCUGGAGGAGAGCUUCCUGGU





GGAGGAGGACAAGAAGCACGAGCGGCACCCCAUCUUCGGCAACAUCGUGGACGAGGUGGCCUACCACGAGAAGUACCCCACCA





UCUACCACCUGCGGAAGAAGCUGGUGGACAGCACCGACAAGGCCGACCUGCGGCUGAUCUACCUGGCCCUGGCCCACAUGAUC





AAGUUCCGGGGCCACUUCCUGAUCGAGGGCGACCUGAACCCCGACAACAGCGACGUGGACAAGCUGUUCAUCCAGCUGGUGCA





GACCUACAACCAGCUGUUCGAGGAGAACCCCAUCAACGCCAGCGGCGUGGACGCCAAGGCCAUCCUGAGCGCCCGGCUGAGCA





AGAGCCGGCGGCUGGAGAACCUGAUCGCCCAGCUGCCCGGCGAGAAGAAGAACGGCCUGUUCGGCAACCUGAUCGCCCUGAGC





CUGGGCCUGACCCCCAACUUCAAGAGCAACUUCGACCUGGCCGAGGACGCCAAGCUGCAGCUGAGCAAGGACACCUACGACGA





CGACCUGGACAACCUGCUGGCCCAGAUCGGCGACCAGUACGCCGACCUGUUCCUGGCCGCCAAGAACCUGAGCGACGCCAUCC





UGCUGAGCGACAUCCUGCGGGUGAACACCGAGAUCACCAAGGCCCCCCUGAGCGCCAGCAUGAUCAAGCGGUACGACGAGCAC





CACCAGGACCUGACCCUGCUGAAGGCCCUGGUGCGGCAGCAGCUGCCCGAGAAGUACAAGGAGAUCUUCUUCGACCAGAGCAA





GAACGGCUACGCCGGCUACAUCGACGGCGGCGCCAGCCAGGAGGAGUUCUACAAGUUCAUCAAGCCCAUCCUGGAGAAGAUGG





ACGGCACCGAGGAGCUGCUGGUGAAGCUGAACCGGGAGGACCUGCUGCGGAAGCAGCGGACCUUCGACAACGGCAGCAUCCCC





CACCAGAUCCACCUGGGCGAGCUGCACGCCAUCCUGCGGCGGCAGGAGGACUUCUACCCCUUCCUGAAGGACAACCGGGAGAA





GAUCGAGAAGAUCCUGACCUUCCGGAUCCCCUACUACGUGGGCCCCCUGGCCCGGGGCAACAGCCGGUUCGCCUGGAUGACCC





GGAAAUCCGAGGAGACCAUCACCCCCUGGAACUUCGAGGAGGUGGUGGACAAGGGCGCCAGCGCCCAGAGCUUCAUCGAGCGG





AUGACCAACUUCGACAAGAACCUGCCCAACGAGAAGGUGCUGCCCAAGCACAGCCUGCUGUACGAGUACUUCACCGUGUACAA





CGAGCUGACCAAGGUGAAGUACGUGACCGAGGGCAUGCGGAAGCCCGCCUUCCUGAGCGGCGAGCAGAAGAAGGCCAUCGUGG





ACCUGCUGUUCAAGACCAACCGGAAGGUGACCGUGAAGCAGCUGAAGGAGGACUACUUCAAGAAGAUCGAGUGCUUCGACAGC





GUGGAGAUCAGCGGCGUGGAGGACCGGUUCAACGCCAGCCUGGGCACCUACCACGACCUGCUGAAGAUCAUCAAGGACAAGGA





CUUCCUGGACAACGAGGAGAACGAGGACAUCCUGGAGGACAUCGUGCUGACCCUGACCCUGUUCGAGGACCGGGAGAUGAUCG





AGGAGCGGCUGAAAACCUACGCCCACCUGUUCGACGACAAGGUGAUGAAGCAGCUGAAGCGGCGGCGGUACACCGGCUGGGGC





CGGCUGAGCCGGAAGCUGAUCAACGGCAUCCGGGACAAGCAGAGCGGCAAGACCAUCCUGGACUUCCUGAAAUCCGACGGCUU





CGCCAACCGGAACUUCAUGCAGCUGAUCCACGACGACAGCCUGACCUUCAAGGAGGACAUCCAGAAGGCCCAGGUGAGCGGCC





AGGGCGACAGCCUGCACGAGCACAUCGCCAACCUGGCCGGCAGCCCCGCCAUCAAGAAGGGCAUCCUGCAGACCGUGAAGGUG





GUGGACGAGCUGGUGAAGGUGAUGGGCCGGCACAAGCCCGAGAACAUCGUGAUCGAGAUGGCCCGGGAGAACCAGACCACCCA





GAAGGGCCAGAAGAACAGCCGGGAGCGGAUGAAGCGGAUCGAGGAGGGCAUCAAGGAGCUGGGCAGCCAGAUCCUGAAGGAGC





ACCCCGUGGAGAACACCCAGCUGCAGAACGAGAAGCUGUACCUGUACUACCUGCAGAACGGCCGGGACAUGUACGUGGACCAG





GAGCUGGACAUCAACCGGCUGAGCGACUACGACGUGGCCGCCAUCGUGCCCCAGAGCUUCCUGAAGGACGACAGCAUCGACAA





CAAGGUGCUGACCCGGAGCGACAAGGCCCGGGGCAAGAGCGACAACGUGCCCAGCGAGGAGGUGGUGAAGAAGAUGAAGAACU





ACUGGCGGCAGCUGCUGAACGCCAAGCUGAUCACCCAGCGGAAGUUCGACAACCUGACCAAGGCCGAGCGGGGCGGCCUGAGC





GAGCUGGACAAGGCCGGCUUCAUCAAGCGGCAGCUGGUGGAGACCCGGCAGAUCACCAAGCACGUGGCCCAGAUCCUGGACAG





CCGGAUGAACACCAAGUACGACGAGAACGACAAGCUGAUCCGGGAGGUGAAGGUGAUCACCCUGAAAUCCAAGCUGGUGAGCG





ACUUCCGGAAGGACUUCCAGUUCUACAAGGUGCGGGAGAUCAACAACUACCACCACGCCCACGACGCCUACCUGAACGCCGUG





GUGGGCACCGCCCUGAUCAAGAAGUACCCCAAGCUGGAGAGCGAGUUCGUGUACGGCGACUACAAGGUGUACGACGUGCGGAA





GAUGAUCGCCAAGAGCGAGCAGGAGAUCGGCAAGGCCACCGCCAAGUACUUCUUCUACAGCAACAUCAUGAACUUCUUCAAGA





CCGAGAUCACCCUGGCCAACGGCGAGAUCCGGAAGCGGCCCCUGAUCGAGACCAACGGCGAGACCGGCGAGAUCGUGUGGGAC





AAGGGCCGGGACUUCGCCACCGUGCGGAAGGUGCUGAGCAUGCCCCAGGUGAACAUCGUGAAGAAAACCGAGGUGCAGACCGG





CGGCUUCAGCAAGGAGAGCAUCCUGCCCAAGCGGAACAGCGACAAGCUGAUCGCCCGGAAGAAGGACUGGGACCCCAAGAAGU





ACGGCGGCUUCGACAGCCCCACCGUGGCCUACAGCGUGCUGGUGGUGGCCAAGGUGGAGAAGGGCAAGAGCAAGAAGCUGAAA





UCCGUGAAGGAGCUGCUGGGCAUCACCAUCAUGGAGCGGAGCAGCUUCGAGAAGAACCCCAUCGACUUCCUGGAGGCCAAGGG





CUACAAGGAGGUGAAGAAGGACCUGAUCAUCAAGCUGCCCAAGUACAGCCUGUUCGAGCUGGAGAACGGCCGGAAGCGGAUGC





UGGCCAGCGCCGGCGAGCUGCAGAAGGGCAACGAGCUGGCCCUGCCCAGCAAGUACGUGAACUUCCUGUACCUGGCCAGCCAC





UACGAGAAGCUGAAGGGCAGCCCCGAGGACAACGAGCAGAAGCAGCUGUUCGUGGAGCAGCACAAGCACUACCUGGACGAGAU





CAUCGAGCAGAUCAGCGAGUUCAGCAAGCGGGUGAUCCUGGCCGACGCCAACCUGGACAAGGUGCUGAGCGCCUACAACAAGC





ACCGGGACAAGCCCAUCCGGGAGCAGGCCGAGAACAUCAUCCACCUGUUCACCCUGACCAACCUGGGCGCCCCCGCCGCCUUC





AAGUACUUCGACACCACCAUCGACCGGAAGCGGUACACCAGCACCAAGGAGGUGCUGGACGCCACCCUGAUCCACCAGAGCAU





CACCGGCCUGUACGAGACCCGGAUCGACCUGAGCCAGCUGGGCGGCGACAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGG





CCAAGAAGAAGAAGGGCCGGGCCAUCUUCAAGCCCGAGGAGCUGCGGCAGGCCCUGAUGCCCACCCUGGAGGCCCUGUACCGG





CAGGACCCCGAGAGCCUGCCCUUCCGGCAGCCCGUGGACCCCCAGCUGCUGGGCAUCCCCGACUACUUCGACAUCGUGAAAUC





CCCCAUGGACCUGAGCACCAUCAAGCGGAAGCUGGACACCGGCCAGUACCAGGAGCCCUGGCAGUACGUGGACGACAUCUGGC





UGAUGUUCAACAACGCCUGGCUGUACAACCGGAAAACCAGCCGGGUGUACAAGUACUGCAGCAAGCUGAGCGAGGUGUUCGAG





CAGGAGAUCGACCCCGUGAUGCAGAGCCUGGGCUACUGCUGCGGCCGGAAGCUGGAGUUCAGCCCCCAGACCCUGUGCUGCUA





CGGCAAGCAGCUGUGCACCAUCCCCCGGGACGCCACCUACUACAGCUACCAGAACCGGUACCACUUCUGCGAGAAGUGCUUCA





ACGAGAUCCAGGGCGAGAGCGUGAGCCUGGGCGACGACCCCAGCCAGCCCCAGACCACCAUCAACAAGGAGCAGUUCAGCAAG





CGGAAGAACGACACCCUGGACCCCGAGCUGUUCGUGGAGUGCACCGAGUGCGGCCGGAAGAUGCACCAGAUCUGCGUGCUGCA





CCACGAGAUCAUCUGGCCCGCCGGCUUCGUGUGCGACGGCUGCCUGAAGAAAUCCGCCCGGACCCGGAAGGAGAACAAGUUCA





GCGCCAAGCGGCUGCCCAGCACCCGGCUGGGCACCUUCCUGGAGAACCGGGUGAACGACUUCCUGCGGCGGCAGAACCACCCC





GAGAGCGGCGAGGUGACCGUGCGGGUGGUGCACGCCAGCGACAAGACCGUGGAGGUGAAGCCCGGCAUGAAGGCCCGGUUCGU





GGACAGCGGCGAGAUGGCCGAGAGCUUCCCCUACCGGACCAAGGCCCUGUUCGCCUUCGAGGAGAUCGACGGCGUGGACCUGU





GCUUCUUCGGCAUGCACGUGCAGGAGUACGGCAGCGACUGCCCCCCCCCCAACCAGCGGCGGGUGUACAUCAGCUACCUGGAC





AGCGUGCACUUCUUCCGGCCCAAGUGCCUGCGGACCGCCGUGUACCACGAGAUCCUGAUCGGCUACCUGGAGUACGUGAAGAA





GCUGGGCUACACCACCGGCCACAUCUGGGCCUGCCCCCCCAGCGAGGGCGACGACUACAUCUUCCACUGCCACCCCCCCGACC





AGAAGAUCCCCAAGCCCAAGCGGCUGCAGGAGUGGUACAAGAAGAUGCUGGACAAGGCCGUGAGCGAGCGGAUCGUGCACGAC





UACAAGGACAUCUUCAAGCAGGCCACCGAGGACCGGCUGACCAGCGCCAAGGAGCUGCCCUACUUCGAGGGCGACUUCUGGCC





CAACGUGCUGGAGGAGAGCAUCAAGGAGCUGGAGCAGGAGGAGGAGGAGCGGAAGCGGGAGGAGAACACCAGCAACGAGAGCA





CCGACGUGACCAAGGGCGACAGCAAGAACGCCAAGAAGAAGAACAACAAGAAAACCAGCAAGAACAAGAGCAGCCUGAGCCGG





GGCAACAAGAAGAAGCCCGGCAUGCCCAACGUGAGCAACGACCUGAGCCAGAAGCUGUACGCCACCAUGGAGAAGCACAAGGA





GGUGUUCUUCGUGAUCCGGCUGAUCGCCGGCCCCGCCGCCAACAGCCUGCCCCCCAUCGUGGACCCCGACCCCCUGAUCCCCU





GCGACCUGAUGGACGGCCGGGACGCCUUCCUGACCCUGGCCCGGGACAAGCACCUGGAGUUCAGCAGCCUGCGGCGGGCCCAG





UGGAGCACCAUGUGCAUGCUGGUGGAGCUGCACACCCAGAGCCAGGACAGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGC





CGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGACUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUC





UGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUGUACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAA





GUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAA





>ZF1-VPR protein


SEQ ID NO.: 100



MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSRSDNLVRHQRTHTGEKPYKCPECGKSFSRSDHLTNHQRTHTGE






KPYKCPECGKSFSDPGHLVRHQRTHTGEKPYKCPECGKSFSQRAHLERHQRTHTGEKPYKCPECGKSFSSRRTCRAHQRTHTG





EKPYKCPECGKSFSRSDKLTEHQRTHTGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDL





DMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKK





SPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPA





MVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQG





IPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKP





EAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDE





ETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHI





STGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





ZF1-VPR mRNA


SEQ ID No.: 101



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCCCUGGAGCCGGGUGAGAAGCCAUACAAGUGCCCAGAAUGCGGCAAGAGCUUUAG





CCGCAGCGAUAAUCUGGUUCGUCAUCAGCGCACGCAUACGGGUGAGAAACCGUACAAAUGUCCAGAAUGCGGCAAAAGCUUUA





GUCGCAGCGAUCAUCUGACGAAUCACCAGCGCACCCAUACCGGCGAAAAACCGUACAAGUGCCCGGAGUGCGGUAAAAGCUUC





AGCGACCCGGGUCAUCUGGUGCGCCACCAACGCACGCACACCGGUGAGAAACCAUAUAAAUGUCCAGAGUGCGGCAAGAGUUU





UAGCCAGCGUGCCCAUCUGGAACGUCAUCAGCGUACCCACACGGGUGAAAAACCAUAUAAGUGCCCGGAGUGCGGUAAGAGUU





UUAGUAGCCGCCGUACGUGCCGUGCGCACCAACGCACCCACACCGGUGAAAAGCCAUACAAGUGUCCGGAAUGCGGCAAGAGC





UUCAGCCGCAGCGACAAACUCACCGAACAUCAACGUACCCAUACGGGCGAGAAGCCGUACAAAUGCCCAGAAUGUGGCAAAAG





CUUCAGUCGCAACGAUACGCUGACCGAGCAUCAGCGUACGCACACCGGCGAAAAGCCAACCGGCAAGAAAACCAGCGCUAGCG





GCAGCGGCGGCGGCAGCGGCGGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGAC





CUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGA





CAUGCUGAGCGGCGGCCCCAAGAAGAAGCGGAAGGUGGGCAGCCAGUACCUGCCCGACACCGACGACCGGCACCGGAUCGAGG





AGAAGCGGAAGCGGACCUACGAGACCUUCAAGAGCAUCAUGAAGAAAUCCCCCUUCAGCGGCCCCACCGACCCCCGGCCCCCC





CCCCGGCGGAUCGCCGUGCCCAGCCGGAGCAGCGCCAGCGUGCCCAAGCCCGCCCCCCAGCCCUACCCCUUCACCAGCAGCCU





GAGCACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCAGCGGCCAGAUCAGCCAGGCCAGCGCCCUGGCCCCCGCCC





CCCCCCAGGUGCUGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCAUGGUGAGCGCCCUGGCCCAGGCCCCCGCCCCCGUG





CCCGUGCUGGCCCCCGGCCCCCCCCAGGCCGUGGCCCCCCCCGCCCCCAAGCCCACCCAGGCCGGCGAGGGCACCCUGAGCGA





GGCCCUGCUGCAGCUGCAGUUCGACGACGAGGACCUGGGCGCCCUGCUGGGCAACAGCACCGACCCCGCCGUGUUCACCGACC





UGGCCAGCGUGGACAACAGCGAGUUCCAGCAGCUGCUGAACCAGGGCAUCCCCGUGGCCCCCCACACCACCGAGCCCAUGCUG





AUGGAGUACCCCGAGGCCAUCACCCGGCUGGUGACCGGCGCCCAGCGGCCCCCCGACCCCGCCCCCGCCCCCCUGGGCGCCCC





CGGCCUGCCCAACGGCCUGCUGAGCGGCGACGAGGACUUCAGCAGCAUCGCCGACAUGGACUUCAGCGCCCUGCUGGGCAGCG





GCAGCGGCAGCCGGGACAGCCGGGAGGGCAUGUUCCUGCCCAAGCCCGAGGCCGGCAGCGCCAUCAGCGACGUGUUCGAGGGC





CGGGAGGUGUGCCAGCCCAAGCGGCUCCGGCCCUUCCACCCCCCCGGCAGCCCCUGGGCCAACCGGCCCCUGCCCGCCAGCCU





GGCCCCCACCCCCACCGGCCCCGUGCACGAGCCCGUGGGCAGCCUGACCCCCGCCCCCGUGCCCCAGCCCCUGGACCCCGCCC





CCGCCGUGACCCCCGAGGCCAGCCACCUGCUGGAGGACCCCGACGAGGAGACCAGCCAGGCCGUGAAGGCCCUGCGGGAGAUG





GCCGACACCGUGAUCCCCCAGAAGGAGGAGGCCGCCAUCUGCGGCCAGAUGGACCUGAGCCACCCCCCCCCCCGGGGCCACCU





GGACGAGCUGACCACCACCCUGGAGAGCAUGACCGAGGACCUGAACCUGGACAGCCCCCUGACCCCCGAGCUGAACGAGAUCC





UGGACACCUUCCUGAACGACGAGUGCCUGCUGCACGCCAUGCACAUCAGCACCGGCCUGAGCAUCUUCGACACCAGCCUGUUC





AGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGA





CUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF1 target sequence


SEQ ID NO.: 102



CCGCGGCGTGGAGGCAGGGAG






>ZF1 amino acid sequence


SEQ ID NO.: 103



LEPGEKPYKCPECGKSFSRSDNLVRHQRTHTGEKPYKCPECGKSFSRSDHLTNHQRTHTGEKPYKCPECGKSFSDPGHLVRHQ






RTHTGEKPYKCPECGKSFSQRAHLERHQRTHTGEKPYKCPECGKSFSSRRTCRAHQRTHTGEKPYKCPECGKSFSRSDKLTEH





QRTHTGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPTGKKTS





>ZF1 mRNA sequence


SEQ ID NO.: 211



CUGGAGCCGGGUGAGAAGCCAUACAAGUGCCCAGAAUGCGGCAAGAGCUUUAGCCGCAGCGAUAAUCUGGUUCGUCAUCAGCG






CACGCAUACGGGUGAGAAACCGUACAAAUGUCCAGAAUGCGGCAAAAGCUUUAGUCGCAGCGAUCAUCUGACGAAUCACCAGC





GCACCCAUACCGGCGAAAAACCGUACAAGUGCCCGGAGUGCGGUAAAAGCUUCAGCGACCCGGGUCAUCUGGUGCGCCACCAA





CGCACGCACACCGGUGAGAAACCAUAUAAAUGUCCAGAGUGCGGCAAGAGUUUUAGCCAGCGUGCCCAUCUGGAACGUCAUCA





GCGUACCCACACGGGUGAAAAACCAUAUAAGUGCCCGGAGUGCGGUAAGAGUUUUAGUAGCCGCCGUACGUGCCGUGCGCACC





AACGCACCCACACCGGUGAAAAGCCAUACAAGUGUCCGGAAUGCGGCAAGAGCUUCAGCCGCAGCGACAAACUCACCGAACAU





CAACGUACCCAUACGGGCGAGAAGCCGUACAAAUGCCCAGAAUGUGGCAAAAGCUUCAGUCGCAACGAUACGCUGACCGAGCA





UCAGCGUACGCACACCGGCGAAAAGCCAACCGGCAAGAAAACCAGC





>ZF2-VPR protein


SEQ ID NO.: 104



MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSQSSNLVRHQRTHTGEKPYKCPECGKSFSSKKHLAEHQRTHTGE






KPYKCPECGKSFSSKKALTEHQRTHTGEKPYKCPECGKSFSDCRDLARHQRTHTGEKPYKCPECGKSFSDPGHLVRHQRTHTG





EKPYKCPECGKSFSRNDALTEHQRTHTGEKPYKCPECGKSFSDKKDLTRHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDL





DMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKK





SPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPA





MVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQG





IPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKP





EAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDE





ETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHI





STGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>ZF2-VPR mRNA


SEQ ID NO.: 105



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCCCUGGAACCGGGCGAGAAACCGUACAAGUGCCCAGAAUGCGGUAAAAGCUUCAG





CCAGAGCAGUAAUCUGGUUCGUCACCAGCGCACCCACACGGGUGAAAAGCCAUACAAAUGUCCAGAGUGUGGUAAGAGUUUCA





GUAGCAAAAAGCAUCUGGCGGAACACCAACGUACGCAUACGGGUGAAAAGCCGUACAAGUGUCCGGAAUGUGGCAAGAGCUUU





AGCAGCAAGAAGGCGCUGACCGAACAUCAGCGUACCCAUACCGGUGAAAAACCAUACAAGUGCCCGGAGUGCGGCAAAAGUUU





CAGCGAUUGUCGCGAUCUGGCCCGUCAUCAACGCACCCACACCGGCGAGAAACCAUAUAAGUGUCCGGAGUGCGGUAAAAGCU





UUAGCGAUCCGGGCCAUCUGGUUCGCCACCAACGCACGCACACCGGCGAGAAACCGUAUAAGUGCCCAGAGUGCGGUAAGAGU





UUUAGCCGUAACGAUGCGCUGACGGAGCAUCAGCGCACGCACACGGGCGAGAAACCAUAUAAAUGCCCGGAAUGUGGUAAGAG





CUUCAGUGACAAAAAGGAUCUGACCCGCCAUCAACGUACGCAUACGGGCGAGAAGCCAACCGGCAAGAAAACCAGCGCUAGCG





GCAGCGGCGGCGGCAGCGGCGGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGAC





CUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGA





CAUGCUGAGCGGCGGCCCCAAGAAGAAGCGGAAGGUGGGCAGCCAGUACCUGCCCGACACCGACGACCGGCACCGGAUCGAGG





AGAAGCGGAAGCGGACCUACGAGACCUUCAAGAGCAUCAUGAAGAAAUCCCCCUUCAGCGGCCCCACCGACCCCCGGCCCCCC





CCCCGGCGGAUCGCCGUGCCCAGCCGGAGCAGCGCCAGCGUGCCCAAGCCCGCCCCCCAGCCCUACCCCUUCACCAGCAGCCU





GAGCACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCAGCGGCCAGAUCAGCCAGGCCAGCGCCCUGGCCCCCGCCC





CCCCCCAGGUGCUGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCAUGGUGAGCGCCCUGGCCCAGGCCCCCGCCCCCGUG





CCCGUGCUGGCCCCCGGCCCCCCCCAGGCCGUGGCCCCCCCCGCCCCCAAGCCCACCCAGGCCGGCGAGGGCACCCUGAGCGA





GGCCCUGCUGCAGCUGCAGUUCGACGACGAGGACCUGGGCGCCCUGCUGGGCAACAGCACCGACCCCGCCGUGUUCACCGACC





UGGCCAGCGUGGACAACAGCGAGUUCCAGCAGCUGCUGAACCAGGGCAUCCCCGUGGCCCCCCACACCACCGAGCCCAUGCUG





AUGGAGUACCCCGAGGCCAUCACCCGGCUGGUGACCGGCGCCCAGCGGCCCCCCGACCCCGCCCCCGCCCCCCUGGGCGCCCC





CGGCCUGCCCAACGGCCUGCUGAGCGGCGACGAGGACUUCAGCAGCAUCGCCGACAUGGACUUCAGCGCCCUGCUGGGCAGCG





GCAGCGGCAGCCGGGACAGCCGGGAGGGCAUGUUCCUGCCCAAGCCCGAGGCCGGCAGCGCCAUCAGCGACGUGUUCGAGGGC





CGGGAGGUGUGCCAGCCCAAGCGGCUCCGGCCCUUCCACCCCCCCGGCAGCCCCUGGGCCAACCGGCCCCUGCCCGCCAGCCU





GGCCCCCACCCCCACCGGCCCCGUGCACGAGCCCGUGGGCAGCCUGACCCCCGCCCCCGUGCCCCAGCCCCUGGACCCCGCCC





CCGCCGUGACCCCCGAGGCCAGCCACCUGCUGGAGGACCCCGACGAGGAGACCAGCCAGGCCGUGAAGGCCCUGCGGGAGAUG





GCCGACACCGUGAUCCCCCAGAAGGAGGAGGCCGCCAUCUGCGGCCAGAUGGACCUGAGCCACCCCCCCCCCCGGGGCCACCU





GGACGAGCUGACCACCACCCUGGAGAGCAUGACCGAGGACCUGAACCUGGACAGCCCCCUGACCCCCGAGCUGAACGAGAUCC





UGGACACCUUCCUGAACGACGAGUGCCUGCUGCACGCCAUGCACAUCAGCACCGGCCUGAGCAUCUUCGACACCAGCCUGUUC





AGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGA





CUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAA





>ZF2 target sequence


SEQ ID NO.: 106



ACCCTGGGCGCCCACCCCGAA






>ZF2 amino acid sequence


SEQ ID NO.: 107



LEPGEKPYKCPECGKSFSQSSNLVRHQRTHTGEKPYKCPECGKSFSSKKHLAEHQRTHTGEKPYKCPECGKSFSSKKALTEHQ






RTHTGEKPYKCPECGKSFSDCRDLARHQRTHTGEKPYKCPECGKSFSDPGHLVRHQRTHTGEKPYKCPECGKSFSRNDALTEH





QRTHTGEKPYKCPECGKSFSDKKDLTRHQRTHTGEKPTGKKTS





>ZF2 mRNA sequence


SEQ ID NO.: 212



CUGGAACCGGGCGAGAAACCGUACAAGUGCCCAGAAUGCGGUAAAAGCUUCAGCCAGAGCAGUAAUCUGGUUCGUCACCAGCG






CACCCACACGGGUGAAAAGCCAUACAAAUGUCCAGAGUGUGGUAAGAGUUUCAGUAGCAAAAAGCAUCUGGCGGAACACCAAC





GUACGCAUACGGGUGAAAAGCCGUACAAGUGUCCGGAAUGUGGCAAGAGCUUUAGCAGCAAGAAGGCGCUGACCGAACAUCAG





CGUACCCAUACCGGUGAAAAACCAUACAAGUGCCCGGAGUGCGGCAAAAGUUUCAGCGAUUGUCGCGAUCUGGCCCGUCAUCA





ACGCACCCACACCGGCGAGAAACCAUAUAAGUGUCCGGAGUGCGGUAAAAGCUUUAGCGAUCCGGGCCAUCUGGUUCGCCACC





AACGCACGCACACCGGCGAGAAACCGUAUAAGUGCCCAGAGUGCGGUAAGAGUUUUAGCCGUAACGAUGCGCUGACGGAGCAU





CAGCGCACGCACACGGGCGAGAAACCAUAUAAAUGCCCGGAAUGUGGUAAGAGCUUCAGUGACAAAAAGGAUCUGACCCGCCA





UCAACGUACGCAUACGGGCGAGAAGCCAACCGGCAAGAAAACCAGC





>ZF3-VPR protein


SEQ ID No.: 108



MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSRSDNLVRHQRTHTGEKPYKCPECGKSFSQLAHLRAHQRTHTGE






KPYKCPECGKSFSDPGHLVRHQRTHTGEKPYKCPECGKSFSQSSNLVRHQRTHTGEKPYKCPECGKSFSRKDNLKNHQRTHTG





EKPYKCPECGKSFSQAGHLASHQRTHTGEKPYKCPECGKSFSTSGSLVRHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDL





DMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKK





SPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPA





MVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQG





IPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKP





EAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDE





ETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHI





STGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>ZF3-VPR mRNA


SEQ ID NO.: 109



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCCCUCGAACCGGGCGAGAAACCAUAUAAGUGUCCGGAAUGUGGUAAAAGUUUUAG





CCGCAGCGAUAAUCUCGUGCGUCACCAGCGUACGCAUACCGGUGAGAAGCCAUACAAGUGUCCGGAGUGUGGCAAAAGCUUCA





GUCAGCUGGCGCAUCUGCGCGCGCAUCAGCGCACCCACACCGGUGAGAAACCGUACAAGUGUCCAGAAUGCGGCAAAAGCUUU





AGCGAUCCGGGUCAUCUGGUGCGUCAUCAACGUACGCACACGGGCGAAAAACCGUACAAAUGUCCGGAGUGCGGCAAGAGCUU





CAGCCAGAGCAGCAAUCUGGUUCGCCACCAGCGUACGCACACCGGUGAAAAGCCAUACAAGUGCCCGGAGUGCGGCAAGAGUU





UCAGUCGCAAGGACAAUCUGAAGAACCAUCAACGCACCCAUACGGGCGAGAAGCCGUACAAAUGUCCGGAAUGCGGUAAAAGU





UUUAGCCAAGCCGGUCAUCUGGCCAGCCAUCAGCGUACCCAUACGGGUGAGAAACCGUAUAAAUGUCCAGAAUGUGGUAAGAG





UUUCAGCACCAGCGGUAGUCUGGUUCGUCAUCAACGCACGCAUACGGGUGAAAAACCAACCGGCAAGAAAACCAGCGCUAGCG





GCAGCGGCGGCGGCAGCGGCGGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGAC





CUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGA





CAUGCUGAGCGGCGGCCCCAAGAAGAAGCGGAAGGUGGGCAGCCAGUACCUGCCCGACACCGACGACCGGCACCGGAUCGAGG





AGAAGCGGAAGCGGACCUACGAGACCUUCAAGAGCAUCAUGAAGAAAUCCCCCUUCAGCGGCCCCACCGACCCCCGGCCCCCC





CCCCGGCGGAUCGCCGUGCCCAGCCGGAGCAGCGCCAGCGUGCCCAAGCCCGCCCCCCAGCCCUACCCCUUCACCAGCAGCCU





GAGCACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCAGCGGCCAGAUCAGCCAGGCCAGCGCCCUGGCCCCCGCCC





CCCCCCAGGUGCUGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCAUGGUGAGCGCCCUGGCCCAGGCCCCCGCCCCCGUG





CCCGUGCUGGCCCCCGGCCCCCCCCAGGCCGUGGCCCCCCCCGCCCCCAAGCCCACCCAGGCCGGCGAGGGCACCCUGAGCGA





GGCCCUGCUGCAGCUGCAGUUCGACGACGAGGACCUGGGCGCCCUGCUGGGCAACAGCACCGACCCCGCCGUGUUCACCGACC





UGGCCAGCGUGGACAACAGCGAGUUCCAGCAGCUGCUGAACCAGGGCAUCCCCGUGGCCCCCCACACCACCGAGCCCAUGCUG





AUGGAGUACCCCGAGGCCAUCACCCGGCUGGUGACCGGCGCCCAGCGGCCCCCCGACCCCGCCCCCGCCCCCCUGGGCGCCCC





CGGCCUGCCCAACGGCCUGCUGAGCGGCGACGAGGACUUCAGCAGCAUCGCCGACAUGGACUUCAGCGCCCUGCUGGGCAGCG





GCAGCGGCAGCCGGGACAGCCGGGAGGGCAUGUUCCUGCCCAAGCCCGAGGCCGGCAGCGCCAUCAGCGACGUGUUCGAGGGC





CGGGAGGUGUGCCAGCCCAAGCGGCUCCGGCCCUUCCACCCCCCCGGCAGCCCCUGGGCCAACCGGCCCCUGCCCGCCAGCCU





GGCCCCCACCCCCACCGGCCCCGUGCACGAGCCCGUGGGCAGCCUGACCCCCGCCCCCGUGCCCCAGCCCCUGGACCCCGCCC





CCGCCGUGACCCCCGAGGCCAGCCACCUGCUGGAGGACCCCGACGAGGAGACCAGCCAGGCCGUGAAGGCCCUGCGGGAGAUG





GCCGACACCGUGAUCCCCCAGAAGGAGGAGGCCGCCAUCUGCGGCCAGAUGGACCUGAGCCACCCCCCCCCCCGGGGCCACCU





GGACGAGCUGACCACCACCCUGGAGAGCAUGACCGAGGACCUGAACCUGGACAGCCCCCUGACCCCCGAGCUGAACGAGAUCC





UGGACACCUUCCUGAACGACGAGUGCCUGCUGCACGCCAUGCACAUCAGCACCGGCCUGAGCAUCUUCGACACCAGCCUGUUC





AGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGA





CUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF3 target sequence


SEQ ID No.: 110



GTTTGAAAGGAAGGCAGAGAG






>ZF3 amino acid sequence


SEQ ID NO.: 111



LEPGEKPYKCPECGKSFSRSDNLVRHQRTHTGEKPYKCPECGKSFSQLAHLRAHQRTHTGEKPYKCPECGKSFSDPGHLVRHQ






RTHTGEKPYKCPECGKSFSQSSNLVRHQRTHTGEKPYKCPECGKSFSRKDNLKNHQRTHTGEKPYKCPECGKSFSQAGHLASH





QRTHTGEKPYKCPECGKSFSTSGSLVRHQRTHTGEKPTGKKTS





>ZF3 mRNA


SEQ ID NO.: 213



CUCGAACCGGGCGAGAAACCAUAUAAGUGUCCGGAAUGUGGUAAAAGUUUUAGCCGCAGCGAUAAUCUCGUGCGUCACCAGCG






UACGCAUACCGGUGAGAAGCCAUACAAGUGUCCGGAGUGUGGCAAAAGCUUCAGUCAGCUGGCGCAUCUGCGCGCGCAUCAGC





GCACCCACACCGGUGAGAAACCGUACAAGUGUCCAGAAUGCGGCAAAAGCUUUAGCGAUCCGGGUCAUCUGGUGCGUCAUCAA





CGUACGCACACGGGCGAAAAACCGUACAAAUGUCCGGAGUGCGGCAAGAGCUUCAGCCAGAGCAGCAAUCUGGUUCGCCACCA





GCGUACGCACACCGGUGAAAAGCCAUACAAGUGCCCGGAGUGCGGCAAGAGUUUCAGUCGCAAGGACAAUCUGAAGAACCAUC





AACGCACCCAUACGGGCGAGAAGCCGUACAAAUGUCCGGAAUGCGGUAAAAGUUUUAGCCAAGCCGGUCAUCUGGCCAGCCAU





CAGCGUACCCAUACGGGUGAGAAACCGUAUAAAUGUCCAGAAUGUGGUAAGAGUUUCAGCACCAGCGGUAGUCUGGUUCGUCA





UCAACGCACGCAUACGGGUGAAAAACCAACCGGCAAGAAAACCAGC





>ZF4-VPR protein


SEQ ID NO.: 112



MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSQRAHLERHQRTHTGEKPYKCPECGKSFSQLAHLRAHQRTHTGE






KPYKCPECGKSFSSKKHLAEHQRTHTGEKPYKCPECGKSFSTTGALTEHQRTHTGEKPYKCPECGKSFSTSGNLVRHQRTHTG





EKPYKCPECGKSFSQSGDLRRHQRTHTGEKPYKCPECGKSFSTSHSLTEHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDL





DMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKK





SPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPA





MVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQG





IPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKP





EAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDE





ETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHI





STGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>ZF4-mRNA


SEQ ID NO.: 113



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCCCUCGAACCGGGCGAAAAGCCGUAUAAGUGCCCGGAAUGCGGCAAGAGUUUUAG





CCAGCGCGCCCAUCUGGAACGUCACCAGCGUACCCAUACCGGUGAAAAGCCAUAUAAAUGCCCAGAAUGUGGUAAAAGCUUUA





GUCAGCUGGCCCAUCUGCGCGCCCACCAACGUACGCACACGGGCGAGAAGCCGUACAAAUGCCCAGAAUGCGGUAAAAGCUUC





AGCAGCAAAAAGCAUCUGGCGGAACAUCAACGUACCCACACCGGCGAGAAACCAUACAAGUGCCCGGAAUGCGGUAAAAGCUU





CAGCACCACCGGUGCGCUGACGGAGCAUCAGCGCACCCACACGGGCGAAAAACCGUAUAAGUGUCCGGAGUGUGGCAAAAGUU





UUAGUACCAGCGGCAAUCUGGUGCGCCAUCAACGUACGCAUACCGGCGAGAAGCCAUAUAAAUGUCCAGAGUGUGGCAAGAGC





UUUAGCCAAAGCGGUGAUCUGCGUCGCCACCAACGCACGCACACCGGCGAAAAACCAUACAAAUGUCCGGAAUGCGGUAAGAG





UUUCAGCACGAGCCAUAGUCUGACCGAACAUCAACGUACCCAUACGGGUGAGAAACCAACCGGCAAGAAAACCAGCGCUAGCG





GCAGCGGCGGCGGCAGCGGCGGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGAC





CUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGA





CAUGCUGAGCGGCGGCCCCAAGAAGAAGCGGAAGGUGGGCAGCCAGUACCUGCCCGACACCGACGACCGGCACCGGAUCGAGG





AGAAGCGGAAGCGGACCUACGAGACCUUCAAGAGCAUCAUGAAGAAAUCCCCCUUCAGCGGCCCCACCGACCCCCGGCCCCCC





CCCCGGCGGAUCGCCGUGCCCAGCCGGAGCAGCGCCAGCGUGCCCAAGCCCGCCCCCCAGCCCUACCCCUUCACCAGCAGCCU





GAGCACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCAGCGGCCAGAUCAGCCAGGCCAGCGCCCUGGCCCCCGCCC





CCCCCCAGGUGCUGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCAUGGUGAGCGCCCUGGCCCAGGCCCCCGCCCCCGUG





CCCGUGCUGGCCCCCGGCCCCCCCCAGGCCGUGGCCCCCCCCGCCCCCAAGCCCACCCAGGCCGGCGAGGGCACCCUGAGCGA





GGCCCUGCUGCAGCUGCAGUUCGACGACGAGGACCUGGGCGCCCUGCUGGGCAACAGCACCGACCCCGCCGUGUUCACCGACC





UGGCCAGCGUGGACAACAGCGAGUUCCAGCAGCUGCUGAACCAGGGCAUCCCCGUGGCCCCCCACACCACCGAGCCCAUGCUG





AUGGAGUACCCCGAGGCCAUCACCCGGCUGGUGACCGGCGCCCAGCGGCCCCCCGACCCCGCCCCCGCCCCCCUGGGCGCCCC





CGGCCUGCCCAACGGCCUGCUGAGCGGCGACGAGGACUUCAGCAGCAUCGCCGACAUGGACUUCAGCGCCCUGCUGGGCAGCG





GCAGCGGCAGCCGGGACAGCCGGGAGGGCAUGUUCCUGCCCAAGCCCGAGGCCGGCAGCGCCAUCAGCGACGUGUUCGAGGGC





CGGGAGGUGUGCCAGCCCAAGCGGCUCCGGCCCUUCCACCCCCCCGGCAGCCCCUGGGCCAACCGGCCCCUGCCCGCCAGCCU





GGCCCCCACCCCCACCGGCCCCGUGCACGAGCCCGUGGGCAGCCUGACCCCCGCCCCCGUGCCCCAGCCCCUGGACCCCGCCC





CCGCCGUGACCCCCGAGGCCAGCCACCUGCUGGAGGACCCCGACGAGGAGACCAGCCAGGCCGUGAAGGCCCUGCGGGAGAUG





GCCGACACCGUGAUCCCCCAGAAGGAGGAGGCCGCCAUCUGCGGCCAGAUGGACCUGAGCCACCCCCCCCCCCGGGGCCACCU





GGACGAGCUGACCACCACCCUGGAGAGCAUGACCGAGGACCUGAACCUGGACAGCCCCCUGACCCCCGAGCUGAACGAGAUCC





UGGACACCUUCCUGAACGACGAGUGCCUGCUGCACGCCAUGCACAUCAGCACCGGCCUGAGCAUCUUCGACACCAGCCUGUUC





AGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGA





CUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF4 target sequence


SEQ ID NO.: 114



CCAGCAGATCTTCCCAGAGGA






>ZF4 amino acid sequence


SEQ ID NO.: 115



LEPGEKPYKCPECGKSFSQRAHLERHQRTHTGEKPYKCPECGKSFSQLAHLRAHQRTHTGEKPYKCPECGKSFSSKKHLAEHQ






RTHTGEKPYKCPECGKSFSTTGALTEHQRTHTGEKPYKCPECGKSFSTSGNLVRHQRTHTGEKPYKCPECGKSFSQSGDLRRH





QRTHTGEKPYKCPECGKSFSTSHSLTEHQRTHTGEKPTGKKTS





>ZF4 mRNA sequence


SEQ ID NO.: 214



CUCGAACCGGGCGAAAAGCCGUAUAAGUGCCCGGAAUGCGGCAAGAGUUUUAGCCAGCGCGCCCAUCUGGAACGUCACCAGCG






UACCCAUACCGGUGAAAAGCCAUAUAAAUGCCCAGAAUGUGGUAAAAGCUUUAGUCAGCUGGCCCAUCUGCGCGCCCACCAAC





GUACGCACACGGGCGAGAAGCCGUACAAAUGCCCAGAAUGCGGUAAAAGCUUCAGCAGCAAAAAGCAUCUGGCGGAACAUCAA





CGUACCCACACCGGCGAGAAACCAUACAAGUGCCCGGAAUGCGGUAAAAGCUUCAGCACCACCGGUGCGCUGACGGAGCAUCA





GCGCACCCACACGGGCGAAAAACCGUAUAAGUGUCCGGAGUGUGGCAAAAGUUUUAGUACCAGCGGCAAUCUGGUGCGCCAUC





AACGUACGCAUACCGGCGAGAAGCCAUAUAAAUGUCCAGAGUGUGGCAAGAGCUUUAGCCAAAGCGGUGAUCUGCGUCGCCAC





CAACGCACGCACACCGGCGAAAAACCAUACAAAUGUCCGGAAUGCGGUAAGAGUUUCAGCACGAGCCAUAGUCUGACCGAACA





UCAACGUACCCAUACGGGUGAGAAACCAACCGGCAAGAAAACCAGC





>ZF5-VPR protein


SEQ ID NO.: 116



MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSTSGNLTEHQRTHTGEKPYKCPECGKSFSDSGNLRVHQRTHTGE






KPYKCPECGKSFSHKNALQNHQRTHTGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPYKCPECGKSFSQRAHLERHQRTHTG





EKPYKCPECGKSFSRSDKLVRHQRTHTGEKPYKCPECGKSFSDPGHLVRHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDL





DMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKK





SPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPA





MVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQG





IPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKP





EAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDE





ETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHI





STGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>ZF5-VPR mRNA


SEQ ID NO.: 117



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCCCUCGAACCGGGCGAGAAACCAUAUAAGUGUCCAGAGUGUGGUAAGAGCUUUAG





CACCAGUGGCAAUCUGACCGAGCAUCAACGCACGCAUACGGGUGAGAAACCGUACAAGUGCCCGGAAUGCGGCAAAAGUUUCA





GCGAUAGCGGCAAUCUGCGUGUGCACCAGCGUACGCAUACGGGCGAAAAGCCGUAUAAGUGCCCAGAAUGCGGUAAGAGUUUU





AGCCACAAAAACGCGCUGCAGAACCACCAGCGCACCCACACGGGUGAGAAGCCAUACAAAUGUCCGGAAUGCGGCAAAAGCUU





CAGCCGCAACGAUACGCUGACGGAACACCAACGUACGCAUACCGGCGAAAAGCCAUACAAGUGCCCGGAGUGCGGUAAAAGCU





UUAGCCAGCGCGCGCAUCUCGAACGUCAUCAACGUACCCAUACCGGUGAAAAACCAUAUAAAUGCCCGGAAUGUGGUAAAAGU





UUUAGCCGCAGCGACAAACUGGUGCGUCAUCAACGCACCCAUACCGGUGAAAAGCCAUAUAAGUGCCCGGAGUGUGGUAAAAG





CUUCAGCGAUCCGGGUCAUCUGGUUCGCCAUCAACGUACGCACACCGGCGAGAAGCCAACCGGCAAGAAAACCAGCGCUAGCG





GCAGCGGCGGCGGCAGCGGCGGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGAC





CUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGA





CAUGCUGAGCGGCGGCCCCAAGAAGAAGCGGAAGGUGGGCAGCCAGUACCUGCCCGACACCGACGACCGGCACCGGAUCGAGG





AGAAGCGGAAGCGGACCUACGAGACCUUCAAGAGCAUCAUGAAGAAAUCCCCCUUCAGCGGCCCCACCGACCCCCGGCCCCCC





CCCCGGCGGAUCGCCGUGCCCAGCCGGAGCAGCGCCAGCGUGCCCAAGCCCGCCCCCCAGCCCUACCCCUUCACCAGCAGCCU





GAGCACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCAGCGGCCAGAUCAGCCAGGCCAGCGCCCUGGCCCCCGCCC





CCCCCCAGGUGCUGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCAUGGUGAGCGCCCUGGCCCAGGCCCCCGCCCCCGUG





CCCGUGCUGGCCCCCGGCCCCCCCCAGGCCGUGGCCCCCCCCGCCCCCAAGCCCACCCAGGCCGGCGAGGGCACCCUGAGCGA





GGCCCUGCUGCAGCUGCAGUUCGACGACGAGGACCUGGGCGCCCUGCUGGGCAACAGCACCGACCCCGCCGUGUUCACCGACC





UGGCCAGCGUGGACAACAGCGAGUUCCAGCAGCUGCUGAACCAGGGCAUCCCCGUGGCCCCCCACACCACCGAGCCCAUGCUG





AUGGAGUACCCCGAGGCCAUCACCCGGCUGGUGACCGGCGCCCAGCGGCCCCCCGACCCCGCCCCCGCCCCCCUGGGCGCCCC





CGGCCUGCCCAACGGCCUGCUGAGCGGCGACGAGGACUUCAGCAGCAUCGCCGACAUGGACUUCAGCGCCCUGCUGGGCAGCG





GCAGCGGCAGCCGGGACAGCCGGGAGGGCAUGUUCCUGCCCAAGCCCGAGGCCGGCAGCGCCAUCAGCGACGUGUUCGAGGGC





CGGGAGGUGUGCCAGCCCAAGCGGCUCCGGCCCUUCCACCCCCCCGGCAGCCCCUGGGCCAACCGGCCCCUGCCCGCCAGCCU





GGCCCCCACCCCCACCGGCCCCGUGCACGAGCCCGUGGGCAGCCUGACCCCCGCCCCCGUGCCCCAGCCCCUGGACCCCGCCC





CCGCCGUGACCCCCGAGGCCAGCCACCUGCUGGAGGACCCCGACGAGGAGACCAGCCAGGCCGUGAAGGCCCUGCGGGAGAUG





GCCGACACCGUGAUCCCCCAGAAGGAGGAGGCCGCCAUCUGCGGCCAGAUGGACCUGAGCCACCCCCCCCCCCGGGGCCACCU





GGACGAGCUGACCACCACCCUGGAGAGCAUGACCGAGGACCUGAACCUGGACAGCCCCCUGACCCCCGAGCUGAACGAGAUCC





UGGACACCUUCCUGAACGACGAGUGCCUGCUGCACGCCAUGCACAUCAGCACCGGCCUGAGCAUCUUCGACACCAGCCUGUUC





AGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGA





CUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF5 target sequence


SEQ ID NO.: 118



GGCGGGGGACCGATTAACCAT






>ZF5 amino acid sequence


SEQ ID NO.: 119



LEPGEKPYKCPECGKSFSTSGNLTEHQRTHTGEKPYKCPECGKSFSDSGNLRVHQRTHTGEKPYKCPECGKSFSHKNALQNHQ






RTHTGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPYKCPECGKSFSQRAHLERHQRTHTGEKPYKCPECGKSFSRSDKLVRH





QRTHTGEKPYKCPECGKSFSDPGHLVRHQRTHTGEKPTGKKTS





>ZF5 mRNA sequence


SEQ ID NO.: 215



CUCGAACCGGGCGAGAAACCAUAUAAGUGUCCAGAGUGUGGUAAGAGCUUUAGCACCAGUGGCAAUCUGACCGAGCAUCAACG






CACGCAUACGGGUGAGAAACCGUACAAGUGCCCGGAAUGCGGCAAAAGUUUCAGCGAUAGCGGCAAUCUGCGUGUGCACCAGC





GUACGCAUACGGGCGAAAAGCCGUAUAAGUGCCCAGAAUGCGGUAAGAGUUUUAGCCACAAAAACGCGCUGCAGAACCACCAG





CGCACCCACACGGGUGAGAAGCCAUACAAAUGUCCGGAAUGCGGCAAAAGCUUCAGCCGCAACGAUACGCUGACGGAACACCA





ACGUACGCAUACCGGCGAAAAGCCAUACAAGUGCCCGGAGUGCGGUAAAAGCUUUAGCCAGCGCGCGCAUCUCGAACGUCAUC





AACGUACCCAUACCGGUGAAAAACCAUAUAAAUGCCCGGAAUGUGGUAAAAGUUUUAGCCGCAGCGACAAACUGGUGCGUCAU





CAACGCACCCAUACCGGUGAAAAGCCAUAUAAGUGCCCGGAGUGUGGUAAAAGCUUCAGCGAUCCGGGUCAUCUGGUUCGCCA





UCAACGUACGCACACCGGCGAGAAGCCAACCGGCAAGAAAACCAGC





>ZF6-VPR protein


SEQ ID NO.: 120



MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSTSHSLTEHQRTHTGEKPYKCPECGKSFSQRANLRAHQRTHTGE






KPYKCPECGKSFSRADNLTEHQRTHTGEKPYKCPECGKSFSERSHLREHQRTHTGEKPYKCPECGKSFSDPGALVRHQRTHTG





EKPYKCPECGKSFSTSGHLVRHQRTHTGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDL





DMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKK





SPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPA





MVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQG





IPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKP





EAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDE





ETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHI





STGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>ZF6-VPR  mRNA


SEQ ID NO.: 121



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCCCUCGAACCGGGCGAAAAACCGUAUAAGUGUCCGGAGUGCGGCAAGAGCUUCAG





CACGAGCCAUAGUCUGACCGAACACCAGCGCACCCACACGGGCGAAAAGCCGUACAAAUGUCCAGAGUGUGGUAAGAGUUUCA





GCCAGCGUGCCAAUCUGCGCGCCCACCAACGUACCCACACCGGUGAGAAGCCGUAUAAGUGCCCAGAGUGUGGUAAAAGCUUC





AGCCGCGCCGAUAAUCUGACGGAGCACCAACGCACCCACACCGGCGAAAAGCCAUACAAGUGCCCGGAGUGUGGCAAGAGCUU





UAGCGAACGCAGCCAUCUGCGCGAACACCAACGUACGCACACGGGUGAGAAACCAUACAAAUGUCCAGAAUGUGGUAAAAGUU





UUAGCGAUCCGGGCGCGCUGGUUCGCCACCAGCGCACGCACACCGGUGAAAAGCCGUAUAAAUGUCCAGAAUGCGGCAAAAGC





UUCAGUACCAGCGGUCAUCUGGUUCGUCAUCAGCGUACCCAUACCGGCGAGAAGCCAUAUAAGUGCCCGGAGUGUGGCAAAAG





UUUCAGCCGCAAUGAUACGCUGACCGAGCAUCAGCGUACGCAUACCGGUGAAAAACCAACCGGCAAGAAAACCAGCGCUAGCG





GCAGCGGCGGCGGCAGCGGCGGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGAC





CUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGA





CAUGCUGAGCGGCGGCCCCAAGAAGAAGCGGAAGGUGGGCAGCCAGUACCUGCCCGACACCGACGACCGGCACCGGAUCGAGG





AGAAGCGGAAGCGGACCUACGAGACCUUCAAGAGCAUCAUGAAGAAAUCCCCCUUCAGCGGCCCCACCGACCCCCGGCCCCCC





CCCCGGCGGAUCGCCGUGCCCAGCCGGAGCAGCGCCAGCGUGCCCAAGCCCGCCCCCCAGCCCUACCCCUUCACCAGCAGCCU





GAGCACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCAGCGGCCAGAUCAGCCAGGCCAGCGCCCUGGCCCCCGCCC





CCCCCCAGGUGCUGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCAUGGUGAGCGCCCUGGCCCAGGCCCCCGCCCCCGUG





CCCGUGCUGGCCCCCGGCCCCCCCCAGGCCGUGGCCCCCCCCGCCCCCAAGCCCACCCAGGCCGGCGAGGGCACCCUGAGCGA





GGCCCUGCUGCAGCUGCAGUUCGACGACGAGGACCUGGGCGCCCUGCUGGGCAACAGCACCGACCCCGCCGUGUUCACCGACC





UGGCCAGCGUGGACAACAGCGAGUUCCAGCAGCUGCUGAACCAGGGCAUCCCCGUGGCCCCCCACACCACCGAGCCCAUGCUG





AUGGAGUACCCCGAGGCCAUCACCCGGCUGGUGACCGGCGCCCAGCGGCCCCCCGACCCCGCCCCCGCCCCCCUGGGCGCCCC





CGGCCUGCCCAACGGCCUGCUGAGCGGCGACGAGGACUUCAGCAGCAUCGCCGACAUGGACUUCAGCGCCCUGCUGGGCAGCG





GCAGCGGCAGCCGGGACAGCCGGGAGGGCAUGUUCCUGCCCAAGCCCGAGGCCGGCAGCGCCAUCAGCGACGUGUUCGAGGGC





CGGGAGGUGUGCCAGCCCAAGCGGCUCCGGCCCUUCCACCCCCCCGGCAGCCCCUGGGCCAACCGGCCCCUGCCCGCCAGCCU





GGCCCCCACCCCCACCGGCCCCGUGCACGAGCCCGUGGGCAGCCUGACCCCCGCCCCCGUGCCCCAGCCCCUGGACCCCGCCC





CCGCCGUGACCCCCGAGGCCAGCCACCUGCUGGAGGACCCCGACGAGGAGACCAGCCAGGCCGUGAAGGCCCUGCGGGAGAUG





GCCGACACCGUGAUCCCCCAGAAGGAGGAGGCCGCCAUCUGCGGCCAGAUGGACCUGAGCCACCCCCCCCCCCGGGGCCACCU





GGACGAGCUGACCACCACCCUGGAGAGCAUGACCGAGGACCUGAACCUGGACAGCCCCCUGACCCCCGAGCUGAACGAGAUCC





UGGACACCUUCCUGAACGACGAGUGCCUGCUGCACGCCAUGCACAUCAGCACCGGCCUGAGCAUCUUCGACACCAGCCUGUUC





AGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGA





CUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF6 target sequence


SEQ ID NO.: 122



CCGGGTGTCAGCCAGAAACCA






>ZF6 amino acid sequence


SEQ ID No.: 123



LEPGEKPYKCPECGKSFSTSHSLTEHQRTHTGEKPYKCPECGKSFSQRANLRAHQRTHTGEKPYKCPECGKSFSRADNLTEHQ






RTHTGEKPYKCPECGKSFSERSHLREHQRTHTGEKPYKCPECGKSFSDPGALVRHQRTHTGEKPYKCPECGKSFSTSGHLVRH





QRTHTGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPTGKKTS





>ZF6 mRNA sequence


SEQ ID NO.: 216



CUCGAACCGGGCGAAAAACCGUAUAAGUGUCCGGAGUGCGGCAAGAGCUUCAGCACGAGCCAUAGUCUGACCGAACACCAGCG






CACCCACACGGGCGAAAAGCCGUACAAAUGUCCAGAGUGUGGUAAGAGUUUCAGCCAGCGUGCCAAUCUGCGCGCCCACCAAC





GUACCCACACCGGUGAGAAGCCGUAUAAGUGCCCAGAGUGUGGUAAAAGCUUCAGCCGCGCCGAUAAUCUGACGGAGCACCAA





CGCACCCACACCGGCGAAAAGCCAUACAAGUGCCCGGAGUGUGGCAAGAGCUUUAGCGAACGCAGCCAUCUGCGCGAACACCA





ACGUACGCACACGGGUGAGAAACCAUACAAAUGUCCAGAAUGUGGUAAAAGUUUUAGCGAUCCGGGCGCGCUGGUUCGCCACC





AGCGCACGCACACCGGUGAAAAGCCGUAUAAAUGUCCAGAAUGCGGCAAAAGCUUCAGUACCAGCGGUCAUCUGGUUCGUCAU





CAGCGUACCCAUACCGGCGAGAAGCCAUAUAAGUGCCCGGAGUGUGGCAAAAGUUUCAGCCGCAAUGAUACGCUGACCGAGCA





UCAGCGUACGCAUACCGGUGAAAAACCAACCGGCAAGAAAACCAGC





>ZF7-VPR protein


SEQ ID NO.: 124



MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSRSDHLTNHQRTHTGEKPYKCPECGKSFSTSGSLVRHQRTHTGE






KPYKCPECGKSFSQAGHLASHQRTHTGEKPYKCPECGKSFSRSDKLTEHQRTHTGEKPYKCPECGKSFSTSGNLTEHQRTHTG





EKPYKCPECGKSFSQSSNLVRHQRTHTGEKPYKCPECGKSFSTHLDLIRHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDL





DMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKK





SPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPA





MVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQG





IPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKP





EAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDE





ETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHI





STGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>ZF7-VPR mRNA


SEQ ID No.: 125



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCCCUGGAACCGGGCGAGAAGCCAUACAAGUGCCCAGAGUGCGGCAAAAGCUUCAG





CCGCAGCGACCAUCUGACCAAUCACCAACGCACCCAUACCGGUGAGAAGCCGUACAAAUGCCCAGAGUGCGGUAAGAGCUUUA





GUACCAGUGGCAGUCUGGUUCGUCAUCAGCGCACGCACACGGGCGAAAAACCAUACAAAUGCCCGGAGUGCGGCAAAAGCUUU





AGCCAAGCCGGUCAUCUGGCGAGCCAUCAACGUACGCACACCGGCGAGAAGCCGUAUAAAUGUCCGGAGUGCGGUAAGAGCUU





UAGCCGUAGCGAUAAACUGACCGAACACCAACGUACGCAUACGGGCGAGAAACCAUAUAAAUGUCCAGAGUGUGGCAAGAGUU





UCAGCACCAGCGGCAAUCUGACCGAGCAUCAACGUACCCAUACCGGUGAAAAGCCAUAUAAAUGUCCAGAAUGCGGUAAGAGU





UUUAGUCAGAGCAGUAAUCUGGUGCGCCAUCAGCGUACCCACACGGGUGAGAAACCAUAUAAGUGUCCGGAAUGCGGCAAGAG





UUUUAGCACCCAUCUGGAUCUGAUCCGUCAUCAGCGCACCCACACCGGUGAAAAACCAACCGGCAAGAAAACCAGUGCUAGCG





GCAGCGGCGGCGGCAGCGGCGGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGAC





CUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGA





CAUGCUGAGCGGCGGCCCCAAGAAGAAGCGGAAGGUGGGCAGCCAGUACCUGCCCGACACCGACGACCGGCACCGGAUCGAGG





AGAAGCGGAAGCGGACCUACGAGACCUUCAAGAGCAUCAUGAAGAAAUCCCCCUUCAGCGGCCCCACCGACCCCCGGCCCCCC





CCCCGGCGGAUCGCCGUGCCCAGCCGGAGCAGCGCCAGCGUGCCCAAGCCCGCCCCCCAGCCCUACCCCUUCACCAGCAGCCU





GAGCACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCAGCGGCCAGAUCAGCCAGGCCAGCGCCCUGGCCCCCGCCC





CCCCCCAGGUGCUGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCAUGGUGAGCGCCCUGGCCCAGGCCCCCGCCCCCGUG





CCCGUGCUGGCCCCCGGCCCCCCCCAGGCCGUGGCCCCCCCCGCCCCCAAGCCCACCCAGGCCGGCGAGGGCACCCUGAGCGA





GGCCCUGCUGCAGCUGCAGUUCGACGACGAGGACCUGGGCGCCCUGCUGGGCAACAGCACCGACCCCGCCGUGUUCACCGACC





UGGCCAGCGUGGACAACAGCGAGUUCCAGCAGCUGCUGAACCAGGGCAUCCCCGUGGCCCCCCACACCACCGAGCCCAUGCUG





AUGGAGUACCCCGAGGCCAUCACCCGGCUGGUGACCGGCGCCCAGCGGCCCCCCGACCCCGCCCCCGCCCCCCUGGGCGCCCC





CGGCCUGCCCAACGGCCUGCUGAGCGGCGACGAGGACUUCAGCAGCAUCGCCGACAUGGACUUCAGCGCCCUGCUGGGCAGCG





GCAGCGGCAGCCGGGACAGCCGGGAGGGCAUGUUCCUGCCCAAGCCCGAGGCCGGCAGCGCCAUCAGCGACGUGUUCGAGGGC





CGGGAGGUGUGCCAGCCCAAGCGGCUCCGGCCCUUCCACCCCCCCGGCAGCCCCUGGGCCAACCGGCCCCUGCCCGCCAGCCU





GGCCCCCACCCCCACCGGCCCCGUGCACGAGCCCGUGGGCAGCCUGACCCCCGCCCCCGUGCCCCAGCCCCUGGACCCCGCCC





CCGCCGUGACCCCCGAGGCCAGCCACCUGCUGGAGGACCCCGACGAGGAGACCAGCCAGGCCGUGAAGGCCCUGCGGGAGAUG





GCCGACACCGUGAUCCCCCAGAAGGAGGAGGCCGCCAUCUGCGGCCAGAUGGACCUGAGCCACCCCCCCCCCCGGGGCCACCU





GGACGAGCUGACCACCACCCUGGAGAGCAUGACCGAGGACCUGAACCUGGACAGCCCCCUGACCCCCGAGCUGAACGAGAUCC





UGGACACCUUCCUGAACGACGAGUGCCUGCUGCACGCCAUGCACAUCAGCACCGGCCUGAGCAUCUUCGACACCAGCCUGUUC





AGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGA





CUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF7.4-VPR mRNA sequence


SEQ ID NO.: 2508



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCGAAGAAAAAGAGGAAGGUCGGGAUCCAC






GGAGUCCCAGCCGCAGGAAGCAGCGGAAGCCUGGAACCCGGAGAAAAACCCUACAAGUGCCCAGAAUGCGGCAAGAGCUUCAG





CCGCAGCGACCACCUGACCAACCACCAGAGAACCCACACCGGAGAAAAGCCAUACAAAUGCCCAGAGUGCGGGAAAAGCUUCA





GCACAAGCGGCAGCCUCGUCAGGCACCAGCGGACACACACCGGCGAGAAGCCCUACAAGUGCCCGGAAUGCGGAAAGAGCUUC





AGCCAAGCCGGACACCUCGCCAGCCACCAGAGGACCCACACAGGAGAGAAACCGUACAAAUGCCCGGAGUGCGGCAAGAGCUU





CAGCCGGAGCGACAAGCUGACCGAACACCAGCGAACCCACACGGGCGAAAAGCCGUACAAGUGCCCCGAGUGCGGAAAAAGCU





UCAGCACGAGCGGAAACCUCACCGAGCACCAGCGCACCCACACGGGAGAGAAGCCGUACAAGUGCCCCGAAUGCGGAAAGAGC





UUCAGCCAGAGCAGCAACCUCGUGCGCCACCAACGGACGCACACAGGGGAAAAGCCCUACAAGUGCCCGGAAUGCGGCAAAAG





CUUCAGCACCCACCUGGACCUGAUCCGGCACCAACGCACGCACACCGGGGAAAAACCGACCGGAAAAAAGACCAGCGCGAGCG





GAAGCGGAGGAGGAAGCGGAGGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGGAGCGACGCACUGGACGACUUCGAC





CUGGACAUGCUGGGAAGCGACGCGCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUCGACGACUUCGACCUCGA





CAUGCUGAGCGGCGGACCCAAGAAGAAGAGAAAGGUCGGAAGCCAGUACCUCCCGGACACCGACGACAGGCACCGCAUCGAAG





AGAAGCGGAAAAGAACCUACGAAACCUUCAAGAGCAUCAUGAAAAAGAGCCCGUUCAGCGGACCAACCGACCCCAGACCACCA





CCGAGAAGAAUCGCGGUCCCAAGCAGGAGCAGCGCCAGCGUCCCGAAGCCAGCCCCACAGCCGUACCCCUUCACCAGCAGCCU





GAGCACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCGAGCGGCCAGAUAAGCCAGGCCAGCGCACUGGCACCAGCCC





CACCGCAAGUGCUGCCCCAAGCACCCGCACCAGCACCCGCCCCCGCGAUGGUCAGCGCCCUGGCACAAGCCCCAGCCCCAGUC





CCGGUGCUCGCACCAGGACCACCCCAAGCAGUCGCACCGCCAGCCCCAAAGCCGACCCAAGCCGGAGAAGGCACCCUCAGCGA





GGCGCUCCUGCAACUCCAAUUCGACGACGAGGACCUGGGAGCCCUGCUGGGCAACAGCACCGACCCGGCAGUGUUCACCGACC





UGGCCAGCGUGGACAACAGCGAAUUCCAGCAGCUCCUGAACCAAGGAAUCCCAGUCGCGCCACACACCACCGAGCCGAUGCUG





AUGGAAUACCCAGAAGCGAUCACGAGACUGGUCACCGGGGCCCAAAGACCGCCGGACCCAGCGCCAGCACCACUGGGAGCCCC





AGGACUGCCCAACGGACUGCUCAGCGGCGACGAGGACUUCAGCAGCAUCGCGGACAUGGACUUCAGCGCACUCCUCGGAAGCG





GAAGCGGCAGCAGAGACAGCCGGGAAGGAAUGUUCCUCCCCAAGCCAGAAGCCGGAAGCGCAAUCAGCGACGUGUUCGAAGGA





CGGGAAGUCUGCCAGCCGAAGCGCCUCAGACCGUUCCACCCACCGGGAAGCCCAUGGGCCAACAGACCGCUGCCAGCCAGCCU





GGCACCGACCCCAACCGGACCAGUCCACGAACCAGUCGGCAGCCUGACACCAGCACCAGUGCCCCAGCCACUGGACCCAGCAC





CGGCAGUGACCCCAGAAGCCAGCCACCUCCUGGAGGACCCCGACGAAGAAACCAGCCAGGCCGUGAAGGCCCUGAGGGAGAUG





GCCGACACGGUGAUCCCACAGAAGGAAGAAGCAGCGAUCUGCGGCCAAAUGGACCUCAGCCACCCACCGCCAAGAGGCCACCU





GGACGAGCUCACCACCACCCUGGAAAGCAUGACCGAGGACCUCAACCUCGACAGCCCCCUGACACCGGAGCUCAACGAGAUCC





UGGACACCUUCCUCAACGACGAAUGCCUGCUCCACGCCAUGCACAUCAGCACCGGACUGAGCAUCUUCGACACCAGCCUGUUC





AGCGGGGGAAAACGACCGGCAGCCACCAAAAAGGCCGGACAGGCGAAGAAGAAGAAGGGGAGCUACCCGUACGACGUGCCCGA





CUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF7 target sequence


SEQ ID NO.: 126



ACTGAACATCGGTGAGTTAGG






>ZF7 amino acid sequence


SEQ ID NO.: 127



LEPGEKPYKCPECGKSFSRSDHLTNHQRTHTGEKPYKCPECGKSFSTSGSLVRHQRTHTGEKPYKCPECGKSFSQAGHLASHQ






RTHTGEKPYKCPECGKSFSRSDKLTEHQRTHTGEKPYKCPECGKSFSTSGNLTEHQRTHTGEKPYKCPECGKSFSQSSNLVRH





QRTHTGEKPYKCPECGKSFSTHLDLIRHQRTHTGEKPTGKKTS





>ZF7 mRNA


SEQ ID NO.: 217



CUGGAACCGGGCGAGAAGCCAUACAAGUGCCCAGAGUGCGGCAAAAGCUUCAGCCGCAGCGACCAUCUGACCAAUCACCAACG






CACCCAUACCGGUGAGAAGCCGUACAAAUGCCCAGAGUGCGGUAAGAGCUUUAGUACCAGUGGCAGUCUGGUUCGUCAUCAGC





GCACGCACACGGGCGAAAAACCAUACAAAUGCCCGGAGUGCGGCAAAAGCUUUAGCCAAGCCGGUCAUCUGGCGAGCCAUCAA





CGUACGCACACCGGCGAGAAGCCGUAUAAAUGUCCGGAGUGCGGUAAGAGCUUUAGCCGUAGCGAUAAACUGACCGAACACCA





ACGUACGCAUACGGGCGAGAAACCAUAUAAAUGUCCAGAGUGUGGCAAGAGUUUCAGCACCAGCGGCAAUCUGACCGAGCAUC





AACGUACCCAUACCGGUGAAAAGCCAUAUAAAUGUCCAGAAUGCGGUAAGAGUUUUAGUCAGAGCAGUAAUCUGGUGCGCCAU





CAGCGUACCCACACGGGUGAGAAACCAUAUAAGUGUCCGGAAUGCGGCAAGAGUUUUAGCACCCAUCUGGAUCUGAUCCGUCA





UCAGCGCACCCACACCGGUGAAAAACCAACCGGCAAGAAAACCAGU





>ZF7.4 mRNA sequence


SEQ ID NO.: 2509



CUGGAACCCGGAGAAAAACCCUACAAGUGCCCAGAAUGCGGCAAGAGCUUCAGCCGCAGCGACCACCUGACCAACCACCAGAG






AACCCACACCGGAGAAAAGCCAUACAAAUGCCCAGAGUGCGGGAAAAGCUUCAGCACAAGCGGCAGCCUCGUCAGGCACCAGC





GGACACACACCGGCGAGAAGCCCUACAAGUGCCCGGAAUGCGGAAAGAGCUUCAGCCAAGCCGGACACCUCGCCAGCCACCAG





AGGACCCACACAGGAGAGAAACCGUACAAAUGCCCGGAGUGCGGCAAGAGCUUCAGCCGGAGCGACAAGCUGACCGAACACCA





GCGAACCCACACGGGCGAAAAGCCGUACAAGUGCCCCGAGUGCGGAAAAAGCUUCAGCACGAGCGGAAACCUCACCGAGCACC





AGCGCACCCACACGGGAGAGAAGCCGUACAAGUGCCCCGAAUGCGGAAAGAGCUUCAGCCAGAGCAGCAACCUCGUGCGCCAC





CAACGGACGCACACAGGGGAAAAGCCCUACAAGUGCCCGGAAUGCGGCAAAAGCUUCAGCACCCACCUGGACCUGAUCCGGCA





CCAACGCACGCACACCGGGGAAAAACCGACCGGAAAAAAGACCAGC





>ZF8-VPR protein


SEQ ID NO.: 128



MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSRSDHLTTHQRTHTGEKPYKCPECGKSFSTSGELVRHQRTHTGE






KPYKCPECGKSFSRRDELNVHQRTHTGEKPYKCPECGKSFSSPADLTRHQRTHTGEKPYKCPECGKSFSRSDELVRHQRTHTG





EKPYKCPECGKSFSRSDKLVRHQRTHTGEKPYKCPECGKSFSTTGALTEHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDL





DMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKK





SPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPA





MVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQG





IPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKP





EAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDE





ETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHI





STGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>ZF8-VPR mRNA


SEQ ID No.: 129



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCCCUGGAACCGGGCGAAAAGCCGUACAAGUGCCCGGAAUGUGGCAAAAGUUUUAG





UCGCAGCGAUCAUCUGACCACCCAUCAGCGUACCCAUACCGGUGAGAAGCCAUACAAAUGCCCAGAAUGUGGUAAGAGCUUUA





GCACCAGCGGCGAGCUGGUUCGUCACCAGCGUACCCACACCGGCGAGAAGCCGUAUAAGUGUCCAGAAUGCGGUAAAAGCUUU





AGCCGCCGCGACGAGCUGAAUGUGCAUCAACGCACCCACACGGGCGAGAAGCCAUAUAAGUGCCCGGAGUGUGGUAAGAGUUU





CAGUAGCCCAGCGGAUCUGACCCGUCAUCAACGUACGCACACGGGCGAGAAACCAUACAAGUGUCCGGAGUGCGGCAAAAGUU





UUAGCCGCAGUGAUGAACUGGUGCGCCACCAGCGCACCCAUACCGGCGAAAAACCGUAUAAGUGCCCAGAGUGCGGUAAGAGC





UUCAGCCGCAGCGACAAACUGGUGCGUCACCAGCGCACGCAUACGGGUGAGAAACCGUACAAGUGCCCGGAGUGCGGCAAAAG





CUUCAGCACCACCGGCGCGCUGACCGAACAUCAACGUACCCAUACGGGUGAGAAACCAACGGGCAAAAAGACCAGCGCUAGCG





GCAGCGGCGGCGGCAGCGGCGGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGAC





CUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGA





CAUGCUGAGCGGCGGCCCCAAGAAGAAGCGGAAGGUGGGCAGCCAGUACCUGCCCGACACCGACGACCGGCACCGGAUCGAGG





AGAAGCGGAAGCGGACCUACGAGACCUUCAAGAGCAUCAUGAAGAAAUCCCCCUUCAGCGGCCCCACCGACCCCCGGCCCCCC





CCCCGGCGGAUCGCCGUGCCCAGCCGGAGCAGCGCCAGCGUGCCCAAGCCCGCCCCCCAGCCCUACCCCUUCACCAGCAGCCU





GAGCACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCAGCGGCCAGAUCAGCCAGGCCAGCGCCCUGGCCCCCGCCC





CCCCCCAGGUGCUGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCAUGGUGAGCGCCCUGGCCCAGGCCCCCGCCCCCGUG





CCCGUGCUGGCCCCCGGCCCCCCCCAGGCCGUGGCCCCCCCCGCCCCCAAGCCCACCCAGGCCGGCGAGGGCACCCUGAGCGA





GGCCCUGCUGCAGCUGCAGUUCGACGACGAGGACCUGGGCGCCCUGCUGGGCAACAGCACCGACCCCGCCGUGUUCACCGACC





UGGCCAGCGUGGACAACAGCGAGUUCCAGCAGCUGCUGAACCAGGGCAUCCCCGUGGCCCCCCACACCACCGAGCCCAUGCUG





AUGGAGUACCCCGAGGCCAUCACCCGGCUGGUGACCGGCGCCCAGCGGCCCCCCGACCCCGCCCCCGCCCCCCUGGGCGCCCC





CGGCCUGCCCAACGGCCUGCUGAGCGGCGACGAGGACUUCAGCAGCAUCGCCGACAUGGACUUCAGCGCCCUGCUGGGCAGCG





GCAGCGGCAGCCGGGACAGCCGGGAGGGCAUGUUCCUGCCCAAGCCCGAGGCCGGCAGCGCCAUCAGCGACGUGUUCGAGGGC





CGGGAGGUGUGCCAGCCCAAGCGGCUCCGGCCCUUCCACCCCCCCGGCAGCCCCUGGGCCAACCGGCCCCUGCCCGCCAGCCU





GGCCCCCACCCCCACCGGCCCCGUGCACGAGCCCGUGGGCAGCCUGACCCCCGCCCCCGUGCCCCAGCCCCUGGACCCCGCCC





CCGCCGUGACCCCCGAGGCCAGCCACCUGCUGGAGGACCCCGACGAGGAGACCAGCCAGGCCGUGAAGGCCCUGCGGGAGAUG





GCCGACACCGUGAUCCCCCAGAAGGAGGAGGCCGCCAUCUGCGGCCAGAUGGACCUGAGCCACCCCCCCCCCCGGGGCCACCU





GGACGAGCUGACCACCACCCUGGAGAGCAUGACCGAGGACCUGAACCUGGACAGCCCCCUGACCCCCGAGCUGAACGAGAUCC





UGGACACCUUCCUGAACGACGAGUGCCUGCUGCACGCCAUGCACAUCAGCACCGGCCUGAGCAUCUUCGACACCAGCCUGUUC





AGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGA





CUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF8 target sequence


SEQ ID NO.: 130



CTTGGGGTGACAATGGCTTGG






>ZF8  amino acid sequence


SEQ ID NO.: 131



LEPGEKPYKCPECGKSFSRSDHLTTHQRTHTGEKPYKCPECGKSFSTSGELVRHQRTHTGEKPYKCPECGKSFSRRDELNVHQ






RTHTGEKPYKCPECGKSFSSPADLTRHQRTHTGEKPYKCPECGKSFSRSDELVRHQRTHTGEKPYKCPECGKSFSRSDKLVRH





QRTHTGEKPYKCPECGKSFSTTGALTEHQRTHTGEKPTGKKTS





>ZF8 mRNA sequence


SEQ ID NO.: 218



CUGGAACCGGGCGAAAAGCCGUACAAGUGCCCGGAAUGUGGCAAAAGUUUUAGUCGCAGCGAUCAUCUGACCACCCAUCAGCG






UACCCAUACCGGUGAGAAGCCAUACAAAUGCCCAGAAUGUGGUAAGAGCUUUAGCACCAGCGGCGAGCUGGUUCGUCACCAGC





GUACCCACACCGGCGAGAAGCCGUAUAAGUGUCCAGAAUGCGGUAAAAGCUUUAGCCGCCGCGACGAGCUGAAUGUGCAUCAA





CGCACCCACACGGGCGAGAAGCCAUAUAAGUGCCCGGAGUGUGGUAAGAGUUUCAGUAGCCCAGCGGAUCUGACCCGUCAUCA





ACGUACGCACACGGGCGAGAAACCAUACAAGUGUCCGGAGUGCGGCAAAAGUUUUAGCCGCAGUGAUGAACUGGUGCGCCACC





AGCGCACCCAUACCGGCGAAAAACCGUAUAAGUGCCCAGAGUGCGGUAAGAGCUUCAGCCGCAGCGACAAACUGGUGCGUCAC





CAGCGCACGCAUACGGGUGAGAAACCGUACAAGUGCCCGGAGUGCGGCAAAAGCUUCAGCACCACCGGCGCGCUGACCGAACA





UCAACGUACCCAUACGGGUGAGAAACCAACGGGCAAAAAGACCAGC





>ZF9-VPR protein


SEQ ID NO.: 132



MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSQSGNLTEHQRTHTGEKPYKCPECGKSFSHKNALQNHQRTHTGE






KPYKCPECGKSFSQNSTLTEHQRTHTGEKPYKCPECGKSFSSKKHLAEHQRTHTGEKPYKCPECGKSFSTSGNLVRHQRTHTG





EKPYKCPECGKSFSQSGHLTEHQRTHTPNPHRRTDPSHKPFQYKCPECGKSFSDKKDLTRHQRTHTGEKPTGKKTSASGSGGG





SGGDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKR





TYETFKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVL





PQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVD





NSEFQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSR





DSREGMFLPKPEAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTP





EASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFL





NDECLLHAMHISTGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>ZF9-VPR mRNA


SEQ ID NO.: 133



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCCCUCGAACCGGGCGAAAAGCCAUACAAAUGUCCGGAGUGUGGCAAGAGUUUCAG





CCAAAGCGGCAACCUCACCGAGCACCAGCGCACGCACACCGGCGAGAAGCCAUAUAAAUGUCCAGAAUGCGGCAAGAGCUUCA





GCCAUAAGAAUGCGCUGCAGAACCAUCAGCGCACCCACACCGGUGAGAAGCCAUACAAGUGCCCAGAGUGCGGCAAAAGUUUC





AGCCAGAAUAGCACCCUCACGGAGCAUCAACGCACGCAUACGGGUGAAAAGCCGUACAAAUGCCCAGAAUGUGGCAAGAGCUU





UAGCAGCAAGAAACAUCUGGCGGAGCAUCAGCGUACCCACACGGGCGAAAAGCCAUACAAAUGUCCGGAAUGCGGCAAAAGCU





UCAGCACGAGUGGCAAUCUGGUGCGCCAUCAACGUACGCACACGGGUGAGAAACCGUAUAAAUGCCCAGAGUGUGGUAAAAGC





UUCAGUCAGAGCGGCCAUCUGACCGAACACCAGCGCACCCAUACGCCAAACCCGCAUCGCCGCACCGAUCCGAGCCACAAGCC





GUUCCAGUACAAGUGUCCAGAGUGCGGUAAAAGUUUUAGCGACAAGAAGGAUCUGACCCGUCACCAACGUACCCAUACCGGUG





AAAAACCAACGGGCAAGAAAACCAGCGCUAGCGGCAGCGGCGGCGGCAGCGGCGGCGACGCCCUGGACGACUUCGACCUGGAC





AUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGACAUGCU





GGGCAGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGAGCGGCGGCCCCAAGAAGAAGCGGAAGGUGGGCAGCCAGUACC





UGCCCGACACCGACGACCGGCACCGGAUCGAGGAGAAGCGGAAGCGGACCUACGAGACCUUCAAGAGCAUCAUGAAGAAAUCC





CCCUUCAGCGGCCCCACCGACCCCCGGCCCCCCCCCCGGCGGAUCGCCGUGCCCAGCCGGAGCAGCGCCAGCGUGCCCAAGCC





CGCCCCCCAGCCCUACCCCUUCACCAGCAGCCUGAGCACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCAGCGGCC





AGAUCAGCCAGGCCAGCGCCCUGGCCCCCGCCCCCCCCCAGGUGCUGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCAUG





GUGAGCGCCCUGGCCCAGGCCCCCGCCCCCGUGCCCGUGCUGGCCCCCGGCCCCCCCCAGGCCGUGGCCCCCCCCGCCCCCAA





GCCCACCCAGGCCGGCGAGGGCACCCUGAGCGAGGCCCUGCUGCAGCUGCAGUUCGACGACGAGGACCUGGGCGCCCUGCUGG





GCAACAGCACCGACCCCGCCGUGUUCACCGACCUGGCCAGCGUGGACAACAGCGAGUUCCAGCAGCUGCUGAACCAGGGCAUC





CCCGUGGCCCCCCACACCACCGAGCCCAUGCUGAUGGAGUACCCCGAGGCCAUCACCCGGCUGGUGACCGGCGCCCAGCGGCC





CCCCGACCCCGCCCCCGCCCCCCUGGGCGCCCCCGGCCUGCCCAACGGCCUGCUGAGCGGCGACGAGGACUUCAGCAGCAUCG





CCGACAUGGACUUCAGCGCCCUGCUGGGCAGCGGCAGCGGCAGCCGGGACAGCCGGGAGGGCAUGUUCCUGCCCAAGCCCGAG





GCCGGCAGCGCCAUCAGCGACGUGUUCGAGGGCCGGGAGGUGUGCCAGCCCAAGCGGCUCCGGCCCUUCCACCCCCCCGGCAG





CCCCUGGGCCAACCGGCCCCUGCCCGCCAGCCUGGCCCCCACCCCCACCGGCCCCGUGCACGAGCCCGUGGGCAGCCUGACCC





CCGCCCCCGUGCCCCAGCCCCUGGACCCCGCCCCCGCCGUGACCCCCGAGGCCAGCCACCUGCUGGAGGACCCCGACGAGGAG





ACCAGCCAGGCCGUGAAGGCCCUGCGGGAGAUGGCCGACACCGUGAUCCCCCAGAAGGAGGAGGCCGCCAUCUGCGGCCAGAU





GGACCUGAGCCACCCCCCCCCCCGGGGCCACCUGGACGAGCUGACCACCACCCUGGAGAGCAUGACCGAGGACCUGAACCUGG





ACAGCCCCCUGACCCCCGAGCUGAACGAGAUCCUGGACACCUUCCUGAACGACGAGUGCCUGCUGCACGCCAUGCACAUCAGC





ACCGGCCUGAGCAUCUUCGACACCAGCCUGUUCAGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAA





GAAGAAGGGCAGCUACCCCUACGACGUGCCCGACUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUU





CUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUGUACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF9 target


SEQ ID NO.: 134



ACCTCCGAGATCCCCTAATTCAA






>ZF9 amino acid sequence


SEQ ID NO.: 135



LEPGEKPYKCPECGKSFSQSGNLTEHQRTHTGEKPYKCPECGKSFSHKNALQNHQRTHTGEKPYKCPECGKSFSQNSTLTEHQ






RTHTGEKPYKCPECGKSFSSKKHLAEHQRTHTGEKPYKCPECGKSFSTSGNLVRHQRTHTGEKPYKCPECGKSFSQSGHLTEH





QRTHTPNPHRRTDPSHKPFQYKCPECGKSFSDKKDLTRHQRTHTGEKPTGKKTS





>ZF9 mRNA sequnce


SEQ ID NO.: 219



CUCGAACCGGGCGAAAAGCCAUACAAAUGUCCGGAGUGUGGCAAGAGUUUCAGCCAAAGCGGCAACCUCACCGAGCACCAGCG






CACGCACACCGGCGAGAAGCCAUAUAAAUGUCCAGAAUGCGGCAAGAGCUUCAGCCAUAAGAAUGCGCUGCAGAACCAUCAGC





GCACCCACACCGGUGAGAAGCCAUACAAGUGCCCAGAGUGCGGCAAAAGUUUCAGCCAGAAUAGCACCCUCACGGAGCAUCAA





CGCACGCAUACGGGUGAAAAGCCGUACAAAUGCCCAGAAUGUGGCAAGAGCUUUAGCAGCAAGAAACAUCUGGCGGAGCAUCA





GCGUACCCACACGGGCGAAAAGCCAUACAAAUGUCCGGAAUGCGGCAAAAGCUUCAGCACGAGUGGCAAUCUGGUGCGCCAUC





AACGUACGCACACGGGUGAGAAACCGUAUAAAUGCCCAGAGUGUGGUAAAAGCUUCAGUCAGAGCGGCCAUCUGACCGAACAC





CAGCGCACCCAUACGCCAAACCCGCAUCGCCGCACCGAUCCGAGCCACAAGCCGUUCCAGUACAAGUGUCCAGAGUGCGGUAA





AAGUUUUAGCGACAAGAAGGAUCUGACCCGUCACCAACGUACCCAUACCGGUGAAAAACCAACGGGCAAGAAAACCAGC





>ZF10-VPR protein


SEQ ID NO.: 136



MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSQNSTLTEHQRTHTGEKPYKCPECGKSFSERSHLREHQRTHTGE






KPYKCPECGKSFSSKKHLAEHQRTHTGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPYKCPECGKSFSDCRDLARHQRTHTG





EKPYKCPECGKSFSQSGDLRRHQRTHTGEKPYKCPECGKSFSTKNSLTEHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDL





DMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKK





SPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPA





MVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQG





IPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKP





EAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDE





ETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHI





STGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>ZF10-VPR mRNA


SEQ ID NO.: 137



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCCCUGGAGCCCGGCGAGAAGCCCUACAAAUGUCCCGAGUGUGGCAAGUCCUUCUC





CCAGAAUAGCACACUGACAGAACACCAGAGGACACACACCGGCGAGAAACCUUAUAAGUGCCCCGAAUGCGGCAAAAGCUUUU





CCGAGAGGAGCCACCUGAGGGAACACCAGAGAACACACACCGGAGAAAAACCUUACAAAUGCCCCGAGUGCGGAAAGUCCUUC





AGCAGCAAGAAGCACCUGGCUGAACACCAGAGAACCCACACCGGCGAGAAGCCUUACAAGUGCCCCGAAUGUGGCAAAAGCUU





CUCUAGAAACGACACACUCACCGAGCACCAGAGAACCCACACCGGCGAAAAGCCUUAUAAGUGUCCCGAGUGUGGCAAGAGCU





UCAGCGAUUGUAGAGAUCUGGCCAGACACCAAAGGACCCACACCGGAGAAAAACCUUACAAGUGCCCCGAGUGUGGAAAGAGC





UUUAGCCAAAGCGGCGAUCUGAGGAGACACCAGAGAACACACACCGGCGAAAAACCCUAUAAGUGUCCCGAAUGCGGAAAAUC





CUUCAGCACCAAAAACUCUCUGACCGAGCACCAAAGAACCCACACCGGCGAAAAGCCUACCGGCAAAAAGACAAGCGCUAGCG





GCAGCGGCGGCGGCAGCGGCGGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGAC





CUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGA





CAUGCUGAGCGGCGGCCCCAAGAAGAAGCGGAAGGUGGGCAGCCAGUACCUGCCCGACACCGACGACCGGCACCGGAUCGAGG





AGAAGCGGAAGCGGACCUACGAGACCUUCAAGAGCAUCAUGAAGAAAUCCCCCUUCAGCGGCCCCACCGACCCCCGGCCCCCC





CCCCGGCGGAUCGCCGUGCCCAGCCGGAGCAGCGCCAGCGUGCCCAAGCCCGCCCCCCAGCCCUACCCCUUCACCAGCAGCCU





GAGCACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCAGCGGCCAGAUCAGCCAGGCCAGCGCCCUGGCCCCCGCCC





CCCCCCAGGUGCUGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCAUGGUGAGCGCCCUGGCCCAGGCCCCCGCCCCCGUG





CCCGUGCUGGCCCCCGGCCCCCCCCAGGCCGUGGCCCCCCCCGCCCCCAAGCCCACCCAGGCCGGCGAGGGCACCCUGAGCGA





GGCCCUGCUGCAGCUGCAGUUCGACGACGAGGACCUGGGCGCCCUGCUGGGCAACAGCACCGACCCCGCCGUGUUCACCGACC





UGGCCAGCGUGGACAACAGCGAGUUCCAGCAGCUGCUGAACCAGGGCAUCCCCGUGGCCCCCCACACCACCGAGCCCAUGCUG





AUGGAGUACCCCGAGGCCAUCACCCGGCUGGUGACCGGCGCCCAGCGGCCCCCCGACCCCGCCCCCGCCCCCCUGGGCGCCCC





CGGCCUGCCCAACGGCCUGCUGAGCGGCGACGAGGACUUCAGCAGCAUCGCCGACAUGGACUUCAGCGCCCUGCUGGGCAGCG





GCAGCGGCAGCCGGGACAGCCGGGAGGGCAUGUUCCUGCCCAAGCCCGAGGCCGGCAGCGCCAUCAGCGACGUGUUCGAGGGC





CGGGAGGUGUGCCAGCCCAAGCGGCUCCGGCCCUUCCACCCCCCCGGCAGCCCCUGGGCCAACCGGCCCCUGCCCGCCAGCCU





GGCCCCCACCCCCACCGGCCCCGUGCACGAGCCCGUGGGCAGCCUGACCCCCGCCCCCGUGCCCCAGCCCCUGGACCCCGCCC





CCGCCGUGACCCCCGAGGCCAGCCACCUGCUGGAGGACCCCGACGAGGAGACCAGCCAGGCCGUGAAGGCCCUGCGGGAGAUG





GCCGACACCGUGAUCCCCCAGAAGGAGGAGGCCGCCAUCUGCGGCCAGAUGGACCUGAGCCACCCCCCCCCCCGGGGCCACCU





GGACGAGCUGACCACCACCCUGGAGAGCAUGACCGAGGACCUGAACCUGGACAGCCCCCUGACCCCCGAGCUGAACGAGAUCC





UGGACACCUUCCUGAACGACGAGUGCCUGCUGCACGCCAUGCACAUCAGCACCGGCCUGAGCAUCUUCGACACCAGCCUGUUC





AGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGA





CUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF10 target sequence


SEQ ID NO.: 138



CCTGCAGCCCCGCCCAGCCTA






>ZF10 amino acid sequence


SEQ ID NO.: 139



LEPGEKPYKCPECGKSFSQNSTLTEHQRTHTGEKPYKCPECGKSFSERSHLREHQRTHTGEKPYKCPECGKSFSSKKHLAEHQ






RTHTGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPYKCPECGKSFSDCRDLARHQRTHTGEKPYKCPECGKSFSQSGDLRRH





QRTHTGEKPYKCPECGKSFSTKNSLTEHQRTHTGEKPTGKKTS





>ZF10 mRNA sequence


SEQ ID NO.: 220



CUGGAGCCCGGCGAGAAGCCCUACAAAUGUCCCGAGUGUGGCAAGUCCUUCUCCCAGAAUAGCACACUGACAGAACACCAGAG






GACACACACCGGCGAGAAACCUUAUAAGUGCCCCGAAUGCGGCAAAAGCUUUUCCGAGAGGAGCCACCUGAGGGAACACCAGA





GAACACACACCGGAGAAAAACCUUACAAAUGCCCCGAGUGCGGAAAGUCCUUCAGCAGCAAGAAGCACCUGGCUGAACACCAG





AGAACCCACACCGGCGAGAAGCCUUACAAGUGCCCCGAAUGUGGCAAAAGCUUCUCUAGAAACGACACACUCACCGAGCACCA





GAGAACCCACACCGGCGAAAAGCCUUAUAAGUGUCCCGAGUGUGGCAAGAGCUUCAGCGAUUGUAGAGAUCUGGCCAGACACC





AAAGGACCCACACCGGAGAAAAACCUUACAAGUGCCCCGAGUGUGGAAAGAGCUUUAGCCAAAGCGGCGAUCUGAGGAGACAC





CAGAGAACACACACCGGCGAAAAACCCUAUAAGUGUCCCGAAUGCGGAAAAUCCUUCAGCACCAAAAACUCUCUGACCGAGCA





CCAAAGAACCCACACCGGCGAAAAGCCUACCGGCAAAAAGACAAGC





>ZF11-VPR protein


SEQ ID NO.: 140



MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSERSHLREHQRTHTGEKPYKCPECGKSFSRADNLTEHQRTHTGE






KPYKCPECGKSFSRSDKLTEHQRTHTGEKPYKCPECGKSFSSKKHLAEHQRTHTGEKPYKCPECGKSFSTKNSLTEHQRTHTG





EKPYKCPECGKSFSDKKDLTRHQRTHTGEKPYKCPECGKSFSSKKHLAEHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDL





DMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKK





SPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPA





MVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQG





IPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKP





EAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDE





ETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHI





STGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>ZF11-VPR mRNA


SEQ ID NO.: 141



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCCCUGGAGCCCGGCGAGAAACCCUACAAGUGUCCCGAGUGUGGCAAGAGCUUUUC





CGAGAGAAGCCACCUGAGGGAACACCAGAGAACCCACACCGGCGAGAAGCCUUACAAAUGCCCCGAAUGUGGAAAGAGCUUUU





CUAGAGCCGACAAUCUGACCGAACACCAAAGAACCCACACCGGCGAAAAACCCUAUAAGUGUCCCGAGUGUGGAAAAAGCUUC





UCUAGAAGCGACAAACUCACAGAGCACCAGAGGACACACACCGGCGAGAAGCCCUACAAAUGUCCCGAGUGCGGCAAAAGCUU





CAGCAGCAAGAAGCACCUGGCCGAGCACCAAAGAACACACACCGGCGAAAAACCUUAUAAAUGCCCCGAGUGCGGCAAGUCCU





UUUCCACCAAGAACUCUCUGACAGAACACCAAAGGACACACACCGGAGAAAAACCCUACAAAUGUCCCGAAUGUGGCAAAUCC





UUCAGCGAUAAGAAGGACCUCACCAGACACCAGAGGACACACACCGGCGAAAAACCUUAUAAAUGUCCCGAGUGCGGAAAGUC





CUUCUCCAGCAAAAAGCACCUCGCUGAGCACCAAAGGACCCACACCGGCGAGAAGCCCACCGGAAAAAAGACCAGCGCUAGCG





GCAGCGGCGGCGGCAGCGGCGGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGAC





CUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGA





CAUGCUGAGCGGCGGCCCCAAGAAGAAGCGGAAGGUGGGCAGCCAGUACCUGCCCGACACCGACGACCGGCACCGGAUCGAGG





AGAAGCGGAAGCGGACCUACGAGACCUUCAAGAGCAUCAUGAAGAAAUCCCCCUUCAGCGGCCCCACCGACCCCCGGCCCCCC





CCCCGGCGGAUCGCCGUGCCCAGCCGGAGCAGCGCCAGCGUGCCCAAGCCCGCCCCCCAGCCCUACCCCUUCACCAGCAGCCU





GAGCACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCAGCGGCCAGAUCAGCCAGGCCAGCGCCCUGGCCCCCGCCC





CCCCCCAGGUGCUGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCAUGGUGAGCGCCCUGGCCCAGGCCCCCGCCCCCGUG





CCCGUGCUGGCCCCCGGCCCCCCCCAGGCCGUGGCCCCCCCCGCCCCCAAGCCCACCCAGGCCGGCGAGGGCACCCUGAGCGA





GGCCCUGCUGCAGCUGCAGUUCGACGACGAGGACCUGGGCGCCCUGCUGGGCAACAGCACCGACCCCGCCGUGUUCACCGACC





UGGCCAGCGUGGACAACAGCGAGUUCCAGCAGCUGCUGAACCAGGGCAUCCCCGUGGCCCCCCACACCACCGAGCCCAUGCUG





AUGGAGUACCCCGAGGCCAUCACCCGGCUGGUGACCGGCGCCCAGCGGCCCCCCGACCCCGCCCCCGCCCCCCUGGGCGCCCC





CGGCCUGCCCAACGGCCUGCUGAGCGGCGACGAGGACUUCAGCAGCAUCGCCGACAUGGACUUCAGCGCCCUGCUGGGCAGCG





GCAGCGGCAGCCGGGACAGCCGGGAGGGCAUGUUCCUGCCCAAGCCCGAGGCCGGCAGCGCCAUCAGCGACGUGUUCGAGGGC





CGGGAGGUGUGCCAGCCCAAGCGGCUCCGGCCCUUCCACCCCCCCGGCAGCCCCUGGGCCAACCGGCCCCUGCCCGCCAGCCU





GGCCCCCACCCCCACCGGCCCCGUGCACGAGCCCGUGGGCAGCCUGACCCCCGCCCCCGUGCCCCAGCCCCUGGACCCCGCCC





CCGCCGUGACCCCCGAGGCCAGCCACCUGCUGGAGGACCCCGACGAGGAGACCAGCCAGGCCGUGAAGGCCCUGCGGGAGAUG





GCCGACACCGUGAUCCCCCAGAAGGAGGAGGCCGCCAUCUGCGGCCAGAUGGACCUGAGCCACCCCCCCCCCCGGGGCCACCU





GGACGAGCUGACCACCACCCUGGAGAGCAUGACCGAGGACCUGAACCUGGACAGCCCCCUGACCCCCGAGCUGAACGAGAUCC





UGGACACCUUCCUGAACGACGAGUGCCUGCUGCACGCCAUGCACAUCAGCACCGGCCUGAGCAUCUUCGACACCAGCCUGUUC





AGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGA





CUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF11 target sequence


SEQ ID NO.: 142



CCCACCCCTCCCCGGCAGAGC






>ZF11 amino acid sequence


SEQ ID NO.: 143



LEPGEKPYKCPECGKSFSERSHLREHQRTHTGEKPYKCPECGKSFSRADNLTEHQRTHTGEKPYKCPECGKSFSRSDKLTEHQ






RTHTGEKPYKCPECGKSFSSKKHLAEHQRTHTGEKPYKCPECGKSFSTKNSLTEHQRTHTGEKPYKCPECGKSFSDKKDLTRH





QRTHTGEKPYKCPECGKSFSSKKHLAEHQRTHTGEKPTGKKTS





>ZF11 mRNA sequence


SEQ ID NO.: 221



CUGGAGCCCGGCGAGAAACCCUACAAGUGUCCCGAGUGUGGCAAGAGCUUUUCCGAGAGAAGCCACCUGAGGGAACACCAGAG






AACCCACACCGGCGAGAAGCCUUACAAAUGCCCCGAAUGUGGAAAGAGCUUUUCUAGAGCCGACAAUCUGACCGAACACCAAA





GAACCCACACCGGCGAAAAACCCUAUAAGUGUCCCGAGUGUGGAAAAAGCUUCUCUAGAAGCGACAAACUCACAGAGCACCAG





AGGACACACACCGGCGAGAAGCCCUACAAAUGUCCCGAGUGCGGCAAAAGCUUCAGCAGCAAGAAGCACCUGGCCGAGCACCA





AAGAACACACACCGGCGAAAAACCUUAUAAAUGCCCCGAGUGCGGCAAGUCCUUUUCCACCAAGAACUCUCUGACAGAACACC





AAAGGACACACACCGGAGAAAAACCCUACAAAUGUCCCGAAUGUGGCAAAUCCUUCAGCGAUAAGAAGGACCUCACCAGACAC





CAGAGGACACACACCGGCGAAAAACCUUAUAAAUGUCCCGAGUGCGGAAAGUCCUUCUCCAGCAAAAAGCACCUCGCUGAGCA





CCAAAGGACCCACACCGGCGAGAAGCCCACCGGAAAAAAGACCAGC





>ZF12-VPR protein


SEQ ID NO.: 144



MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSSKKHLAEHQRTHTGEKPYKCPECGKSFSTHLDLIRHQRTHTGE






KPYKCPECGKSFSRKDNLKNHQRTHTGEKPYKCPECGKSFSDCRDLARHQRTHTGEKPYKCPECGKSFSREDNLHTHQRTHTG





EKPYKCPECGKSFSDPGHLVRHQRTHTGEKPYKCPECGKSFSQLAHLRAHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDL





DMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKK





SPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPA





MVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQG





IPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKP





EAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDE





ETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHI





STGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>ZF12-VPR mRNA


SEQ ID NO.: 145



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCCCUGGAGCCCGGCGAAAAGCCCUACAAAUGUCCCGAAUGUGGCAAGAGCUUCAG





CAGCAAAAAGCACCUGGCUGAACACCAGAGGACCCACACCGGAGAGAAACCCUAUAAAUGUCCCGAGUGUGGAAAAAGCUUCA





GCACCCACCUCGACCUCAUUAGGCACCAAAGAACCCACACCGGCGAAAAACCCUAUAAGUGUCCCGAGUGUGGAAAAUCCUUU





UCUAGAAAGGACAAUCUCAAGAAUCACCAAAGAACACACACCGGCGAGAAACCUUACAAGUGUCCCGAGUGCGGAAAGUCCUU





CUCCGACUGUAGAGAUCUGGCUAGACACCAGAGAACCCACACCGGCGAGAAGCCCUAUAAGUGCCCCGAGUGCGGCAAGUCCU





UCUCUAGAGAGGACAAUCUGCACACACACCAGAGGACCCACACCGGCGAAAAACCUUACAAAUGCCCCGAGUGUGGCAAGAGC





UUUAGCGAUCCCGGACACCUGGUGAGACACCAAAGAACCCACACCGGCGAGAAGCCUUACAAGUGUCCCGAAUGUGGAAAAUC





CUUUAGCCAGCUGGCCCACCUGAGGGCCCACCAAAGGACACACACCGGCGAAAAACCCACCGGCAAAAAGACCAGCGCUAGCG





GCAGCGGCGGCGGCAGCGGCGGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGAC





CUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGA





CAUGCUGAGCGGCGGCCCCAAGAAGAAGCGGAAGGUGGGCAGCCAGUACCUGCCCGACACCGACGACCGGCACCGGAUCGAGG





AGAAGCGGAAGCGGACCUACGAGACCUUCAAGAGCAUCAUGAAGAAAUCCCCCUUCAGCGGCCCCACCGACCCCCGGCCCCCC





CCCCGGCGGAUCGCCGUGCCCAGCCGGAGCAGCGCCAGCGUGCCCAAGCCCGCCCCCCAGCCCUACCCCUUCACCAGCAGCCU





GAGCACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCAGCGGCCAGAUCAGCCAGGCCAGCGCCCUGGCCCCCGCCC





CCCCCCAGGUGCUGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCAUGGUGAGCGCCCUGGCCCAGGCCCCCGCCCCCGUG





CCCGUGCUGGCCCCCGGCCCCCCCCAGGCCGUGGCCCCCCCCGCCCCCAAGCCCACCCAGGCCGGCGAGGGCACCCUGAGCGA





GGCCCUGCUGCAGCUGCAGUUCGACGACGAGGACCUGGGCGCCCUGCUGGGCAACAGCACCGACCCCGCCGUGUUCACCGACC





UGGCCAGCGUGGACAACAGCGAGUUCCAGCAGCUGCUGAACCAGGGCAUCCCCGUGGCCCCCCACACCACCGAGCCCAUGCUG





AUGGAGUACCCCGAGGCCAUCACCCGGCUGGUGACCGGCGCCCAGCGGCCCCCCGACCCCGCCCCCGCCCCCCUGGGCGCCCC





CGGCCUGCCCAACGGCCUGCUGAGCGGCGACGAGGACUUCAGCAGCAUCGCCGACAUGGACUUCAGCGCCCUGCUGGGCAGCG





GCAGCGGCAGCCGGGACAGCCGGGAGGGCAUGUUCCUGCCCAAGCCCGAGGCCGGCAGCGCCAUCAGCGACGUGUUCGAGGGC





CGGGAGGUGUGCCAGCCCAAGCGGCUCCGGCCCUUCCACCCCCCCGGCAGCCCCUGGGCCAACCGGCCCCUGCCCGCCAGCCU





GGCCCCCACCCCCACCGGCCCCGUGCACGAGCCCGUGGGCAGCCUGACCCCCGCCCCCGUGCCCCAGCCCCUGGACCCCGCCC





CCGCCGUGACCCCCGAGGCCAGCCACCUGCUGGAGGACCCCGACGAGGAGACCAGCCAGGCCGUGAAGGCCCUGCGGGAGAUG





GCCGACACCGUGAUCCCCCAGAAGGAGGAGGCCGCCAUCUGCGGCCAGAUGGACCUGAGCCACCCCCCCCCCCGGGGCCACCU





GGACGAGCUGACCACCACCCUGGAGAGCAUGACCGAGGACCUGAACCUGGACAGCCCCCUGACCCCCGAGCUGAACGAGAUCC





UGGACACCUUCCUGAACGACGAGUGCCUGCUGCACGCCAUGCACAUCAGCACCGGCCUGAGCAUCUUCGACACCAGCCUGUUC





AGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGA





CUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF12 target sequence


SEQ ID NO.: 146



AGAGGCTAGGCCAAGACTCCC






>ZF12 amino acid sequence


SEQ ID NO.: 147



LEPGEKPYKCPECGKSFSSKKHLAEHQRTHTGEKPYKCPECGKSFSTHLDLIRHQRTHTGEKPYKCPECGKSFSRKDNLKNHQ






RTHTGEKPYKCPECGKSFSDCRDLARHQRTHTGEKPYKCPECGKSFSREDNLHTHQRTHTGEKPYKCPECGKSFSDPGHLVRH





QRTHTGEKPYKCPECGKSFSQLAHLRAHQRTHTGEKPTGKKTS





>ZF12 mRNA sequence


SEQ ID NO.: 222



CUGGAGCCCGGCGAAAAGCCCUACAAAUGUCCCGAAUGUGGCAAGAGCUUCAGCAGCAAAAAGCACCUGGCUGAACACCAGAG






GACCCACACCGGAGAGAAACCCUAUAAAUGUCCCGAGUGUGGAAAAAGCUUCAGCACCCACCUCGACCUCAUUAGGCACCAAA





GAACCCACACCGGCGAAAAACCCUAUAAGUGUCCCGAGUGUGGAAAAUCCUUUUCUAGAAAGGACAAUCUCAAGAAUCACCAA





AGAACACACACCGGCGAGAAACCUUACAAGUGUCCCGAGUGCGGAAAGUCCUUCUCCGACUGUAGAGAUCUGGCUAGACACCA





GAGAACCCACACCGGCGAGAAGCCCUAUAAGUGCCCCGAGUGCGGCAAGUCCUUCUCUAGAGAGGACAAUCUGCACACACACC





AGAGGACCCACACCGGCGAAAAACCUUACAAAUGCCCCGAGUGUGGCAAGAGCUUUAGCGAUCCCGGACACCUGGUGAGACAC





CAAAGAACCCACACCGGCGAGAAGCCUUACAAGUGUCCCGAAUGUGGAAAAUCCUUUAGCCAGCUGGCCCACCUGAGGGCCCA





CCAAAGGACACACACCGGCGAAAAACCCACCGGCAAAAAGACCAGC





>ZF13-VPR protein


SEQ ID NO.: 148



MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSRSDKLVRHQRTHTGEKPYKCPECGKSFSQSGDLRRHQRTHTGE






KPYKCPECGKSFSTSGELVRHQRTHTGEKPYKCPECGKSFSRSDKLVRHQRTHTGEKPYKCPECGKSFSDPGHLVRHQRTHTG





EKPYKCPECGKSFSRNDALTEHQRTHTGEKPYKCPECGKSFSRSDHLTNHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDL





DMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKK





SPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPA





MVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQG





IPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKP





EAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDE





ETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHI





STGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>ZF13-VPR mRNA


SEQ ID NO.: 149



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCCCUGGAGCCCGGCGAAAAGCCCUAUAAGUGCCCCGAGUGCGGCAAGAGCUUCUC





UAGAAGCGACAAACUCGUGAGACACCAGAGAACACACACCGGAGAGAAACCUUACAAGUGCCCCGAGUGUGGCAAGUCCUUCU





CCCAAUCCGGCGAUCUGAGGAGACACCAGAGAACCCACACCGGCGAAAAACCCUACAAAUGCCCCGAGUGCGGAAAGUCCUUU





UCCACCUCCGGCGAGCUCGUGAGACACCAAAGGACCCACACCGGCGAGAAGCCUUACAAGUGCCCCGAGUGCGGCAAAUCCUU





CUCCAGAUCCGACAAGCUCGUGAGGCACCAGAGGACACACACCGGAGAGAAACCUUAUAAGUGUCCCGAAUGUGGAAAGUCCU





UCAGCGACCCCGGACACCUGGUGAGACACCAGAGGACCCACACCGGCGAAAAGCCUUAUAAAUGUCCCGAGUGCGGAAAAAGC





UUUUCUAGAAACGAUGCUCUGACAGAGCACCAAAGAACCCACACCGGCGAAAAACCCUACAAGUGUCCCGAGUGCGGAAAGAG





CUUCAGCAGAAGCGACCACCUGACCAACCACCAGAGAACACACACCGGAGAAAAACCCACCGGCAAAAAGACCUCCGCUAGCG





GCAGCGGCGGCGGCAGCGGCGGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGAC





CUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGA





CAUGCUGAGCGGCGGCCCCAAGAAGAAGCGGAAGGUGGGCAGCCAGUACCUGCCCGACACCGACGACCGGCACCGGAUCGAGG





AGAAGCGGAAGCGGACCUACGAGACCUUCAAGAGCAUCAUGAAGAAAUCCCCCUUCAGCGGCCCCACCGACCCCCGGCCCCCC





CCCCGGCGGAUCGCCGUGCCCAGCCGGAGCAGCGCCAGCGUGCCCAAGCCCGCCCCCCAGCCCUACCCCUUCACCAGCAGCCU





GAGCACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCAGCGGCCAGAUCAGCCAGGCCAGCGCCCUGGCCCCCGCCC





CCCCCCAGGUGCUGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCAUGGUGAGCGCCCUGGCCCAGGCCCCCGCCCCCGUG





CCCGUGCUGGCCCCCGGCCCCCCCCAGGCCGUGGCCCCCCCCGCCCCCAAGCCCACCCAGGCCGGCGAGGGCACCCUGAGCGA





GGCCCUGCUGCAGCUGCAGUUCGACGACGAGGACCUGGGCGCCCUGCUGGGCAACAGCACCGACCCCGCCGUGUUCACCGACC





UGGCCAGCGUGGACAACAGCGAGUUCCAGCAGCUGCUGAACCAGGGCAUCCCCGUGGCCCCCCACACCACCGAGCCCAUGCUG





AUGGAGUACCCCGAGGCCAUCACCCGGCUGGUGACCGGCGCCCAGCGGCCCCCCGACCCCGCCCCCGCCCCCCUGGGCGCCCC





CGGCCUGCCCAACGGCCUGCUGAGCGGCGACGAGGACUUCAGCAGCAUCGCCGACAUGGACUUCAGCGCCCUGCUGGGCAGCG





GCAGCGGCAGCCGGGACAGCCGGGAGGGCAUGUUCCUGCCCAAGCCCGAGGCCGGCAGCGCCAUCAGCGACGUGUUCGAGGGC





CGGGAGGUGUGCCAGCCCAAGCGGCUCCGGCCCUUCCACCCCCCCGGCAGCCCCUGGGCCAACCGGCCCCUGCCCGCCAGCCU





GGCCCCCACCCCCACCGGCCCCGUGCACGAGCCCGUGGGCAGCCUGACCCCCGCCCCCGUGCCCCAGCCCCUGGACCCCGCCC





CCGCCGUGACCCCCGAGGCCAGCCACCUGCUGGAGGACCCCGACGAGGAGACCAGCCAGGCCGUGAAGGCCCUGCGGGAGAUG





GCCGACACCGUGAUCCCCCAGAAGGAGGAGGCCGCCAUCUGCGGCCAGAUGGACCUGAGCCACCCCCCCCCCCGGGGCCACCU





GGACGAGCUGACCACCACCCUGGAGAGCAUGACCGAGGACCUGAACCUGGACAGCCCCCUGACCCCCGAGCUGAACGAGAUCC





UGGACACCUUCCUGAACGACGAGUGCCUGCUGCACGCCAUGCACAUCAGCACCGGCCUGAGCAUCUUCGACACCAGCCUGUUC





AGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGA





CUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF13 target sequence


SEQ ID NO.: 150



AGGCTGGGCGGGGCTGCAGGG






>ZF13 amino acid sequence


SEQ ID NO.: 151



LEPGEKPYKCPECGKSFSRSDKLVRHQRTHTGEKPYKCPECGKSFSQSGDLRRHQRTHTGEKPYKCPECGKSFSTSGELVRHQ






RTHTGEKPYKCPECGKSFSRSDKLVRHQRTHTGEKPYKCPECGKSFSDPGHLVRHQRTHTGEKPYKCPECGKSFSRNDALTEH





QRTHTGEKPYKCPECGKSFSRSDHLTNHQRTHTGEKPTGKKTS





>ZF13 mRNA sequence


SEQ ID NO.: 223



CUGGAGCCCGGCGAAAAGCCCUAUAAGUGCCCCGAGUGCGGCAAGAGCUUCUCUAGAAGCGACAAACUCGUGAGACACCAGAG






AACACACACCGGAGAGAAACCUUACAAGUGCCCCGAGUGUGGCAAGUCCUUCUCCCAAUCCGGCGAUCUGAGGAGACACCAGA





GAACCCACACCGGCGAAAAACCCUACAAAUGCCCCGAGUGCGGAAAGUCCUUUUCCACCUCCGGCGAGCUCGUGAGACACCAA





AGGACCCACACCGGCGAGAAGCCUUACAAGUGCCCCGAGUGCGGCAAAUCCUUCUCCAGAUCCGACAAGCUCGUGAGGCACCA





GAGGACACACACCGGAGAGAAACCUUAUAAGUGUCCCGAAUGUGGAAAGUCCUUCAGCGACCCCGGACACCUGGUGAGACACC





AGAGGACCCACACCGGCGAAAAGCCUUAUAAAUGUCCCGAGUGCGGAAAAAGCUUUUCUAGAAACGAUGCUCUGACAGAGCAC





CAAAGAACCCACACCGGCGAAAAACCCUACAAGUGUCCCGAGUGCGGAAAGAGCUUCAGCAGAAGCGACCACCUGACCAACCA





CCAGAGAACACACACCGGAGAAAAACCCACCGGCAAAAAGACCUCC





>ZF14-VPR protein


SEQ ID NO.: 152



MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSTSGHLVRHQRTHTGEKPYKCPECGKSFSTTGNLTVHQRTHTGE






KPYKCPECGKSFSTSGSLVRHQRTHTGEKPYKCPECGKSFSRSDKLVRHQRTHTGEKPYKCPECGKSFSRSDELVRHQRTHTG





EKPYKCPECGKSFSRSDKLVRHQRTHTGEKPYKCPECGKSFSQRAHLERHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDL





DMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKK





SPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPA





MVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQG





IPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKP





EAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDE





ETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHI





STGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>ZF14-VPR mRNA


SEQ ID NO.: 153



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCCCUGGAGCCCGGCGAAAAGCCCUACAAAUGUCCCGAAUGCGGCAAAUCCUUCUC





CACCUCCGGCCACCUCGUGAGACACCAGAGGACACACACCGGCGAGAAGCCUUAUAAGUGCCCCGAAUGCGGCAAAAGCUUCU





CCACCACCGGCAAUCUGACCGUCCACCAGAGAACACACACCGGCGAAAAACCUUAUAAGUGUCCCGAGUGUGGCAAAUCCUUU





UCCACCAGCGGAUCUCUGGUGAGACACCAAAGGACACACACCGGCGAAAAACCCUACAAAUGCCCCGAGUGUGGAAAAUCCUU





CUCUAGAAGCGACAAGCUGGUGAGACACCAGAGGACCCACACCGGCGAGAAACCCUACAAGUGCCCCGAAUGUGGCAAGAGCU





UCUCUAGAUCCGACGAGCUCGUGAGACACCAAAGAACCCACACCGGCGAAAAGCCUUACAAAUGUCCCGAGUGCGGAAAGAGC





UUUAGCAGAAGCGAUAAGCUGGUCAGACACCAAAGAACACACACCGGAGAAAAACCCUAUAAGUGCCCCGAGUGUGGCAAGUC





CUUUAGCCAGAGAGCCCACCUGGAGAGACACCAAAGAACCCACACCGGCGAAAAACCCACCGGAAAAAAGACAAGCGCUAGCG





GCAGCGGCGGCGGCAGCGGCGGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGAC





CUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGA





CAUGCUGAGCGGCGGCCCCAAGAAGAAGCGGAAGGUGGGCAGCCAGUACCUGCCCGACACCGACGACCGGCACCGGAUCGAGG





AGAAGCGGAAGCGGACCUACGAGACCUUCAAGAGCAUCAUGAAGAAAUCCCCCUUCAGCGGCCCCACCGACCCCCGGCCCCCC





CCCCGGCGGAUCGCCGUGCCCAGCCGGAGCAGCGCCAGCGUGCCCAAGCCCGCCCCCCAGCCCUACCCCUUCACCAGCAGCCU





GAGCACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCAGCGGCCAGAUCAGCCAGGCCAGCGCCCUGGCCCCCGCCC





CCCCCCAGGUGCUGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCAUGGUGAGCGCCCUGGCCCAGGCCCCCGCCCCCGUG





CCCGUGCUGGCCCCCGGCCCCCCCCAGGCCGUGGCCCCCCCCGCCCCCAAGCCCACCCAGGCCGGCGAGGGCACCCUGAGCGA





GGCCCUGCUGCAGCUGCAGUUCGACGACGAGGACCUGGGCGCCCUGCUGGGCAACAGCACCGACCCCGCCGUGUUCACCGACC





UGGCCAGCGUGGACAACAGCGAGUUCCAGCAGCUGCUGAACCAGGGCAUCCCCGUGGCCCCCCACACCACCGAGCCCAUGCUG





AUGGAGUACCCCGAGGCCAUCACCCGGCUGGUGACCGGCGCCCAGCGGCCCCCCGACCCCGCCCCCGCCCCCCUGGGCGCCCC





CGGCCUGCCCAACGGCCUGCUGAGCGGCGACGAGGACUUCAGCAGCAUCGCCGACAUGGACUUCAGCGCCCUGCUGGGCAGCG





GCAGCGGCAGCCGGGACAGCCGGGAGGGCAUGUUCCUGCCCAAGCCCGAGGCCGGCAGCGCCAUCAGCGACGUGUUCGAGGGC





CGGGAGGUGUGCCAGCCCAAGCGGCUCCGGCCCUUCCACCCCCCCGGCAGCCCCUGGGCCAACCGGCCCCUGCCCGCCAGCCU





GGCCCCCACCCCCACCGGCCCCGUGCACGAGCCCGUGGGCAGCCUGACCCCCGCCCCCGUGCCCCAGCCCCUGGACCCCGCCC





CCGCCGUGACCCCCGAGGCCAGCCACCUGCUGGAGGACCCCGACGAGGAGACCAGCCAGGCCGUGAAGGCCCUGCGGGAGAUG





GCCGACACCGUGAUCCCCCAGAAGGAGGAGGCCGCCAUCUGCGGCCAGAUGGACCUGAGCCACCCCCCCCCCCGGGGCCACCU





GGACGAGCUGACCACCACCCUGGAGAGCAUGACCGAGGACCUGAACCUGGACAGCCCCCUGACCCCCGAGCUGAACGAGAUCC





UGGACACCUUCCUGAACGACGAGUGCCUGCUGCACGCCAUGCACAUCAGCACCGGCCUGAGCAUCUUCGACACCAGCCUGUUC





AGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGA





CUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF14 target sequence


SEQ ID NO.: 154



GGAGGGGTGGGGGTTAATGGT






>ZF14 amino acid sequence


SEQ ID NO.: 155



LEPGEKPYKCPECGKSFSTSGHLVRHQRTHTGEKPYKCPECGKSFSTTGNLTVHQRTHTGEKPYKCPECGKSFSTSGSLVRHQ






RTHTGEKPYKCPECGKSFSRSDKLVRHQRTHTGEKPYKCPECGKSFSRSDELVRHQRTHTGEKPYKCPECGKSFSRSDKLVRH





QRTHTGEKPYKCPECGKSFSQRAHLERHQRTHTGEKPTGKKTS





>ZF14 mRNA sequence


SEQ ID NO.: 224



CUGGAGCCCGGCGAAAAGCCCUACAAAUGUCCCGAAUGCGGCAAAUCCUUCUCCACCUCCGGCCACCUCGUGAGACACCAGAG






GACACACACCGGCGAGAAGCCUUAUAAGUGCCCCGAAUGCGGCAAAAGCUUCUCCACCACCGGCAAUCUGACCGUCCACCAGA





GAACACACACCGGCGAAAAACCUUAUAAGUGUCCCGAGUGUGGCAAAUCCUUUUCCACCAGCGGAUCUCUGGUGAGACACCAA





AGGACACACACCGGCGAAAAACCCUACAAAUGCCCCGAGUGUGGAAAAUCCUUCUCUAGAAGCGACAAGCUGGUGAGACACCA





GAGGACCCACACCGGCGAGAAACCCUACAAGUGCCCCGAAUGUGGCAAGAGCUUCUCUAGAUCCGACGAGCUCGUGAGACACC





AAAGAACCCACACCGGCGAAAAGCCUUACAAAUGUCCCGAGUGCGGAAAGAGCUUUAGCAGAAGCGAUAAGCUGGUCAGACAC





CAAAGAACACACACCGGAGAAAAACCCUAUAAGUGCCCCGAGUGUGGCAAGUCCUUUAGCCAGAGAGCCCACCUGGAGAGACA





CCAAAGAACCCACACCGGCGAAAAACCCACCGGAAAAAAGACAAGC





>ZF15-VPR protein


SEQ ID NO.: 156



MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPYKCPECGKSFSRNDALTEHQRTHTGE






KPYKCPECGKSFSTSGELVRHQRTHTGEKPYKCPECGKSFSRSDNLVRHQRTHTGEKPYKCPECGKSFSRSDELVRHQRTHTG





EKPYKCPECGKSFSRSDKLVRHQRTHTGEKPYKCPECGKSFSQSSNLVRHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDL





DMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKK





SPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPA





MVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQG





IPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKP





EAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDE





ETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHI





STGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>ZF15-VPR mRNA


SEQ ID NO.: 157



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCCCUGGAGCCCGGCGAGAAGCCCUACAAAUGUCCCGAAUGUGGCAAGAGCUUCUC





UAGAAACGACACACUGACCGAACACCAGAGAACACACACCGGCGAAAAACCUUAUAAAUGUCCCGAGUGUGGAAAAUCCUUCU





CUAGAAAUGACGCUCUCACCGAGCACCAAAGAACACACACCGGCGAAAAGCCUUACAAAUGCCCCGAAUGUGGAAAGUCCUUC





UCCACCUCCGGAGAGCUGGUGAGACACCAGAGAACCCACACCGGCGAAAAACCCUACAAGUGCCCCGAGUGCGGAAAAAGCUU





CUCUAGAAGCGAUAAUCUGGUGAGACACCAAAGGACACACACCGGCGAGAAGCCCUAUAAGUGUCCCGAAUGCGGCAAGUCCU





UUUCCAGAAGCGACGAACUGGUGAGACACCAGAGAACCCACACCGGAGAGAAGCCUUAUAAGUGUCCCGAGUGCGGAAAGAGC





UUUUCUAGAUCCGACAAGCUCGUGAGACACCAAAGGACCCACACCGGCGAGAAACCCUAUAAAUGUCCCGAGUGUGGCAAAUC





CUUUUCCCAGAGCAGCAACCUCGUGAGGCACCAGAGGACCCACACCGGCGAGAAACCCACCGGCAAAAAGACCAGCGCUAGCG





GCAGCGGCGGCGGCAGCGGCGGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGAC





CUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGA





CAUGCUGAGCGGCGGCCCCAAGAAGAAGCGGAAGGUGGGCAGCCAGUACCUGCCCGACACCGACGACCGGCACCGGAUCGAGG





AGAAGCGGAAGCGGACCUACGAGACCUUCAAGAGCAUCAUGAAGAAAUCCCCCUUCAGCGGCCCCACCGACCCCCGGCCCCCC





CCCCGGCGGAUCGCCGUGCCCAGCCGGAGCAGCGCCAGCGUGCCCAAGCCCGCCCCCCAGCCCUACCCCUUCACCAGCAGCCU





GAGCACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCAGCGGCCAGAUCAGCCAGGCCAGCGCCCUGGCCCCCGCCC





CCCCCCAGGUGCUGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCAUGGUGAGCGCCCUGGCCCAGGCCCCCGCCCCCGUG





CCCGUGCUGGCCCCCGGCCCCCCCCAGGCCGUGGCCCCCCCCGCCCCCAAGCCCACCCAGGCCGGCGAGGGCACCCUGAGCGA





GGCCCUGCUGCAGCUGCAGUUCGACGACGAGGACCUGGGCGCCCUGCUGGGCAACAGCACCGACCCCGCCGUGUUCACCGACC





UGGCCAGCGUGGACAACAGCGAGUUCCAGCAGCUGCUGAACCAGGGCAUCCCCGUGGCCCCCCACACCACCGAGCCCAUGCUG





AUGGAGUACCCCGAGGCCAUCACCCGGCUGGUGACCGGCGCCCAGCGGCCCCCCGACCCCGCCCCCGCCCCCCUGGGCGCCCC





CGGCCUGCCCAACGGCCUGCUGAGCGGCGACGAGGACUUCAGCAGCAUCGCCGACAUGGACUUCAGCGCCCUGCUGGGCAGCG





GCAGCGGCAGCCGGGACAGCCGGGAGGGCAUGUUCCUGCCCAAGCCCGAGGCCGGCAGCGCCAUCAGCGACGUGUUCGAGGGC





CGGGAGGUGUGCCAGCCCAAGCGGCUCCGGCCCUUCCACCCCCCCGGCAGCCCCUGGGCCAACCGGCCCCUGCCCGCCAGCCU





GGCCCCCACCCCCACCGGCCCCGUGCACGAGCCCGUGGGCAGCCUGACCCCCGCCCCCGUGCCCCAGCCCCUGGACCCCGCCC





CCGCCGUGACCCCCGAGGCCAGCCACCUGCUGGAGGACCCCGACGAGGAGACCAGCCAGGCCGUGAAGGCCCUGCGGGAGAUG





GCCGACACCGUGAUCCCCCAGAAGGAGGAGGCCGCCAUCUGCGGCCAGAUGGACCUGAGCCACCCCCCCCCCCGGGGCCACCU





GGACGAGCUGACCACCACCCUGGAGAGCAUGACCGAGGACCUGAACCUGGACAGCCCCCUGACCCCCGAGCUGAACGAGAUCC





UGGACACCUUCCUGAACGACGAGUGCCUGCUGCACGCCAUGCACAUCAGCACCGGCCUGAGCAUCUUCGACACCAGCCUGUUC





AGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGA





CUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF15 target sequence


SEQ ID NO.: 158



GAAGGGGTGGAGGCTCTGCCG






>ZF15 amino acid sequence


SEQ ID NO.: 159



LEPGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPYKCPECGKSFSRNDALTEHQRTHTGEKPYKCPECGKSFSTSGELVRHQ






RTHTGEKPYKCPECGKSFSRSDNLVRHQRTHTGEKPYKCPECGKSFSRSDELVRHQRTHTGEKPYKCPECGKSFSRSDKLVRH





QRTHTGEKPYKCPECGKSFSQSSNLVRHQRTHTGEKPTGKKTS





>ZF15  mRNA sequence


SEQ ID NO.: 225



CUGGAGCCCGGCGAGAAGCCCUACAAAUGUCCCGAAUGUGGCAAGAGCUUCUCUAGAAACGACACACUGACCGAACACCAGAG






AACACACACCGGCGAAAAACCUUAUAAAUGUCCCGAGUGUGGAAAAUCCUUCUCUAGAAAUGACGCUCUCACCGAGCACCAAA





GAACACACACCGGCGAAAAGCCUUACAAAUGCCCCGAAUGUGGAAAGUCCUUCUCCACCUCCGGAGAGCUGGUGAGACACCAG





AGAACCCACACCGGCGAAAAACCCUACAAGUGCCCCGAGUGCGGAAAAAGCUUCUCUAGAAGCGAUAAUCUGGUGAGACACCA





AAGGACACACACCGGCGAGAAGCCCUAUAAGUGUCCCGAAUGCGGCAAGUCCUUUUCCAGAAGCGACGAACUGGUGAGACACC





AGAGAACCCACACCGGAGAGAAGCCUUAUAAGUGUCCCGAGUGCGGAAAGAGCUUUUCUAGAUCCGACAAGCUCGUGAGACAC





CAAAGGACCCACACCGGCGAGAAACCCUAUAAAUGUCCCGAGUGUGGCAAAUCCUUUUCCCAGAGCAGCAACCUCGUGAGGCA





CCAGAGGACCCACACCGGCGAGAAACCCACCGGCAAAAAGACCAGC





>ZF5.1-VPR mRNA (ZF5-VPR ATUM Opt_1)


SEQ ID NO.: 160



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCGAAGAAAAAGCGCAAAGUCGGAAUCCAU






GGUGUCCCUGCGGCUGGAAGUUCCGGCUCCUUGGAACCGGGAGAGAAGCCUUAUAAGUGUCCGGAGUGUGGGAAGUCGUUCUC





CACCUCGGGCAACCUCACCGAACAUCAGCGCACACAUACGGGGGAGAAACCUUACAAAUGCCCGGAAUGUGGAAAGAGCUUCU





CCGAUUCGGGAAAUCUCAGAGUGCACCAACGCACCCACACAGGAGAAAAACCGUAUAAGUGCCCCGAAUGCGGGAAAUCGUUC





UCCCACAAGAAUGCGCUGCAGAACCACCAGAGGACACAUACUGGGGAGAAGCCCUACAAGUGUCCUGAAUGCGGAAAGUCCUU





CUCGCGCAACGAUACUUUGACCGAGCACCAGCGCACUCACACCGGCGAAAAGCCGUACAAGUGCCCAGAGUGCGGUAAAAGCU





UCUCGCAACGGGCCCAUCUGGAACGGCACCAGCGGACUCACACUGGAGAAAAGCCCUACAAGUGUCCCGAGUGCGGGAAGUCC





UUUUCCCGGUCCGAUAAGCUCGUGCGCCACCAGAGAACCCAUACUGGAGAGAAACCGUACAAAUGUCCAGAAUGCGGCAAAUC





CUUCUCGGACCCGGGACACCUCGUGCGGCAUCAACGGACCCAUACCGGGGAAAAGCCCACCGGAAAGAAAACUAGCGCGUCAG





GCUCUGGUGGAGGAUCGGGGGGAGAUGCUCUGGACGACUUUGACCUUGACAUGCUUGGCUCCGACGCCCUUGACGACUUCGAC





CUCGAUAUGCUGGGAUCGGACGCCCUGGAUGACUUCGAUCUGGACAUGUUGGGCUCGGACGCGCUAGACGAUUUUGACCUGGA





UAUGCUGUCCGGAGGUCCCAAGAAGAAGCGGAAGGUCGGCAGCCAGUAUCUGCCGGAUACUGAUGACCGGCACAGAAUCGAGG





AGAAGCGAAAGCGGACCUACGAAACUUUCAAGAGCAUUAUGAAGAAGUCCCCGUUCUCGGGUCCAACCGACCCCAGACCUCCU





CCGCGGAGAAUUGCCGUGCCAAGCCGCUCAAGCGCCAGCGUGCCCAAGCCAGCACCACAGCCCUACCCGUUCACCUCCUCCCU





UUCGACCAUCAACUACGACGAAUUCCCAACCAUGGUGUUCCCUAGCGGACAAAUCAGCCAGGCUUCCGCUCUGGCACCAGCCC





CACCUCAAGUGCUCCCGCAAGCGCCUGCUCCAGCACCGGCUCCUGCCAUGGUUUCAGCGCUGGCCCAAGCACCCGCUCCUGUG





CCUGUGCUGGCCCCUGGACCACCACAAGCAGUAGCCCCGCCUGCACCUAAGCCAACUCAGGCCGGCGAAGGAACCCUGAGCGA





AGCGUUGCUGCAGCUUCAGUUCGACGACGAGGACCUGGGUGCCCUGUUGGGCAACUCAACUGACCCUGCCGUGUUCACCGACC





UGGCAUCCGUCGAUAACUCCGAGUUCCAGCAGUUGCUGAACCAGGGAAUCCCAGUCGCCCCCCAUACCACCGAACCGAUGCUC





AUGGAGUACCCCGAAGCCAUCACCAGACUGGUCACCGGCGCACAAAGGCCCCCUGAUCCUGCUCCCGCACCUCUCGGUGCCCC





UGGACUGCCAAACGGCCUUCUGUCCGGCGACGAGGACUUCUCGUCCAUCGCCGAUAUGGAUUUCUCCGCCCUGCUCGGAUCCG





GCAGCGGAUCACGCGACUCGCGCGAAGGGAUGUUCCUGCCGAAGCCUGAGGCUGGUUCCGCCAUUAGCGACGUGUUCGAGGGG





CGCGAAGUCUGCCAACCCAAGAGACUGCGCCCGUUUCAUCCCCCGGGAAGCCCUUGGGCCAACAGACCUCUGCCAGCCUCCCU





GGCACCCACUCCGACUGGGCCUGUGCACGAACCCGUGGGCUCCCUGACUCCGGCACCAGUGCCACAGCCCCUGGAUCCAGCCC





CUGCUGUGACCCCGGAGGCCUCACACCUUCUGGAAGAUCCGGACGAGGAAACGUCCCAGGCCGUGAAGGCCCUGCGGGAGAUG





GCGGACACUGUGAUCCCUCAGAAAGAAGAGGCGGCCAUUUGCGGCCAGAUGGACCUCUCCCAUCCGCCUCCGAGAGGACACCU





GGAUGAACUCACGACCACCCUCGAGUCCAUGACCGAGGACCUGAACCUGGACUCCCCCCUGACACCCGAACUCAACGAGAUCC





UGGACACCUUCCUGAACGACGAGUGCCUGCUCCACGCCAUGCACAUCUCAACCGGGCUGUCGAUCUUCGACACUAGCUUGUUC





UCUGGAGGAAAGAGGCCGGCCGCUACUAAGAAGGCCGGACAAGCGAAGAAGAAGAAGGGAUCGUACCCUUACGACGUGCCCGA





CUACGCAUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF5.1 mRNA sequence


SEQ ID NO.: 226



UUGGAACCGGGAGAGAAGCCUUAUAAGUGUCCGGAGUGUGGGAAGUCGUUCUCCACCUCGGGCAACCUCACCGAACAUCAGCG






CACACAUACGGGGGAGAAACCUUACAAAUGCCCGGAAUGUGGAAAGAGCUUCUCCGAUUCGGGAAAUCUCAGAGUGCACCAAC





GCACCCACACAGGAGAAAAACCGUAUAAGUGCCCCGAAUGCGGGAAAUCGUUCUCCCACAAGAAUGCGCUGCAGAACCACCAG





AGGACACAUACUGGGGAGAAGCCCUACAAGUGUCCUGAAUGCGGAAAGUCCUUCUCGCGCAACGAUACUUUGACCGAGCACCA





GCGCACUCACACCGGCGAAAAGCCGUACAAGUGCCCAGAGUGCGGUAAAAGCUUCUCGCAACGGGCCCAUCUGGAACGGCACC





AGCGGACUCACACUGGAGAAAAGCCCUACAAGUGUCCCGAGUGCGGGAAGUCCUUUUCCCGGUCCGAUAAGCUCGUGCGCCAC





CAGAGAACCCAUACUGGAGAGAAACCGUACAAAUGUCCAGAAUGCGGCAAAUCCUUCUCGGACCCGGGACACCUCGUGCGGCA





UCAACGGACCCAUACCGGGGAAAAGCCCACCGGAAAGAAAACUAGC





>ZF5.2-VPR mRNA (ZF5-VPR ATUM Opt_2)


SEQ ID NO. 161



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCGAAGAAGAAGCGGAAGGUCGGCAUCCAC






GGAGUGCCGGCAGCAGGGUCAUCAGGCUCCCUCGAACCCGGGGAAAAGCCGUACAAGUGUCCGGAGUGUGGGAAGUCAUUCUC





CACUUCCGGGAAUCUGACCGAGCAUCAACGCACCCACACUGGCGAGAAGCCCUACAAAUGCCCGGAGUGCGGAAAAUCGUUCU





CGGACUCCGGGAACCUUCGGGUCCACCAAAGGACUCAUACCGGGGAGAAACCGUACAAAUGUCCCGAAUGCGGGAAGUCGUUC





AGCCAUAAGAACGCGCUGCAGAACCAUCAGAGGACCCAUACUGGAGAAAAGCCCUAUAAGUGUCCGGAAUGCGGAAAGUCGUU





CUCACGCAACGACACCCUCACCGAACACCAGCGCACUCACACCGGAGAGAAGCCUUACAAGUGCCCGGAAUGUGGAAAGAGCU





UCAGCCAGCGGGCACAUCUGGAAAGACACCAGCGAACCCACACCGGGGAAAAACCGUAUAAGUGCCCCGAGUGUGGAAAGUCC





UUUUCACGGUCCGAUAAGCUCGUGCGCCACCAGAGAACUCACACUGGGGAGAAGCCGUACAAGUGUCCCGAGUGCGGCAAGAG





CUUCUCAGAUCCGGGACACCUUGUGCGACAUCAACGGACCCAUACCGGAGAAAAACCGACCGGGAAAAAGACCUCAGCAUCAG





GCUCAGGAGGCGGAUCAGGAGGCGACGCGCUCGAUGACUUCGAUCUGGACAUGUUGGGGUCCGACGCGCUUGACGACUUCGAC





CUUGAUAUGCUCGGAUCCGACGCCCUCGACGAUUUUGAUCUCGACAUGCUUGGGUCAGACGCCCUGGACGAUUUCGACCUGGA





CAUGCUGUCCGGUGGACCGAAAAAGAAGAGGAAGGUCGGGUCCCAGUACCUCCCGGACACCGAUGACCGACACCGGAUUGAAG





AGAAGCGCAAGAGAACCUACGAAACCUUCAAGUCGAUUAUGAAGAAGUCGCCGUUCUCGGGACCGACUGAUCCCAGACCGCCG





CCCAGAAGGAUUGCCGUGCCGUCGAGGUCAAGCGCCUCAGUGCCGAAACCCGCUCCGCAACCGUACCCCUUCACCUCAUCACU





UUCCACCAUCAACUACGAUGAGUUCCCCACCAUGGUGUUCCCGUCCGGCCAGAUCUCACAGGCCUCAGCCCUUGCACCGGCAC





CGCCCCAAGUCCUUCCGCAAGCACCCGCACCCGCUCCCGCUCCGGCAAUGGUGUCCGCGCUCGCACAAGCACCGGCUCCGGUG





CCGGUCUUGGCUCCGGGACCGCCGCAAGCAGUGGCACCACCCGCUCCGAAACCGACUCAGGCUGGGGAGGGAACCCUGUCCGA





AGCCCUGCUGCAACUUCAAUUCGACGAUGAAGAUCUGGGCGCACUGUUGGGAAACUCCACUGAUCCGGCAGUGUUCACCGAUC





UGGCCUCGGUGGACAACUCCGAGUUCCAGCAGCUGCUCAACCAAGGGAUUCCGGUCGCCCCGCAUACUACCGAGCCCAUGCUG





AUGGAAUACCCGGAAGCAAUCACCCGGCUGGUCACUGGUGCACAAAGACCCCCCGAUCCUGCUCCGGCACCGUUGGGAGCACC





GGGGUUGCCCAAUGGGCUGCUUUCGGGGGACGAGGAUUUCUCGUCAAUUGCCGACAUGGACUUCUCGGCCCUGUUGGGAUCCG





GAAGCGGAAGCAGGGACUCACGAGAGGGAAUGUUCCUACCGAAGCCCGAAGCGGGAUCAGCAAUCUCAGACGUGUUUGAAGGC





CGCGAAGUCUGCCAGCCGAAGCGCCUUCGCCCGUUCCAUCCGCCGGGAUCACCCUGGGCCAACAGACCCCUGCCGGCAUCACU





GGCCCCGACUCCGACUGGUCCGGUGCACGAACCGGUCGGGAGCCUGACUCCGGCACCCGUGCCCCAACCGUUGGAUCCGGCAC





CGGCAGUGACUCCGGAAGCUUCCCACCUCCUGGAGGAUCCGGACGAAGAGACUUCGCAGGCAGUGAAGGCCCUGCGCGAAAUG





GCGGACACCGUGAUUCCCCAGAAGGAAGAGGCAGCGAUCUGCGGGCAGAUGGACCUGUCACAUCCGCCCCCGAGAGGACACCU





GGACGAGCUGACCACUACCCUGGAAUCGAUGACUGAAGAUCUGAACCUGGACUCACCGCUGACUCCCGAGCUGAACGAAAUCC





UGGACACCUUCCUGAACGACGAGUGCCUUCUCCACGCCAUGCAUAUCUCCACCGGGCUGAGCAUCUUCGACACCUCGCUGUUC





UCGGGAGGAAAACGCCCGGCCGCAACUAAGAAGGCCGGACAGGCCAAGAAGAAGAAGGGGUCAUACCCGUACGACGUGCCCGA





CUAUGCGUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF5.2 mRNA sequence


SEQ ID NO.: 227



CUCGAACCCGGGGAAAAGCCGUACAAGUGUCCGGAGUGUGGGAAGUCAUUCUCCACUUCCGGGAAUCUGACCGAGCAUCAACG






CACCCACACUGGCGAGAAGCCCUACAAAUGCCCGGAGUGCGGAAAAUCGUUCUCGGACUCCGGGAACCUUCGGGUCCACCAAA





GGACUCAUACCGGGGAGAAACCGUACAAAUGUCCCGAAUGCGGGAAGUCGUUCAGCCAUAAGAACGCGCUGCAGAACCAUCAG





AGGACCCAUACUGGAGAAAAGCCCUAUAAGUGUCCGGAAUGCGGAAAGUCGUUCUCACGCAACGACACCCUCACCGAACACCA





GCGCACUCACACCGGAGAGAAGCCUUACAAGUGCCCGGAAUGUGGAAAGAGCUUCAGCCAGCGGGCACAUCUGGAAAGACACC





AGCGAACCCACACCGGGGAAAAACCGUAUAAGUGCCCCGAGUGUGGAAAGUCCUUUUCACGGUCCGAUAAGCUCGUGCGCCAC





CAGAGAACUCACACUGGGGAGAAGCCGUACAAGUGUCCCGAGUGCGGCAAGAGCUUCUCAGAUCCGGGACACCUUGUGCGACA





UCAACGGACCCAUACCGGAGAAAAACCGACCGGGAAAAAGACCUCA





>ZF5.3-VPR mRNA (ZF5-VPR ATUM Opt_3)


SEQ ID NO.: 162



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGCAAGGUCGGGAUCCAC






GGAGUCCCGGCAGCAGGAUCCUCAGGCUCACUGGAACCGGGGGAAAAACCCUACAAGUGCCCGGAAUGCGGCAAGAGCUUCUC





GACCUCCGGGAACCUGACCGAGCACCAGCGCACCCACACCGGAGAGAAACCGUACAAGUGCCCCGAAUGCGGGAAAUCGUUCU





CAGACUCGGGAAACCUCAGGGUGCACCAGCGGACCCACACGGGGGAAAAGCCGUACAAAUGCCCGGAGUGCGGGAAGUCAUUC





UCCCACAAGAACGCGCUGCAGAACCACCAAAGAACCCACACCGGCGAAAAACCGUACAAGUGCCCCGAGUGCGGAAAGUCCUU





CUCCCGCAACGACACCCUCACCGAACACCAACGCACCCACACCGGAGAAAAGCCCUACAAGUGCCCGGAAUGCGGAAAGAGCU





UCAGCCAGAGGGCCCACCUGGAAAGACACCAGAGAACCCACACCGGCGAAAAGCCGUACAAAUGCCCGGAGUGCGGGAAGUCC





UUCAGCCGGUCAGACAAGCUGGUCCGCCACCAAAGGACCCACACAGGAGAAAAGCCCUACAAGUGCCCGGAAUGCGGAAAAUC





GUUCAGCGACCCCGGACACCUGGUCCGGCACCAGAGGACCCACACCGGGGAGAAGCCGACCGGCAAAAAGACCUCAGCGAGCG





GAUCCGGAGGAGGAUCAGGGGGGGACGCACUGGACGACUUCGACCUGGACAUGCUGGGAUCCGACGCACUGGACGACUUCGAC





CUAGACAUGCUCGGAUCGGACGCACUCGACGACUUCGACCUCGACAUGCUAGGAUCAGACGCACUAGACGACUUCGACCUCGA





CAUGCUGUCGGGAGGACCGAAGAAAAAGCGGAAGGUCGGAUCACAGUACCUCCCGGACACCGACGACAGGCACAGAAUCGAAG





AAAAACGCAAGCGCACCUACGAAACCUUCAAGAGCAUCAUGAAAAAGUCGCCGUUCUCAGGACCGACCGACCCCAGACCGCCA





CCGAGGAGAAUAGCCGUCCCGAGCCGAUCCUCCGCAUCCGUGCCGAAACCGGCACCGCAACCCUACCCGUUCACCUCGUCCCU





GUCGACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCUCCGGGCAGAUCUCACAGGCCUCGGCACUGGCACCCGCAC





CACCGCAAGUGCUGCCCCAAGCACCGGCACCCGCACCGGCGCCCGCAAUGGUGUCAGCGCUGGCACAGGCACCAGCACCGGUG





CCAGUCCUCGCACCGGGACCGCCGCAAGCAGUGGCACCGCCGGCACCGAAACCGACCCAGGCCGGAGAAGGGACCCUGUCCGA





GGCGCUGCUGCAACUCCAGUUCGACGACGAGGACCUGGGAGCACUCCUGGGAAACUCCACCGACCCGGCAGUGUUCACCGACC





UCGCAUCGGUGGACAACUCCGAGUUCCAACAGCUCCUGAACCAGGGGAUACCGGUGGCACCGCACACCACCGAACCGAUGCUG





AUGGAAUACCCGGAAGCCAUCACCCGGCUCGUGACCGGAGCGCAAAGACCGCCCGACCCCGCGCCCGCACCGCUGGGAGCACC





GGGACUACCGAACGGGCUGCUCUCAGGGGACGAGGACUUCUCCAGCAUCGCAGACAUGGACUUCUCCGCCCUGCUGGGAUCAG





GAUCCGGAUCACGCGACUCCCGGGAAGGAAUGUUCCUGCCGAAGCCGGAAGCAGGCAGCGCAAUCUCCGACGUGUUCGAAGGC





CGCGAGGUCUGCCAGCCCAAGCGCCUGCGACCGUUCCACCCGCCGGGAUCACCGUGGGCAAACCGCCCGCUACCGGCAUCACU





GGCACCGACACCCACCGGACCGGUGCACGAACCGGUCGGGUCACUGACCCCCGCACCGGUCCCGCAACCGCUAGACCCGGCAC





CGGCAGUGACCCCGGAAGCAUCGCACCUCCUGGAGGACCCGGACGAGGAAACCUCACAGGCAGUGAAGGCCCUGCGGGAGAUG





GCCGACACCGUGAUACCGCAGAAGGAGGAGGCCGCCAUCUGCGGACAAAUGGACCUGUCACACCCGCCCCCGAGAGGACACCU





GGACGAACUCACCACCACCCUGGAGAGCAUGACCGAGGACCUGAACCUGGACUCACCGCUGACCCCGGAGCUGAACGAAAUCC





UGGACACCUUCCUGAACGACGAGUGCCUGCUGCACGCAAUGCACAUCAGCACCGGGCUGUCGAUCUUCGACACCAGCCUGUUC





UCCGGAGGGAAAAGACCCGCCGCCACCAAGAAAGCGGGCCAAGCAAAGAAAAAGAAGGGAUCGUACCCCUACGACGUGCCGGA





CUACGCAUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF5.3 mRNA


SEQ ID NO.: 228



CUGGAACCGGGGGAAAAACCCUACAAGUGCCCGGAAUGCGGCAAGAGCUUCUCGACCUCCGGGAACCUGACCGAGCACCAGCG






CACCCACACCGGAGAGAAACCGUACAAGUGCCCCGAAUGCGGGAAAUCGUUCUCAGACUCGGGAAACCUCAGGGUGCACCAGC





GGACCCACACGGGGGAAAAGCCGUACAAAUGCCCGGAGUGCGGGAAGUCAUUCUCCCACAAGAACGCGCUGCAGAACCACCAA





AGAACCCACACCGGCGAAAAACCGUACAAGUGCCCCGAGUGCGGAAAGUCCUUCUCCCGCAACGACACCCUCACCGAACACCA





ACGCACCCACACCGGAGAAAAGCCCUACAAGUGCCCGGAAUGCGGAAAGAGCUUCAGCCAGAGGGCCCACCUGGAAAGACACC





AGAGAACCCACACCGGCGAAAAGCCGUACAAAUGCCCGGAGUGCGGGAAGUCCUUCAGCCGGUCAGACAAGCUGGUCCGCCAC





CAAAGGACCCACACAGGAGAAAAGCCCUACAAGUGCCCGGAAUGCGGAAAAUCGUUCAGCGACCCCGGACACCUGGUCCGGCA





CCAGAGGACCCACACCGGGGAGAAGCCGACCGGCAAAAAGACCUCA





>ZF5.4-VPR mRNA (ZF5-VPR ATUM Opt_4)


SEQ ID NO.: 163



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCGAAGAAAAAGAGGAAGGUCGGGAUCCAC






GGAGUCCCAGCCGCAGGAAGCAGCGGAAGCCUGGAACCCGGAGAAAAACCCUACAAGUGCCCAGAAUGCGGCAAGAGCUUCAG





CACCAGCGGAAACCUGACCGAACACCAGCGGACGCACACAGGGGAGAAACCGUACAAAUGCCCGGAGUGCGGAAAGAGCUUCA





GCGACAGCGGCAACCUCCGCGUGCACCAGAGAACCCACACGGGAGAGAAGCCGUACAAGUGCCCGGAAUGCGGAAAAAGCUUC





AGCCACAAGAACGCGCUGCAGAACCACCAGAGGACACACACGGGCGAGAAGCCCUACAAAUGCCCCGAAUGCGGGAAAAGCUU





CAGCCGGAACGACACCCUCACCGAGCACCAGCGAACCCACACCGGAGAAAAGCCGUACAAGUGCCCGGAAUGCGGAAAAAGCU





UCAGCCAACGGGCCCACCUGGAACGCCACCAAAGGACCCACACAGGAGAAAAGCCCUACAAGUGCCCAGAGUGCGGCAAAAGC





UUCAGCCGGAGCGACAAGCUGGUCCGGCACCAGCGCACACACACCGGCGAAAAGCCAUACAAGUGCCCCGAGUGCGGAAAGAG





CUUCAGCGACCCAGGACACCUCGUGCGGCACCAACGCACGCACACCGGGGAAAAACCGACCGGAAAAAAGACCAGCGCGAGCG





GAAGCGGAGGAGGAAGCGGAGGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGGAGCGACGCACUGGACGACUUCGAC





CUGGACAUGCUGGGAAGCGACGCGCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUCGACGACUUCGACCUCGA





CAUGCUGAGCGGCGGACCCAAGAAGAAGAGAAAGGUCGGAAGCCAGUACCUCCCGGACACCGACGACAGGCACCGCAUCGAAG





AGAAGCGGAAAAGAACCUACGAAACCUUCAAGAGCAUCAUGAAAAAGAGCCCGUUCAGCGGACCAACCGACCCCAGACCACCA





CCGAGAAGAAUCGCGGUCCCAAGCAGGAGCAGCGCCAGCGUCCCGAAGCCAGCCCCACAGCCGUACCCCUUCACCAGCAGCCU





GAGCACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCGAGCGGCCAGAUAAGCCAGGCCAGCGCACUGGCACCAGCCC





CACCGCAAGUGCUGCCCCAAGCACCCGCACCAGCACCCGCCCCCGCGAUGGUCAGCGCCCUGGCACAAGCCCCAGCCCCAGUC





CCGGUGCUCGCACCAGGACCACCCCAAGCAGUCGCACCGCCAGCCCCAAAGCCGACCCAAGCCGGAGAAGGCACCCUCAGCGA





GGCGCUCCUGCAACUCCAAUUCGACGACGAGGACCUGGGAGCCCUGCUGGGCAACAGCACCGACCCGGCAGUGUUCACCGACC





UGGCCAGCGUGGACAACAGCGAAUUCCAGCAGCUCCUGAACCAAGGAAUCCCAGUCGCGCCACACACCACCGAGCCGAUGCUG





AUGGAAUACCCAGAAGCGAUCACGAGACUGGUCACCGGGGCCCAAAGACCGCCGGACCCAGCGCCAGCACCACUGGGAGCCCC





AGGACUGCCCAACGGACUGCUCAGCGGCGACGAGGACUUCAGCAGCAUCGCGGACAUGGACUUCAGCGCACUCCUCGGAAGCG





GAAGCGGCAGCAGAGACAGCCGGGAAGGAAUGUUCCUCCCCAAGCCAGAAGCCGGAAGCGCAAUCAGCGACGUGUUCGAAGGA





CGGGAAGUCUGCCAGCCGAAGCGCCUCAGACCGUUCCACCCACCGGGAAGCCCAUGGGCCAACAGACCGCUGCCAGCCAGCCU





GGCACCGACCCCAACCGGACCAGUCCACGAACCAGUCGGCAGCCUGACACCAGCACCAGUGCCCCAGCCACUGGACCCAGCAC





CGGCAGUGACCCCAGAAGCCAGCCACCUCCUGGAGGACCCCGACGAAGAAACCAGCCAGGCCGUGAAGGCCCUGAGGGAGAUG





GCCGACACGGUGAUCCCACAGAAGGAAGAAGCAGCGAUCUGCGGCCAAAUGGACCUCAGCCACCCACCGCCAAGAGGCCACCU





GGACGAGCUCACCACCACCCUGGAAAGCAUGACCGAGGACCUCAACCUCGACAGCCCCCUGACACCGGAGCUCAACGAGAUCC





UGGACACCUUCCUCAACGACGAAUGCCUGCUCCACGCCAUGCACAUCAGCACCGGACUGAGCAUCUUCGACACCAGCCUGUUC





AGCGGGGGAAAACGACCGGCAGCCACCAAAAAGGCCGGACAGGCGAAGAAGAAGAAGGGGAGCUACCCGUACGACGUGCCCGA





CUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF5.4 mRNA sequence


SEQ ID NO.: 229



CUGGAACCCGGAGAAAAACCCUACAAGUGCCCAGAAUGCGGCAAGAGCUUCAGCACCAGCGGAAACCUGACCGAACACCAGCG






GACGCACACAGGGGAGAAACCGUACAAAUGCCCGGAGUGCGGAAAGAGCUUCAGCGACAGCGGCAACCUCCGCGUGCACCAGA





GAACCCACACGGGAGAGAAGCCGUACAAGUGCCCGGAAUGCGGAAAAAGCUUCAGCCACAAGAACGCGCUGCAGAACCACCAG





AGGACACACACGGGCGAGAAGCCCUACAAAUGCCCCGAAUGCGGGAAAAGCUUCAGCCGGAACGACACCCUCACCGAGCACCA





GCGAACCCACACCGGAGAAAAGCCGUACAAGUGCCCGGAAUGCGGAAAAAGCUUCAGCCAACGGGCCCACCUGGAACGCCACC





AAAGGACCCACACAGGAGAAAAGCCCUACAAGUGCCCAGAGUGCGGCAAAAGCUUCAGCCGGAGCGACAAGCUGGUCCGGCAC





CAGCGCACACACACCGGCGAAAAGCCAUACAAGUGCCCCGAGUGCGGAAAGAGCUUCAGCGACCCAGGACACCUCGUGCGGCA





CCAACGCACGCACACCGGGGAAAAACCGACCGGAAAAAAGACCAGC





>ZF5.5-VPR  (ZF5-VPR ATUM Opt_5)


SEQ ID NO.: 164



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCGAAGAAGAAGCGCAAGGUCGGCAUACAC






GGAGUCCCAGCCGCUGGAUCCUCCGGAUCCCUGGAACCUGGGGAGAAACCCUAUAAGUGCCCGGAGUGCGGAAAGUCAUUCUC





AACUAGCGGAAACCUGACAGAGCACCAGAGGACCCAUACUGGCGAAAAGCCAUACAAAUGCCCCGAAUGCGGGAAAAGCUUCA





GCGACAGCGGGAACCUGAGAGUGCACCAGCGGACUCAUACCGGGGAGAAGCCUUACAAGUGCCCCGAGUGUGGAAAGUCCUUC





UCCCAUAAGAACGCGCUCCAGAACCACCAGAGAACCCACACCGGAGAAAAGCCGUACAAGUGCCCGGAAUGCGGCAAAUCCUU





UUCACGGAACGACACUCUCACCGAGCACCAACGGACGCACACCGGAGAGAAGCCGUACAAGUGCCCUGAAUGCGGAAAGAGCU





UUAGCCAGAGGGCCCACCUGGAACGGCAUCAGCGCACUCACACCGGGGAAAAGCCCUACAAGUGCCCAGAGUGCGGCAAGAGC





UUCUCCCGGUCUGACAAGCUUGUGCGCCAUCAGCGGACCCACACUGGAGAAAAACCGUACAAGUGUCCGGAGUGUGGCAAAUC





GUUCUCAGACCCGGGACACCUGGUCCGACACCAACGCACACACACCGGCGAAAAGCCGACCGGCAAAAAGACCUCGGCCUCGG





GAUCUGGAGGAGGAAGCGGCGGAGAUGCCCUGGACGACUUCGACCUGGACAUGUUGGGCAGCGACGCACUGGAUGACUUCGAC





CUGGAUAUGCUGGGAUCCGACGCCCUCGACGAUUUCGACCUCGAUAUGCUUGGCUCCGAUGCGCUCGAUGAUUUCGAUUUGGA





CAUGCUGUCCGGCGGACCUAAGAAGAAGAGAAAGGUCGGCAGCCAAUACCUCCCGGACACUGAUGACCGGCACCGGAUCGAAG





AGAAGCGGAAGCGCACUUACGAGACUUUCAAGUCGAUCAUGAAGAAGUCACCCUUCUCGGGACCUACUGAUCCUCGGCCGCCA





CCUAGACGGAUCGCGGUGCCAUCCAGGUCAUCCGCUUCCGUCCCCAAGCCUGCGCCUCAACCGUACCCUUUCACUUCCUCCCU





GUCGACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCUCCGGACAGAUUUCCCAAGCCUCGGCGCUAGCACCAGCCC





CUCCACAAGUGCUUCCGCAAGCUCCAGCUCCGGCACCAGCACCAGCCAUGGUGUCCGCUCUGGCCCAAGCUCCUGCUCCGGUG





CCUGUGCUGGCUCCUGGACCGCCUCAGGCAGUGGCACCACCCGCACCAAAGCCGACCCAAGCGGGAGAGGGAACUCUGUCCGA





AGCGCUGCUGCAGCUCCAGUUCGACGACGAGGACCUGGGUGCCCUGCUCGGAAAUUCGACCGAUCCGGCCGUGUUUACCGACU





UGGCCAGUGUGGACAACUCCGAGUUCCAACAGCUGCUGAACCAGGGGAUUCCAGUGGCCCCCCACACUACUGAACCGAUGCUG





AUGGAAUACCCCGAGGCCAUUACCAGACUGGUCACUGGAGCCCAGAGGCCUCCAGACCCUGCCCCUGCUCCACUGGGUGCCCC





AGGACUGCCCAAUGGGCUUCUGUCGGGCGAUGAGGAUUUCAGCUCAAUCGCGGAUAUGGACUUCUCCGCCCUUCUGGGUUCCG





GAUCCGGUUCACGGGAUUCCAGAGAGGGCAUGUUCCUACCCAAGCCCGAAGCCGGAAGCGCGAUCAGCGACGUGUUCGAGGGU





CGCGAAGUCUGUCAGCCAAAGAGACUCCGGCCGUUUCAUCCACCCGGAUCACCCUGGGCCAAUCGCCCACUCCCUGCCUCAUU





GGCCCCGACCCCUACUGGUCCGGUGCACGAGCCUGUCGGGUCGCUCACUCCGGCACCUGUGCCACAACCGCUGGACCCUGCAC





CAGCCGUGACCCCAGAGGCGUCCCACCUCCUCGAAGAUCCCGAUGAAGAAACAAGCCAGGCCGUGAAGGCCCUGCGCGAAAUG





GCCGACACCGUGAUCCCGCAGAAAGAGGAAGCCGCCAUCUGCGGUCAGAUGGACCUGAGCCAUCCCCCUCCGAGAGGACACCU





GGACGAACUGACCACUACACUGGAGAGCAUGACCGAGGACCUGAACCUGGACUCCCCCCUUACCCCGGAACUGAACGAGAUUC





UCGACACUUUCCUGAACGACGAGUGUCUGCUCCACGCGAUGCACAUCUCGACCGGACUGUCGAUCUUUGACACCUCGCUGUUC





UCCGGUGGCAAAAGGCCUGCCGCCACCAAGAAGGCCGGACAGGCCAAGAAAAAGAAGGGCUCCUACCCGUACGAUGUGCCCGA





CUACGCUUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF5.5 mRNA


SEQ ID NO.: 230



CUGGAACCUGGGGAGAAACCCUAUAAGUGCCCGGAGUGCGGAAAGUCAUUCUCAACUAGCGGAAACCUGACAGAGCACCAGAG






GACCCAUACUGGCGAAAAGCCAUACAAAUGCCCCGAAUGCGGGAAAAGCUUCAGCGACAGCGGGAACCUGAGAGUGCACCAGC





GGACUCAUACCGGGGAGAAGCCUUACAAGUGCCCCGAGUGUGGAAAGUCCUUCUCCCAUAAGAACGCGCUCCAGAACCACCAG





AGAACCCACACCGGAGAAAAGCCGUACAAGUGCCCGGAAUGCGGCAAAUCCUUUUCACGGAACGACACUCUCACCGAGCACCA





ACGGACGCACACCGGAGAGAAGCCGUACAAGUGCCCUGAAUGCGGAAAGAGCUUUAGCCAGAGGGCCCACCUGGAACGGCAUC





AGCGCACUCACACCGGGGAAAAGCCCUACAAGUGCCCAGAGUGCGGCAAGAGCUUCUCCCGGUCUGACAAGCUUGUGCGCCAU





CAGCGGACCCACACUGGAGAAAAACCGUACAAGUGUCCGGAGUGUGGCAAAUCGUUCUCAGACCCGGGACACCUGGUCCGACA





CCAACGCACACACACCGGCGAAAAGCCGACCGGCAAAAAGACCUCG





>ZF5.6-VPR mRNA (ZF5-VPR ATUM Opt_6)


SEQ ID NO.: 165



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAAAAGCGCAAAGUGGGCAUCCAC






GGCGUGCCAGCAGCAGGAAGCAGCGGAAGCCUGGAACCCGGGGAGAAGCCGUACAAGUGCCCAGAAUGCGGAAAGAGCUUCAG





CACCAGCGGCAACCUCACCGAGCACCAGAGAACCCACACCGGGGAGAAACCGUACAAAUGCCCGGAAUGCGGCAAGAGCUUCA





GCGACAGCGGAAACCUGAGAGUGCACCAACGCACCCACACGGGAGAAAAACCCUACAAAUGCCCCGAGUGCGGGAAAAGCUUC





AGCCACAAGAACGCGCUGCAGAACCACCAAAGAACGCACACCGGAGAAAAGCCGUACAAGUGCCCAGAAUGCGGAAAGAGCUU





CAGCAGAAACGACACCCUGACCGAACACCAGCGGACGCACACAGGCGAAAAACCAUACAAGUGCCCGGAGUGCGGCAAAAGCU





UCAGCCAGAGAGCGCACCUGGAAAGGCACCAGCGCACACACACCGGCGAAAAGCCAUACAAAUGCCCAGAGUGCGGAAAAAGC





UUCAGCCGGAGCGACAAGCUGGUCCGCCACCAACGGACCCACACAGGGGAAAAGCCCUACAAGUGCCCCGAAUGCGGCAAGAG





CUUCAGCGACCCGGGACACCUCGUGCGGCACCAGAGGACCCACACCGGAGAAAAGCCGACCGGGAAAAAGACCAGCGCAAGCG





GGAGCGGAGGAGGAAGCGGAGGCGACGCACUCGACGACUUCGACCUGGACAUGCUGGGGAGCGACGCACUGGACGACUUCGAC





CUCGACAUGCUCGGAAGCGACGCCCUCGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCGCUGGACGACUUCGACCUCGA





CAUGCUCAGCGGGGGACCAAAAAAGAAAAGGAAGGUCGGAAGCCAGUACCUCCCGGACACCGACGACAGGCACCGGAUCGAGG





AAAAGCGGAAGCGCACCUACGAAACCUUCAAGAGCAUCAUGAAAAAGAGCCCCUUCAGCGGACCGACAGACCCGAGGCCACCA





CCACGGAGAAUCGCCGUGCCAAGCAGGAGCAGCGCCAGCGUGCCCAAACCGGCCCCACAACCCUACCCGUUCACCAGCAGCCU





CAGCACCAUCAACUACGACGAGUUCCCAACCAUGGUGUUCCCCAGCGGACAGAUCAGCCAAGCCAGCGCACUGGCACCAGCCC





CCCCGCAAGUGCUGCCACAAGCGCCGGCACCAGCGCCAGCACCAGCCAUGGUCAGCGCGCUGGCACAAGCCCCCGCACCAGUG





CCAGUGCUCGCACCAGGACCACCCCAGGCAGUAGCACCGCCAGCCCCGAAGCCAACCCAGGCAGGAGAAGGCACCCUCAGCGA





GGCGCUGCUGCAGCUCCAGUUCGACGACGAGGACCUCGGAGCCCUGCUGGGAAACAGCACCGACCCAGCCGUGUUCACCGACC





UGGCCAGCGUGGACAACAGCGAAUUCCAGCAGCUGCUCAACCAAGGAAUCCCGGUGGCCCCACACACCACCGAACCCAUGCUG





AUGGAGUACCCGGAGGCCAUCACCAGACUCGUGACAGGAGCCCAGAGGCCACCAGACCCAGCCCCAGCACCACUGGGAGCCCC





AGGACUCCCCAACGGACUGCUGAGCGGCGACGAGGACUUCAGCAGCAUCGCCGACAUGGACUUCAGCGCACUCCUCGGGAGCG





GAAGCGGAAGCAGAGACAGCAGGGAAGGAAUGUUCCUGCCCAAGCCGGAAGCGGGAAGCGCAAUCAGCGACGUGUUCGAAGGA





AGAGAAGUCUGCCAGCCCAAGAGGCUGCGCCCGUUCCACCCACCAGGAAGCCCGUGGGCCAACAGACCACUGCCAGCAAGCCU





CGCCCCGACACCAACCGGACCGGUGCACGAACCCGUGGGCAGCCUGACCCCAGCACCGGUCCCACAGCCACUGGACCCAGCAC





CCGCAGUGACCCCAGAAGCCAGCCACCUCCUGGAGGACCCGGACGAAGAAACCAGCCAGGCCGUCAAGGCCCUGCGCGAGAUG





GCCGACACCGUCAUCCCCCAAAAGGAAGAGGCGGCCAUCUGCGGACAGAUGGACCUGAGCCACCCACCGCCAAGAGGCCACCU





CGACGAGCUGACCACCACCCUGGAAAGCAUGACGGAGGACCUGAACCUCGACAGCCCGCUAACGCCCGAGCUGAACGAAAUCC





UGGACACCUUCCUCAACGACGAAUGCCUGCUGCACGCCAUGCACAUCAGCACCGGACUGAGCAUCUUCGACACGAGCCUGUUC





AGCGGAGGAAAACGGCCAGCCGCAACCAAGAAGGCCGGACAAGCCAAGAAGAAGAAGGGGAGCUACCCGUACGACGUGCCAGA





CUACGCAUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUG





UACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF5.6 mRNA


SEQ ID NO.: 231



CUGGAACCCGGGGAGAAGCCGUACAAGUGCCCAGAAUGCGGAAAGAGCUUCAGCACCAGCGGCAACCUCACCGAGCACCAGAG






AACCCACACCGGGGAGAAACCGUACAAAUGCCCGGAAUGCGGCAAGAGCUUCAGCGACAGCGGAAACCUGAGAGUGCACCAAC





GCACCCACACGGGAGAAAAACCCUACAAAUGCCCCGAGUGCGGGAAAAGCUUCAGCCACAAGAACGCGCUGCAGAACCACCAA





AGAACGCACACCGGAGAAAAGCCGUACAAGUGCCCAGAAUGCGGAAAGAGCUUCAGCAGAAACGACACCCUGACCGAACACCA





GCGGACGCACACAGGCGAAAAACCAUACAAGUGCCCGGAGUGCGGCAAAAGCUUCAGCCAGAGAGCGCACCUGGAAAGGCACC





AGCGCACACACACCGGCGAAAAGCCAUACAAAUGCCCAGAGUGCGGAAAAAGCUUCAGCCGGAGCGACAAGCUGGUCCGCCAC





CAACGGACCCACACAGGGGAAAAGCCCUACAAGUGCCCCGAAUGCGGCAAGAGCUUCAGCGACCCGGGACACCUCGUGCGGCA





CCAGAGGACCCACACCGGAGAAAAGCCGACCGGGAAAAAGACCAGC





>ZF5-P300 protein


SEQ ID NO.: 166



MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSQRAHLERHQRTHTGEKPYKCPECGKSFSQLAHLRAHQRTHTGE






KPYKCPECGKSFSSKKHLAEHQRTHTGEKPYKCPECGKSFSTTGALTEHQRTHTGEKPYKCPECGKSFSTSGNLVRHQRTHTG





EKPYKCPECGKSFSQSGDLRRHQRTHTGEKPYKCPECGKSFSTSHSLTEHQRTHTGEKPTGKKTSASGSGGGSGGIFKPEELR





QALMPTLEALYRQDPESLPFRQPVDPQLLGIPDYFDIVKSPMDLSTIKRKLDTGQYQEPWQYVDDIWLMFNNAWLYNRKTSRV





YKYCSKLSEVFEQEIDPVMQSLGYCCGRKLEFSPQTLCCYGKQLCTIPRDATYYSYQNRYHFCEKCFNEIQGESVSLGDDPSQ





PQTTINKEQFSKRKNDTLDPELFVECTECGRKMHQICVLHHEIIWPAGFVCDGCLKKSARTRKENKFSAKRLPSTRLGTFLEN





RVNDFLRRQNHPESGEVTVRVVHASDKTVEVKPGMKARFVDSGEMAESFPYRTKALFAFEEIDGVDLCFFGMHVQEYGSDCPP





PNQRRVYISYLDSVHFFRPKCLRTAVYHEILIGYLEYVKKLGYTTGHIWACPPSEGDDYIFHCHPPDQKIPKPKRLQEWYKKM





LDKAVSERIVHDYKDIFKQATEDRLTSAKELPYFEGDFWPNVLEESIKELEQEEEERKREENTSNESTDVTKGDSKNAKKKNN





KKTSKNKSSLSRGNKKKPGMPNVSNDLSQKLYATMEKHKEVFFVIRLIAGPAANSLPPIVDPDPLIPCDLMDGRDAFLTLARD





KHLEFSSLRRAQWSTMCMLVELHTQSQDSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>ZF5-P300  mRNA


SEQ ID NO.: 167



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCCCUCGAACCGGGCGAAAAGCCGUAUAAGUGCCCGGAAUGCGGCAAGAGUUUUAG





CCAGCGCGCCCAUCUGGAACGUCACCAGCGUACCCAUACCGGUGAAAAGCCAUAUAAAUGCCCAGAAUGUGGUAAAAGCUUUA





GUCAGCUGGCCCAUCUGCGCGCCCACCAACGUACGCACACGGGCGAGAAGCCGUACAAAUGCCCAGAAUGCGGUAAAAGCUUC





AGCAGCAAAAAGCAUCUGGCGGAACAUCAACGUACCCACACCGGCGAGAAACCAUACAAGUGCCCGGAAUGCGGUAAAAGCUU





CAGCACCACCGGUGCGCUGACGGAGCAUCAGCGCACCCACACGGGCGAAAAACCGUAUAAGUGUCCGGAGUGUGGCAAAAGUU





UUAGUACCAGCGGCAAUCUGGUGCGCCAUCAACGUACGCAUACCGGCGAGAAGCCAUAUAAAUGUCCAGAGUGUGGCAAGAGC





UUUAGCCAAAGCGGUGAUCUGCGUCGCCACCAACGCACGCACACCGGCGAAAAACCAUACAAAUGUCCGGAAUGCGGUAAGAG





UUUCAGCACGAGCCAUAGUCUGACCGAACAUCAACGUACCCAUACGGGUGAGAAACCAACCGGCAAGAAAACCAGCGCUAGCG





GCAGCGGCGGCGGCAGCGGCGGCAUCUUCAAGCCCGAGGAGCUGCGGCAGGCCCUGAUGCCCACCCUGGAGGCCCUGUACCGG





CAGGACCCCGAGAGCCUGCCCUUCCGGCAGCCCGUGGACCCCCAGCUGCUGGGCAUCCCCGACUACUUCGACAUCGUGAAAUC





CCCCAUGGACCUGAGCACCAUCAAGCGGAAGCUGGACACCGGCCAGUACCAGGAGCCCUGGCAGUACGUGGACGACAUCUGGC





UGAUGUUCAACAACGCCUGGCUGUACAACCGGAAAACCAGCCGGGUGUACAAGUACUGCAGCAAGCUGAGCGAGGUGUUCGAG





CAGGAGAUCGACCCCGUGAUGCAGAGCCUGGGCUACUGCUGCGGCCGGAAGCUGGAGUUCAGCCCCCAGACCCUGUGCUGCUA





CGGCAAGCAGCUGUGCACCAUCCCCCGGGACGCCACCUACUACAGCUACCAGAACCGGUACCACUUCUGCGAGAAGUGCUUCA





ACGAGAUCCAGGGCGAGAGCGUGAGCCUGGGCGACGACCCCAGCCAGCCCCAGACCACCAUCAACAAGGAGCAGUUCAGCAAG





CGGAAGAACGACACCCUGGACCCCGAGCUGUUCGUGGAGUGCACCGAGUGCGGCCGGAAGAUGCACCAGAUCUGCGUGCUGCA





CCACGAGAUCAUCUGGCCCGCCGGCUUCGUGUGCGACGGCUGCCUGAAGAAAUCCGCCCGGACCCGGAAGGAGAACAAGUUCA





GCGCCAAGCGGCUGCCCAGCACCCGGCUGGGCACCUUCCUGGAGAACCGGGUGAACGACUUCCUGCGGCGGCAGAACCACCCC





GAGAGCGGCGAGGUGACCGUGCGGGUGGUGCACGCCAGCGACAAGACCGUGGAGGUGAAGCCCGGCAUGAAGGCCCGGUUCGU





GGACAGCGGCGAGAUGGCCGAGAGCUUCCCCUACCGGACCAAGGCCCUGUUCGCCUUCGAGGAGAUCGACGGCGUGGACCUGU





GCUUCUUCGGCAUGCACGUGCAGGAGUACGGCAGCGACUGCCCCCCCCCCAACCAGCGGCGGGUGUACAUCAGCUACCUGGAC





AGCGUGCACUUCUUCCGGCCCAAGUGCCUGCGGACCGCCGUGUACCACGAGAUCCUGAUCGGCUACCUGGAGUACGUGAAGAA





GCUGGGCUACACCACCGGCCACAUCUGGGCCUGCCCCCCCAGCGAGGGCGACGACUACAUCUUCCACUGCCACCCCCCCGACC





AGAAGAUCCCCAAGCCCAAGCGGCUGCAGGAGUGGUACAAGAAGAUGCUGGACAAGGCCGUGAGCGAGCGGAUCGUGCACGAC





UACAAGGACAUCUUCAAGCAGGCCACCGAGGACCGGCUGACCAGCGCCAAGGAGCUGCCCUACUUCGAGGGCGACUUCUGGCC





CAACGUGCUGGAGGAGAGCAUCAAGGAGCUGGAGCAGGAGGAGGAGGAGCGGAAGCGGGAGGAGAACACCAGCAACGAGAGCA





CCGACGUGACCAAGGGCGACAGCAAGAACGCCAAGAAGAAGAACAACAAGAAAACCAGCAAGAACAAGAGCAGCCUGAGCCGG





GGCAACAAGAAGAAGCCCGGCAUGCCCAACGUGAGCAACGACCUGAGCCAGAAGCUGUACGCCACCAUGGAGAAGCACAAGGA





GGUGUUCUUCGUGAUCCGGCUGAUCGCCGGCCCCGCCGCCAACAGCCUGCCCCCCAUCGUGGACCCCGACCCCCUGAUCCCCU





GCGACCUGAUGGACGGCCGGGACGCCUUCCUGACCCUGGCCCGGGACAAGCACCUGGAGUUCAGCAGCCUGCGGCGGGCCCAG





UGGAGCACCAUGUGCAUGCUGGUGGAGCUGCACACCCAGAGCCAGGACAGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGC





CGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGACUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUC





UGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUGUACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAA





GUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAA





>Nucleotide Seqeunce of DNA binding domain of ZF5-P300


SEQ ID NO.: 232



CUCGAACCGGGCGAAAAGCCGUAUAAGUGCCCGGAAUGCGGCAAGAGUUUUAGCCAGCGCGCCCAUCUGGAACGUCACCAGCG






UACCCAUACCGGUGAAAAGCCAUAUAAAUGCCCAGAAUGUGGUAAAAGCUUUAGUCAGCUGGCCCAUCUGCGCGCCCACCAAC





GUACGCACACGGGCGAGAAGCCGUACAAAUGCCCAGAAUGCGGUAAAAGCUUCAGCAGCAAAAAGCAUCUGGCGGAACAUCAA





CGUACCCACACCGGCGAGAAACCAUACAAGUGCCCGGAAUGCGGUAAAAGCUUCAGCACCACCGGUGCGCUGACGGAGCAUCA





GCGCACCCACACGGGCGAAAAACCGUAUAAGUGUCCGGAGUGUGGCAAAAGUUUUAGUACCAGCGGCAAUCUGGUGCGCCAUC





AACGUACGCAUACCGGCGAGAAGCCAUAUAAAUGUCCAGAGUGUGGCAAGAGCUUUAGCCAAAGCGGUGAUCUGCGUCGCCAC





CAACGCACGCACACCGGCGAAAAACCAUACAAAUGUCCGGAAUGCGGUAAGAGUUUCAGCACGAGCCAUAGUCUGACCGAACA





UCAACGUACCCAUACGGGUGAGAAACCAACCGGCAAGAAAACCAGC





>ZF7-p300 protein


SEQ ID NO.: 168



MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSRSDHLTNHQRTHTGEKPYKCPECGKSFSTSGSLVRHQRTHTGE






KPYKCPECGKSFSQAGHLASHQRTHTGEKPYKCPECGKSFSRSDKLTEHQRTHTGEKPYKCPECGKSFSTSGNLTEHQRTHTG





EKPYKCPECGKSFSQSSNLVRHQRTHTGEKPYKCPECGKSFSTHLDLIRHQRTHTGEKPTGKKTSASGSGGGSGGIFKPEELR





QALMPTLEALYRQDPESLPFRQPVDPQLLGIPDYFDIVKSPMDLSTIKRKLDTGQYQEPWQYVDDIWLMFNNAWLYNRKTSRV





YKYCSKLSEVFEQEIDPVMQSLGYCCGRKLEFSPQTLCCYGKQLCTIPRDATYYSYQNRYHFCEKCFNEIQGESVSLGDDPSQ





PQTTINKEQFSKRKNDTLDPELFVECTECGRKMHQICVLHHEIIWPAGFVCDGCLKKSARTRKENKFSAKRLPSTRLGTFLEN





RVNDFLRRQNHPESGEVTVRVVHASDKTVEVKPGMKARFVDSGEMAESFPYRTKALFAFEEIDGVDLCFFGMHVQEYGSDCPP





PNQRRVYISYLDSVHFFRPKCLRTAVYHEILIGYLEYVKKLGYTTGHIWACPPSEGDDYIFHCHPPDQKIPKPKRLQEWYKKM





LDKAVSERIVHDYKDIFKQATEDRLTSAKELPYFEGDFWPNVLEESIKELEQEEEERKREENTSNESTDVTKGDSKNAKKKNN





KKTSKNKSSLSRGNKKKPGMPNVSNDLSQKLYATMEKHKEVFFVIRLIAGPAANSLPPIVDPDPLIPCDLMDGRDAFLTLARD





KHLEFSSLRRAQWSTMCMLVELHTQSQDSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>ZF7-p300 mRNA


SEQ ID NO.: 169



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCCCUGGAACCGGGCGAGAAGCCAUACAAGUGCCCAGAGUGCGGCAAAAGCUUCAG





CCGCAGCGACCAUCUGACCAAUCACCAACGCACCCAUACCGGUGAGAAGCCGUACAAAUGCCCAGAGUGCGGUAAGAGCUUUA





GUACCAGUGGCAGUCUGGUUCGUCAUCAGCGCACGCACACGGGCGAAAAACCAUACAAAUGCCCGGAGUGCGGCAAAAGCUUU





AGCCAAGCCGGUCAUCUGGCGAGCCAUCAACGUACGCACACCGGCGAGAAGCCGUAUAAAUGUCCGGAGUGCGGUAAGAGCUU





UAGCCGUAGCGAUAAACUGACCGAACACCAACGUACGCAUACGGGCGAGAAACCAUAUAAAUGUCCAGAGUGUGGCAAGAGUU





UCAGCACCAGCGGCAAUCUGACCGAGCAUCAACGUACCCAUACCGGUGAAAAGCCAUAUAAAUGUCCAGAAUGCGGUAAGAGU





UUUAGUCAGAGCAGUAAUCUGGUGCGCCAUCAGCGUACCCACACGGGUGAGAAACCAUAUAAGUGUCCGGAAUGCGGCAAGAG





UUUUAGCACCCAUCUGGAUCUGAUCCGUCAUCAGCGCACCCACACCGGUGAAAAACCAACCGGCAAGAAAACCAGUGCUAGCG





GCAGCGGCGGCGGCAGCGGCGGCAUCUUCAAGCCCGAGGAGCUGCGGCAGGCCCUGAUGCCCACCCUGGAGGCCCUGUACCGG





CAGGACCCCGAGAGCCUGCCCUUCCGGCAGCCCGUGGACCCCCAGCUGCUGGGCAUCCCCGACUACUUCGACAUCGUGAAAUC





CCCCAUGGACCUGAGCACCAUCAAGCGGAAGCUGGACACCGGCCAGUACCAGGAGCCCUGGCAGUACGUGGACGACAUCUGGC





UGAUGUUCAACAACGCCUGGCUGUACAACCGGAAAACCAGCCGGGUGUACAAGUACUGCAGCAAGCUGAGCGAGGUGUUCGAG





CAGGAGAUCGACCCCGUGAUGCAGAGCCUGGGCUACUGCUGCGGCCGGAAGCUGGAGUUCAGCCCCCAGACCCUGUGCUGCUA





CGGCAAGCAGCUGUGCACCAUCCCCCGGGACGCCACCUACUACAGCUACCAGAACCGGUACCACUUCUGCGAGAAGUGCUUCA





ACGAGAUCCAGGGCGAGAGCGUGAGCCUGGGCGACGACCCCAGCCAGCCCCAGACCACCAUCAACAAGGAGCAGUUCAGCAAG





CGGAAGAACGACACCCUGGACCCCGAGCUGUUCGUGGAGUGCACCGAGUGCGGCCGGAAGAUGCACCAGAUCUGCGUGCUGCA





CCACGAGAUCAUCUGGCCCGCCGGCUUCGUGUGCGACGGCUGCCUGAAGAAAUCCGCCCGGACCCGGAAGGAGAACAAGUUCA





GCGCCAAGCGGCUGCCCAGCACCCGGCUGGGCACCUUCCUGGAGAACCGGGUGAACGACUUCCUGCGGCGGCAGAACCACCCC





GAGAGCGGCGAGGUGACCGUGCGGGUGGUGCACGCCAGCGACAAGACCGUGGAGGUGAAGCCCGGCAUGAAGGCCCGGUUCGU





GGACAGCGGCGAGAUGGCCGAGAGCUUCCCCUACCGGACCAAGGCCCUGUUCGCCUUCGAGGAGAUCGACGGCGUGGACCUGU





GCUUCUUCGGCAUGCACGUGCAGGAGUACGGCAGCGACUGCCCCCCCCCCAACCAGCGGCGGGUGUACAUCAGCUACCUGGAC





AGCGUGCACUUCUUCCGGCCCAAGUGCCUGCGGACCGCCGUGUACCACGAGAUCCUGAUCGGCUACCUGGAGUACGUGAAGAA





GCUGGGCUACACCACCGGCCACAUCUGGGCCUGCCCCCCCAGCGAGGGCGACGACUACAUCUUCCACUGCCACCCCCCCGACC





AGAAGAUCCCCAAGCCCAAGCGGCUGCAGGAGUGGUACAAGAAGAUGCUGGACAAGGCCGUGAGCGAGCGGAUCGUGCACGAC





UACAAGGACAUCUUCAAGCAGGCCACCGAGGACCGGCUGACCAGCGCCAAGGAGCUGCCCUACUUCGAGGGCGACUUCUGGCC





CAACGUGCUGGAGGAGAGCAUCAAGGAGCUGGAGCAGGAGGAGGAGGAGCGGAAGCGGGAGGAGAACACCAGCAACGAGAGCA





CCGACGUGACCAAGGGCGACAGCAAGAACGCCAAGAAGAAGAACAACAAGAAAACCAGCAAGAACAAGAGCAGCCUGAGCCGG





GGCAACAAGAAGAAGCCCGGCAUGCCCAACGUGAGCAACGACCUGAGCCAGAAGCUGUACGCCACCAUGGAGAAGCACAAGGA





GGUGUUCUUCGUGAUCCGGCUGAUCGCCGGCCCCGCCGCCAACAGCCUGCCCCCCAUCGUGGACCCCGACCCCCUGAUCCCCU





GCGACCUGAUGGACGGCCGGGACGCCUUCCUGACCCUGGCCCGGGACAAGCACCUGGAGUUCAGCAGCCUGCGGCGGGCCCAG





UGGAGCACCAUGUGCAUGCUGGUGGAGCUGCACACCCAGAGCCAGGACAGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGC





CGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGACUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUC





UGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUGUACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAA





GUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAA





>Nucleotide Seqeunce of DNA binding domain of ZF7-P300


SEQ ID NO.: 233



CUGGAACCGGGCGAGAAGCCAUACAAGUGCCCAGAGUGCGGCAAAAGCUUCAGCCGCAGCGACCAUCUGACCAAUCACCAACG






CACCCAUACCGGUGAGAAGCCGUACAAAUGCCCAGAGUGCGGUAAGAGCUUUAGUACCAGUGGCAGUCUGGUUCGUCAUCAGC





GCACGCACACGGGCGAAAAACCAUACAAAUGCCCGGAGUGCGGCAAAAGCUUUAGCCAAGCCGGUCAUCUGGCGAGCCAUCAA





CGUACGCACACCGGCGAGAAGCCGUAUAAAUGUCCGGAGUGCGGUAAGAGCUUUAGCCGUAGCGAUAAACUGACCGAACACCA





ACGUACGCAUACGGGCGAGAAACCAUAUAAAUGUCCAGAGUGUGGCAAGAGUUUCAGCACCAGCGGCAAUCUGACCGAGCAUC





AACGUACCCAUACCGGUGAAAAGCCAUAUAAAUGUCCAGAAUGCGGUAAGAGUUUUAGUCAGAGCAGUAAUCUGGUGCGCCAU





CAGCGUACCCACACGGGUGAGAAACCAUAUAAGUGUCCGGAAUGCGGCAAGAGUUUUAGCACCCAUCUGGAUCUGAUCCGUCA





UCAGCGCACCCACACCGGUGAAAAACCAACCGGCAAGAAAACCAGU





>ZF5.3-VPR3 protein


SEQ ID NO.: 170



MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSRSDHLTNHQRTHTGEKPYKCPECGKSFSTSGSLVRHQRTHTGE






KPYKCPECGKSFSQAGHLASHQRTHTGEKPYKCPECGKSFSRSDKLTEHQRTHTGEKPYKCPECGKSFSTSGNLTEHQRTHTG





EKPYKCPECGKSFSQSSNLVRHQRTHTGEKPYKCPECGKSFSTHLDLIRHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDL





DMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSGSGSQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGP





TDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPAMVSALA





QAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIPVAPH





TTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGSAI





SDVFEGREVCQPKRIRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDEETSQAV





KALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHISTGLSI





FDTSLFGSGSGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSGSGSQYLPDTDDRHRIEEK





RKRTYETFKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPP





QVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLA





SVDNSEFQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGS





GSRDSREGMFLPKPEAGSAISDVFEGREVCQPKRIRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPA





VTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILD





TFLNDECLLHAMHISTGLSIFDTSLFGSGSGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLG





SGSGSQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFP





TMVFPSGQISQASALAPAPPQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDD





EDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSG





DEDFSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGSAISDVFEGREVCQPKRIRPFHPPGSPWANRPLPASLAPTPTGPVH





EPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLES





MTEDLNLDSPLTPELNEILDTFLNDECLLHAMHISTGLSIFDTSLFGSGSGSGGGGSGKRPAATKKAGQAKKKKGSYPYDVPD





YA*





>ZF5.3-VPR3 mRNA


SEQ ID NO.: 171



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGCAAGGUCGGGAUCCAC






GGAGUCCCGGCAGCAGGAUCCUCAGGCUCACUGGAACCGGGGGAAAAACCCUACAAGUGCCCGGAAUGCGGCAAGAGCUUCUC





GCGGUCCGACCACCUGACCAACCACCAGAGAACACACACCGGCGAGAAGCCGUACAAGUGCCCCGAGUGCGGGAAGUCGUUCA





GCACCUCAGGAUCGCUGGUCCGCCACCAACGGACCCACACAGGAGAAAAGCCCUACAAGUGCCCGGAAUGCGGAAAGAGCUUC





AGCCAAGCCGGGCACCUGGCAUCACACCAGCGAACCCACACCGGAGAAAAACCGUACAAAUGCCCGGAGUGCGGCAAAUCCUU





CUCGCGCUCCGACAAGCUGACCGAACACCAAAGGACACACACCGGAGAGAAGCCCUACAAGUGCCCGGAAUGCGGAAAAUCGU





UCUCGACCUCGGGGAACCUGACCGAGCACCAACGCACCCACACCGGCGAAAAACCGUACAAAUGCCCGGAGUGCGGAAAGUCG





UUCUCACAAUCCUCCAACCUGGUCCGGCACCAAAGAACGCACACAGGGGAAAAGCCGUACAAGUGCCCCGAAUGCGGGAAAUC





CUUCAGCACCCACCUGGACCUCAUCCGGCACCAGAGGACCCACACCGGGGAGAAGCCGACCGGCAAAAAGACCUCAGCGAGCG





GAUCCGGAGGAGGAUCAGGGGGGGACGCCCUCGACGACUUCGAUCUGGACAUGCUGGGUAGCGACGCCCUGGAUGACUUCGAC





CUCGAUAUGCUGGGAUCCGACGCACUUGACGAUUUUGACUUGGACAUGCUCGGCUCCGACGCUCUGGACGAUUUCGACCUUGA





CAUGCUUGGCUCCGGCUCAGGAUCCCAGUACCUCCCCGAUACCGACGACAGACACCGGAUCGAAGAAAAGCGCAAGCGCACCU





ACGAAACCUUCAAGUCGAUUAUGAAGAAGUCGCCUUUCUCCGGGCCGACUGAUCCUAGACCUCCACCAAGAAGAAUCGCGGUG





CCGUCCAGAUCGUCCGCGUCAGUGCCGAAACCAGCACCGCAGCCGUAUCCGUUCACUUCCUCCCUUUCCACCAUUAACUACGA





CGAAUUCCCCACGAUGGUGUUCCCUAGCGGACAGAUUUCGCAAGCCAGCGCUCUUGCUCCUGCGCCUCCUCAAGUGCUGCCUC





AGGCCCCUGCUCCUGCUCCUGCACCCGCCAUGGUGUCCGCCCUGGCUCAAGCUCCAGCCCCUGUGCCUGUCCUGGCCCCUGGA





CCACCUCAGGCAGUAGCACCUCCCGCUCCCAAGCCCACCCAAGCGGGAGAGGGCACUCUUUCCGAGGCCCUGCUGCAACUGCA





GUUCGACGACGAGGACCUGGGGGCACUUCUGGGAAAUAGCACCGAUCCGGCCGUGUUCACCGACCUGGCCAGCGUCGACAACU





CAGAGUUCCAGCAGCUCCUCAACCAAGGGAUUCCGGUGGCCCCUCACACGACCGAGCCGAUGUUGAUGGAAUACCCGGAAGCC





AUCACCCGCCUAGUGACCGGAGCGCAAAGACCGCCUGACCCAGCUCCUGCCCCUUUGGGAGCCCCUGGAUUGCCCAACGGACU





CCUGUCCGGCGACGAGGAUUUCUCGUCCAUCGCCGAUAUGGACUUCUCGGCCCUGUUGGGUAGCGGUUCGGGUAGUCGCGAUA





GCCGGGAAGGAAUGUUCCUGCCGAAGCCUGAGGCCGGGUCUGCCAUUAGCGAUGUGUUUGAAGGACGGGAAGUCUGUCAGCCC





AAGCGGAUUCGCCCAUUCCACCCCCCUGGAUCGCCUUGGGCCAACAGGCCACUCCCCGCUUCGCUUGCGCCGACUCCUACCGG





GCCAGUGCACGAACCUGUGGGAUCCCUGACUCCGGCUCCUGUGCCACAGCCUCUGGAUCCGGCUCCCGCUGUCACCCCUGAGG





CCUCACACCUUCUCGAGGACCCCGACGAAGAGACUUCCCAGGCCGUGAAAGCGCUCCGGGAGAUGGCGGACACUGUGAUCCCG





CAAAAGGAAGAAGCCGCGAUUUGCGGCCAGAUGGACCUGUCGCAUCCUCCACCACGCGGUCACCUCGAUGAACUGACAACUAC





CCUGGAGUCGAUGACCGAGGACCUGAACCUGGACUCCCCGCUGACUCCUGAGCUCAACGAAAUCCUGGACACUUUCCUGAACG





AUGAGUGCCUGCUGCACGCCAUGCACAUCUCCACUGGGCUGUCAAUCUUCGACACCAGCCUGUUCGGCUCCGGAUCCGGUUCC





GACGCACUGGACGAUUUUGACCUGGAUAUGUUGGGGAGCGACGCACUGGACGAUUUUGAUCUGGAUAUGCUGGGAUCCGACGC





GCUCGACGAUUUCGACCUGGACAUGCUCGGAUCGGACGCCCUGGACGACUUCGACCUCGAUAUGCUUGGAUCAGGGUCCGGCU





CACAAUAUCUGCCGGACACUGAUGACCGGCAUAGAAUCGAAGAAAAGCGCAAGCGGACCUACGAAACUUUCAAGAGCAUCAUG





AAGAAAUCGCCGUUCUCUGGGCCGACUGAUCCUAGGCCGCCUCCGAGAAGGAUCGCCGUGCCCUCAAGAUCCUCCGCCUCUGU





GCCCAAGCCGGCUCCACAGCCUUACCCCUUCACUUCGUCGCUGAGCACCAUCAACUACGACGAAUUCCCGACCAUGGUCUUUC





CGAGCGGCCAGAUUUCCCAGGCGUCCGCCUUGGCUCCUGCACCACCCCAAGUGCUGCCUCAGGCGCCUGCACCAGCUCCAGCC





CCUGCCAUGGUGUCCGCGCUGGCACAAGCCCCUGCACCUGUGCCAGUGCUCGCACCUGGUCCUCCGCAAGCUGUGGCACCUCC





UGCGCCUAAGCCGACUCAGGCCGGAGAAGGGACCCUGUCAGAGGCCCUGCUGCAACUGCAGUUUGACGAUGAGGAUCUGGGAG





CCCUUCUGGGCAACUCGACUGACCCCGCCGUGUUCACCGACCUGGCGUCCGUGGAUAACUCCGAGUUCCAGCAGCUCCUCAAC





CAAGGGAUUCCUGUCGCCCCGCACACUACCGAGCCGAUGCUGAUGGAGUACCCGGAGGCCAUCACCCGGCUUGUGACGGGUGC





UCAGAGGCCUCCAGAUCCGGCUCCAGCACCGUUAGGAGCCCCCGGACUUCCUAACGGACUGCUGUCCGGCGACGAGGACUUCU





CCAGCAUCGCCGACAUGGAUUUUUCCGCGCUGUUGGGAUCGGGUUCCGGCUCAAGAGACAGCCGCGAGGGAAUGUUCCUCCCG





AAACCAGAGGCCGGCUCAGCCAUCAGCGACGUGUUCGAAGGGCGCGAAGUCUGCCAGCCCAAGCGGAUCCGCCCGUUUCAUCC





GCCUGGAUCACCGUGGGCCAACAGACCCCUACCCGCAAGCUUAGCCCCUACCCCCACUGGCCCUGUCCACGAACCUGUGGGCU





CCCUGACACCCGCUCCUGUGCCACAACCUCUGGACCCCGCACCAGCAGUCACACCCGAAGCCAGCCACCUCCUUGAGGAUCCG





GACGAGGAGACUAGCCAGGCCGUGAAGGCGCUCCGCGAAAUGGCCGACACUGUGAUCCCUCAAAAGGAAGAGGCGGCCAUUUG





UGGACAGAUGGACUUGUCCCACCCGCCUCCAAGAGGUCACCUGGACGAACUUACCACCACGCUCGAAUCCAUGACUGAGGAUC





UGAACCUGGAUUCCCCGCUCACUCCCGAGCUCAACGAAAUCCUUGAUACCUUCCUUAACGACGAGUGUCUCCUGCAUGCCAUG





CACAUCUCCACCGGACUGAGCAUUUUCGACACCUCGCUGUUCGGUUCCGGAAGCGGCUCAGACGCGCUGGAUGACUUCGAUUU





GGACAUGCUUGGCAGCGAUGCCCUGGAUGAUUUCGACCUGGACAUGCUCGGGUCGGAUGCGCUGGACGACUUCGAUCUCGAUA





UGUUGGGCUCCGAUGCCCUCGACGACUUUGACCUCGACAUGCUGGGCUCGGGCUCAGGAUCCCAAUACCUCCCGGAUACCGAC





GACAGGCAUCGCAUUGAGGAAAAGCGGAAGCGCACCUAUGAAACCUUCAAGUCCAUUAUGAAGAAGUCGCCCUUUUCCGGACC





GACUGACCCUCGGCCUCCUCCUCGACGAAUUGCCGUCCCAUCUCGGUCAUCCGCCUCGGUCCCCAAGCCAGCACCGCAGCCUU





AUCCGUUCACCUCCUCUCUGUCCACCAUUAACUACGAUGAAUUCCCCACCAUGGUGUUCCCGUCGGGACAGAUCUCCCAAGCC





UCAGCCCUUGCUCCUGCCCCUCCACAAGUCCUGCCCCAAGCACCAGCGCCUGCUCCUGCACCCGCGAUGGUGUCCGCACUGGC





GCAAGCUCCUGCCCCUGUGCCUGUGCUGGCUCCUGGACCACCCCAGGCAGUAGCACCUCCAGCCCCGAAGCCCACUCAGGCUG





GAGAGGGAACCCUGAGCGAAGCGCUGCUGCAGCUCCAGUUCGACGACGAAGAUCUGGGUGCCCUGCUGGGAAAUUCCACCGAU





CCGGCGGUGUUCACAGACCUGGCCUCCGUGGACAACUCCGAAUUCCAGCAGUUGUUGAACCAGGGCAUUCCUGUGGCCCCCCA





CACCACUGAGCCAAUGCUCAUGGAAUACCCCGAGGCCAUUACCAGACUCGUGACCGGAGCCCAAAGGCCUCCGGAUCCAGCGC





CAGCUCCGUUGGGAGCUCCGGGAUUGCCGAACGGGCUGCUGUCGGGAGAUGAAGAUUUCUCCUCAAUCGCCGAUAUGGACUUC





UCCGCGCUGCUGGGUUCGGGUUCGGGAUCGCGCGAUAGCCGGGAGGGCAUGUUCCUACCGAAGCCUGAGGCCGGAAGCGCCAU





CUCCGAUGUGUUCGAGGGCAGAGAAGUCUGUCAGCCUAAGCGCAUUCGCCCGUUCCACCCUCCUGGAUCGCCCUGGGCCAAUC





GGCCACUGCCUGCGUCCCUCGCUCCAACGCCGACCGGACCUGUGCACGAACCGGUCGGCUCACUGACUCCAGCUCCCGUCCCA





CAACCGCUCGACCCUGCUCCCGCUGUUACCCCCGAAGCCUCCCAUUUGCUGGAAGAUCCCGAUGAGGAAACUUCCCAGGCCGU





CAAGGCCCUGCGGGAGAUGGCAGACACCGUGAUACCCCAGAAGGAAGAAGCUGCCAUCUGCGGGCAGAUGGACCUGUCCCAUC





CUCCUCCACGCGGACACUUGGACGAGCUGACCACUACUCUGGAGUCCAUGACCGAGGACCUGAACCUUGACUCGCCUUUGACC





CCUGAACUGAACGAAAUUCUGGACACCUUCCUGAAUGACGAGUGCCUCCUGCACGCGAUGCACAUCAGCACCGGACUGUCCAU





CUUCGACACUUCCCUCUUUGGGAGCGGGUCCGGAUCAGGCGGUGGUGGUAGCGGGAAACGGCCAGCAGCGACCAAGAAGGCCG





GACAGGCCAAGAAGAAGAAAGGCUCAUACCCCUACGACGUGCCGGACUACGCAUGAGCGGCCGCUUAAUUAAGCUGCCUUCUG





CGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUGUACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGU





CUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>Nucleotide Seqeunce of DNA binding domain of ZF5.3-VPR3


SEQ ID NO.: 234



CUGGAACCGGGGGAAAAACCCUACAAGUGCCCGGAAUGCGGCAAGAGCUUCUCGCGGUCCGACCACCUGACCAACCACCAGAG






AACACACACCGGCGAGAAGCCGUACAAGUGCCCCGAGUGCGGGAAGUCGUUCAGCACCUCAGGAUCGCUGGUCCGCCACCAAC





GGACCCACACAGGAGAAAAGCCCUACAAGUGCCCGGAAUGCGGAAAGAGCUUCAGCCAAGCCGGGCACCUGGCAUCACACCAG





CGAACCCACACCGGAGAAAAACCGUACAAAUGCCCGGAGUGCGGCAAAUCCUUCUCGCGCUCCGACAAGCUGACCGAACACCA





AAGGACACACACCGGAGAGAAGCCCUACAAGUGCCCGGAAUGCGGAAAAUCGUUCUCGACCUCGGGGAACCUGACCGAGCACC





AACGCACCCACACCGGCGAAAAACCGUACAAAUGCCCGGAGUGCGGAAAGUCGUUCUCACAAUCCUCCAACCUGGUCCGGCAC





CAAAGAACGCACACAGGGGAAAAGCCGUACAAGUGCCCCGAAUGCGGGAAAUCCUUCAGCACCCACCUGGACCUCAUCCGGCA





CCAGAGGACCCACACCGGGGAGAAGCCGACCGGCAAAAAGACCUCA





>dCas9-VPR3 protein


SEQ ID NO.: 172



MAPKKKRKVGIHGVPAADKKYSIGLAIGTNSVGWAVITDEYKVPSKKFKVLGNTDRHSIKKNLIGALLFDSGETAEATRLKRT






ARRRYTRRKNRICYLQEIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDEVAYHEKYPTIYHLRKKLVDSTDKAD





LRLIYLALAHMIKFRGHFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDAKAILSARLSKSRRLENLIAQLPGEK





KNGLFGNLIALSLGLTPNFKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLAAKNLSDAILLSDILRVNTEITKAP





LSASMIKRYDEHHQDLTLLKALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIKPILEKMDGTEELLVKLNREDLL





RKQRTFDNGSIPHQIHLGELHAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRFAWMTRKSEETITPWNFEEVV





DKGASAQSFIERMTNFDKNLPNEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQKKAIVDLLFKTNRKVTVKQLK





EDYFKKIECFDSVEISGVEDRFNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDREMIEERLKTYAHLFDDKVM





KQLKRRRYTGWGRLSRKLINGIRDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQVSGQGDSLHEHIANLAGSP





AIKKGILQTVKVVDELVKVMGRHKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQILKEHPVENTQLQNEKLYLY





YLQNGRDMYVDQELDINRLSDYDVAAIVPQSFLKDDSIDNKVLTRSDKARGKSDNVPSEEVVKKMKNYWRQLLNAKLITQRKF





DNLTKAERGGLSELDKAGFIKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITLKSKLVSDFRKDFQFYKVREINN





YHHAHDAYLNAVVGTALIKKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFYSNIMNFFKTEITLANGEIRKRPLI





ETNGETGEIVWDKGRDFATVRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKKDWDPKKYGGFDSPTVAYSVLVV





AKVEKGKSKKLKSVKELLGITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELENGRKRMLASAGELQKGNELALP





SKYVNFLYLASHYEKLKGSPEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKVLSAYNKHRDKPIREQAENIIHL





FTLTNLGAPAAFKYFDTTIDRKRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGDSGGKRPAATKKAGQAKKKKSGGGGSDA





LDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSGSGSQYLPDTDDRHRIEEKRKRTYETFKSIMKK





SPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPA





MVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQG





IPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKP





EAGSAISDVFEGREVCQPKRIRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDE





ETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHI





STGLSIFDTSLFGSGSGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSGSGSQYLPDTDDR





HRIEEKRKRTYETFKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASA





LAPAPPQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPA





VFTDLASVDNSEFQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSA





LLGSGSGSRDSREGMFLPKPEAGSAISDVFEGREVCQPKRIRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQP





LDPAPAVTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPE





LNEILDTFLNDECLLHAMHISTGLSIFDTSLFGSGSGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDF





DLDMLGSGSGSQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTI





NYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALL





QLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLP





NGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGSAISDVFEGREVCQPKRIRPFHPPGSPWANRPLPASLAPT





PTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDEL





TTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHISTGLSIFDTSLFGSGSGSGGGGSGKRPAATKKAGQAKKKKGSY





PYDVPDYA*





>dCas9-VPR3 mRNA


SEQ ID NO. 173



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGAGGAAAGUCGGAAUUCAC






GGAGUGCCUGCAGCGGAUAAGAAGUACUCCAUCGGACUCGCAAUCGGCACCAACUCCGUGGGAUGGGCCGUGAUCACCGACGA





GUACAAAGUGCCGUCUAAAAAGUUCAAGGUGCUCGGAAACACCGAUAGGCACUCCAUCAAGAAGAACCUGAUUGGGGCCCUGC





UGUUUGAUUCCGGGGAAACGGCAGAGGCCACUCGCCUCAAGAGAACUGCACGCCGGCGGUACACUCGUCGGAAGAACCGCAUC





UGCUAUCUGCAAGAGAUUUUCUCCAACGAGAUGGCCAAAGUGGACGACUCAUUCUUCCACCGCCUCGAAGAAUCUUUCCUGGU





CGAAGAGGACAAGAAGCACGAACGCCACCCCAUUUUCGGGAACAUUGUCGACGAAGUGGCGUACCACGAGAAGUACCCCACCA





UCUACCAUCUCCGCAAGAAGCUCGUGGAUUCCACUGACAAGGCCGAUCUCAGACUGAUCUACCUGGCGCUUGCUCACAUGAUU





AAGUUCAGGGGUCACUUCCUGAUUGAGGGAGAUCUGAACCCCGACAACAGCGAUGUCGAUAAGCUGUUCAUUCAGCUGGUGCA





GACCUACAAUCAGCUGUUCGAAGAGAACCCCAUUAAUGCCUCCGGUGUCGAUGCCAAGGCCAUCCUGUCCGCACGGCUGAGCA





AAUCGCGCAGGCUGGAAAACCUGAUCGCCCAGCUGCCUGGAGAGAAAAAGAACGGACUGUUCGGCAACCUUAUCGCGCUGUCC





UUGGGACUGACCCCGAACUUCAAGAGCAACUUCGACUUGGCCGAGGAUGCCAAGCUGCAACUGUCGAAGGACACCUACGACGA





UGACCUCGAUAAUCUGCUGGCCCAAAUUGGCGAUCAAUAUGCAGACCUGUUCCUUGCCGCAAAGAACCUGAGCGACGCGAUUC





UCCUGUCGGACAUCCUGCGGGUCAACACCGAGAUCACCAAGGCACCGUUGUCCGCCUCCAUGAUUAAGCGAUACGACGAACAC





CAUCAGGACCUGACUCUGCUGAAGGCCCUGGUCCGCCAACAGCUGCCCGAGAAGUACAAGGAGAUCUUCUUCGACCAAUCCAA





GAAUGGAUACGCUGGAUACAUCGAUGGCGGUGCCAGCCAAGAGGAAUUCUACAAGUUCAUCAAACCGAUACUGGAGAAGAUGG





ACGGCACAGAGGAACUCCUGGUCAAGCUGAACCGGGAGGAUCUGCUGCGGAAGCAGAGGACCUUCGACAACGGGUCCAUCCCG





CACCAGAUUCACCUGGGCGAACUGCAUGCGAUCCUGCGACGGCAGGAGGACUUCUACCCAUUCCUGAAGGAUAACAGAGAGAA





AAUCGAGAAGAUCCUCACCUUCCGGAUCCCGUAUUACGUGGGACCCCUGGCUAGGGGCAACAGCCGCUUCGCCUGGAUGACCC





GCAAGUCCGAGGAAACUAUUACUCCCUGGAACUUCGAGGAAGUAGUGGACAAAGGCGCCAGCGCGCAAUCCUUCAUCGAACGG





AUGACCAACUUCGACAAGAACUUGCCGAACGAAAAGGUGUUGCCGAAGCAUUCUCUGCUGUAUGAGUACUUCACUGUGUACAA





CGAACUGACCAAAGUGAAAUACGUCACAGAAGGAAUGAGAAAGCCAGCCUUCCUUAGCGGGGAGCAGAAGAAGGCCAUUGUGG





ACCUCCUGUUCAAAACCAACCGAAAGGUCACCGUGAAGCAACUGAAGGAGGAUUACUUCAAGAAGAUCGAAUGUUUCGACUCG





GUGGAGAUCUCCGGGGUGGAGGAUCGCUUCAAUGCCUCCCUGGGCACCUACCAUGAUCUGCUCAAGAUCAUCAAGGAUAAGGA





CUUCCUCGACAACGAAGAGAACGAAGAUAUCCUGGAGGAUAUCGUGCUCACCCUCACCCUGUUCGAGGAUAGAGAGAUGAUCG





AAGAGAGACUUAAGACCUACGCCCACCUGUUCGACGACAAAGUCAUGAAGCAGCUGAAGCGGAGGAGGUACACUGGAUGGGGC





AGACUGUCCCGCAAGCUCAUCAACGGGAUUCGAGAUAAGCAGUCCGGAAAGACAAUCCUCGACUUCCUGAAAUCCGACGGAUU





UGCCAACAGAAACUUCAUGCAGCUGAUCCAUGAUGACUCGCUGACCUUCAAGGAGGAUAUUCAGAAGGCUCAAGUGUCGGGAC





AGGGCGAUUCCCUCCACGAGCACAUCGCCAACCUCGCGGGAUCCCCUGCAAUCAAGAAGGGUAUCCUGCAGACCGUGAAGGUC





GUGGACGAAUUAGUGAAAGUCAUGGGCCGGCAUAAGCCUGAAAACAUCGUGAUCGAGAUGGCCCGGGAAAACCAGACCACCCA





AAAGGGACAGAAGAACUCCCGCGAGCGCAUGAAGCGGAUCGAGGAAGGGAUCAAGGAGCUGGGGUCGCAGAUCUUAAAGGAGC





ACCCCGUGGAAAAUACUCAGCUGCAAAACGAAAAGCUGUACCUGUAUUACUUGCAAAACGGAAGAGAUAUGUACGUGGAUCAA





GAAUUGGACAUCAACAGACUCUCCGACUACGACGUCGCUGCGAUUGUGCCACAAAGCUUUCUUAAGGACGACUCCAUCGACAA





CAAGGUCCUCACCCGGUCCGAUAAGGCCCGCGGAAAGUCCGACAACGUGCCAAGCGAAGAGGUGGUCAAGAAGAUGAAGAAUU





ACUGGCGGCAGCUGCUGAACGCCAAGCUGAUAACUCAGCGGAAGUUCGACAACCUGACUAAGGCUGAGCGGGGAGGACUCUCG





GAACUGGACAAGGCUGGGUUCAUCAAGAGACAGUUGGUGGAAACCCGCCAAAUUACCAAACACGUGGCGCAGAUCCUGGACUC





ACGCAUGAACACUAAGUACGACGAGAACGAUAAGCUGAUUCGGGAAGUCAAAGUGAUCACCCUGAAGUCCAAGCUCGUCAGCG





ACUUCCGGAAGGAUUUCCAGUUUUACAAGGUCCGCGAAAUUAACAACUACCAUCAUGCUCACGACGCCUACUUGAACGCCGUG





GUCGGUACCGCCCUGAUCAAGAAGUAUCCAAAGCUCGAGUCCGAGUUUGUGUACGGCGACUACAAGGUCUACGACGUGCGCAA





GAUGAUCGCGAAAUCCGAGCAGGAAAUCGGAAAGGCCACCGCCAAGUACUUCUUCUACUCAAACAUUAUGAACUUCUUCAAGA





CCGAAAUCACUCUGGCGAACGGCGAAAUCCGGAAAAGACCGCUGAUCGAGACUAACGGCGAAACCGGCGAAAUCGUGUGGGAC





AAGGGACGGGACUUCGCCACCGUGCGCAAGGUGCUGUCGAUGCCCCAAGUGAACAUUGUGAAGAAAACCGAAGUCCAGACUGG





CGGCUUCAGCAAAGAAUCGAUCCUGCCCAAGAGAAACAGCGACAAGCUGAUCGCCCGCAAGAAGGACUGGGACCCCAAGAAAU





ACGGCGGUUUCGACUCACCCACUGUGGCCUACUCGGUCCUCGUGGUCGCCAAGGUCGAGAAGGGCAAAAGCAAAAAGCUUAAA





UCGGUGAAGGAACUUCUGGGUAUCACGAUCAUGGAACGCUCCUCCUUCGAAAAGAACCCCAUCGACUUUUUGGAAGCAAAGGG





AUACAAGGAAGUCAAGAAGGACCUCAUCAUCAAGCUGCCGAAGUAUAGCCUCUUCGAACUGGAGAACGGUCGGAAGAGAAUGC





UGGCUUCAGCGGGAGAGCUGCAAAAGGGAAACGAGCUGGCCCUUCCGAGCAAAUACGUCAACUUUCUGUACCUGGCCUCGCAC





UACGAAAAGCUCAAGGGAUCACCCGAGGACAACGAACAGAAGCAACUGUUCGUGGAACAGCAUAAGCAUUACCUGGAUGAGAU





UAUCGAACAGAUUUCCGAAUUCUCCAAGCGCGUGAUUCUGGCCGACGCCAACCUGGACAAGGUCCUUUCAGCCUACAACAAGC





ACCGGGAUAAGCCGAUCCGGGAACAGGCGGAAAACAUCAUCCAUCUGUUCACGUUGACUAAUCUUGGAGCACCAGCCGCGUUU





AAGUACUUUGACACCACCAUUGACAGGAAACGGUACACAUCCACGAAGGAAGUGUUGGAUGCGACGCUGAUUCAUCAGAGUAU





CACCGGACUCUACGAAACGCGGAUUGACCUCAGCCAGUUGGGAGGGGACUCCGGAGGAAAGAGGCCAGCCGCCACUAAGAAGG





CUGGGCAGGCCAAGAAGAAAAAGUCCGGUGGAGGAGGCUCAGACGCCCUCGACGACUUCGAUCUGGACAUGCUGGGUAGCGAC





GCCCUGGAUGACUUCGACCUCGAUAUGCUGGGAUCCGACGCACUUGACGAUUUUGACUUGGACAUGCUCGGCUCCGACGCUCU





GGACGAUUUCGACCUUGACAUGCUUGGCUCCGGCUCAGGAUCCCAGUACCUCCCCGAUACCGACGACAGACACCGGAUCGAAG





AAAAGCGCAAGCGCACCUACGAAACCUUCAAGUCGAUUAUGAAGAAGUCGCCUUUCUCCGGGCCGACUGAUCCUAGACCUCCA





CCAAGAAGAAUCGCGGUGCCGUCCAGAUCGUCCGCGUCAGUGCCGAAACCAGCACCGCAGCCGUAUCCGUUCACUUCCUCCCU





UUCCACCAUUAACUACGACGAAUUCCCCACGAUGGUGUUCCCUAGCGGACAGAUUUCGCAAGCCAGCGCUCUUGCUCCUGCGC





CUCCUCAAGUGCUGCCUCAGGCCCCUGCUCCUGCUCCUGCACCCGCCAUGGUGUCCGCCCUGGCUCAAGCUCCAGCCCCUGUG





CCUGUCCUGGCCCCUGGACCACCUCAGGCAGUAGCACCUCCCGCUCCCAAGCCCACCCAAGCGGGAGAGGGCACUCUUUCCGA





GGCCCUGCUGCAACUGCAGUUCGACGACGAGGACCUGGGGGCACUUCUGGGAAAUAGCACCGAUCCGGCCGUGUUCACCGACC





UGGCCAGCGUCGACAACUCAGAGUUCCAGCAGCUCCUCAACCAAGGGAUUCCGGUGGCCCCUCACACGACCGAGCCGAUGUUG





AUGGAAUACCCGGAAGCCAUCACCCGCCUAGUGACCGGAGCGCAAAGACCGCCUGACCCAGCUCCUGCCCCUUUGGGAGCCCC





UGGAUUGCCCAACGGACUCCUGUCCGGCGACGAGGAUUUCUCGUCCAUCGCCGAUAUGGACUUCUCGGCCCUGUUGGGUAGCG





GUUCGGGUAGUCGCGAUAGCCGGGAAGGAAUGUUCCUGCCGAAGCCUGAGGCCGGGUCUGCCAUUAGCGAUGUGUUUGAAGGA





CGGGAAGUCUGUCAGCCCAAGCGGAUUCGCCCAUUCCACCCCCCUGGAUCGCCUUGGGCCAACAGGCCACUCCCCGCUUCGCU





UGCGCCGACUCCUACCGGGCCAGUGCACGAACCUGUGGGAUCCCUGACUCCGGCUCCUGUGCCACAGCCUCUGGAUCCGGCUC





CCGCUGUCACCCCUGAGGCCUCACACCUUCUCGAGGACCCCGACGAAGAGACUUCCCAGGCCGUGAAAGCGCUCCGGGAGAUG





GCGGACACUGUGAUCCCGCAAAAGGAAGAAGCCGCGAUUUGCGGCCAGAUGGACCUGUCGCAUCCUCCACCACGCGGUCACCU





CGAUGAACUGACAACUACCCUGGAGUCGAUGACCGAGGACCUGAACCUGGACUCCCCGCUGACUCCUGAGCUCAACGAAAUCC





UGGACACUUUCCUGAACGAUGAGUGCCUGCUGCACGCCAUGCACAUCUCCACUGGGCUGUCAAUCUUCGACACCAGCCUGUUC





GGCUCCGGAUCCGGUUCCGACGCACUGGACGAUUUUGACCUGGAUAUGUUGGGGAGCGACGCACUGGACGAUUUUGAUCUGGA





UAUGCUGGGAUCCGACGCGCUCGACGAUUUCGACCUGGACAUGCUCGGAUCGGACGCCCUGGACGACUUCGACCUCGAUAUGC





UUGGAUCAGGGUCCGGCUCACAAUAUCUGCCGGACACUGAUGACCGGCAUAGAAUCGAAGAAAAGCGCAAGCGGACCUACGAA





ACUUUCAAGAGCAUCAUGAAGAAAUCGCCGUUCUCUGGGCCGACUGAUCCUAGGCCGCCUCCGAGAAGGAUCGCCGUGCCCUC





AAGAUCCUCCGCCUCUGUGCCCAAGCCGGCUCCACAGCCUUACCCCUUCACUUCGUCGCUGAGCACCAUCAACUACGACGAAU





UCCCGACCAUGGUCUUUCCGAGCGGCCAGAUUUCCCAGGCGUCCGCCUUGGCUCCUGCACCACCCCAAGUGCUGCCUCAGGCG





CCUGCACCAGCUCCAGCCCCUGCCAUGGUGUCCGCGCUGGCACAAGCCCCUGCACCUGUGCCAGUGCUCGCACCUGGUCCUCC





GCAAGCUGUGGCACCUCCUGCGCCUAAGCCGACUCAGGCCGGAGAAGGGACCCUGUCAGAGGCCCUGCUGCAACUGCAGUUUG





ACGAUGAGGAUCUGGGAGCCCUUCUGGGCAACUCGACUGACCCCGCCGUGUUCACCGACCUGGCGUCCGUGGAUAACUCCGAG





UUCCAGCAGCUCCUCAACCAAGGGAUUCCUGUCGCCCCGCACACUACCGAGCCGAUGCUGAUGGAGUACCCGGAGGCCAUCAC





CCGGCUUGUGACGGGUGCUCAGAGGCCUCCAGAUCCGGCUCCAGCACCGUUAGGAGCCCCCGGACUUCCUAACGGACUGCUGU





CCGGCGACGAGGACUUCUCCAGCAUCGCCGACAUGGAUUUUUCCGCGCUGUUGGGAUCGGGUUCCGGCUCAAGAGACAGCCGC





GAGGGAAUGUUCCUCCCGAAACCAGAGGCCGGCUCAGCCAUCAGCGACGUGUUCGAAGGGCGCGAAGUCUGCCAGCCCAAGCG





GAUCCGCCCGUUUCAUCCGCCUGGAUCACCGUGGGCCAACAGACCCCUACCCGCAAGCUUAGCCCCUACCCCCACUGGCCCUG





UCCACGAACCUGUGGGCUCCCUGACACCCGCUCCUGUGCCACAACCUCUGGACCCCGCACCAGCAGUCACACCCGAAGCCAGC





CACCUCCUUGAGGAUCCGGACGAGGAGACUAGCCAGGCCGUGAAGGCGCUCCGCGAAAUGGCCGACACUGUGAUCCCUCAAAA





GGAAGAGGCGGCCAUUUGUGGACAGAUGGACUUGUCCCACCCGCCUCCAAGAGGUCACCUGGACGAACUUACCACCACGCUCG





AAUCCAUGACUGAGGAUCUGAACCUGGAUUCCCCGCUCACUCCCGAGCUCAACGAAAUCCUUGAUACCUUCCUUAACGACGAG





UGUCUCCUGCAUGCCAUGCACAUCUCCACCGGACUGAGCAUUUUCGACACCUCGCUGUUCGGUUCCGGAAGCGGCUCAGACGC





GCUGGAUGACUUCGAUUUGGACAUGCUUGGCAGCGAUGCCCUGGAUGAUUUCGACCUGGACAUGCUCGGGUCGGAUGCGCUGG





ACGACUUCGAUCUCGAUAUGUUGGGCUCCGAUGCCCUCGACGACUUUGACCUCGACAUGCUGGGCUCGGGCUCAGGAUCCCAA





UACCUCCCGGAUACCGACGACAGGCAUCGCAUUGAGGAAAAGCGGAAGCGCACCUAUGAAACCUUCAAGUCCAUUAUGAAGAA





GUCGCCCUUUUCCGGACCGACUGACCCUCGGCCUCCUCCUCGACGAAUUGCCGUCCCAUCUCGGUCAUCCGCCUCGGUCCCCA





AGCCAGCACCGCAGCCUUAUCCGUUCACCUCCUCUCUGUCCACCAUUAACUACGAUGAAUUCCCCACCAUGGUGUUCCCGUCG





GGACAGAUCUCCCAAGCCUCAGCCCUUGCUCCUGCCCCUCCACAAGUCCUGCCCCAAGCACCAGCGCCUGCUCCUGCACCCGC





GAUGGUGUCCGCACUGGCGCAAGCUCCUGCCCCUGUGCCUGUGCUGGCUCCUGGACCACCCCAGGCAGUAGCACCUCCAGCCC





CGAAGCCCACUCAGGCUGGAGAGGGAACCCUGAGCGAAGCGCUGCUGCAGCUCCAGUUCGACGACGAAGAUCUGGGUGCCCUG





CUGGGAAAUUCCACCGAUCCGGCGGUGUUCACAGACCUGGCCUCCGUGGACAACUCCGAAUUCCAGCAGUUGUUGAACCAGGG





CAUUCCUGUGGCCCCCCACACCACUGAGCCAAUGCUCAUGGAAUACCCCGAGGCCAUUACCAGACUCGUGACCGGAGCCCAAA





GGCCUCCGGAUCCAGCGCCAGCUCCGUUGGGAGCUCCGGGAUUGCCGAACGGGCUGCUGUCGGGAGAUGAAGAUUUCUCCUCA





AUCGCCGAUAUGGACUUCUCCGCGCUGCUGGGUUCGGGUUCGGGAUCGCGCGAUAGCCGGGAGGGCAUGUUCCUACCGAAGCC





UGAGGCCGGAAGCGCCAUCUCCGAUGUGUUCGAGGGCAGAGAAGUCUGUCAGCCUAAGCGCAUUCGCCCGUUCCACCCUCCUG





GAUCGCCCUGGGCCAAUCGGCCACUGCCUGCGUCCCUCGCUCCAACGCCGACCGGACCUGUGCACGAACCGGUCGGCUCACUG





ACUCCAGCUCCCGUCCCACAACCGCUCGACCCUGCUCCCGCUGUUACCCCCGAAGCCUCCCAUUUGCUGGAAGAUCCCGAUGA





GGAAACUUCCCAGGCCGUCAAGGCCCUGCGGGAGAUGGCAGACACCGUGAUACCCCAGAAGGAAGAAGCUGCCAUCUGCGGGC





AGAUGGACCUGUCCCAUCCUCCUCCACGCGGACACUUGGACGAGCUGACCACUACUCUGGAGUCCAUGACCGAGGACCUGAAC





CUUGACUCGCCUUUGACCCCUGAACUGAACGAAAUUCUGGACACCUUCCUGAAUGACGAGUGCCUCCUGCACGCGAUGCACAU





CAGCACCGGACUGUCCAUCUUCGACACUUCCCUCUUUGGGAGCGGGUCCGGAUCAGGCGGUGGUGGUAGCGGGAAACGGCCAG





CAGCGACCAAGAAGGCCGGACAGGCCAAGAAGAAGAAAGGCUCAUACCCCUACGACGUGCCGGACUACGCAUGAGCGGCCGCU





UAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUGUACCUCUUGGUCUUUGAAU





AAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF5 protein-no effector


SEQ ID NO.: 174



MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSTSGNLTEHQRTHTGEKPYKCPECGKSFSDSGNLRVHQRTHTGE






KPYKCPECGKSFSHKNALQNHQRTHTGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPYKCPECGKSFSQRAHLERHQRTHTG





EKPYKCPECGKSFSRSDKLVRHQRTHTGEKPYKCPECGKSFSDPGHLVRHQRTHTGEKPTGKKTSASSGGKRPAATKKAGQAK





KKKGSYPYDVPDYA





>Nucleotide Seqeunce of DNA binding domain of ZF5 protein-no effector


SEQ ID NO.: 235



CUCGAACCGGGCGAGAAACCAUAUAAGUGUCCAGAGUGUGGUAAGAGCUUUAGCACCAGUGGCAAUCUGACCGAGCAUCAACG






CACGCAUACGGGUGAGAAACCGUACAAGUGCCCGGAAUGCGGCAAAAGUUUCAGCGAUAGCGGCAAUCUGCGUGUGCACCAGC





GUACGCAUACGGGCGAAAAGCCGUAUAAGUGCCCAGAAUGCGGUAAGAGUUUUAGCCACAAAAACGCGCUGCAGAACCACCAG





CGCACCCACACGGGUGAGAAGCCAUACAAAUGUCCGGAAUGCGGCAAAAGCUUCAGCCGCAACGAUACGCUGACGGAACACCA





ACGUACGCAUACCGGCGAAAAGCCAUACAAGUGCCCGGAGUGCGGUAAAAGCUUUAGCCAGCGCGCGCAUCUCGAACGUCAUC





AACGUACCCAUACCGGUGAAAAACCAUAUAAAUGCCCGGAAUGUGGUAAAAGUUUUAGCCGCAGCGACAAACUGGUGCGUCAU





CAACGCACCCAUACCGGUGAAAAGCCAUAUAAGUGCCCGGAGUGUGGUAAAAGCUUCAGCGAUCCGGGUCAUCUGGUUCGCCA





UCAACGUACGCACACCGGCGAGAAGCCAACCGGCAAGAAAACCAGC





>ZF5 mRNA-no effector


SEQ ID NO.: 175



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCCCUCGAACCGGGCGAGAAACCAUAUAAGUGUCCAGAGUGUGGUAAGAGCUUUAG





CACCAGUGGCAAUCUGACCGAGCAUCAACGCACGCAUACGGGUGAGAAACCGUACAAGUGCCCGGAAUGCGGCAAAAGUUUCA





GCGAUAGCGGCAAUCUGCGUGUGCACCAGCGUACGCAUACGGGCGAAAAGCCGUAUAAGUGCCCAGAAUGCGGUAAGAGUUUU





AGCCACAAAAACGCGCUGCAGAACCACCAGCGCACCCACACGGGUGAGAAGCCAUACAAAUGUCCGGAAUGCGGCAAAAGCUU





CAGCCGCAACGAUACGCUGACGGAACACCAACGUACGCAUACCGGCGAAAAGCCAUACAAGUGCCCGGAGUGCGGUAAAAGCU





UUAGCCAGCGCGCGCAUCUCGAACGUCAUCAACGUACCCAUACCGGUGAAAAACCAUAUAAAUGCCCGGAAUGUGGUAAAAGU





UUUAGCCGCAGCGACAAACUGGUGCGUCAUCAACGCACCCAUACCGGUGAAAAGCCAUAUAAGUGCCCGGAGUGUGGUAAAAG





CUUCAGCGAUCCGGGUCAUCUGGUUCGCCAUCAACGUACGCACACCGGCGAGAAGCCAACCGGCAAGAAAACCAGCGCUAGCA





GCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGAC





UACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUGU





ACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>ZF5.3-VPR-tPT2a-ZF7-VPR protein


SEQ ID NO.: 176



MAPKKKRKVGIHGVPAAGSSGSSGSLEPGEKPYKCPECGKSFSTSGNLTEHQRTHTGEKPYKCPECGKSFSDSGNLRVHQRTH






TGEKPYKCPECGKSFSHKNALQNHQRTHTGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPYKCPECGKSFSQRAHLERHQRT





HTGEKPYKCPECGKSFSRSDKLVRHQRTHTGEKPYKCPECGKSFSDPGHLVRHQRTHTGEKPTGKKTSASGSGGGSGGDALDD





FDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSI





MKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAP





APAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLL





NQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFL





PKPEAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLED





PDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHA





MHISTGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYAATNFSLLKQAGDVEENPGPTSAGKLGSGEGRGSLLTCG





DVEENPGPLEGSSGSGSLEPGEKPYKCPECGKSFSRSDHLTNHQRTHTGEKPYKCPECGKSFSTSGSLVRHQRTHTGEKPYKC





PECGKSFSQAGHLASHQRTHTGEKPYKCPECGKSFSRSDKLTEHQRTHTGEKPYKCPECGKSFSTSGNLTEHQRTHTGEKPYK





CPECGKSFSQSSNLVRHQRTHTGEKPYKCPECGKSFSTHLDLIRHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDLDMLGS





DALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSG





PTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPAMVSAL





AQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIPVAP





HTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGSA





ISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDEETSQA





VKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHISTGLS





IFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>ZF5.3-VPR-tPT2a-ZF7-VPR mRNA


SEQ ID NO.: 177



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCAUCAGGCUCACUGGAACCGGGGGAAAAACCCUACAAGUGCCCGGAAUGCGGCAA





GAGCUUCUCGACCUCCGGGAACCUGACCGAGCACCAGCGCACCCACACCGGAGAGAAACCGUACAAGUGCCCCGAAUGCGGGA





AAUCGUUCUCAGACUCGGGAAACCUCAGGGUGCACCAGCGGACCCACACGGGGGAAAAGCCGUACAAAUGCCCGGAGUGCGGG





AAGUCAUUCUCCCACAAGAACGCGCUGCAGAACCACCAAAGAACCCACACCGGCGAAAAACCGUACAAGUGCCCCGAGUGCGG





AAAGUCCUUCUCCCGCAACGACACCCUCACCGAACACCAACGCACCCACACCGGAGAAAAGCCCUACAAGUGCCCGGAAUGCG





GAAAGAGCUUCAGCCAGAGGGCCCACCUGGAAAGACACCAGAGAACCCACACCGGCGAAAAGCCGUACAAAUGCCCGGAGUGC





GGGAAGUCCUUCAGCCGGUCAGACAAGCUGGUCCGCCACCAAAGGACCCACACAGGAGAAAAGCCCUACAAGUGCCCGGAAUG





CGGAAAAUCGUUCAGCGACCCCGGACACCUGGUCCGGCACCAGAGGACCCACACCGGGGAGAAGCCGACCGGCAAAAAGACCU





CAGCGAGCGGAUCAGGAGGAGGAUCAGGGGGGGACGCACUGGACGACUUCGACCUGGACAUGCUGGGAUCAGACGCACUGGAC





GACUUCGACCUAGACAUGCUCGGAUCGGACGCACUCGACGACUUCGACCUCGACAUGCUAGGAUCAGACGCACUAGACGACUU





CGACCUCGACAUGCUGUCGGGAGGACCGAAGAAAAAGCGGAAGGUCGGAUCACAGUACCUCCCGGACACCGACGACAGGCACA





GAAUCGAAGAAAAACGCAAGCGCACCUACGAAACCUUCAAGAGCAUCAUGAAAAAGUCGCCGUUCUCAGGACCGACCGACCCC





AGACCGCCACCGAGGAGAAUAGCCGUCCCGAGCCGAUCCUCCGCAUCCGUGCCGAAACCGGCACCGCAACCCUACCCGUUCAC





CUCGUCCCUGUCGACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCUCCGGGCAGAUCUCACAGGCCUCGGCACUGG





CACCCGCACCACCGCAAGUGCUGCCCCAAGCACCGGCACCCGCACCGGCGCCCGCAAUGGUGUCAGCGCUGGCACAGGCACCA





GCACCGGUGCCAGUCCUCGCACCGGGACCGCCGCAAGCAGUGGCACCGCCGGCACCGAAACCGACCCAGGCCGGAGAAGGGAC





CCUGUCCGAGGCGCUGCUGCAACUCCAGUUCGACGACGAGGACCUGGGAGCACUCCUGGGAAACUCCACCGACCCGGCAGUGU





UCACCGACCUCGCAUCGGUGGACAACUCCGAGUUCCAACAGCUCCUGAACCAGGGGAUACCGGUGGCACCGCACACCACCGAA





CCGAUGCUGAUGGAAUACCCGGAAGCCAUCACCCGGCUCGUGACCGGAGCGCAAAGACCGCCCGACCCCGCGCCCGCACCGCU





GGGAGCACCGGGACUACCGAACGGGCUGCUCUCAGGGGACGAGGACUUCUCCAGCAUCGCAGACAUGGACUUCUCCGCCCUGC





UGGGAUCAGGAUCAGGAUCACGCGACUCCCGGGAAGGAAUGUUCCUGCCGAAGCCGGAAGCAGGCAGCGCAAUCUCCGACGUG





UUCGAAGGCCGCGAGGUCUGCCAGCCCAAGCGCCUGCGACCGUUCCACCCGCCGGGAUCACCGUGGGCAAACCGCCCGCUACC





GGCAUCACUGGCACCGACACCCACCGGACCGGUGCACGAACCGGUCGGGUCACUGACCCCCGCACCGGUCCCGCAACCGCUAG





ACCCGGCACCGGCAGUGACCCCGGAAGCAUCGCACCUCCUGGAGGACCCGGACGAGGAAACCUCACAGGCAGUGAAGGCCCUG





CGGGAGAUGGCCGACACCGUGAUACCGCAGAAGGAGGAGGCCGCCAUCUGCGGACAAAUGGACCUGUCACACCCGCCCCCGAG





AGGACACCUGGACGAACUCACCACCACCCUGGAGAGCAUGACCGAGGACCUGAACCUGGACUCACCGCUGACCCCGGAGCUGA





ACGAAAUCCUGGACACCUUCCUGAACGACGAGUGCCUGCUGCACGCAAUGCACAUCAGCACCGGGCUGUCGAUCUUCGACACC





AGCCUGUUCUCCGGAGGGAAAAGACCCGCCGCCACCAAGAAAGCGGGCCAAGCAAAGAAAAAGAAGGGAUCGUACCCCUACGA





CGUGCCGGACUACGCAGCCACCAACUUUUCUCUGCUGAAGCAAGCCGGAGAUGUGGAGGAGAAUCCCGGCCCUACCUCCGCCG





GAAAACUGGGCUCCGGCGAAGGCAGAGGAAGCCUCCUCACAUGCGGCGACGUGGAGGAGAACCCCGGCCCUCUGGAGGGAUCC





UCAGGCUCAGGAUCCCUGGAACCGGGCGAGAAGCCAUACAAGUGCCCAGAGUGCGGCAAAAGCUUCAGCCGCAGCGACCAUCU





GACCAAUCACCAACGCACCCAUACCGGUGAGAAGCCGUACAAAUGCCCAGAGUGCGGUAAGAGCUUUAGUACCAGUGGCAGUC





UGGUUCGUCAUCAGCGCACGCACACGGGCGAAAAACCAUACAAAUGCCCGGAGUGCGGCAAAAGCUUUAGCCAAGCCGGUCAU





CUGGCGAGCCAUCAACGUACGCACACCGGCGAGAAGCCGUAUAAAUGUCCGGAGUGCGGUAAGAGCUUUAGCCGUAGCGAUAA





ACUGACCGAACACCAACGUACGCAUACGGGCGAGAAACCAUAUAAAUGUCCAGAGUGUGGCAAGAGUUUCAGCACCAGCGGCA





AUCUGACCGAGCAUCAACGUACCCAUACCGGUGAAAAGCCAUAUAAAUGUCCAGAAUGCGGUAAGAGUUUUAGUCAGAGCAGU





AAUCUGGUGCGCCAUCAGCGUACCCACACGGGUGAGAAACCAUAUAAGUGUCCGGAAUGCGGCAAGAGUUUUAGCACCCAUCU





GGAUCUGAUCCGUCAUCAGCGCACCCACACCGGUGAAAAACCAACCGGCAAGAAAACCAGUGCUAGCGGCAGCGGCGGCGGCA





GCGGCGGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGC





AGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGAGCGGCGG





CCCCAAGAAGAAGCGGAAGGUGGGCAGCCAGUACCUGCCCGACACCGACGACCGGCACCGGAUCGAGGAGAAGCGGAAGCGGA





CCUACGAGACCUUCAAGAGCAUCAUGAAGAAAUCCCCCUUCAGCGGCCCCACCGACCCCCGGCCCCCCCCCCGGCGGAUCGCC





GUGCCCAGCCGGAGCAGCGCCAGCGUGCCCAAGCCCGCCCCCCAGCCCUACCCCUUCACCAGCAGCCUGAGCACCAUCAACUA





CGACGAGUUCCCCACCAUGGUGUUCCCCAGCGGCCAGAUCAGCCAGGCCAGCGCCCUGGCCCCCGCCCCCCCCCAGGUGCUGC





CCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCAUGGUGAGCGCCCUGGCCCAGGCCCCCGCCCCCGUGCCCGUGCUGGCCCCC





GGCCCCCCCCAGGCCGUGGCCCCCCCCGCCCCCAAGCCCACCCAGGCCGGCGAGGGCACCCUGAGCGAGGCCCUGCUGCAGCU





GCAGUUCGACGACGAGGACCUGGGCGCCCUGCUGGGCAACAGCACCGACCCCGCCGUGUUCACCGACCUGGCCAGCGUGGACA





ACAGCGAGUUCCAGCAGCUGCUGAACCAGGGCAUCCCCGUGGCCCCCCACACCACCGAGCCCAUGCUGAUGGAGUACCCCGAG





GCCAUCACCCGGCUGGUGACCGGCGCCCAGCGGCCCCCCGACCCCGCCCCCGCCCCCCUGGGCGCCCCCGGCCUGCCCAACGG





CCUGCUGAGCGGCGACGAGGACUUCAGCAGCAUCGCCGACAUGGACUUCAGCGCCCUGCUGGGCAGCGGCAGCGGCAGCCGGG





ACAGCCGGGAGGGCAUGUUCCUGCCCAAGCCCGAGGCCGGCAGCGCCAUCAGCGACGUGUUCGAGGGCCGGGAGGUGUGCCAG





CCCAAGCGGCUCCGGCCCUUCCACCCCCCCGGCAGCCCCUGGGCCAACCGGCCCCUGCCCGCCAGCCUGGCCCCCACCCCCAC





CGGCCCCGUGCACGAGCCCGUGGGCAGCCUGACCCCCGCCCCCGUGCCCCAGCCCCUGGACCCCGCCCCCGCCGUGACCCCCG





AGGCCAGCCACCUGCUGGAGGACCCCGACGAGGAGACCAGCCAGGCCGUGAAGGCCCUGCGGGAGAUGGCCGACACCGUGAUC





CCCCAGAAGGAGGAGGCCGCCAUCUGCGGCCAGAUGGACCUGAGCCACCCCCCCCCCCGGGGCCACCUGGACGAGCUGACCAC





CACCCUGGAGAGCAUGACCGAGGACCUGAACCUGGACAGCCCCCUGACCCCCGAGCUGAACGAGAUCCUGGACACCUUCCUGA





ACGACGAGUGCCUGCUGCACGCCAUGCACAUCAGCACCGGCCUGAGCAUCUUCGACACCAGCCUGUUCAGCGGCGGCAAGCGG





CCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGACUACGCCUGAGCGGC





CGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUGUACCUCUUGGUCUUU





GAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAA





>Nucleotide Seqeunce of DNA binding domain of ZF5.3-VPR-tPT2a-ZF7-VPR; 1


SEQ ID NO.: 236



CUGGAACCGGGGGAAAAACCCUACAAGUGCCCGGAAUGCGGCAAGAGCUUCUCGACCUCCGGGAACCUGACCGAGCACCAGCG






CACCCACACCGGAGAGAAACCGUACAAGUGCCCCGAAUGCGGGAAAUCGUUCUCAGACUCGGGAAACCUCAGGGUGCACCAGC





GGACCCACACGGGGGAAAAGCCGUACAAAUGCCCGGAGUGCGGGAAGUCAUUCUCCCACAAGAACGCGCUGCAGAACCACCAA





AGAACCCACACCGGCGAAAAACCGUACAAGUGCCCCGAGUGCGGAAAGUCCUUCUCCCGCAACGACACCCUCACCGAACACCA





ACGCACCCACACCGGAGAAAAGCCCUACAAGUGCCCGGAAUGCGGAAAGAGCUUCAGCCAGAGGGCCCACCUGGAAAGACACC





AGAGAACCCACACCGGCGAAAAGCCGUACAAAUGCCCGGAGUGCGGGAAGUCCUUCAGCCGGUCAGACAAGCUGGUCCGCCAC





CAAAGGACCCACACAGGAGAAAAGCCCUACAAGUGCCCGGAAUGCGGAAAAUCGUUCAGCGACCCCGGACACCUGGUCCGGCA





CCAGAGGACCCACACCGGGGAGAAGCCGACCGGCAAAAAGACCUCA





>Nucleotide Seqeunce of DNA binding domain of ZF5.3-VPR-tPT2a-ZF7-VPR; 2


SEQ ID NO.: 237



CTGGAACCGGGCGAGAAGCCATACAAGTGCCCAGAGTGCGGCAAAAGCTTCAGCCGCAGCGACCATCTGACCAATCACCAACG






CACCCATACCGGTGAGAAGCCGTACAAATGCCCAGAGTGCGGTAAGAGCTTTAGTACCAGTGGCAGTCTGGTTCGTCATCAGC





GCACGCACACGGGCGAAAAACCATACAAATGCCCGGAGTGCGGCAAAAGCTTTAGCCAAGCCGGTCATCTGGCGAGCCATCAA





CGTACGCACACCGGCGAGAAGCCGTATAAATGTCCGGAGTGCGGTAAGAGCTTTAGCCGTAGCGATAAACTGACCGAACACCA





ACGTACGCATACGGGCGAGAAACCATATAAATGTCCAGAGTGTGGCAAGAGTTTCAGCACCAGCGGCAATCTGACCGAGCATC





AACGTACCCATACCGGTGAAAAGCCATATAAATGTCCAGAATGCGGTAAGAGTTTTAGTCAGAGCAGTAATCTGGTGCGCCAT





CAGCGTACCCACACGGGTGAGAAACCATATAAGTGTCCGGAATGCGGCAAGAGTTTTAGCACCCATCTGGATCTGATCCGTCA





TCAGCGCACCCACACCGGTGAAAAACCAACCGGCAAGAAAACCAGT





>ZF7-VPR-tPT2a-ZF5.3-VPR protein


SEQ ID NO.: 178



MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSRSDHLTNHQRTHTGEKPYKCPECGKSFSTSGSLVRHQRTHTGE






KPYKCPECGKSFSQAGHLASHQRTHTGEKPYKCPECGKSFSRSDKLTEHQRTHTGEKPYKCPECGKSFSTSGNLTEHQRTHTG





EKPYKCPECGKSFSQSSNLVRHQRTHTGEKPYKCPECGKSFSTHLDLIRHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDL





DMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKK





SPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPA





MVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQG





IPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKP





EAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDE





ETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHI





STGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYAATNFSLLKQAGDVEENPGPTSAGKLGSGEGRGSLLTCGDVE





ENPGPLEGSSGSLEPGEKPYKCPECGKSFSTSGNLTEHQRTHTGEKPYKCPECGKSFSDSGNLRVHQRTHTGEKPYKCPECGK





SFSHKNALQNHQRTHTGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPYKCPECGKSFSQRAHLERHQRTHTGEKPYKCPECG





KSFSRSDKLVRHQRTHTGEKPYKCPECGKSFSDPGHLVRHQRTHTGEKPTGKKTSASGSGGGSGGDALDDFDLDMLGSDALDD





FDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSIMKKSPFSGPTDPR





PPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAPAPAMVSALAQAPA





PVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLLNQGIPVAPHTTEP





MLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFLPKPEAGSAISDVF





EGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLEDPDEETSQAVKALR





EMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHAMHISTGLSIFDTS





LFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA





>ZF7-VPR-tPT2a-ZF5.3-VPR mRNA


SEQ ID NO.: 179



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCCCUGGAACCGGGCGAGAAGCCAUACAAGUGCCCAGAGUGCGGCAAAAGCUUCAG





CCGCAGCGACCAUCUGACCAAUCACCAACGCACCCAUACCGGUGAGAAGCCGUACAAAUGCCCAGAGUGCGGUAAGAGCUUUA





GUACCAGUGGCAGUCUGGUUCGUCAUCAGCGCACGCACACGGGCGAAAAACCAUACAAAUGCCCGGAGUGCGGCAAAAGCUUU





AGCCAAGCCGGUCAUCUGGCGAGCCAUCAACGUACGCACACCGGCGAGAAGCCGUAUAAAUGUCCGGAGUGCGGUAAGAGCUU





UAGCCGUAGCGAUAAACUGACCGAACACCAACGUACGCAUACGGGCGAGAAACCAUAUAAAUGUCCAGAGUGUGGCAAGAGUU





UCAGCACCAGCGGCAAUCUGACCGAGCAUCAACGUACCCAUACCGGUGAAAAGCCAUAUAAAUGUCCAGAAUGCGGUAAGAGU





UUUAGUCAGAGCAGUAAUCUGGUGCGCCAUCAGCGUACCCACACGGGUGAGAAACCAUAUAAGUGUCCGGAAUGCGGCAAGAG





UUUUAGCACCCAUCUGGAUCUGAUCCGUCAUCAGCGCACCCACACCGGUGAAAAACCAACCGGCAAGAAAACCAGUGCUAGCG





GCAGCGGCGGCGGCAGCGGCGGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGAC





CUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGACAUGCUGGGCAGCGACGCCCUGGACGACUUCGACCUGGA





CAUGCUGAGCGGCGGCCCCAAGAAGAAGCGGAAGGUGGGCAGCCAGUACCUGCCCGACACCGACGACCGGCACCGGAUCGAGG





AGAAGCGGAAGCGGACCUACGAGACCUUCAAGAGCAUCAUGAAGAAAUCCCCCUUCAGCGGCCCCACCGACCCCCGGCCCCCC





CCCCGGCGGAUCGCCGUGCCCAGCCGGAGCAGCGCCAGCGUGCCCAAGCCCGCCCCCCAGCCCUACCCCUUCACCAGCAGCCU





GAGCACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCAGCGGCCAGAUCAGCCAGGCCAGCGCCCUGGCCCCCGCCC





CCCCCCAGGUGCUGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCAUGGUGAGCGCCCUGGCCCAGGCCCCCGCCCCCGUG





CCCGUGCUGGCCCCCGGCCCCCCCCAGGCCGUGGCCCCCCCCGCCCCCAAGCCCACCCAGGCCGGCGAGGGCACCCUGAGCGA





GGCCCUGCUGCAGCUGCAGUUCGACGACGAGGACCUGGGCGCCCUGCUGGGCAACAGCACCGACCCCGCCGUGUUCACCGACC





UGGCCAGCGUGGACAACAGCGAGUUCCAGCAGCUGCUGAACCAGGGCAUCCCCGUGGCCCCCCACACCACCGAGCCCAUGCUG





AUGGAGUACCCCGAGGCCAUCACCCGGCUGGUGACCGGCGCCCAGCGGCCCCCCGACCCCGCCCCCGCCCCCCUGGGCGCCCC





CGGCCUGCCCAACGGCCUGCUGAGCGGCGACGAGGACUUCAGCAGCAUCGCCGACAUGGACUUCAGCGCCCUGCUGGGCAGCG





GCAGCGGCAGCCGGGACAGCCGGGAGGGCAUGUUCCUGCCCAAGCCCGAGGCCGGCAGCGCCAUCAGCGACGUGUUCGAGGGC





CGGGAGGUGUGCCAGCCCAAGCGGCUCCGGCCCUUCCACCCCCCCGGCAGCCCCUGGGCCAACCGGCCCCUGCCCGCCAGCCU





GGCCCCCACCCCCACCGGCCCCGUGCACGAGCCCGUGGGCAGCCUGACCCCCGCCCCCGUGCCCCAGCCCCUGGACCCCGCCC





CCGCCGUGACCCCCGAGGCCAGCCACCUGCUGGAGGACCCCGACGAGGAGACCAGCCAGGCCGUGAAGGCCCUGCGGGAGAUG





GCCGACACCGUGAUCCCCCAGAAGGAGGAGGCCGCCAUCUGCGGCCAGAUGGACCUGAGCCACCCCCCCCCCCGGGGCCACCU





GGACGAGCUGACCACCACCCUGGAGAGCAUGACCGAGGACCUGAACCUGGACAGCCCCCUGACCCCCGAGCUGAACGAGAUCC





UGGACACCUUCCUGAACGACGAGUGCCUGCUGCACGCCAUGCACAUCAGCACCGGCCUGAGCAUCUUCGACACCAGCCUGUUC





AGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGA





CUACGCCGCCACCAACUUUUCUCUGCUGAAGCAAGCCGGAGAUGUGGAGGAGAAUCCCGGCCCUACCUCCGCCGGAAAACUGG





GCUCCGGCGAAGGCAGAGGAAGCCUCCUCACAUGCGGCGACGUGGAGGAGAACCCCGGCCCUCUGGAGGGAUCCUCAGGCUCA





CUGGAACCGGGGGAAAAACCCUACAAGUGCCCGGAAUGCGGCAAGAGCUUCUCGACCUCCGGGAACCUGACCGAGCACCAGCG





CACCCACACCGGAGAGAAACCGUACAAGUGCCCCGAAUGCGGGAAAUCGUUCUCAGACUCGGGAAACCUCAGGGUGCACCAGC





GGACCCACACGGGGGAAAAGCCGUACAAAUGCCCGGAGUGCGGGAAGUCAUUCUCCCACAAGAACGCGCUGCAGAACCACCAA





AGAACCCACACCGGCGAAAAACCGUACAAGUGCCCCGAGUGCGGAAAGUCCUUCUCCCGCAACGACACCCUCACCGAACACCA





ACGCACCCACACCGGAGAAAAGCCCUACAAGUGCCCGGAAUGCGGAAAGAGCUUCAGCCAGAGGGCCCACCUGGAAAGACACC





AGAGAACCCACACCGGCGAAAAGCCGUACAAAUGCCCGGAGUGCGGGAAGUCCUUCAGCCGGUCAGACAAGCUGGUCCGCCAC





CAAAGGACCCACACAGGAGAAAAGCCCUACAAGUGCCCGGAAUGCGGAAAAUCGUUCAGCGACCCCGGACACCUGGUCCGGCA





CCAGAGGACCCACACCGGGGAGAAGCCGACCGGCAAAAAGACCUCAGCGAGCGGAUCAGGAGGAGGAUCAGGGGGGGACGCAC





UGGACGACUUCGACCUGGACAUGCUGGGAUCAGACGCACUGGACGACUUCGACCUAGACAUGCUCGGAUCGGACGCACUCGAC





GACUUCGACCUCGACAUGCUAGGAUCAGACGCACUAGACGACUUCGACCUCGACAUGCUGUCGGGAGGACCGAAGAAAAAGCG





GAAGGUCGGAUCACAGUACCUCCCGGACACCGACGACAGGCACAGAAUCGAAGAAAAACGCAAGCGCACCUACGAAACCUUCA





AGAGCAUCAUGAAAAAGUCGCCGUUCUCAGGACCGACCGACCCCAGACCGCCACCGAGGAGAAUAGCCGUCCCGAGCCGAUCC





UCCGCAUCCGUGCCGAAACCGGCACCGCAACCCUACCCGUUCACCUCGUCCCUGUCGACCAUCAACUACGACGAGUUCCCCAC





CAUGGUGUUCCCCUCCGGGCAGAUCUCACAGGCCUCGGCACUGGCACCCGCACCACCGCAAGUGCUGCCCCAAGCACCGGCAC





CCGCACCGGCGCCCGCAAUGGUGUCAGCGCUGGCACAGGCACCAGCACCGGUGCCAGUCCUCGCACCGGGACCGCCGCAAGCA





GUGGCACCGCCGGCACCGAAACCGACCCAGGCCGGAGAAGGGACCCUGUCCGAGGCGCUGCUGCAACUCCAGUUCGACGACGA





GGACCUGGGAGCACUCCUGGGAAACUCCACCGACCCGGCAGUGUUCACCGACCUCGCAUCGGUGGACAACUCCGAGUUCCAAC





AGCUCCUGAACCAGGGGAUACCGGUGGCACCGCACACCACCGAACCGAUGCUGAUGGAAUACCCGGAAGCCAUCACCCGGCUC





GUGACCGGAGCGCAAAGACCGCCCGACCCCGCGCCCGCACCGCUGGGAGCACCGGGACUACCGAACGGGCUGCUCUCAGGGGA





CGAGGACUUCUCCAGCAUCGCAGACAUGGACUUCUCCGCCCUGCUGGGAUCAGGAUCAGGAUCACGCGACUCCCGGGAAGGAA





UGUUCCUGCCGAAGCCGGAAGCAGGCAGCGCAAUCUCCGACGUGUUCGAAGGCCGCGAGGUCUGCCAGCCCAAGCGCCUGCGA





CCGUUCCACCCGCCGGGAUCACCGUGGGCAAACCGCCCGCUACCGGCAUCACUGGCACCGACACCCACCGGACCGGUGCACGA





ACCGGUCGGGUCACUGACCCCCGCACCGGUCCCGCAACCGCUAGACCCGGCACCGGCAGUGACCCCGGAAGCAUCGCACCUCC





UGGAGGACCCGGACGAGGAAACCUCACAGGCAGUGAAGGCCCUGCGGGAGAUGGCCGACACCGUGAUACCGCAGAAGGAGGAG





GCCGCCAUCUGCGGACAAAUGGACCUGUCACACCCGCCCCCGAGAGGACACCUGGACGAACUCACCACCACCCUGGAGAGCAU





GACCGAGGACCUGAACCUGGACUCACCGCUGACCCCGGAGCUGAACGAAAUCCUGGACACCUUCCUGAACGACGAGUGCCUGC





UGCACGCAAUGCACAUCAGCACCGGGCUGUCGAUCUUCGACACCAGCCUGUUCUCCGGAGGGAAAAGACCCGCCGCCACCAAG





AAAGCGGGCCAAGCAAAGAAAAAGAAGGGAUCGUACCCCUACGACGUGCCGGACUACGCAUGAGCGGCCGCUUAAUUAAGCUG





CCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUGUACCUCUUGGUCUUUGAAUAAAGCCUGAGU





AGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAA





>Nucleotide Seqeunce of DNA binding domain of ZF7-VPR-tPT2a-ZF5.3-VPR; 1


SEQ ID NO.: 238



CUGGAACCGGGCGAGAAGCCAUACAAGUGCCCAGAGUGCGGCAAAAGCUUCAGCCGCAGCGACCAUCUGACCAAUCACCAACG






CACCCAUACCGGUGAGAAGCCGUACAAAUGCCCAGAGUGCGGUAAGAGCUUUAGUACCAGUGGCAGUCUGGUUCGUCAUCAGC





GCACGCACACGGGCGAAAAACCAUACAAAUGCCCGGAGUGCGGCAAAAGCUUUAGCCAAGCCGGUCAUCUGGCGAGCCAUCAA





CGUACGCACACCGGCGAGAAGCCGUAUAAAUGUCCGGAGUGCGGUAAGAGCUUUAGCCGUAGCGAUAAACUGACCGAACACCA





ACGUACGCAUACGGGCGAGAAACCAUAUAAAUGUCCAGAGUGUGGCAAGAGUUUCAGCACCAGCGGCAAUCUGACCGAGCAUC





AACGUACCCAUACCGGUGAAAAGCCAUAUAAAUGUCCAGAAUGCGGUAAGAGUUUUAGUCAGAGCAGUAAUCUGGUGCGCCAU





CAGCGUACCCACACGGGUGAGAAACCAUAUAAGUGUCCGGAAUGCGGCAAGAGUUUUAGCACCCAUCUGGAUCUGAUCCGUCA





UCAGCGCACCCACACCGGUGAAAAACCAACCGGCAAGAAAACCAGU





>Nucleotide Seqeunce of DNA binding domain of ZF7-VPR-tPT2a-ZF5.3-VPR; 2


SEQ ID NO.: 239



CTGGAACCGGGGGAAAAACCCTACAAGTGCCCGGAATGCGGCAAGAGCTTCTCGACCTCCGGGAACCTGACCGAGCACCAGCG






CACCCACACCGGAGAGAAACCGTACAAGTGCCCCGAATGCGGGAAATCGTTCTCAGACTCGGGAAACCTCAGGGTGCACCAGC





GGACCCACACGGGGGAAAAGCCGTACAAATGCCCGGAGTGCGGGAAGTCATTCTCCCACAAGAACGCGCTGCAGAACCACCAA





AGAACCCACACCGGCGAAAAACCGTACAAGTGCCCCGAGTGCGGAAAGTCCTTCTCCCGCAACGACACCCTCACCGAACACCA





ACGCACCCACACCGGAGAAAAGCCCTACAAGTGCCCGGAATGCGGAAAGAGCTTCAGCCAGAGGGCCCACCTGGAAAGACACC





AGAGAACCCACACCGGCGAAAAGCCGTACAAATGCCCGGAGTGCGGGAAGTCCTTCAGCCGGTCAGACAAGCTGGTCCGCCAC





CAAAGGACCCACACAGGAGAAAAGCCCTACAAGTGCCCGGAATGCGGAAAATCGTTCAGCGACCCCGGACACCTGGTCCGGCA





CCAGAGGACCCACACCGGGGAGAAGCCGACCGGCAAAAAGACCTCA





>ZF5.3-VPR-tPT2a-ZF7-p300 protein


SEQ ID NO.: 180



MAPKKKRKVGIHGVPAAGSSGSSGSLEPGEKPYKCPECGKSFSTSGNLTEHQRTHTGEKPYKCPECGKSFSDSGNLRVHQRTH






TGEKPYKCPECGKSFSHKNALQNHQRTHTGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPYKCPECGKSFSQRAHLERHQRT





HTGEKPYKCPECGKSFSRSDKLVRHQRTHTGEKPYKCPECGKSFSDPGHLVRHQRTHTGEKPTGKKTSASGSGGGSGGDALDD





FDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETFKSI





MKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPAPAP





APAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQQLL





NQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREGMFL





PKPEAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHLLED





PDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECLLHA





MHISTGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYAATNFSLLKQAGDVEENPGPTSAGKLGSGEGRGSLLTCG





DVEENPGPLEGSSGSGSLEPGEKPYKCPECGKSFSRSDHLTNHQRTHTGEKPYKCPECGKSFSTSGSLVRHQRTHTGEKPYKC





PECGKSFSQAGHLASHQRTHTGEKPYKCPECGKSFSRSDKLTEHQRTHTGEKPYKCPECGKSFSTSGNLTEHQRTHTGEKPYK





CPECGKSFSQSSNLVRHQRTHTGEKPYKCPECGKSFSTHLDLIRHQRTHTGEKPTGKKTSASGSGGGSGGIFKPEELRQALMP





TLEALYRQDPESLPFRQPVDPQLLGIPDYFDIVKSPMDLSTIKRKLDTGQYQEPWQYVDDIWLMFNNAWLYNRKTSRVYKYCS





KLSEVFEQEIDPVMQSLGYCCGRKLEFSPQTLCCYGKQLCTIPRDATYYSYQNRYHFCEKCFNEIQGESVSLGDDPSQPQTTI





NKEQFSKRKNDTLDPELFVECTECGRKMHQICVLHHEIIWPAGFVCDGCLKKSARTRKENKFSAKRLPSTRLGTFLENRVNDF





LRRQNHPESGEVTVRVVHASDKTVEVKPGMKARFVDSGEMAESFPYRTKALFAFEEIDGVDLCFFGMHVQEYGSDCPPPNQRR





VYISYLDSVHFFRPKCLRTAVYHEILIGYLEYVKKLGYTTGHIWACPPSEGDDYIFHCHPPDQKIPKPKRLQEWYKKMLDKAV





SERIVHDYKDIFKQATEDRLTSAKELPYFEGDFWPNVLEESIKELEQEEEERKREENTSNESTDVTKGDSKNAKKKNNKKTSK





NKSSLSRGNKKKPGMPNVSNDLSQKLYATMEKHKEVFFVIRLIAGPAANSLPPIVDPDPLIPCDLMDGRDAFLTLARDKHLEF





SSLRRAQWSTMCMLVELHTQSQDSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>ZE5.3-VPR-tPT2a-ZE7-p300 mRNA


SEQ ID NO.: 181



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCAUCAGGCUCACUGGAACCGGGGGAAAAACCCUACAAGUGCCCGGAAUGCGGCAA





GAGCUUCUCGACCUCCGGGAACCUGACCGAGCACCAGCGCACCCACACCGGAGAGAAACCGUACAAGUGCCCCGAAUGCGGGA





AAUCGUUCUCAGACUCGGGAAACCUCAGGGUGCACCAGCGGACCCACACGGGGGAAAAGCCGUACAAAUGCCCGGAGUGCGGG





AAGUCAUUCUCCCACAAGAACGCGCUGCAGAACCACCAAAGAACCCACACCGGCGAAAAACCGUACAAGUGCCCCGAGUGCGG





AAAGUCCUUCUCCCGCAACGACACCCUCACCGAACACCAACGCACCCACACCGGAGAAAAGCCCUACAAGUGCCCGGAAUGCG





GAAAGAGCUUCAGCCAGAGGGCCCACCUGGAAAGACACCAGAGAACCCACACCGGCGAAAAGCCGUACAAAUGCCCGGAGUGC





GGGAAGUCCUUCAGCCGGUCAGACAAGCUGGUCCGCCACCAAAGGACCCACACAGGAGAAAAGCCCUACAAGUGCCCGGAAUG





CGGAAAAUCGUUCAGCGACCCCGGACACCUGGUCCGGCACCAGAGGACCCACACCGGGGAGAAGCCGACCGGCAAAAAGACCU





CAGCGAGCGGAUCAGGAGGAGGAUCAGGGGGGGACGCACUGGACGACUUCGACCUGGACAUGCUGGGAUCAGACGCACUGGAC





GACUUCGACCUAGACAUGCUCGGAUCGGACGCACUCGACGACUUCGACCUCGACAUGCUAGGAUCAGACGCACUAGACGACUU





CGACCUCGACAUGCUGUCGGGAGGACCGAAGAAAAAGCGGAAGGUCGGAUCACAGUACCUCCCGGACACCGACGACAGGCACA





GAAUCGAAGAAAAACGCAAGCGCACCUACGAAACCUUCAAGAGCAUCAUGAAAAAGUCGCCGUUCUCAGGACCGACCGACCCC





AGACCGCCACCGAGGAGAAUAGCCGUCCCGAGCCGAUCCUCCGCAUCCGUGCCGAAACCGGCACCGCAACCCUACCCGUUCAC





CUCGUCCCUGUCGACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCUCCGGGCAGAUCUCACAGGCCUCGGCACUGG





CACCCGCACCACCGCAAGUGCUGCCCCAAGCACCGGCACCCGCACCGGCGCCCGCAAUGGUGUCAGCGCUGGCACAGGCACCA





GCACCGGUGCCAGUCCUCGCACCGGGACCGCCGCAAGCAGUGGCACCGCCGGCACCGAAACCGACCCAGGCCGGAGAAGGGAC





CCUGUCCGAGGCGCUGCUGCAACUCCAGUUCGACGACGAGGACCUGGGAGCACUCCUGGGAAACUCCACCGACCCGGCAGUGU





UCACCGACCUCGCAUCGGUGGACAACUCCGAGUUCCAACAGCUCCUGAACCAGGGGAUACCGGUGGCACCGCACACCACCGAA





CCGAUGCUGAUGGAAUACCCGGAAGCCAUCACCCGGCUCGUGACCGGAGCGCAAAGACCGCCCGACCCCGCGCCCGCACCGCU





GGGAGCACCGGGACUACCGAACGGGCUGCUCUCAGGGGACGAGGACUUCUCCAGCAUCGCAGACAUGGACUUCUCCGCCCUGC





UGGGAUCAGGAUCAGGAUCACGCGACUCCCGGGAAGGAAUGUUCCUGCCGAAGCCGGAAGCAGGCAGCGCAAUCUCCGACGUG





UUCGAAGGCCGCGAGGUCUGCCAGCCCAAGCGCCUGCGACCGUUCCACCCGCCGGGAUCACCGUGGGCAAACCGCCCGCUACC





GGCAUCACUGGCACCGACACCCACCGGACCGGUGCACGAACCGGUCGGGUCACUGACCCCCGCACCGGUCCCGCAACCGCUAG





ACCCGGCACCGGCAGUGACCCCGGAAGCAUCGCACCUCCUGGAGGACCCGGACGAGGAAACCUCACAGGCAGUGAAGGCCCUG





CGGGAGAUGGCCGACACCGUGAUACCGCAGAAGGAGGAGGCCGCCAUCUGCGGACAAAUGGACCUGUCACACCCGCCCCCGAG





AGGACACCUGGACGAACUCACCACCACCCUGGAGAGCAUGACCGAGGACCUGAACCUGGACUCACCGCUGACCCCGGAGCUGA





ACGAAAUCCUGGACACCUUCCUGAACGACGAGUGCCUGCUGCACGCAAUGCACAUCAGCACCGGGCUGUCGAUCUUCGACACC





AGCCUGUUCUCCGGAGGGAAAAGACCCGCCGCCACCAAGAAAGCGGGCCAAGCAAAGAAAAAGAAGGGAUCGUACCCCUACGA





CGUGCCGGACUACGCAGCCACCAACUUUUCUCUGCUGAAGCAAGCCGGAGAUGUGGAGGAGAAUCCCGGCCCUACCUCCGCCG





GAAAACUGGGCUCCGGCGAAGGCAGAGGAAGCCUCCUCACAUGCGGCGACGUGGAGGAGAACCCCGGCCCUCUGGAGGGAUCC





UCAGGCUCAGGAUCCCUGGAACCGGGCGAGAAGCCAUACAAGUGCCCAGAGUGCGGCAAAAGCUUCAGCCGCAGCGACCAUCU





GACCAAUCACCAACGCACCCAUACCGGUGAGAAGCCGUACAAAUGCCCAGAGUGCGGUAAGAGCUUUAGUACCAGUGGCAGUC





UGGUUCGUCAUCAGCGCACGCACACGGGCGAAAAACCAUACAAAUGCCCGGAGUGCGGCAAAAGCUUUAGCCAAGCCGGUCAU





CUGGCGAGCCAUCAACGUACGCACACCGGCGAGAAGCCGUAUAAAUGUCCGGAGUGCGGUAAGAGCUUUAGCCGUAGCGAUAA





ACUGACCGAACACCAACGUACGCAUACGGGCGAGAAACCAUAUAAAUGUCCAGAGUGUGGCAAGAGUUUCAGCACCAGCGGCA





AUCUGACCGAGCAUCAACGUACCCAUACCGGUGAAAAGCCAUAUAAAUGUCCAGAAUGCGGUAAGAGUUUUAGUCAGAGCAGU





AAUCUGGUGCGCCAUCAGCGUACCCACACGGGUGAGAAACCAUAUAAGUGUCCGGAAUGCGGCAAGAGUUUUAGCACCCAUCU





GGAUCUGAUCCGUCAUCAGCGCACCCACACCGGUGAAAAACCAACCGGCAAGAAAACCAGUGCUAGCGGCAGCGGCGGCGGCA





GCGGCGGCAUCUUCAAGCCCGAGGAGCUGCGGCAGGCCCUGAUGCCCACCCUGGAGGCCCUGUACCGGCAGGACCCCGAGAGC





CUGCCCUUCCGGCAGCCCGUGGACCCCCAGCUGCUGGGCAUCCCCGACUACUUCGACAUCGUGAAAUCCCCCAUGGACCUGAG





CACCAUCAAGCGGAAGCUGGACACCGGCCAGUACCAGGAGCCCUGGCAGUACGUGGACGACAUCUGGCUGAUGUUCAACAACG





CCUGGCUGUACAACCGGAAAACCAGCCGGGUGUACAAGUACUGCAGCAAGCUGAGCGAGGUGUUCGAGCAGGAGAUCGACCCC





GUGAUGCAGAGCCUGGGCUACUGCUGCGGCCGGAAGCUGGAGUUCAGCCCCCAGACCCUGUGCUGCUACGGCAAGCAGCUGUG





CACCAUCCCCCGGGACGCCACCUACUACAGCUACCAGAACCGGUACCACUUCUGCGAGAAGUGCUUCAACGAGAUCCAGGGCG





AGAGCGUGAGCCUGGGCGACGACCCCAGCCAGCCCCAGACCACCAUCAACAAGGAGCAGUUCAGCAAGCGGAAGAACGACACC





CUGGACCCCGAGCUGUUCGUGGAGUGCACCGAGUGCGGCCGGAAGAUGCACCAGAUCUGCGUGCUGCACCACGAGAUCAUCUG





GCCCGCCGGCUUCGUGUGCGACGGCUGCCUGAAGAAAUCCGCCCGGACCCGGAAGGAGAACAAGUUCAGCGCCAAGCGGCUGC





CCAGCACCCGGCUGGGCACCUUCCUGGAGAACCGGGUGAACGACUUCCUGCGGCGGCAGAACCACCCCGAGAGCGGCGAGGUG





ACCGUGCGGGUGGUGCACGCCAGCGACAAGACCGUGGAGGUGAAGCCCGGCAUGAAGGCCCGGUUCGUGGACAGCGGCGAGAU





GGCCGAGAGCUUCCCCUACCGGACCAAGGCCCUGUUCGCCUUCGAGGAGAUCGACGGCGUGGACCUGUGCUUCUUCGGCAUGC





ACGUGCAGGAGUACGGCAGCGACUGCCCCCCCCCCAACCAGCGGCGGGUGUACAUCAGCUACCUGGACAGCGUGCACUUCUUC





CGGCCCAAGUGCCUGCGGACCGCCGUGUACCACGAGAUCCUGAUCGGCUACCUGGAGUACGUGAAGAAGCUGGGCUACACCAC





CGGCCACAUCUGGGCCUGCCCCCCCAGCGAGGGCGACGACUACAUCUUCCACUGCCACCCCCCCGACCAGAAGAUCCCCAAGC





CCAAGCGGCUGCAGGAGUGGUACAAGAAGAUGCUGGACAAGGCCGUGAGCGAGCGGAUCGUGCACGACUACAAGGACAUCUUC





AAGCAGGCCACCGAGGACCGGCUGACCAGCGCCAAGGAGCUGCCCUACUUCGAGGGCGACUUCUGGCCCAACGUGCUGGAGGA





GAGCAUCAAGGAGCUGGAGCAGGAGGAGGAGGAGCGGAAGCGGGAGGAGAACACCAGCAACGAGAGCACCGACGUGACCAAGG





GCGACAGCAAGAACGCCAAGAAGAAGAACAACAAGAAAACCAGCAAGAACAAGAGCAGCCUGAGCCGGGGCAACAAGAAGAAG





CCCGGCAUGCCCAACGUGAGCAACGACCUGAGCCAGAAGCUGUACGCCACCAUGGAGAAGCACAAGGAGGUGUUCUUCGUGAU





CCGGCUGAUCGCCGGCCCCGCCGCCAACAGCCUGCCCCCCAUCGUGGACCCCGACCCCCUGAUCCCCUGCGACCUGAUGGACG





GCCGGGACGCCUUCCUGACCCUGGCCCGGGACAAGCACCUGGAGUUCAGCAGCCUGCGGCGGGCCCAGUGGAGCACCAUGUGC





AUGCUGGUGGAGCUGCACACCCAGAGCCAGGACAGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAA





GAAGAAGGGCAGCUACCCCUACGACGUGCCCGACUACGCCUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUU





CUGGCCAUGCCCUUCUUCUCUCCCUUGCACCUGUACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>Nucleotide Seqeunce of DNA binding domain of ZF5.3-VPR-tPT2a-ZF7-p300; 1


SEQ ID NO.: 240



CUGGAACCGGGGGAAAAACCCUACAAGUGCCCGGAAUGCGGCAAGAGCUUCUCGACCUCCGGGAACCUGACCGAGCACCAGCG






CACCCACACCGGAGAGAAACCGUACAAGUGCCCCGAAUGCGGGAAAUCGUUCUCAGACUCGGGAAACCUCAGGGUGCACCAGC





GGACCCACACGGGGGAAAAGCCGUACAAAUGCCCGGAGUGCGGGAAGUCAUUCUCCCACAAGAACGCGCUGCAGAACCACCAA





AGAACCCACACCGGCGAAAAACCGUACAAGUGCCCCGAGUGCGGAAAGUCCUUCUCCCGCAACGACACCCUCACCGAACACCA





ACGCACCCACACCGGAGAAAAGCCCUACAAGUGCCCGGAAUGCGGAAAGAGCUUCAGCCAGAGGGCCCACCUGGAAAGACACC





AGAGAACCCACACCGGCGAAAAGCCGUACAAAUGCCCGGAGUGCGGGAAGUCCUUCAGCCGGUCAGACAAGCUGGUCCGCCAC





CAAAGGACCCACACAGGAGAAAAGCCCUACAAGUGCCCGGAAUGCGGAAAAUCGUUCAGCGACCCCGGACACCUGGUCCGGCA





CCAGAGGACCCACACCGGGGAGAAGCCGACCGGCAAAAAGACCUCA





>Nucleotide Seqeunce of DNA binding domain of ZF5.3-VPR-tPT2a-ZF7-p300; 2


SEQ ID NO.: 241



CTGGAACCGGGCGAGAAGCCATACAAGTGCCCAGAGTGCGGCAAAAGCTTCAGCCGCAGCGACCATCTGACCAATCACCAACG






CACCCATACCGGTGAGAAGCCGTACAAATGCCCAGAGTGCGGTAAGAGCTTTAGTACCAGTGGCAGTCTGGTTCGTCATCAGC





GCACGCACACGGGCGAAAAACCATACAAATGCCCGGAGTGCGGCAAAAGCTTTAGCCAAGCCGGTCATCTGGCGAGCCATCAA





CGTACGCACACCGGCGAGAAGCCGTATAAATGTCCGGAGTGCGGTAAGAGCTTTAGCCGTAGCGATAAACTGACCGAACACCA





ACGTACGCATACGGGCGAGAAACCATATAAATGTCCAGAGTGTGGCAAGAGTTTCAGCACCAGCGGCAATCTGACCGAGCATC





AACGTACCCATACCGGTGAAAAGCCATATAAATGTCCAGAATGCGGTAAGAGTTTTAGTCAGAGCAGTAATCTGGTGCGCCAT





CAGCGTACCCACACGGGTGAGAAACCATATAAGTGTCCGGAATGCGGCAAGAGTTTTAGCACCCATCTGGATCTGATCCGTCA





TCAGCGCACCCACACCGGTGAAAAACCAACCGGCAAGAAAACCAGT





>ZF7-p300-tPT2a-ZF5.3-VPR protein


SEQ ID NO.: 182



MAPKKKRKVGIHGVPAAGSSGSLEPGEKPYKCPECGKSFSRSDHLTNHQRTHTGEKPYKCPECGKSFSTSGSLVRHQRTHTGE






KPYKCPECGKSFSQAGHLASHQRTHTGEKPYKCPECGKSFSRSDKLTEHQRTHTGEKPYKCPECGKSFSTSGNLTEHQRTHTG





EKPYKCPECGKSFSQSSNLVRHQRTHTGEKPYKCPECGKSFSTHLDLIRHQRTHTGEKPTGKKTSASGSGGGSGGIFKPEELR





QALMPTLEALYRQDPESLPFRQPVDPQLLGIPDYFDIVKSPMDLSTIKRKLDTGQYQEPWQYVDDIWLMFNNAWLYNRKTSRV





YKYCSKLSEVFEQEIDPVMQSLGYCCGRKLEFSPQTLCCYGKQLCTIPRDATYYSYQNRYHFCEKCFNEIQGESVSLGDDPSQ





PQTTINKEQFSKRKNDTLDPELFVECTECGRKMHQICVLHHEIIWPAGFVCDGCLKKSARTRKENKFSAKRLPSTRLGTFLEN





RVNDFLRRQNHPESGEVTVRVVHASDKTVEVKPGMKARFVDSGEMAESFPYRTKALFAFEEIDGVDLCFFGMHVQEYGSDCPP





PNQRRVYISYLDSVHFFRPKCLRTAVYHEILIGYLEYVKKLGYTTGHIWACPPSEGDDYIFHCHPPDQKIPKPKRLQEWYKKM





LDKAVSERIVHDYKDIFKQATEDRLTSAKELPYFEGDFWPNVLEESIKELEQEEEERKREENTSNESTDVTKGDSKNAKKKNN





KKTSKNKSSLSRGNKKKPGMPNVSNDLSQKLYATMEKHKEVFFVIRLIAGPAANSLPPIVDPDPLIPCDLMDGRDAFLTLARD





KHLEFSSLRRAQWSTMCMLVELHTQSQDSGGKRPAATKKAGQAKKKKGSYPYDVPDYAATNFSLLKQAGDVEENPGPTSAGKL





GSGEGRGSLLTCGDVEENPGPLEGSSGSLEPGEKPYKCPECGKSFSTSGNLTEHQRTHTGEKPYKCPECGKSFSDSGNLRVHQ





RTHTGEKPYKCPECGKSFSHKNALQNHQRTHTGEKPYKCPECGKSFSRNDTLTEHQRTHTGEKPYKCPECGKSFSQRAHLERH





QRTHTGEKPYKCPECGKSFSRSDKLVRHQRTHTGEKPYKCPECGKSFSDPGHLVRHQRTHTGEKPTGKKTSASGSGGGSGGDA





LDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYETF





KSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQAPA





PAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSEFQ





QLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSREG





MFLPKPEAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEASHL





LEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDECL





LHAMHISTGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>>ZF7-p300-tPT2a-ZF5.3-VPR mRNA


SEQ ID NO.: 183



AGGAAAUAAGAGAGAAAAGAAGAGUAAGAAGAAAUAUAAGAGCCACCAUGGCCCCCAAGAAGAAGCGGAAGGUGGGCAUCCAC






GGCGUGCCCGCCGCCGGCAGCAGCGGAUCCCUGGAACCGGGCGAGAAGCCAUACAAGUGCCCAGAGUGCGGCAAAAGCUUCAG





CCGCAGCGACCAUCUGACCAAUCACCAACGCACCCAUACCGGUGAGAAGCCGUACAAAUGCCCAGAGUGCGGUAAGAGCUUUA





GUACCAGUGGCAGUCUGGUUCGUCAUCAGCGCACGCACACGGGCGAAAAACCAUACAAAUGCCCGGAGUGCGGCAAAAGCUUU





AGCCAAGCCGGUCAUCUGGCGAGCCAUCAACGUACGCACACCGGCGAGAAGCCGUAUAAAUGUCCGGAGUGCGGUAAGAGCUU





UAGCCGUAGCGAUAAACUGACCGAACACCAACGUACGCAUACGGGCGAGAAACCAUAUAAAUGUCCAGAGUGUGGCAAGAGUU





UCAGCACCAGCGGCAAUCUGACCGAGCAUCAACGUACCCAUACCGGUGAAAAGCCAUAUAAAUGUCCAGAAUGCGGUAAGAGU





UUUAGUCAGAGCAGUAAUCUGGUGCGCCAUCAGCGUACCCACACGGGUGAGAAACCAUAUAAGUGUCCGGAAUGCGGCAAGAG





UUUUAGCACCCAUCUGGAUCUGAUCCGUCAUCAGCGCACCCACACCGGUGAAAAACCAACCGGCAAGAAAACCAGUGCUAGCG





GCAGCGGCGGCGGCAGCGGCGGCAUCUUCAAGCCCGAGGAGCUGCGGCAGGCCCUGAUGCCCACCCUGGAGGCCCUGUACCGG





CAGGACCCCGAGAGCCUGCCCUUCCGGCAGCCCGUGGACCCCCAGCUGCUGGGCAUCCCCGACUACUUCGACAUCGUGAAAUC





CCCCAUGGACCUGAGCACCAUCAAGCGGAAGCUGGACACCGGCCAGUACCAGGAGCCCUGGCAGUACGUGGACGACAUCUGGC





UGAUGUUCAACAACGCCUGGCUGUACAACCGGAAAACCAGCCGGGUGUACAAGUACUGCAGCAAGCUGAGCGAGGUGUUCGAG





CAGGAGAUCGACCCCGUGAUGCAGAGCCUGGGCUACUGCUGCGGCCGGAAGCUGGAGUUCAGCCCCCAGACCCUGUGCUGCUA





CGGCAAGCAGCUGUGCACCAUCCCCCGGGACGCCACCUACUACAGCUACCAGAACCGGUACCACUUCUGCGAGAAGUGCUUCA





ACGAGAUCCAGGGCGAGAGCGUGAGCCUGGGCGACGACCCCAGCCAGCCCCAGACCACCAUCAACAAGGAGCAGUUCAGCAAG





CGGAAGAACGACACCCUGGACCCCGAGCUGUUCGUGGAGUGCACCGAGUGCGGCCGGAAGAUGCACCAGAUCUGCGUGCUGCA





CCACGAGAUCAUCUGGCCCGCCGGCUUCGUGUGCGACGGCUGCCUGAAGAAAUCCGCCCGGACCCGGAAGGAGAACAAGUUCA





GCGCCAAGCGGCUGCCCAGCACCCGGCUGGGCACCUUCCUGGAGAACCGGGUGAACGACUUCCUGCGGCGGCAGAACCACCCC





GAGAGCGGCGAGGUGACCGUGCGGGUGGUGCACGCCAGCGACAAGACCGUGGAGGUGAAGCCCGGCAUGAAGGCCCGGUUCGU





GGACAGCGGCGAGAUGGCCGAGAGCUUCCCCUACCGGACCAAGGCCCUGUUCGCCUUCGAGGAGAUCGACGGCGUGGACCUGU





GCUUCUUCGGCAUGCACGUGCAGGAGUACGGCAGCGACUGCCCCCCCCCCAACCAGCGGCGGGUGUACAUCAGCUACCUGGAC





AGCGUGCACUUCUUCCGGCCCAAGUGCCUGCGGACCGCCGUGUACCACGAGAUCCUGAUCGGCUACCUGGAGUACGUGAAGAA





GCUGGGCUACACCACCGGCCACAUCUGGGCCUGCCCCCCCAGCGAGGGCGACGACUACAUCUUCCACUGCCACCCCCCCGACC





AGAAGAUCCCCAAGCCCAAGCGGCUGCAGGAGUGGUACAAGAAGAUGCUGGACAAGGCCGUGAGCGAGCGGAUCGUGCACGAC





UACAAGGACAUCUUCAAGCAGGCCACCGAGGACCGGCUGACCAGCGCCAAGGAGCUGCCCUACUUCGAGGGCGACUUCUGGCC





CAACGUGCUGGAGGAGAGCAUCAAGGAGCUGGAGCAGGAGGAGGAGGAGCGGAAGCGGGAGGAGAACACCAGCAACGAGAGCA





CCGACGUGACCAAGGGCGACAGCAAGAACGCCAAGAAGAAGAACAACAAGAAAACCAGCAAGAACAAGAGCAGCCUGAGCCGG





GGCAACAAGAAGAAGCCCGGCAUGCCCAACGUGAGCAACGACCUGAGCCAGAAGCUGUACGCCACCAUGGAGAAGCACAAGGA





GGUGUUCUUCGUGAUCCGGCUGAUCGCCGGCCCCGCCGCCAACAGCCUGCCCCCCAUCGUGGACCCCGACCCCCUGAUCCCCU





GCGACCUGAUGGACGGCCGGGACGCCUUCCUGACCCUGGCCCGGGACAAGCACCUGGAGUUCAGCAGCCUGCGGCGGGCCCAG





UGGAGCACCAUGUGCAUGCUGGUGGAGCUGCACACCCAGAGCCAGGACAGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGC





CGGCCAGGCCAAGAAGAAGAAGGGCAGCUACCCCUACGACGUGCCCGACUACGCCGCCACCAACUUUUCUCUGCUGAAGCAAG





CCGGAGAUGUGGAGGAGAAUCCCGGCCCUACCUCCGCCGGAAAACUGGGCUCCGGCGAAGGCAGAGGAAGCCUCCUCACAUGC





GGCGACGUGGAGGAGAACCCCGGCCCUCUGGAGGGAUCCUCAGGCUCACUGGAACCGGGGGAAAAACCCUACAAGUGCCCGGA





AUGCGGCAAGAGCUUCUCGACCUCCGGGAACCUGACCGAGCACCAGCGCACCCACACCGGAGAGAAACCGUACAAGUGCCCCG





AAUGCGGGAAAUCGUUCUCAGACUCGGGAAACCUCAGGGUGCACCAGCGGACCCACACGGGGGAAAAGCCGUACAAAUGCCCG





GAGUGCGGGAAGUCAUUCUCCCACAAGAACGCGCUGCAGAACCACCAAAGAACCCACACCGGCGAAAAACCGUACAAGUGCCC





CGAGUGCGGAAAGUCCUUCUCCCGCAACGACACCCUCACCGAACACCAACGCACCCACACCGGAGAAAAGCCCUACAAGUGCC





CGGAAUGCGGAAAGAGCUUCAGCCAGAGGGCCCACCUGGAAAGACACCAGAGAACCCACACCGGCGAAAAGCCGUACAAAUGC





CCGGAGUGCGGGAAGUCCUUCAGCCGGUCAGACAAGCUGGUCCGCCACCAAAGGACCCACACAGGAGAAAAGCCCUACAAGUG





CCCGGAAUGCGGAAAAUCGUUCAGCGACCCCGGACACCUGGUCCGGCACCAGAGGACCCACACCGGGGAGAAGCCGACCGGCA





AAAAGACCUCAGCGAGCGGAUCAGGAGGAGGAUCAGGGGGGGACGCACUGGACGACUUCGACCUGGACAUGCUGGGAUCAGAC





GCACUGGACGACUUCGACCUAGACAUGCUCGGAUCGGACGCACUCGACGACUUCGACCUCGACAUGCUAGGAUCAGACGCACU





AGACGACUUCGACCUCGACAUGCUGUCGGGAGGACCGAAGAAAAAGCGGAAGGUCGGAUCACAGUACCUCCCGGACACCGACG





ACAGGCACAGAAUCGAAGAAAAACGCAAGCGCACCUACGAAACCUUCAAGAGCAUCAUGAAAAAGUCGCCGUUCUCAGGACCG





ACCGACCCCAGACCGCCACCGAGGAGAAUAGCCGUCCCGAGCCGAUCCUCCGCAUCCGUGCCGAAACCGGCACCGCAACCCUA





CCCGUUCACCUCGUCCCUGUCGACCAUCAACUACGACGAGUUCCCCACCAUGGUGUUCCCCUCCGGGCAGAUCUCACAGGCCU





CGGCACUGGCACCCGCACCACCGCAAGUGCUGCCCCAAGCACCGGCACCCGCACCGGCGCCCGCAAUGGUGUCAGCGCUGGCA





CAGGCACCAGCACCGGUGCCAGUCCUCGCACCGGGACCGCCGCAAGCAGUGGCACCGCCGGCACCGAAACCGACCCAGGCCGG





AGAAGGGACCCUGUCCGAGGCGCUGCUGCAACUCCAGUUCGACGACGAGGACCUGGGAGCACUCCUGGGAAACUCCACCGACC





CGGCAGUGUUCACCGACCUCGCAUCGGUGGACAACUCCGAGUUCCAACAGCUCCUGAACCAGGGGAUACCGGUGGCACCGCAC





ACCACCGAACCGAUGCUGAUGGAAUACCCGGAAGCCAUCACCCGGCUCGUGACCGGAGCGCAAAGACCGCCCGACCCCGCGCC





CGCACCGCUGGGAGCACCGGGACUACCGAACGGGCUGCUCUCAGGGGACGAGGACUUCUCCAGCAUCGCAGACAUGGACUUCU





CCGCCCUGCUGGGAUCAGGAUCAGGAUCACGCGACUCCCGGGAAGGAAUGUUCCUGCCGAAGCCGGAAGCAGGCAGCGCAAUC





UCCGACGUGUUCGAAGGCCGCGAGGUCUGCCAGCCCAAGCGCCUGCGACCGUUCCACCCGCCGGGAUCACCGUGGGCAAACCG





CCCGCUACCGGCAUCACUGGCACCGACACCCACCGGACCGGUGCACGAACCGGUCGGGUCACUGACCCCCGCACCGGUCCCGC





AACCGCUAGACCCGGCACCGGCAGUGACCCCGGAAGCAUCGCACCUCCUGGAGGACCCGGACGAGGAAACCUCACAGGCAGUG





AAGGCCCUGCGGGAGAUGGCCGACACCGUGAUACCGCAGAAGGAGGAGGCCGCCAUCUGCGGACAAAUGGACCUGUCACACCC





GCCCCCGAGAGGACACCUGGACGAACUCACCACCACCCUGGAGAGCAUGACCGAGGACCUGAACCUGGACUCACCGCUGACCC





CGGAGCUGAACGAAAUCCUGGACACCUUCCUGAACGACGAGUGCCUGCUGCACGCAAUGCACAUCAGCACCGGGCUGUCGAUC





UUCGACACCAGCCUGUUCUCCGGAGGGAAAAGACCCGCCGCCACCAAGAAAGCGGGCCAAGCAAAGAAAAAGAAGGGAUCGUA





CCCCUACGACGUGCCGGACUACGCAUGAGCGGCCGCUUAAUUAAGCUGCCUUCUGCGGGGCUUGCCUUCUGGCCAUGCCCUUC





UUCUCUCCCUUGCACCUGUACCUCUUGGUCUUUGAAUAAAGCCUGAGUAGGAAGUCUAGAAAAAAAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>Nucleotide Seqeunce of DNA binding domain of ZF7-p300-tPT2a-ZF5.3-VPR


SEQ ID NO.: 242



CUGGAACCGGGCGAGAAGCCAUACAAGUGCCCAGAGUGCGGCAAAAGCUUCAGCCGCAGCGACCAUCUGACCAAUCACCAACG






CACCCAUACCGGUGAGAAGCCGUACAAAUGCCCAGAGUGCGGUAAGAGCUUUAGUACCAGUGGCAGUCUGGUUCGUCAUCAGC





GCACGCACACGGGCGAAAAACCAUACAAAUGCCCGGAGUGCGGCAAAAGCUUUAGCCAAGCCGGUCAUCUGGCGAGCCAUCAA





CGUACGCACACCGGCGAGAAGCCGUAUAAAUGUCCGGAGUGCGGUAAGAGCUUUAGCCGUAGCGAUAAACUGACCGAACACCA





ACGUACGCAUACGGGCGAGAAACCAUAUAAAUGUCCAGAGUGUGGCAAGAGUUUCAGCACCAGCGGCAAUCUGACCGAGCAUC





AACGUACCCAUACCGGUGAAAAGCCAUAUAAAUGUCCAGAAUGCGGUAAGAGUUUUAGUCAGAGCAGUAAUCUGGUGCGCCAU





CAGCGUACCCACACGGGUGAGAAACCAUAUAAGUGUCCGGAAUGCGGCAAGAGUUUUAGCACCCAUCUGGAUCUGAUCCGUCA





UCAGCGCACCCACACCGGUGAAAAACCAACCGGCAAGAAAACCAGU





>Nucleotide Seqeunce of DNA binding domain of ZF7-p300-tPT2a-ZF5.3-VPR


SEQ ID NO.: 243



CTGGAACCGGGGGAAAAACCCTACAAGTGCCCGGAATGCGGCAAGAGCTTCTCGACCTCCGGGAACCTGACCGAGCACCAGCG






CACCCACACCGGAGAGAAACCGTACAAGTGCCCCGAATGCGGGAAATCGTTCTCAGACTCGGGAAACCTCAGGGTGCACCAGC





GGACCCACACGGGGGAAAAGCCGTACAAATGCCCGGAGTGCGGGAAGTCATTCTCCCACAAGAACGCGCTGCAGAACCACCAA





AGAACCCACACCGGCGAAAAACCGTACAAGTGCCCCGAGTGCGGAAAGTCCTTCTCCCGCAACGACACCCTCACCGAACACCA





ACGCACCCACACCGGAGAAAAGCCCTACAAGTGCCCGGAATGCGGAAAGAGCTTCAGCCAGAGGGCCCACCTGGAAAGACACC





AGAGAACCCACACCGGCGAAAAGCCGTACAAATGCCCGGAGTGCGGGAAGTCCTTCAGCCGGTCAGACAAGCTGGTCCGCCAC





CAAAGGACCCACACAGGAGAAAAGCCCTACAAGTGCCCGGAATGCGGAAAATCGTTCAGCGACCCCGGACACCTGGTCCGGCA





CCAGAGGACCCACACCGGGGAGAAGCCGACCGGCAAAAAGACCTCA





>TAL1-VPR protein


SEQ ID NO.: 184



MAPKKKRKVGIHGVPAAGSSGSHMVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQ






DMIAALPEATHEAIVGVGKRGAGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLNLTPEQV





VAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQALET





VQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQALETVQRLLPVLCQAHGLT





PEQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQ





ALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQALETVQRLLPVLCQA





HGLTPEQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNI





GGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQALETVQRLLPV





LCQAHGLTPEQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAI





ASNNGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTNRRIPERTSHRVASGSGGGSGG





DALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYE





TFKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQA





PAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSE





FQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSR





EGMFLPKPEAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEAS





HLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDE





CLLHAMHISTGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>TAL1-VPR mRNA


SEQ ID NO.: 185



AGGAAATAAGAGAGAAAAGAAGAGTAAGAAGAAATATAAGAGCCACCATGGCCCCCAAGAAGAAGCGGAAGGTGGGCATCCAC






GGCGTGCCCGCCGCCGGCAGCAGCGGATCCCATATGGTTGATCTGCGTACCCTGGGTTATAGCCAGCAGCAGCAAGAAAAAAT





CAAACCGAAAGTTCGTAGCACCGTTGCACAGCATCATGAAGCACTGGTTGGTCATGGTTTTACCCATGCACATATTGTTGCAC





TGAGCCAGCATCCGGCAGCACTGGGCACCGTTGCAGTTAAATATCAGGATATGATTGCAGCACTGCCGGAAGCAACCCATGAA





GCAATTGTTGGTGTTGGTAAACGCGGAGCTGGTGCACGTGCCCTGGAAGCACTGCTGACCGTTGCCGGTGAACTGCGTGGTCC





GCCTCTGCAGCTGGATACCGGTCAGCTGCTGAAAATTGCAAAACGTGGTGGTGTTACCGCAGTTGAAGCAGTTCATGCATGGC





GTAATGCACTGACCGGTGCACCGCTGAATCTGACACCGGAACAGGTTGTTGCAATTGCCAGCCATGATGGTGGCAAACAGGCA





CTGGAAACCGTTCAGCGTCTGCTGCCGGTTCTGTGTCAGGCACATGGTCTGACCCCTGAACAGGTGGTGGCCATTGCAAGCAA





TAATGGCGGTAAACAAGCCCTGGAAACAGTGCAGCGCCTGTTACCGGTGCTGTGCCAGGCCCATGGCTTAACTCCGGAACAGG





TGGTAGCGATCGCATCAAACAATGGAGGTAAACAGGCCTTAGAAACCGTACAGCGCTTACTGCCGGTGTTATGCCAGGCGCAC





GGCCTGACGCCAGAACAGGTAGTGGCAATCGCCTCACATGACGGTGGAAAACAGGCGTTAGAGACAGTCCAGCGCCTGCTGCC





TGTATTATGTCAAGCCCATGGCCTGACCCCAGAGCAAGTTGTTGCGATTGCAAGTAATAATGGGGGTAAACAGGCACTTGAGA





CAGTTCAACGTTTACTGCCTGTACTGTGCCAAGCTCACGGTCTGACTCCGGAACAAGTCGTCGCGATTGCGAGTAATGGTGGT





GGCAAACAAGCATTAGAAACGGTGCAACGCCTGCTGCCAGTTCTTTGCCAGGCTCACGGTTTAACCCCTGAGCAGGTTGTAGC





TATTGCGAGTAACAATGGTGGTAAGCAGGCGTTGGAAACTGTGCAAAGACTGCTGCCCGTGTTGTGCCAAGCACATGGTTTAA





CCCCAGAACAAGTCGTAGCAATCGCAAGCAATAATGGTGGCAAGCAAGCGCTTGAAACAGTACAGCGTTTATTACCGGTACTT





TGTCAGGCCCACGGTCTTACACCAGAACAAGTTGTGGCCATAGCCAGTAACATTGGCGGAAAGCAGGCTCTGGAAACGGTACA





ACGTCTGTTACCTGTTCTGTGTCAAGCGCACGGATTAACACCTGAACAAGTAGTTGCCATTGCGTCAAATAATGGAGGCAAGC





AGGCCTTGGAGACAGTGCAGAGATTACTGCCAGTGTTGTGTCAGGCTCATGGCCTTACACCCGAGCAGGTCGTGGCAATTGCA





TCTAACAATGGCGGTAAGCAAGCTTTAGAGACTGTTCAGAGACTGCTTCCTGTCCTGTGCCAGGCACACGGACTTACGCCTGA





GCAAGTGGTTGCAATCGCCTCTCATGATGGTGGTAAGCAAGCACTGGAAACTGTCCAACGCTTACTTCCGGTGCTTTGTCAAG





CACACGGCTTAACGCCAGAGCAGGTCGTCGCCATAGCCAGCAATATAGGTGGTAAACAGGCCCTTGAAACGGTCCAAAGACTT





CTGCCGGTCCTTTGCCAAGCGCATGGGCTGACACCTGAGCAGGTAGTCGCGATTGCCTCAAATAATGGTGGGAAGCAGGCATT





AGAAACAGTTCAAAGATTATTACCAGTCCTGTGTCAGGCGCATGGGTTAACCCCAGAGCAGGTAGTTGCAATAGCATCCAACA





ATGGCGGAAAACAAGCGTTGGAAACGGTTCAGCGGTTATTGCCTGTTTTGTGCCAGGCGCATGGTTTGACACCCGAGCAAGTG





GTAGCCATAGCCTCAAATAATGGGGGTAAACAAGCTTTGGAGACAGTACAACGGCTGCTTCCAGTTTTATGTCAGGCCCATGG





ATTGACGCCTGAACAAGTTGTCGCTATCGCAAGTAATATCGGTGGTAAACAAGCGCTTGAAACCGTTCAACGCCTTCTGCCTG





TGCTTTGTCAGGCACATGGATTAACACCCGAACAGGTTGTCGCGATAGCTTCAAACAATGGTGGTCGTCCGGCACTGGAAAGC





ATTGTTGCACAGCTGAGCCGTCCTGATCCGGCACTGGCAGCACTGACCAATGATCATCTGGTTGCACTGGCATGTCTGGGTGG





TCGCCCTGCCCTGGATGCAGTTAAAAAAGGTCTGCCGCATGCACCGGCACTGATTAAACGTACCAATCGTCGTATTCCGGAAC





GTACCAGCCATCGTGTTGCTAGCGGCAGCGGCGGCGGCAGCGGCGGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGC





AGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGCAGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGCAGCGA





CGCCCTGGACGACTTCGACCTGGACATGCTGAGCGGCGGCCCCAAGAAGAAGCGGAAGGTGGGCAGCCAGTACCTGCCCGACA





CCGACGACCGGCACCGGATCGAGGAGAAGCGGAAGCGGACCTACGAGACCTTCAAGAGCATCATGAAGAAATCCCCCTTCAGC





GGCCCCACCGACCCCCGGCCCCCCCCCCGGCGGATCGCCGTGCCCAGCCGGAGCAGCGCCAGCGTGCCCAAGCCCGCCCCCCA





GCCCTACCCCTTCACCAGCAGCCTGAGCACCATCAACTACGACGAGTTCCCCACCATGGTGTTCCCCAGCGGCCAGATCAGCC





AGGCCAGCGCCCTGGCCCCCGCCCCCCCCCAGGTGCTGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCATGGTGAGCGCC





CTGGCCCAGGCCCCCGCCCCCGTGCCCGTGCTGGCCCCCGGCCCCCCCCAGGCCGTGGCCCCCCCCGCCCCCAAGCCCACCCA





GGCCGGCGAGGGCACCCTGAGCGAGGCCCTGCTGCAGCTGCAGTTCGACGACGAGGACCTGGGCGCCCTGCTGGGCAACAGCA





CCGACCCCGCCGTGTTCACCGACCTGGCCAGCGTGGACAACAGCGAGTTCCAGCAGCTGCTGAACCAGGGCATCCCCGTGGCC





CCCCACACCACCGAGCCCATGCTGATGGAGTACCCCGAGGCCATCACCCGGCTGGTGACCGGCGCCCAGCGGCCCCCCGACCC





CGCCCCCGCCCCCCTGGGCGCCCCCGGCCTGCCCAACGGCCTGCTGAGCGGCGACGAGGACTTCAGCAGCATCGCCGACATGG





ACTTCAGCGCCCTGCTGGGCAGCGGCAGCGGCAGCCGGGACAGCCGGGAGGGCATGTTCCTGCCCAAGCCCGAGGCCGGCAGC





GCCATCAGCGACGTGTTCGAGGGCCGGGAGGTGTGCCAGCCCAAGCGGCTCCGGCCCTTCCACCCCCCCGGCAGCCCCTGGGC





CAACCGGCCCCTGCCCGCCAGCCTGGCCCCCACCCCCACCGGCCCCGTGCACGAGCCCGTGGGCAGCCTGACCCCCGCCCCCG





TGCCCCAGCCCCTGGACCCCGCCCCCGCCGTGACCCCCGAGGCCAGCCACCTGCTGGAGGACCCCGACGAGGAGACCAGCCAG





GCCGTGAAGGCCCTGCGGGAGATGGCCGACACCGTGATCCCCCAGAAGGAGGAGGCCGCCATCTGCGGCCAGATGGACCTGAG





CCACCCCCCCCCCCGGGGCCACCTGGACGAGCTGACCACCACCCTGGAGAGCATGACCGAGGACCTGAACCTGGACAGCCCCC





TGACCCCCGAGCTGAACGAGATCCTGGACACCTTCCTGAACGACGAGTGCCTGCTGCACGCCATGCACATCAGCACCGGCCTG





AGCATCTTCGACACCAGCCTGTTCAGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGG





CAGCTACCCCTACGACGTGCCCGACTACGCCTGAGCGGCCGCTTAATTAAGCTGCCTTCTGCGGGGCTTGCCTTCTGGCCATG





CCCTTCTTCTCTCCCTTGCACCTGTACCTCTTGGTCTTTGAATAAAGCCTGAGTAGGAAGTCTAGAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>TAL1 amino acid sequence


SEQ ID NO.: 186



GSHMVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGVGKR






GAGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLULTPEQVVAIASHDGGKQALETVQRLL





PVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVV





AIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETV





QRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTP





EQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQA





LETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAH





GLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNG





GKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGRPALESIVAQLSRP





DPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTURRIPERTSHRVAS





>TAL1 mRNA


SEQ ID NO.: 244



GGAUCCCAUAUGGUUGAUCUGCGUACCCUGGGUUAUAGCCAGCAGCAGCAAGAAAAAAUCAAACCGAAAGUUCGUAGCACCGU






UGCACAGCAUCAUGAAGCACUGGUUGGUCAUGGUUUUACCCAUGCACAUAUUGUUGCACUGAGCCAGCAUCCGGCAGCACUGG





GCACCGUUGCAGUUAAAUAUCAGGAUAUGAUUGCAGCACUGCCGGAAGCAACCCAUGAAGCAAUUGUUGGUGUUGGUAAACGC





GGAGCUGGUGCACGUGCCCUGGAAGCACUGCUGACCGUUGCCGGUGAACUGCGUGGUCCGCCUCUGCAGCUGGAUACCGGUCA





GCUGCUGAAAAUUGCAAAACGUGGUGGUGUUACCGCAGUUGAAGCAGUUCAUGCAUGGCGUAAUGCACUGACCGGUGCACCGC





UGAAUCUGACACCGGAACAGGUUGUUGCAAUUGCCAGCCAUGAUGGUGGCAAACAGGCACUGGAAACCGUUCAGCGUCUGCUG





CCGGUUCUGUGUCAGGCACAUGGUCUGACCCCUGAACAGGUGGUGGCCAUUGCAAGCAAUAAUGGCGGUAAACAAGCCCUGGA





AACAGUGCAGCGCCUGUUACCGGUGCUGUGCCAGGCCCAUGGCUUAACUCCGGAACAGGUGGUAGCGAUCGCAUCAAACAAUG





GAGGUAAACAGGCCUUAGAAACCGUACAGCGCUUACUGCCGGUGUUAUGCCAGGCGCACGGCCUGACGCCAGAACAGGUAGUG





GCAAUCGCCUCACAUGACGGUGGAAAACAGGCGUUAGAGACAGUCCAGCGCCUGCUGCCUGUAUUAUGUCAAGCCCAUGGCCU





GACCCCAGAGCAAGUUGUUGCGAUUGCAAGUAAUAAUGGGGGUAAACAGGCACUUGAGACAGUUCAACGUUUACUGCCUGUAC





UGUGCCAAGCUCACGGUCUGACUCCGGAACAAGUCGUCGCGAUUGCGAGUAAUGGUGGUGGCAAACAAGCAUUAGAAACGGUG





CAACGCCUGCUGCCAGUUCUUUGCCAGGCUCACGGUUUAACCCCUGAGCAGGUUGUAGCUAUUGCGAGUAACAAUGGUGGUAA





GCAGGCGUUGGAAACUGUGCAAAGACUGCUGCCCGUGUUGUGCCAAGCACAUGGUUUAACCCCAGAACAAGUCGUAGCAAUCG





CAAGCAAUAAUGGUGGCAAGCAAGCGCUUGAAACAGUACAGCGUUUAUUACCGGUACUUUGUCAGGCCCACGGUCUUACACCA





GAACAAGUUGUGGCCAUAGCCAGUAACAUUGGCGGAAAGCAGGCUCUGGAAACGGUACAACGUCUGUUACCUGUUCUGUGUCA





AGCGCACGGAUUAACACCUGAACAAGUAGUUGCCAUUGCGUCAAAUAAUGGAGGCAAGCAGGCCUUGGAGACAGUGCAGAGAU





UACUGCCAGUGUUGUGUCAGGCUCAUGGCCUUACACCCGAGCAGGUCGUGGCAAUUGCAUCUAACAAUGGCGGUAAGCAAGCU





UUAGAGACUGUUCAGAGACUGCUUCCUGUCCUGUGCCAGGCACACGGACUUACGCCUGAGCAAGUGGUUGCAAUCGCCUCUCA





UGAUGGUGGUAAGCAAGCACUGGAAACUGUCCAACGCUUACUUCCGGUGCUUUGUCAAGCACACGGCUUAACGCCAGAGCAGG





UCGUCGCCAUAGCCAGCAAUAUAGGUGGUAAACAGGCCCUUGAAACGGUCCAAAGACUUCUGCCGGUCCUUUGCCAAGCGCAU





GGGCUGACACCUGAGCAGGUAGUCGCGAUUGCCUCAAAUAAUGGUGGGAAGCAGGCAUUAGAAACAGUUCAAAGAUUAUUACC





AGUCCUGUGUCAGGCGCAUGGGUUAACCCCAGAGCAGGUAGUUGCAAUAGCAUCCAACAAUGGCGGAAAACAAGCGUUGGAAA





CGGUUCAGCGGUUAUUGCCUGUUUUGUGCCAGGCGCAUGGUUUGACACCCGAGCAAGUGGUAGCCAUAGCCUCAAAUAAUGGG





GGUAAACAAGCUUUGGAGACAGUACAACGGCUGCUUCCAGUUUUAUGUCAGGCCCAUGGAUUGACGCCUGAACAAGUUGUCGC





UAUCGCAAGUAAUAUCGGUGGUAAACAAGCGCUUGAAACCGUUCAACGCCUUCUGCCUGUGCUUUGUCAGGCACAUGGAUUAA





CACCCGAACAGGUUGUCGCGAUAGCUUCAAACAAUGGUGGUCGUCCGGCACUGGAAAGCAUUGUUGCACAGCUGAGCCGUCCU





GAUCCGGCACUGGCAGCACUGACCAAUGAUCAUCUGGUUGCACUGGCAUGUCUGGGUGGUCGCCCUGCCCUGGAUGCAGUUAA





AAAAGGUCUGCCGCAUGCACCGGCACUGAUUAAACGUACCAAUCGUCGUAUUCCGGAACGUACCAGCCAUCGUGUUGCUAGC





>TAL2-VPR protein


SEQ ID NO.: 187



MAPKKKRKVGIHGVPAAGSSGSHMVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQ






DMIAALPEATHEAIVGVGKRGAGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLULTPEQV





VAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALET





VQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLT





PEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNGGGKQ





ALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQA





HGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNN





GGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPV





LCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAI





ASUNGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTNRRIPERTSHRVASGSGGGSGG





DALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYE





TFKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQA





PAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSE





FQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSR





EGMFLPKPEAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEAS





HLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDE





CLLHAMHISTGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>TAL2-VPR mRNA


SEQ ID NO.: 188



AGGAAATAAGAGAGAAAAGAAGAGTAAGAAGAAATATAAGAGCCACCATGGCCCCCAAGAAGAAGCGGAAGGTGGGCATCCAC






GGCGTGCCCGCCGCCGGCAGCAGCGGATCCCATATGGTTGATCTGCGTACCCTGGGTTATAGCCAGCAGCAGCAAGAAAAAAT





CAAACCGAAAGTTCGTAGCACCGTTGCACAGCATCATGAAGCACTGGTTGGTCATGGTTTTACCCATGCACATATTGTTGCAC





TGAGCCAGCATCCGGCAGCACTGGGCACCGTTGCAGTTAAATATCAGGATATGATTGCAGCACTGCCGGAAGCAACCCATGAA





GCAATTGTTGGTGTTGGTAAACGCGGAGCTGGTGCACGTGCCCTGGAAGCACTGCTGACCGTTGCCGGTGAACTGCGTGGTCC





GCCTCTGCAGCTGGATACCGGTCAGCTGCTGAAAATTGCAAAACGTGGTGGTGTTACCGCAGTTGAAGCAGTTCATGCATGGC





GTAATGCACTGACCGGTGCACCGCTGAATCTGACACCGGAACAGGTTGTTGCAATTGCCAGCAATGGTGGTGGCAAACAGGCA





CTGGAAACCGTTCAGCGTCTGCTGCCGGTTCTGTGTCAGGCACATGGTCTGACCCCTGAACAGGTGGTGGCCATTGCAAGCAA





TGGCGGCGGTAAACAAGCCCTGGAAACAGTGCAGCGCCTGTTACCGGTGCTGTGCCAGGCCCATGGCTTAACTCCGGAACAGG





TGGTAGCGATCGCATCACATGATGGAGGTAAACAGGCCTTAGAAACCGTACAGCGCTTACTGCCGGTGTTATGCCAGGCGCAC





GGCCTGACGCCAGAACAGGTAGTGGCAATCGCCTCAAATAATGGTGGAAAACAGGCGTTAGAGACAGTCCAGCGCCTGCTGCC





TGTATTATGTCAAGCCCATGGCCTGACCCCAGAGCAAGTTGTTGCGATTGCAAGTAACAATGGGGGTAAACAGGCACTTGAGA





CAGTTCAACGTTTACTGCCTGTACTGTGCCAAGCTCACGGTCTGACTCCGGAACAAGTCGTCGCGATTGCGAGTAATAATGGT





GGCAAACAAGCATTAGAAACGGTGCAACGCCTGCTGCCAGTTCTTTGCCAGGCTCACGGTTTAACCCCTGAGCAGGTTGTAGC





TATTGCGAGTAACAATGGTGGTAAGCAGGCGTTGGAAACTGTGCAAAGACTGCTGCCCGTGTTGTGCCAAGCACATGGTTTAA





CCCCAGAACAAGTCGTAGCAATCGCAAGCAATGGGGGTGGCAAGCAAGCGCTTGAAACAGTACAGCGTTTATTACCGGTACTT





TGTCAGGCCCACGGTCTTACACCAGAACAAGTTGTGGCCATAGCCAGTAATAATGGCGGAAAGCAGGCTCTGGAAACGGTACA





ACGTCTGTTACCTGTTCTGTGTCAAGCGCACGGATTAACACCTGAACAAGTAGTTGCCATTGCGTCAAACAATGGAGGCAAGC





AGGCCTTGGAGACAGTGCAGAGATTACTGCCAGTGTTGTGTCAGGCTCATGGCCTTACACCCGAGCAGGTCGTGGCAATTGCA





TCTAATAATGGCGGTAAGCAAGCTTTAGAGACTGTTCAGAGACTGCTTCCTGTCCTGTGCCAGGCACACGGACTTACGCCTGA





GCAAGTGGTTGCAATCGCCTCTCATGATGGTGGTAAGCAAGCACTGGAAACTGTCCAACGCTTACTTCCGGTGCTTTGTCAAG





CACACGGCTTAACGCCAGAGCAGGTCGTCGCCATAGCCAGCAACAATGGTGGTAAACAGGCCCTTGAAACGGTCCAAAGACTT





CTGCCGGTCCTTTGCCAAGCGCATGGGCTGACACCTGAGCAGGTAGTCGCGATTGCCTCACATGACGGTGGGAAGCAGGCATT





AGAAACAGTTCAAAGATTATTACCAGTCCTGTGTCAGGCGCATGGGTTAACCCCAGAGCAGGTAGTTGCAATAGCATCCCATG





ATGGCGGAAAACAAGCGTTGGAAACGGTTCAGCGGTTATTGCCTGTTTTGTGCCAGGCGCATGGTTTGACACCCGAGCAAGTG





GTAGCCATAGCCTCACATGACGGGGGTAAACAAGCTTTGGAGACAGTACAACGGCTGCTTCCAGTTTTATGTCAGGCCCATGG





ATTGACGCCTGAACAAGTTGTCGCTATCGCAAGTAACATTGGTGGTAAACAAGCGCTTGAAACCGTTCAACGCCTTCTGCCTG





TGCTTTGTCAGGCACATGGATTAACACCCGAACAGGTTGTCGCGATAGCTTCAAACAATGGTGGTCGTCCGGCACTGGAAAGC





ATTGTTGCACAGCTGAGCCGTCCTGATCCGGCACTGGCAGCACTGACCAATGATCATCTGGTTGCACTGGCATGTCTGGGTGG





TCGCCCTGCCCTGGATGCAGTTAAAAAAGGTCTGCCGCATGCACCGGCACTGATTAAACGTACCAATCGTCGTATTCCGGAAC





GTACCAGCCATCGTGTTGCTAGCGGCAGCGGCGGCGGCAGCGGCGGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGC





AGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGCAGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGCAGCGA





CGCCCTGGACGACTTCGACCTGGACATGCTGAGCGGCGGCCCCAAGAAGAAGCGGAAGGTGGGCAGCCAGTACCTGCCCGACA





CCGACGACCGGCACCGGATCGAGGAGAAGCGGAAGCGGACCTACGAGACCTTCAAGAGCATCATGAAGAAATCCCCCTTCAGC





GGCCCCACCGACCCCCGGCCCCCCCCCCGGCGGATCGCCGTGCCCAGCCGGAGCAGCGCCAGCGTGCCCAAGCCCGCCCCCCA





GCCCTACCCCTTCACCAGCAGCCTGAGCACCATCAACTACGACGAGTTCCCCACCATGGTGTTCCCCAGCGGCCAGATCAGCC





AGGCCAGCGCCCTGGCCCCCGCCCCCCCCCAGGTGCTGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCATGGTGAGCGCC





CTGGCCCAGGCCCCCGCCCCCGTGCCCGTGCTGGCCCCCGGCCCCCCCCAGGCCGTGGCCCCCCCCGCCCCCAAGCCCACCCA





GGCCGGCGAGGGCACCCTGAGCGAGGCCCTGCTGCAGCTGCAGTTCGACGACGAGGACCTGGGCGCCCTGCTGGGCAACAGCA





CCGACCCCGCCGTGTTCACCGACCTGGCCAGCGTGGACAACAGCGAGTTCCAGCAGCTGCTGAACCAGGGCATCCCCGTGGCC





CCCCACACCACCGAGCCCATGCTGATGGAGTACCCCGAGGCCATCACCCGGCTGGTGACCGGCGCCCAGCGGCCCCCCGACCC





CGCCCCCGCCCCCCTGGGCGCCCCCGGCCTGCCCAACGGCCTGCTGAGCGGCGACGAGGACTTCAGCAGCATCGCCGACATGG





ACTTCAGCGCCCTGCTGGGCAGCGGCAGCGGCAGCCGGGACAGCCGGGAGGGCATGTTCCTGCCCAAGCCCGAGGCCGGCAGC





GCCATCAGCGACGTGTTCGAGGGCCGGGAGGTGTGCCAGCCCAAGCGGCTCCGGCCCTTCCACCCCCCCGGCAGCCCCTGGGC





CAACCGGCCCCTGCCCGCCAGCCTGGCCCCCACCCCCACCGGCCCCGTGCACGAGCCCGTGGGCAGCCTGACCCCCGCCCCCG





TGCCCCAGCCCCTGGACCCCGCCCCCGCCGTGACCCCCGAGGCCAGCCACCTGCTGGAGGACCCCGACGAGGAGACCAGCCAG





GCCGTGAAGGCCCTGCGGGAGATGGCCGACACCGTGATCCCCCAGAAGGAGGAGGCCGCCATCTGCGGCCAGATGGACCTGAG





CCACCCCCCCCCCCGGGGCCACCTGGACGAGCTGACCACCACCCTGGAGAGCATGACCGAGGACCTGAACCTGGACAGCCCCC





TGACCCCCGAGCTGAACGAGATCCTGGACACCTTCCTGAACGACGAGTGCCTGCTGCACGCCATGCACATCAGCACCGGCCTG





AGCATCTTCGACACCAGCCTGTTCAGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGG





CAGCTACCCCTACGACGTGCCCGACTACGCCTGAGCGGCCGCTTAATTAAGCTGCCTTCTGCGGGGCTTGCCTTCTGGCCATG





CCCTTCTTCTCTCCCTTGCACCTGTACCTCTTGGTCTTTGAATAAAGCCTGAGTAGGAAGTCTAGAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>TAL2 amino acid sequence


SEQ ID NO.: 189



GSHMVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGVGKR






GAGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLNLTPEQVVAIASNGGGKQALETVQRLL





PVLCQAHGLTPEQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVV





AIASNNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQALETV





QRLLPVLCQAHGLTPEQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNGGGKQALETVQRLLPVLCQAHGLTP





EQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQA





LETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAH





GLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDG





GKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGRPALESIVAQLSRP





DPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTURRIPERTSHRVAS





>TAL2 mRNA sequence


SEQ ID NO.: 245



GGAUCCCAUAUGGUUGAUCUGCGUACCCUGGGUUAUAGCCAGCAGCAGCAAGAAAAAAUCAAACCGAAAGUUCGUAGCACCGU






UGCACAGCAUCAUGAAGCACUGGUUGGUCAUGGUUUUACCCAUGCACAUAUUGUUGCACUGAGCCAGCAUCCGGCAGCACUGG





GCACCGUUGCAGUUAAAUAUCAGGAUAUGAUUGCAGCACUGCCGGAAGCAACCCAUGAAGCAAUUGUUGGUGUUGGUAAACGC





GGAGCUGGUGCACGUGCCCUGGAAGCACUGCUGACCGUUGCCGGUGAACUGCGUGGUCCGCCUCUGCAGCUGGAUACCGGUCA





GCUGCUGAAAAUUGCAAAACGUGGUGGUGUUACCGCAGUUGAAGCAGUUCAUGCAUGGCGUAAUGCACUGACCGGUGCACCGC





UGAAUCUGACACCGGAACAGGUUGUUGCAAUUGCCAGCAAUGGUGGUGGCAAACAGGCACUGGAAACCGUUCAGCGUCUGCUG





CCGGUUCUGUGUCAGGCACAUGGUCUGACCCCUGAACAGGUGGUGGCCAUUGCAAGCAAUGGCGGCGGUAAACAAGCCCUGGA





AACAGUGCAGCGCCUGUUACCGGUGCUGUGCCAGGCCCAUGGCUUAACUCCGGAACAGGUGGUAGCGAUCGCAUCACAUGAUG





GAGGUAAACAGGCCUUAGAAACCGUACAGCGCUUACUGCCGGUGUUAUGCCAGGCGCACGGCCUGACGCCAGAACAGGUAGUG





GCAAUCGCCUCAAAUAAUGGUGGAAAACAGGCGUUAGAGACAGUCCAGCGCCUGCUGCCUGUAUUAUGUCAAGCCCAUGGCCU





GACCCCAGAGCAAGUUGUUGCGAUUGCAAGUAACAAUGGGGGUAAACAGGCACUUGAGACAGUUCAACGUUUACUGCCUGUAC





UGUGCCAAGCUCACGGUCUGACUCCGGAACAAGUCGUCGCGAUUGCGAGUAAUAAUGGUGGCAAACAAGCAUUAGAAACGGUG





CAACGCCUGCUGCCAGUUCUUUGCCAGGCUCACGGUUUAACCCCUGAGCAGGUUGUAGCUAUUGCGAGUAACAAUGGUGGUAA





GCAGGCGUUGGAAACUGUGCAAAGACUGCUGCCCGUGUUGUGCCAAGCACAUGGUUUAACCCCAGAACAAGUCGUAGCAAUCG





CAAGCAAUGGGGGUGGCAAGCAAGCGCUUGAAACAGUACAGCGUUUAUUACCGGUACUUUGUCAGGCCCACGGUCUUACACCA





GAACAAGUUGUGGCCAUAGCCAGUAAUAAUGGCGGAAAGCAGGCUCUGGAAACGGUACAACGUCUGUUACCUGUUCUGUGUCA





AGCGCACGGAUUAACACCUGAACAAGUAGUUGCCAUUGCGUCAAACAAUGGAGGCAAGCAGGCCUUGGAGACAGUGCAGAGAU





UACUGCCAGUGUUGUGUCAGGCUCAUGGCCUUACACCCGAGCAGGUCGUGGCAAUUGCAUCUAAUAAUGGCGGUAAGCAAGCU





UUAGAGACUGUUCAGAGACUGCUUCCUGUCCUGUGCCAGGCACACGGACUUACGCCUGAGCAAGUGGUUGCAAUCGCCUCUCA





UGAUGGUGGUAAGCAAGCACUGGAAACUGUCCAACGCUUACUUCCGGUGCUUUGUCAAGCACACGGCUUAACGCCAGAGCAGG





UCGUCGCCAUAGCCAGCAACAAUGGUGGUAAACAGGCCCUUGAAACGGUCCAAAGACUUCUGCCGGUCCUUUGCCAAGCGCAU





GGGCUGACACCUGAGCAGGUAGUCGCGAUUGCCUCACAUGACGGUGGGAAGCAGGCAUUAGAAACAGUUCAAAGAUUAUUACC





AGUCCUGUGUCAGGCGCAUGGGUUAACCCCAGAGCAGGUAGUUGCAAUAGCAUCCCAUGAUGGCGGAAAACAAGCGUUGGAAA





CGGUUCAGCGGUUAUUGCCUGUUUUGUGCCAGGCGCAUGGUUUGACACCCGAGCAAGUGGUAGCCAUAGCCUCACAUGACGGG





GGUAAACAAGCUUUGGAGACAGUACAACGGCUGCUUCCAGUUUUAUGUCAGGCCCAUGGAUUGACGCCUGAACAAGUUGUCGC





UAUCGCAAGUAACAUUGGUGGUAAACAAGCGCUUGAAACCGUUCAACGCCUUCUGCCUGUGCUUUGUCAGGCACAUGGAUUAA





CACCCGAACAGGUUGUCGCGAUAGCUUCAAACAAUGGUGGUCGUCCGGCACUGGAAAGCAUUGUUGCACAGCUGAGCCGUCCU





GAUCCGGCACUGGCAGCACUGACCAAUGAUCAUCUGGUUGCACUGGCAUGUCUGGGUGGUCGCCCUGCCCUGGAUGCAGUUAA





AAAAGGUCUGCCGCAUGCACCGGCACUGAUUAAACGUACCAAUCGUCGUAUUCCGGAACGUACCAGCCAUCGUGUUGCUAGC





>TAL3-VPR protein


SEQ ID NO.: 190



MAPKKKRKVGIHGVPAAGSSGSHMVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQ






DMIAALPEATHEAIVGVGKRGAGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLULTPEQV





VAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNGGGKQALET





VQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLT





PEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQ





ALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQA





HGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNN





GGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPV





LCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAI





ASNIGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTURRIPERTSHRVASGSGGGSGG





DALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYE





TFKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQA





PAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSE





FQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSR





EGMFLPKPEAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEAS





HLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDE





CLLHAMHISTGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA





>TAL3-VPR mRNA


SEQ ID NO.: 191



AGGAAATAAGAGAGAAAAGAAGAGTAAGAAGAAATATAAGAGCCACCATGGCCCCCAAGAAGAAGCGGAAGGTGGGCATCCAC






GGCGTGCCCGCCGCCGGCAGCAGCGGATCCCATATGGTTGATCTGCGTACCCTGGGTTATAGCCAGCAGCAGCAAGAAAAAAT





CAAACCGAAAGTTCGTAGCACCGTTGCACAGCATCATGAAGCACTGGTTGGTCATGGTTTTACCCATGCACATATTGTTGCAC





TGAGCCAGCATCCGGCAGCACTGGGCACCGTTGCAGTTAAATATCAGGATATGATTGCAGCACTGCCGGAAGCAACCCATGAA





GCAATTGTTGGTGTTGGTAAACGCGGAGCTGGTGCACGTGCCCTGGAAGCACTGCTGACCGTTGCCGGTGAACTGCGTGGTCC





GCCTCTGCAGCTGGATACCGGTCAGCTGCTGAAAATTGCAAAACGTGGTGGTGTTACCGCAGTTGAAGCAGTTCATGCATGGC





GTAATGCACTGACCGGTGCACCGCTGAATCTGACACCGGAACAGGTTGTTGCAATTGCCAGCAATAATGGTGGCAAACAGGCA





CTGGAAACCGTTCAGCGTCTGCTGCCGGTTCTGTGTCAGGCACATGGTCTGACCCCTGAACAGGTGGTGGCCATTGCAAGCAA





TGGTGGCGGTAAACAAGCCCTGGAAACAGTGCAGCGCCTGTTACCGGTGCTGTGCCAGGCCCATGGCTTAACTCCGGAACAGG





TGGTAGCGATCGCATCAAATGGCGGAGGTAAACAGGCCTTAGAAACCGTACAGCGCTTACTGCCGGTGTTATGCCAGGCGCAC





GGCCTGACGCCAGAACAGGTAGTGGCAATCGCCTCAAATGGGGGTGGAAAACAGGCGTTAGAGACAGTCCAGCGCCTGCTGCC





TGTATTATGTCAAGCCCATGGCCTGACCCCAGAGCAAGTTGTTGCGATTGCAAGTAACAATGGGGGTAAACAGGCACTTGAGA





CAGTTCAACGTTTACTGCCTGTACTGTGCCAAGCTCACGGTCTGACTCCGGAACAAGTCGTCGCGATTGCGAGTAACATTGGT





GGCAAACAAGCATTAGAAACGGTGCAACGCCTGCTGCCAGTTCTTTGCCAGGCTCACGGTTTAACCCCTGAGCAGGTTGTAGC





TATTGCGAGTAATATCGGTGGTAAGCAGGCGTTGGAAACTGTGCAAAGACTGCTGCCCGTGTTGTGCCAAGCACATGGTTTAA





CCCCAGAACAAGTCGTAGCAATCGCAAGCAATATAGGTGGCAAGCAAGCGCTTGAAACAGTACAGCGTTTATTACCGGTACTT





TGTCAGGCCCACGGTCTTACACCAGAACAAGTTGTGGCCATAGCCAGTAATAATGGCGGAAAGCAGGCTCTGGAAACGGTACA





ACGTCTGTTACCTGTTCTGTGTCAAGCGCACGGATTAACACCTGAACAAGTAGTTGCCATTGCGTCAAACAATGGAGGCAAGC





AGGCCTTGGAGACAGTGCAGAGATTACTGCCAGTGTTGTGTCAGGCTCATGGCCTTACACCCGAGCAGGTCGTGGCAATTGCA





TCTAACATTGGCGGTAAGCAAGCTTTAGAGACTGTTCAGAGACTGCTTCCTGTCCTGTGCCAGGCACACGGACTTACGCCTGA





GCAAGTGGTTGCAATCGCCTCTAATATCGGTGGTAAGCAAGCACTGGAAACTGTCCAACGCTTACTTCCGGTGCTTTGTCAAG





CACACGGCTTAACGCCAGAGCAGGTCGTCGCCATAGCCAGCAATAATGGTGGTAAACAGGCCCTTGAAACGGTCCAAAGACTT





CTGCCGGTCCTTTGCCAAGCGCATGGGCTGACACCTGAGCAGGTAGTCGCGATTGCCTCAAACAATGGTGGGAAGCAGGCATT





AGAAACAGTTCAAAGATTATTACCAGTCCTGTGTCAGGCGCATGGGTTAACCCCAGAGCAGGTAGTTGCAATAGCATCCCATG





ATGGCGGAAAACAAGCGTTGGAAACGGTTCAGCGGTTATTGCCTGTTTTGTGCCAGGCGCATGGTTTGACACCCGAGCAAGTG





GTAGCCATAGCCTCAAATATAGGGGGTAAACAAGCTTTGGAGACAGTACAACGGCTGCTTCCAGTTTTATGTCAGGCCCATGG





ATTGACGCCTGAACAAGTTGTCGCTATCGCAAGTAATAATGGTGGTAAACAAGCGCTTGAAACCGTTCAACGCCTTCTGCCTG





TGCTTTGTCAGGCACATGGATTAACACCCGAACAGGTTGTCGCGATAGCTTCAAACATTGGTGGTCGTCCGGCACTGGAAAGC





ATTGTTGCACAGCTGAGCCGTCCTGATCCGGCACTGGCAGCACTGACCAATGATCATCTGGTTGCACTGGCATGTCTGGGTGG





TCGCCCTGCCCTGGATGCAGTTAAAAAAGGTCTGCCGCATGCACCGGCACTGATTAAACGTACCAATCGTCGTATTCCGGAAC





GTACCAGCCATCGTGTTGCTAGCGGCAGCGGCGGCGGCAGCGGCGGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGC





AGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGCAGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGCAGCGA





CGCCCTGGACGACTTCGACCTGGACATGCTGAGCGGCGGCCCCAAGAAGAAGCGGAAGGTGGGCAGCCAGTACCTGCCCGACA





CCGACGACCGGCACCGGATCGAGGAGAAGCGGAAGCGGACCTACGAGACCTTCAAGAGCATCATGAAGAAATCCCCCTTCAGC





GGCCCCACCGACCCCCGGCCCCCCCCCCGGCGGATCGCCGTGCCCAGCCGGAGCAGCGCCAGCGTGCCCAAGCCCGCCCCCCA





GCCCTACCCCTTCACCAGCAGCCTGAGCACCATCAACTACGACGAGTTCCCCACCATGGTGTTCCCCAGCGGCCAGATCAGCC





AGGCCAGCGCCCTGGCCCCCGCCCCCCCCCAGGTGCTGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCATGGTGAGCGCC





CTGGCCCAGGCCCCCGCCCCCGTGCCCGTGCTGGCCCCCGGCCCCCCCCAGGCCGTGGCCCCCCCCGCCCCCAAGCCCACCCA





GGCCGGCGAGGGCACCCTGAGCGAGGCCCTGCTGCAGCTGCAGTTCGACGACGAGGACCTGGGCGCCCTGCTGGGCAACAGCA





CCGACCCCGCCGTGTTCACCGACCTGGCCAGCGTGGACAACAGCGAGTTCCAGCAGCTGCTGAACCAGGGCATCCCCGTGGCC





CCCCACACCACCGAGCCCATGCTGATGGAGTACCCCGAGGCCATCACCCGGCTGGTGACCGGCGCCCAGCGGCCCCCCGACCC





CGCCCCCGCCCCCCTGGGCGCCCCCGGCCTGCCCAACGGCCTGCTGAGCGGCGACGAGGACTTCAGCAGCATCGCCGACATGG





ACTTCAGCGCCCTGCTGGGCAGCGGCAGCGGCAGCCGGGACAGCCGGGAGGGCATGTTCCTGCCCAAGCCCGAGGCCGGCAGC





GCCATCAGCGACGTGTTCGAGGGCCGGGAGGTGTGCCAGCCCAAGCGGCTCCGGCCCTTCCACCCCCCCGGCAGCCCCTGGGC





CAACCGGCCCCTGCCCGCCAGCCTGGCCCCCACCCCCACCGGCCCCGTGCACGAGCCCGTGGGCAGCCTGACCCCCGCCCCCG





TGCCCCAGCCCCTGGACCCCGCCCCCGCCGTGACCCCCGAGGCCAGCCACCTGCTGGAGGACCCCGACGAGGAGACCAGCCAG





GCCGTGAAGGCCCTGCGGGAGATGGCCGACACCGTGATCCCCCAGAAGGAGGAGGCCGCCATCTGCGGCCAGATGGACCTGAG





CCACCCCCCCCCCCGGGGCCACCTGGACGAGCTGACCACCACCCTGGAGAGCATGACCGAGGACCTGAACCTGGACAGCCCCC





TGACCCCCGAGCTGAACGAGATCCTGGACACCTTCCTGAACGACGAGTGCCTGCTGCACGCCATGCACATCAGCACCGGCCTG





AGCATCTTCGACACCAGCCTGTTCAGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGG





CAGCTACCCCTACGACGTGCCCGACTACGCCTGAGCGGCCGCTTAATTAAGCTGCCTTCTGCGGGGCTTGCCTTCTGGCCATG





CCCTTCTTCTCTCCCTTGCACCTGTACCTCTTGGTCTTTGAATAAAGCCTGAGTAGGAAGTCTAGAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>TAL3 amino acid sequence


SEQ ID NO.: 192



GSHMVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGVGKR






GAGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLULTPEQVVAIASUNGGKQALETVQRLL





PVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVV





AIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETV





QRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTP





EQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQA





LETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAH





GLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIG





GKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGRPALESIVAQLSRP





DPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTURRIPERTSHRVAS





>TAL3 mRNA sequence


SEQ ID NO.: 246



GGAUCCCAUAUGGUUGAUCUGCGUACCCUGGGUUAUAGCCAGCAGCAGCAAGAAAAAAUCAAACCGAAAGUUCGUAGCACCGU






UGCACAGCAUCAUGAAGCACUGGUUGGUCAUGGUUUUACCCAUGCACAUAUUGUUGCACUGAGCCAGCAUCCGGCAGCACUGG





GCACCGUUGCAGUUAAAUAUCAGGAUAUGAUUGCAGCACUGCCGGAAGCAACCCAUGAAGCAAUUGUUGGUGUUGGUAAACGC





GGAGCUGGUGCACGUGCCCUGGAAGCACUGCUGACCGUUGCCGGUGAACUGCGUGGUCCGCCUCUGCAGCUGGAUACCGGUCA





GCUGCUGAAAAUUGCAAAACGUGGUGGUGUUACCGCAGUUGAAGCAGUUCAUGCAUGGCGUAAUGCACUGACCGGUGCACCGC





UGAAUCUGACACCGGAACAGGUUGUUGCAAUUGCCAGCAAUAAUGGUGGCAAACAGGCACUGGAAACCGUUCAGCGUCUGCUG





CCGGUUCUGUGUCAGGCACAUGGUCUGACCCCUGAACAGGUGGUGGCCAUUGCAAGCAAUGGUGGCGGUAAACAAGCCCUGGA





AACAGUGCAGCGCCUGUUACCGGUGCUGUGCCAGGCCCAUGGCUUAACUCCGGAACAGGUGGUAGCGAUCGCAUCAAAUGGCG





GAGGUAAACAGGCCUUAGAAACCGUACAGCGCUUACUGCCGGUGUUAUGCCAGGCGCACGGCCUGACGCCAGAACAGGUAGUG





GCAAUCGCCUCAAAUGGGGGUGGAAAACAGGCGUUAGAGACAGUCCAGCGCCUGCUGCCUGUAUUAUGUCAAGCCCAUGGCCU





GACCCCAGAGCAAGUUGUUGCGAUUGCAAGUAACAAUGGGGGUAAACAGGCACUUGAGACAGUUCAACGUUUACUGCCUGUAC





UGUGCCAAGCUCACGGUCUGACUCCGGAACAAGUCGUCGCGAUUGCGAGUAACAUUGGUGGCAAACAAGCAUUAGAAACGGUG





CAACGCCUGCUGCCAGUUCUUUGCCAGGCUCACGGUUUAACCCCUGAGCAGGUUGUAGCUAUUGCGAGUAAUAUCGGUGGUAA





GCAGGCGUUGGAAACUGUGCAAAGACUGCUGCCCGUGUUGUGCCAAGCACAUGGUUUAACCCCAGAACAAGUCGUAGCAAUCG





CAAGCAAUAUAGGUGGCAAGCAAGCGCUUGAAACAGUACAGCGUUUAUUACCGGUACUUUGUCAGGCCCACGGUCUUACACCA





GAACAAGUUGUGGCCAUAGCCAGUAAUAAUGGCGGAAAGCAGGCUCUGGAAACGGUACAACGUCUGUUACCUGUUCUGUGUCA





AGCGCACGGAUUAACACCUGAACAAGUAGUUGCCAUUGCGUCAAACAAUGGAGGCAAGCAGGCCUUGGAGACAGUGCAGAGAU





UACUGCCAGUGUUGUGUCAGGCUCAUGGCCUUACACCCGAGCAGGUCGUGGCAAUUGCAUCUAACAUUGGCGGUAAGCAAGCU





UUAGAGACUGUUCAGAGACUGCUUCCUGUCCUGUGCCAGGCACACGGACUUACGCCUGAGCAAGUGGUUGCAAUCGCCUCUAA





UAUCGGUGGUAAGCAAGCACUGGAAACUGUCCAACGCUUACUUCCGGUGCUUUGUCAAGCACACGGCUUAACGCCAGAGCAGG





UCGUCGCCAUAGCCAGCAAUAAUGGUGGUAAACAGGCCCUUGAAACGGUCCAAAGACUUCUGCCGGUCCUUUGCCAAGCGCAU





GGGCUGACACCUGAGCAGGUAGUCGCGAUUGCCUCAAACAAUGGUGGGAAGCAGGCAUUAGAAACAGUUCAAAGAUUAUUACC





AGUCCUGUGUCAGGCGCAUGGGUUAACCCCAGAGCAGGUAGUUGCAAUAGCAUCCCAUGAUGGCGGAAAACAAGCGUUGGAAA





CGGUUCAGCGGUUAUUGCCUGUUUUGUGCCAGGCGCAUGGUUUGACACCCGAGCAAGUGGUAGCCAUAGCCUCAAAUAUAGGG





GGUAAACAAGCUUUGGAGACAGUACAACGGCUGCUUCCAGUUUUAUGUCAGGCCCAUGGAUUGACGCCUGAACAAGUUGUCGC





UAUCGCAAGUAAUAAUGGUGGUAAACAAGCGCUUGAAACCGUUCAACGCCUUCUGCCUGUGCUUUGUCAGGCACAUGGAUUAA





CACCCGAACAGGUUGUCGCGAUAGCUUCAAACAUUGGUGGUCGUCCGGCACUGGAAAGCAUUGUUGCACAGCUGAGCCGUCCU





GAUCCGGCACUGGCAGCACUGACCAAUGAUCAUCUGGUUGCACUGGCAUGUCUGGGUGGUCGCCCUGCCCUGGAUGCAGUUAA





AAAAGGUCUGCCGCAUGCACCGGCACUGAUUAAACGUACCAAUCGUCGUAUUCCGGAACGUACCAGCCAUCGUGUUGCUAGC





>TAL4-VPR protein


SEQ ID NO.: 193



MAPKKKRKVGIHGVPAAGSSGSHMVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQ






DMIAALPEATHEAIVGVGKRGAGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLNLTPEQV





VAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALET





VQRLLPVLCQAHGLTPEQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLT





PEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQ





ALETVQRLLPVLCQAHGLTPEQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQA





HGLTPEQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHD





GGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPV





LCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPEQVVAI





ASNIGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTNRRIPERTSHRVASGSGGGSGG





DALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYE





TFKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQA





PAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSE





FQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSR





EGMFLPKPEAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEAS





HLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDE





CLLHAMHISTGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDY*





>TAL4-VPR mRNA


SEQ ID NO.: 194



AGGAAATAAGAGAGAAAAGAAGAGTAAGAAGAAATATAAGAGCCACCATGGCCCCCAAGAAGAAGCGGAAGGTGGGCATCCAC






GGCGTGCCCGCCGCCGGCAGCAGCGGATCCCATATGGTTGATCTGCGTACCCTGGGTTATAGCCAGCAGCAGCAAGAAAAAAT





CAAACCGAAAGTTCGTAGCACCGTTGCACAGCATCATGAAGCACTGGTTGGTCATGGTTTTACCCATGCACATATTGTTGCAC





TGAGCCAGCATCCGGCAGCACTGGGCACCGTTGCAGTTAAATATCAGGATATGATTGCAGCACTGCCGGAAGCAACCCATGAA





GCAATTGTTGGTGTTGGTAAACGCGGAGCTGGTGCACGTGCCCTGGAAGCACTGCTGACCGTTGCCGGTGAACTGCGTGGTCC





GCCTCTGCAGCTGGATACCGGTCAGCTGCTGAAAATTGCAAAACGTGGTGGTGTTACCGCAGTTGAAGCAGTTCATGCATGGC





GTAATGCACTGACCGGTGCACCGCTGAATCTGACACCGGAACAGGTTGTTGCAATTGCCAGCCATGATGGTGGCAAACAGGCA





CTGGAAACCGTTCAGCGTCTGCTGCCGGTTCTGTGTCAGGCACATGGTCTGACCCCTGAACAGGTGGTGGCCATTGCAAGCCA





TGACGGCGGTAAACAAGCCCTGGAAACAGTGCAGCGCCTGTTACCGGTGCTGTGCCAGGCCCATGGCTTAACTCCGGAACAGG





TGGTAGCGATCGCATCAAACATTGGAGGTAAACAGGCCTTAGAAACCGTACAGCGCTTACTGCCGGTGTTATGCCAGGCGCAC





GGCCTGACGCCAGAACAGGTAGTGGCAATCGCCTCAAATAATGGTGGAAAACAGGCGTTAGAGACAGTCCAGCGCCTGCTGCC





TGTATTATGTCAAGCCCATGGCCTGACCCCAGAGCAAGTTGTTGCGATTGCAAGTCATGATGGGGGTAAACAGGCACTTGAGA





CAGTTCAACGTTTACTGCCTGTACTGTGCCAAGCTCACGGTCTGACTCCGGAACAAGTCGTCGCGATTGCGAGTAATATCGGT





GGCAAACAAGCATTAGAAACGGTGCAACGCCTGCTGCCAGTTCTTTGCCAGGCTCACGGTTTAACCCCTGAGCAGGTTGTAGC





TATTGCGAGTAACAATGGTGGTAAGCAGGCGTTGGAAACTGTGCAAAGACTGCTGCCCGTGTTGTGCCAAGCACATGGTTTAA





CCCCAGAACAAGTCGTAGCAATCGCAAGCAATATAGGTGGCAAGCAAGCGCTTGAAACAGTACAGCGTTTATTACCGGTACTT





TGTCAGGCCCACGGTCTTACACCAGAACAAGTTGTGGCCATAGCCAGTAATGGTGGCGGAAAGCAGGCTCTGGAAACGGTACA





ACGTCTGTTACCTGTTCTGTGTCAAGCGCACGGATTAACACCTGAACAAGTAGTTGCCATTGCGTCACATGACGGAGGCAAGC





AGGCCTTGGAGACAGTGCAGAGATTACTGCCAGTGTTGTGTCAGGCTCATGGCCTTACACCCGAGCAGGTCGTGGCAATTGCA





TCTAATGGCGGCGGTAAGCAAGCTTTAGAGACTGTTCAGAGACTGCTTCCTGTCCTGTGCCAGGCACACGGACTTACGCCTGA





GCAAGTGGTTGCAATCGCCTCTAATGGGGGTGGTAAGCAAGCACTGGAAACTGTCCAACGCTTACTTCCGGTGCTTTGTCAAG





CACACGGCTTAACGCCAGAGCAGGTCGTCGCCATAGCCAGCCATGATGGTGGTAAACAGGCCCTTGAAACGGTCCAAAGACTT





CTGCCGGTCCTTTGCCAAGCGCATGGGCTGACACCTGAGCAGGTAGTCGCGATTGCCTCACATGACGGTGGGAAGCAGGCATT





AGAAACAGTTCAAAGATTATTACCAGTCCTGTGTCAGGCGCATGGGTTAACCCCAGAGCAGGTAGTTGCAATAGCATCCCATG





ATGGCGGAAAACAAGCGTTGGAAACGGTTCAGCGGTTATTGCCTGTTTTGTGCCAGGCGCATGGTTTGACACCCGAGCAAGTG





GTAGCCATAGCCTCAAACATTGGGGGTAAACAAGCTTTGGAGACAGTACAACGGCTGCTTCCAGTTTTATGTCAGGCCCATGG





ATTGACGCCTGAACAAGTTGTCGCTATCGCAAGTAATAATGGTGGTAAACAAGCGCTTGAAACCGTTCAACGCCTTCTGCCTG





TGCTTTGTCAGGCACATGGATTAACACCCGAACAGGTTGTCGCGATAGCTTCAAATATCGGTGGTCGTCCGGCACTGGAAAGC





ATTGTTGCACAGCTGAGCCGTCCTGATCCGGCACTGGCAGCACTGACCAATGATCATCTGGTTGCACTGGCATGTCTGGGTGG





TCGCCCTGCCCTGGATGCAGTTAAAAAAGGTCTGCCGCATGCACCGGCACTGATTAAACGTACCAATCGTCGTATTCCGGAAC





GTACCAGCCATCGTGTTGCTAGCGGCAGCGGCGGCGGCAGCGGCGGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGC





AGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGCAGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGCAGCGA





CGCCCTGGACGACTTCGACCTGGACATGCTGAGCGGCGGCCCCAAGAAGAAGCGGAAGGTGGGCAGCCAGTACCTGCCCGACA





CCGACGACCGGCACCGGATCGAGGAGAAGCGGAAGCGGACCTACGAGACCTTCAAGAGCATCATGAAGAAATCCCCCTTCAGC





GGCCCCACCGACCCCCGGCCCCCCCCCCGGCGGATCGCCGTGCCCAGCCGGAGCAGCGCCAGCGTGCCCAAGCCCGCCCCCCA





GCCCTACCCCTTCACCAGCAGCCTGAGCACCATCAACTACGACGAGTTCCCCACCATGGTGTTCCCCAGCGGCCAGATCAGCC





AGGCCAGCGCCCTGGCCCCCGCCCCCCCCCAGGTGCTGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCATGGTGAGCGCC





CTGGCCCAGGCCCCCGCCCCCGTGCCCGTGCTGGCCCCCGGCCCCCCCCAGGCCGTGGCCCCCCCCGCCCCCAAGCCCACCCA





GGCCGGCGAGGGCACCCTGAGCGAGGCCCTGCTGCAGCTGCAGTTCGACGACGAGGACCTGGGCGCCCTGCTGGGCAACAGCA





CCGACCCCGCCGTGTTCACCGACCTGGCCAGCGTGGACAACAGCGAGTTCCAGCAGCTGCTGAACCAGGGCATCCCCGTGGCC





CCCCACACCACCGAGCCCATGCTGATGGAGTACCCCGAGGCCATCACCCGGCTGGTGACCGGCGCCCAGCGGCCCCCCGACCC





CGCCCCCGCCCCCCTGGGCGCCCCCGGCCTGCCCAACGGCCTGCTGAGCGGCGACGAGGACTTCAGCAGCATCGCCGACATGG





ACTTCAGCGCCCTGCTGGGCAGCGGCAGCGGCAGCCGGGACAGCCGGGAGGGCATGTTCCTGCCCAAGCCCGAGGCCGGCAGC





GCCATCAGCGACGTGTTCGAGGGCCGGGAGGTGTGCCAGCCCAAGCGGCTCCGGCCCTTCCACCCCCCCGGCAGCCCCTGGGC





CAACCGGCCCCTGCCCGCCAGCCTGGCCCCCACCCCCACCGGCCCCGTGCACGAGCCCGTGGGCAGCCTGACCCCCGCCCCCG





TGCCCCAGCCCCTGGACCCCGCCCCCGCCGTGACCCCCGAGGCCAGCCACCTGCTGGAGGACCCCGACGAGGAGACCAGCCAG





GCCGTGAAGGCCCTGCGGGAGATGGCCGACACCGTGATCCCCCAGAAGGAGGAGGCCGCCATCTGCGGCCAGATGGACCTGAG





CCACCCCCCCCCCCGGGGCCACCTGGACGAGCTGACCACCACCCTGGAGAGCATGACCGAGGACCTGAACCTGGACAGCCCCC





TGACCCCCGAGCTGAACGAGATCCTGGACACCTTCCTGAACGACGAGTGCCTGCTGCACGCCATGCACATCAGCACCGGCCTG





AGCATCTTCGACACCAGCCTGTTCAGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGG





CAGCTACCCCTACGACGTGCCCGACTACGCCTGAGCGGCCGCTTAATTAAGCTGCCTTCTGCGGGGCTTGCCTTCTGGCCATG





CCCTTCTTCTCTCCCTTGCACCTGTACCTCTTGGTCTTTGAATAAAGCCTGAGTAGGAAGTCTAGAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>TAL4 amino acid sequence


SEQ ID NO.: 195



GSHMVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGVGKR






GAGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLULTPEQVVAIASHDGGKQALETVQRLL





PVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVV





AIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETV





QRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTP





EQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQA





LETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAH





GLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIG





GKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGRPALESIVAQLSRP





DPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTURRIPERTSHRVAS





>TAL4 mRNA sequence


SEQ ID NO.: 247



GGAUCCCAUAUGGUUGAUCUGCGUACCCUGGGUUAUAGCCAGCAGCAGCAAGAAAAAAUCAAACCGAAAGUUCGUAGCACCGU






UGCACAGCAUCAUGAAGCACUGGUUGGUCAUGGUUUUACCCAUGCACAUAUUGUUGCACUGAGCCAGCAUCCGGCAGCACUGG





GCACCGUUGCAGUUAAAUAUCAGGAUAUGAUUGCAGCACUGCCGGAAGCAACCCAUGAAGCAAUUGUUGGUGUUGGUAAACGC





GGAGCUGGUGCACGUGCCCUGGAAGCACUGCUGACCGUUGCCGGUGAACUGCGUGGUCCGCCUCUGCAGCUGGAUACCGGUCA





GCUGCUGAAAAUUGCAAAACGUGGUGGUGUUACCGCAGUUGAAGCAGUUCAUGCAUGGCGUAAUGCACUGACCGGUGCACCGC





UGAAUCUGACACCGGAACAGGUUGUUGCAAUUGCCAGCCAUGAUGGUGGCAAACAGGCACUGGAAACCGUUCAGCGUCUGCUG





CCGGUUCUGUGUCAGGCACAUGGUCUGACCCCUGAACAGGUGGUGGCCAUUGCAAGCCAUGACGGCGGUAAACAAGCCCUGGA





AACAGUGCAGCGCCUGUUACCGGUGCUGUGCCAGGCCCAUGGCUUAACUCCGGAACAGGUGGUAGCGAUCGCAUCAAACAUUG





GAGGUAAACAGGCCUUAGAAACCGUACAGCGCUUACUGCCGGUGUUAUGCCAGGCGCACGGCCUGACGCCAGAACAGGUAGUG





GCAAUCGCCUCAAAUAAUGGUGGAAAACAGGCGUUAGAGACAGUCCAGCGCCUGCUGCCUGUAUUAUGUCAAGCCCAUGGCCU





GACCCCAGAGCAAGUUGUUGCGAUUGCAAGUCAUGAUGGGGGUAAACAGGCACUUGAGACAGUUCAACGUUUACUGCCUGUAC





UGUGCCAAGCUCACGGUCUGACUCCGGAACAAGUCGUCGCGAUUGCGAGUAAUAUCGGUGGCAAACAAGCAUUAGAAACGGUG





CAACGCCUGCUGCCAGUUCUUUGCCAGGCUCACGGUUUAACCCCUGAGCAGGUUGUAGCUAUUGCGAGUAACAAUGGUGGUAA





GCAGGCGUUGGAAACUGUGCAAAGACUGCUGCCCGUGUUGUGCCAAGCACAUGGUUUAACCCCAGAACAAGUCGUAGCAAUCG





CAAGCAAUAUAGGUGGCAAGCAAGCGCUUGAAACAGUACAGCGUUUAUUACCGGUACUUUGUCAGGCCCACGGUCUUACACCA





GAACAAGUUGUGGCCAUAGCCAGUAAUGGUGGCGGAAAGCAGGCUCUGGAAACGGUACAACGUCUGUUACCUGUUCUGUGUCA





AGCGCACGGAUUAACACCUGAACAAGUAGUUGCCAUUGCGUCACAUGACGGAGGCAAGCAGGCCUUGGAGACAGUGCAGAGAU





UACUGCCAGUGUUGUGUCAGGCUCAUGGCCUUACACCCGAGCAGGUCGUGGCAAUUGCAUCUAAUGGCGGCGGUAAGCAAGCU





UUAGAGACUGUUCAGAGACUGCUUCCUGUCCUGUGCCAGGCACACGGACUUACGCCUGAGCAAGUGGUUGCAAUCGCCUCUAA





UGGGGGUGGUAAGCAAGCACUGGAAACUGUCCAACGCUUACUUCCGGUGCUUUGUCAAGCACACGGCUUAACGCCAGAGCAGG





UCGUCGCCAUAGCCAGCCAUGAUGGUGGUAAACAGGCCCUUGAAACGGUCCAAAGACUUCUGCCGGUCCUUUGCCAAGCGCAU





GGGCUGACACCUGAGCAGGUAGUCGCGAUUGCCUCACAUGACGGUGGGAAGCAGGCAUUAGAAACAGUUCAAAGAUUAUUACC





AGUCCUGUGUCAGGCGCAUGGGUUAACCCCAGAGCAGGUAGUUGCAAUAGCAUCCCAUGAUGGCGGAAAACAAGCGUUGGAAA





CGGUUCAGCGGUUAUUGCCUGUUUUGUGCCAGGCGCAUGGUUUGACACCCGAGCAAGUGGUAGCCAUAGCCUCAAACAUUGGG





GGUAAACAAGCUUUGGAGACAGUACAACGGCUGCUUCCAGUUUUAUGUCAGGCCCAUGGAUUGACGCCUGAACAAGUUGUCGC





UAUCGCAAGUAAUAAUGGUGGUAAACAAGCGCUUGAAACCGUUCAACGCCUUCUGCCUGUGCUUUGUCAGGCACAUGGAUUAA





CACCCGAACAGGUUGUCGCGAUAGCUUCAAAUAUCGGUGGUCGUCCGGCACUGGAAAGCAUUGUUGCACAGCUGAGCCGUCCU





GAUCCGGCACUGGCAGCACUGACCAAUGAUCAUCUGGUUGCACUGGCAUGUCUGGGUGGUCGCCCUGCCCUGGAUGCAGUUAA





AAAAGGUCUGCCGCAUGCACCGGCACUGAUUAAACGUACCAAUCGUCGUAUUCCGGAACGUACCAGCCAUCGUGUUGCUAGC





>TAL5-VPR protein


SEQ ID NO.: 196



MAPKKKRKVGIHGVPAAGSSGSHMVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQ






DMIAALPEATHEAIVGVGKRGAGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLNLTPEQV





VAIASNNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQALET





VQRLLPVLCQAHGLTPEQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQALETVQRLLPVLCQAHGLT





PEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQ





ALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQA





HGLTPEQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNI





GGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPV





LCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAI





ASNGGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTNRRIPERTSHRVASGSGGGSGG





DALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYE





TFKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQA





PAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSE





FQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSR





EGMFLPKPEAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEAS





HLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDE





CLLHAMHISTGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA





>TAL5-VPR mRNA


SEQ ID NO.: 197



AGGAAATAAGAGAGAAAAGAAGAGTAAGAAGAAATATAAGAGCCACCATGGCCCCCAAGAAGAAGCGGAAGGTGGGCATCCAC






GGCGTGCCCGCCGCCGGCAGCAGCGGATCCCATATGGTTGATCTGCGTACCCTGGGTTATAGCCAGCAGCAGCAAGAAAAAAT





CAAACCGAAAGTTCGTAGCACCGTTGCACAGCATCATGAAGCACTGGTTGGTCATGGTTTTACCCATGCACATATTGTTGCAC





TGAGCCAGCATCCGGCAGCACTGGGCACCGTTGCAGTTAAATATCAGGATATGATTGCAGCACTGCCGGAAGCAACCCATGAA





GCAATTGTTGGTGTTGGTAAACGCGGAGCTGGTGCACGTGCCCTGGAAGCACTGCTGACCGTTGCCGGTGAACTGCGTGGTCC





GCCTCTGCAGCTGGATACCGGTCAGCTGCTGAAAATTGCAAAACGTGGTGGTGTTACCGCAGTTGAAGCAGTTCATGCATGGC





GTAATGCACTGACCGGTGCACCGCTGAATCTGACACCGGAACAGGTTGTTGCAATTGCCAGCAATAATGGTGGCAAACAGGCA





CTGGAAACCGTTCAGCGTCTGCTGCCGGTTCTGTGTCAGGCACATGGTCTGACCCCTGAACAGGTGGTGGCCATTGCAAGCAA





CAATGGCGGTAAACAAGCCCTGGAAACAGTGCAGCGCCTGTTACCGGTGCTGTGCCAGGCCCATGGCTTAACTCCGGAACAGG





TGGTAGCGATCGCATCAAATAATGGAGGTAAACAGGCCTTAGAAACCGTACAGCGCTTACTGCCGGTGTTATGCCAGGCGCAC





GGCCTGACGCCAGAACAGGTAGTGGCAATCGCCTCAAACAATGGTGGAAAACAGGCGTTAGAGACAGTCCAGCGCCTGCTGCC





TGTATTATGTCAAGCCCATGGCCTGACCCCAGAGCAAGTTGTTGCGATTGCAAGTAATAATGGGGGTAAACAGGCACTTGAGA





CAGTTCAACGTTTACTGCCTGTACTGTGCCAAGCTCACGGTCTGACTCCGGAACAAGTCGTCGCGATTGCGAGTAACATTGGT





GGCAAACAAGCATTAGAAACGGTGCAACGCCTGCTGCCAGTTCTTTGCCAGGCTCACGGTTTAACCCCTGAGCAGGTTGTAGC





TATTGCGAGTCATGATGGTGGTAAGCAGGCGTTGGAAACTGTGCAAAGACTGCTGCCCGTGTTGTGCCAAGCACATGGTTTAA





CCCCAGAACAAGTCGTAGCAATCGCAAGCCATGACGGTGGCAAGCAAGCGCTTGAAACAGTACAGCGTTTATTACCGGTACTT





TGTCAGGCCCACGGTCTTACACCAGAACAAGTTGTGGCCATAGCCAGTAACAATGGCGGAAAGCAGGCTCTGGAAACGGTACA





ACGTCTGTTACCTGTTCTGTGTCAAGCGCACGGATTAACACCTGAACAAGTAGTTGCCATTGCGTCAAATATCGGAGGCAAGC





AGGCCTTGGAGACAGTGCAGAGATTACTGCCAGTGTTGTGTCAGGCTCATGGCCTTACACCCGAGCAGGTCGTGGCAATTGCA





TCTAATGGTGGCGGTAAGCAAGCTTTAGAGACTGTTCAGAGACTGCTTCCTGTCCTGTGCCAGGCACACGGACTTACGCCTGA





GCAAGTGGTTGCAATCGCCTCTAATGGCGGTGGTAAGCAAGCACTGGAAACTGTCCAACGCTTACTTCCGGTGCTTTGTCAAG





CACACGGCTTAACGCCAGAGCAGGTCGTCGCCATAGCCAGCAATATAGGTGGTAAACAGGCCCTTGAAACGGTCCAAAGACTT





CTGCCGGTCCTTTGCCAAGCGCATGGGCTGACACCTGAGCAGGTAGTCGCGATTGCCTCAAACATTGGTGGGAAGCAGGCATT





AGAAACAGTTCAAAGATTATTACCAGTCCTGTGTCAGGCGCATGGGTTAACCCCAGAGCAGGTAGTTGCAATAGCATCCCATG





ATGGCGGAAAACAAGCGTTGGAAACGGTTCAGCGGTTATTGCCTGTTTTGTGCCAGGCGCATGGTTTGACACCCGAGCAAGTG





GTAGCCATAGCCTCACATGACGGGGGTAAACAAGCTTTGGAGACAGTACAACGGCTGCTTCCAGTTTTATGTCAGGCCCATGG





ATTGACGCCTGAACAAGTTGTCGCTATCGCAAGTAATATCGGTGGTAAACAAGCGCTTGAAACCGTTCAACGCCTTCTGCCTG





TGCTTTGTCAGGCACATGGATTAACACCCGAACAGGTTGTCGCGATAGCTTCAAATGGGGGTGGTCGTCCGGCACTGGAAAGC





ATTGTTGCACAGCTGAGCCGTCCTGATCCGGCACTGGCAGCACTGACCAATGATCATCTGGTTGCACTGGCATGTCTGGGTGG





TCGCCCTGCCCTGGATGCAGTTAAAAAAGGTCTGCCGCATGCACCGGCACTGATTAAACGTACCAATCGTCGTATTCCGGAAC





GTACCAGCCATCGTGTTGCTAGCGGCAGCGGCGGCGGCAGCGGCGGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGC





AGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGCAGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGCAGCGA





CGCCCTGGACGACTTCGACCTGGACATGCTGAGCGGCGGCCCCAAGAAGAAGCGGAAGGTGGGCAGCCAGTACCTGCCCGACA





CCGACGACCGGCACCGGATCGAGGAGAAGCGGAAGCGGACCTACGAGACCTTCAAGAGCATCATGAAGAAATCCCCCTTCAGC





GGCCCCACCGACCCCCGGCCCCCCCCCCGGCGGATCGCCGTGCCCAGCCGGAGCAGCGCCAGCGTGCCCAAGCCCGCCCCCCA





GCCCTACCCCTTCACCAGCAGCCTGAGCACCATCAACTACGACGAGTTCCCCACCATGGTGTTCCCCAGCGGCCAGATCAGCC





AGGCCAGCGCCCTGGCCCCCGCCCCCCCCCAGGTGCTGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCATGGTGAGCGCC





CTGGCCCAGGCCCCCGCCCCCGTGCCCGTGCTGGCCCCCGGCCCCCCCCAGGCCGTGGCCCCCCCCGCCCCCAAGCCCACCCA





GGCCGGCGAGGGCACCCTGAGCGAGGCCCTGCTGCAGCTGCAGTTCGACGACGAGGACCTGGGCGCCCTGCTGGGCAACAGCA





CCGACCCCGCCGTGTTCACCGACCTGGCCAGCGTGGACAACAGCGAGTTCCAGCAGCTGCTGAACCAGGGCATCCCCGTGGCC





CCCCACACCACCGAGCCCATGCTGATGGAGTACCCCGAGGCCATCACCCGGCTGGTGACCGGCGCCCAGCGGCCCCCCGACCC





CGCCCCCGCCCCCCTGGGCGCCCCCGGCCTGCCCAACGGCCTGCTGAGCGGCGACGAGGACTTCAGCAGCATCGCCGACATGG





ACTTCAGCGCCCTGCTGGGCAGCGGCAGCGGCAGCCGGGACAGCCGGGAGGGCATGTTCCTGCCCAAGCCCGAGGCCGGCAGC





GCCATCAGCGACGTGTTCGAGGGCCGGGAGGTGTGCCAGCCCAAGCGGCTCCGGCCCTTCCACCCCCCCGGCAGCCCCTGGGC





CAACCGGCCCCTGCCCGCCAGCCTGGCCCCCACCCCCACCGGCCCCGTGCACGAGCCCGTGGGCAGCCTGACCCCCGCCCCCG





TGCCCCAGCCCCTGGACCCCGCCCCCGCCGTGACCCCCGAGGCCAGCCACCTGCTGGAGGACCCCGACGAGGAGACCAGCCAG





GCCGTGAAGGCCCTGCGGGAGATGGCCGACACCGTGATCCCCCAGAAGGAGGAGGCCGCCATCTGCGGCCAGATGGACCTGAG





CCACCCCCCCCCCCGGGGCCACCTGGACGAGCTGACCACCACCCTGGAGAGCATGACCGAGGACCTGAACCTGGACAGCCCCC





TGACCCCCGAGCTGAACGAGATCCTGGACACCTTCCTGAACGACGAGTGCCTGCTGCACGCCATGCACATCAGCACCGGCCTG





AGCATCTTCGACACCAGCCTGTTCAGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGG





CAGCTACCCCTACGACGTGCCCGACTACGCCTGAGCGGCCGCTTAATTAAGCTGCCTTCTGCGGGGCTTGCCTTCTGGCCATG





CCCTTCTTCTCTCCCTTGCACCTGTACCTCTTGGTCTTTGAATAAAGCCTGAGTAGGAAGTCTAGAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>TAL5 amino acid


SEQ ID NO.: 198



GSHMVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGVGKR






GAGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLULTPEQVVAIASUNGGKQALETVQRLL





PVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVV





AIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETV





QRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTP





EQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQA





LETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAH





GLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDG





GKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGRPALESIVAQLSRP





DPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTURRIPERTSHRVAS





>TAL5 mRNA


SEQ ID NO.: 248



GGAUCCCAUAUGGUUGAUCUGCGUACCCUGGGUUAUAGCCAGCAGCAGCAAGAAAAAAUCAAACCGAAAGUUCGUAGCACCGU






UGCACAGCAUCAUGAAGCACUGGUUGGUCAUGGUUUUACCCAUGCACAUAUUGUUGCACUGAGCCAGCAUCCGGCAGCACUGG





GCACCGUUGCAGUUAAAUAUCAGGAUAUGAUUGCAGCACUGCCGGAAGCAACCCAUGAAGCAAUUGUUGGUGUUGGUAAACGC





GGAGCUGGUGCACGUGCCCUGGAAGCACUGCUGACCGUUGCCGGUGAACUGCGUGGUCCGCCUCUGCAGCUGGAUACCGGUCA





GCUGCUGAAAAUUGCAAAACGUGGUGGUGUUACCGCAGUUGAAGCAGUUCAUGCAUGGCGUAAUGCACUGACCGGUGCACCGC





UGAAUCUGACACCGGAACAGGUUGUUGCAAUUGCCAGCAAUAAUGGUGGCAAACAGGCACUGGAAACCGUUCAGCGUCUGCUG





CCGGUUCUGUGUCAGGCACAUGGUCUGACCCCUGAACAGGUGGUGGCCAUUGCAAGCAACAAUGGCGGUAAACAAGCCCUGGA





AACAGUGCAGCGCCUGUUACCGGUGCUGUGCCAGGCCCAUGGCUUAACUCCGGAACAGGUGGUAGCGAUCGCAUCAAAUAAUG





GAGGUAAACAGGCCUUAGAAACCGUACAGCGCUUACUGCCGGUGUUAUGCCAGGCGCACGGCCUGACGCCAGAACAGGUAGUG





GCAAUCGCCUCAAACAAUGGUGGAAAACAGGCGUUAGAGACAGUCCAGCGCCUGCUGCCUGUAUUAUGUCAAGCCCAUGGCCU





GACCCCAGAGCAAGUUGUUGCGAUUGCAAGUAAUAAUGGGGGUAAACAGGCACUUGAGACAGUUCAACGUUUACUGCCUGUAC





UGUGCCAAGCUCACGGUCUGACUCCGGAACAAGUCGUCGCGAUUGCGAGUAACAUUGGUGGCAAACAAGCAUUAGAAACGGUG





CAACGCCUGCUGCCAGUUCUUUGCCAGGCUCACGGUUUAACCCCUGAGCAGGUUGUAGCUAUUGCGAGUCAUGAUGGUGGUAA





GCAGGCGUUGGAAACUGUGCAAAGACUGCUGCCCGUGUUGUGCCAAGCACAUGGUUUAACCCCAGAACAAGUCGUAGCAAUCG





CAAGCCAUGACGGUGGCAAGCAAGCGCUUGAAACAGUACAGCGUUUAUUACCGGUACUUUGUCAGGCCCACGGUCUUACACCA





GAACAAGUUGUGGCCAUAGCCAGUAACAAUGGCGGAAAGCAGGCUCUGGAAACGGUACAACGUCUGUUACCUGUUCUGUGUCA





AGCGCACGGAUUAACACCUGAACAAGUAGUUGCCAUUGCGUCAAAUAUCGGAGGCAAGCAGGCCUUGGAGACAGUGCAGAGAU





UACUGCCAGUGUUGUGUCAGGCUCAUGGCCUUACACCCGAGCAGGUCGUGGCAAUUGCAUCUAAUGGUGGCGGUAAGCAAGCU





UUAGAGACUGUUCAGAGACUGCUUCCUGUCCUGUGCCAGGCACACGGACUUACGCCUGAGCAAGUGGUUGCAAUCGCCUCUAA





UGGCGGUGGUAAGCAAGCACUGGAAACUGUCCAACGCUUACUUCCGGUGCUUUGUCAAGCACACGGCUUAACGCCAGAGCAGG





UCGUCGCCAUAGCCAGCAAUAUAGGUGGUAAACAGGCCCUUGAAACGGUCCAAAGACUUCUGCCGGUCCUUUGCCAAGCGCAU





GGGCUGACACCUGAGCAGGUAGUCGCGAUUGCCUCAAACAUUGGUGGGAAGCAGGCAUUAGAAACAGUUCAAAGAUUAUUACC





AGUCCUGUGUCAGGCGCAUGGGUUAACCCCAGAGCAGGUAGUUGCAAUAGCAUCCCAUGAUGGCGGAAAACAAGCGUUGGAAA





CGGUUCAGCGGUUAUUGCCUGUUUUGUGCCAGGCGCAUGGUUUGACACCCGAGCAAGUGGUAGCCAUAGCCUCACAUGACGGG





GGUAAACAAGCUUUGGAGACAGUACAACGGCUGCUUCCAGUUUUAUGUCAGGCCCAUGGAUUGACGCCUGAACAAGUUGUCGC





UAUCGCAAGUAAUAUCGGUGGUAAACAAGCGCUUGAAACCGUUCAACGCCUUCUGCCUGUGCUUUGUCAGGCACAUGGAUUAA





CACCCGAACAGGUUGUCGCGAUAGCUUCAAAUGGGGGUGGUCGUCCGGCACUGGAAAGCAUUGUUGCACAGCUGAGCCGUCCU





GAUCCGGCACUGGCAGCACUGACCAAUGAUCAUCUGGUUGCACUGGCAUGUCUGGGUGGUCGCCCUGCCCUGGAUGCAGUUAA





AAAAGGUCUGCCGCAUGCACCGGCACUGAUUAAACGUACCAAUCGUCGUAUUCCGGAACGUACCAGCCAUCGUGUUGCUAGC





>TAL6-VPR protein


SEQ ID NO.: 199



MAPKKKRKVGIHGVPAAGSSGSHMVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQ






DMIAALPEATHEAIVGVGKRGAGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLULTPEQV





VAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNGGGKQALET





VQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLT





PEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQ





ALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQA





HGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNI





GGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPV





LCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAI





ASNIGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTNRRIPERTSHRVASGSGGGSGG





DALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYE





TFKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQA





PAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSE





FQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSR





EGMFLPKPEAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEAS





HLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDE





CLLHAMHISTGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDY*





>TAL6-VPR mRNA


SEQ ID NO.: 200



AGGAAATAAGAGAGAAAAGAAGAGTAAGAAGAAATATAAGAGCCACCATGGCCCCCAAGAAGAAGCGGAAGGTGGGCATCCAC






GGCGTGCCCGCCGCCGGCAGCAGCGGATCCCATATGGTTGATCTGCGTACCCTGGGTTATAGCCAGCAGCAGCAAGAAAAAAT





CAAACCGAAAGTTCGTAGCACCGTTGCACAGCATCATGAAGCACTGGTTGGTCATGGTTTTACCCATGCACATATTGTTGCAC





TGAGCCAGCATCCGGCAGCACTGGGCACCGTTGCAGTTAAATATCAGGATATGATTGCAGCACTGCCGGAAGCAACCCATGAA





GCAATTGTTGGTGTTGGTAAACGCGGAGCTGGTGCACGTGCCCTGGAAGCACTGCTGACCGTTGCCGGTGAACTGCGTGGTCC





GCCTCTGCAGCTGGATACCGGTCAGCTGCTGAAAATTGCAAAACGTGGTGGTGTTACCGCAGTTGAAGCAGTTCATGCATGGC





GTAATGCACTGACCGGTGCACCGCTGAATCTGACACCGGAACAGGTTGTTGCAATTGCCAGCAATAATGGTGGCAAACAGGCA





CTGGAAACCGTTCAGCGTCTGCTGCCGGTTCTGTGTCAGGCACATGGTCTGACCCCTGAACAGGTGGTGGCCATTGCAAGCAA





CAATGGCGGTAAACAAGCCCTGGAAACAGTGCAGCGCCTGTTACCGGTGCTGTGCCAGGCCCATGGCTTAACTCCGGAACAGG





TGGTAGCGATCGCATCAAATGGTGGAGGTAAACAGGCCTTAGAAACCGTACAGCGCTTACTGCCGGTGTTATGCCAGGCGCAC





GGCCTGACGCCAGAACAGGTAGTGGCAATCGCCTCAAATAATGGTGGAAAACAGGCGTTAGAGACAGTCCAGCGCCTGCTGCC





TGTATTATGTCAAGCCCATGGCCTGACCCCAGAGCAAGTTGTTGCGATTGCAAGTAATGGCGGGGGTAAACAGGCACTTGAGA





CAGTTCAACGTTTACTGCCTGTACTGTGCCAAGCTCACGGTCTGACTCCGGAACAAGTCGTCGCGATTGCGAGTCATGATGGT





GGCAAACAAGCATTAGAAACGGTGCAACGCCTGCTGCCAGTTCTTTGCCAGGCTCACGGTTTAACCCCTGAGCAGGTTGTAGC





TATTGCGAGTAACATTGGTGGTAAGCAGGCGTTGGAAACTGTGCAAAGACTGCTGCCCGTGTTGTGCCAAGCACATGGTTTAA





CCCCAGAACAAGTCGTAGCAATCGCAAGCAACAATGGTGGCAAGCAAGCGCTTGAAACAGTACAGCGTTTATTACCGGTACTT





TGTCAGGCCCACGGTCTTACACCAGAACAAGTTGTGGCCATAGCCAGTCATGACGGCGGAAAGCAGGCTCTGGAAACGGTACA





ACGTCTGTTACCTGTTCTGTGTCAAGCGCACGGATTAACACCTGAACAAGTAGTTGCCATTGCGTCACATGATGGAGGCAAGC





AGGCCTTGGAGACAGTGCAGAGATTACTGCCAGTGTTGTGTCAGGCTCATGGCCTTACACCCGAGCAGGTCGTGGCAATTGCA





TCTAATATCGGCGGTAAGCAAGCTTTAGAGACTGTTCAGAGACTGCTTCCTGTCCTGTGCCAGGCACACGGACTTACGCCTGA





GCAAGTGGTTGCAATCGCCTCTAATAATGGTGGTAAGCAAGCACTGGAAACTGTCCAACGCTTACTTCCGGTGCTTTGTCAAG





CACACGGCTTAACGCCAGAGCAGGTCGTCGCCATAGCCAGCAATATAGGTGGTAAACAGGCCCTTGAAACGGTCCAAAGACTT





CTGCCGGTCCTTTGCCAAGCGCATGGGCTGACACCTGAGCAGGTAGTCGCGATTGCCTCAAACATTGGTGGGAAGCAGGCATT





AGAAACAGTTCAAAGATTATTACCAGTCCTGTGTCAGGCGCATGGGTTAACCCCAGAGCAGGTAGTTGCAATAGCATCCAATA





TCGGCGGAAAACAAGCGTTGGAAACGGTTCAGCGGTTATTGCCTGTTTTGTGCCAGGCGCATGGTTTGACACCCGAGCAAGTG





GTAGCCATAGCCTCACATGATGGGGGTAAACAAGCTTTGGAGACAGTACAACGGCTGCTTCCAGTTTTATGTCAGGCCCATGG





ATTGACGCCTGAACAAGTTGTCGCTATCGCAAGTCATGACGGTGGTAAACAAGCGCTTGAAACCGTTCAACGCCTTCTGCCTG





TGCTTTGTCAGGCACATGGATTAACACCCGAACAGGTTGTCGCGATAGCTTCAAATATAGGTGGTCGTCCGGCACTGGAAAGC





ATTGTTGCACAGCTGAGCCGTCCTGATCCGGCACTGGCAGCACTGACCAATGATCATCTGGTTGCACTGGCATGTCTGGGTGG





TCGCCCTGCCCTGGATGCAGTTAAAAAAGGTCTGCCGCATGCACCGGCACTGATTAAACGTACCAATCGTCGTATTCCGGAAC





GTACCAGCCATCGTGTTGCTAGCGGCAGCGGCGGCGGCAGCGGCGGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGC





AGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGCAGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGCAGCGA





CGCCCTGGACGACTTCGACCTGGACATGCTGAGCGGCGGCCCCAAGAAGAAGCGGAAGGTGGGCAGCCAGTACCTGCCCGACA





CCGACGACCGGCACCGGATCGAGGAGAAGCGGAAGCGGACCTACGAGACCTTCAAGAGCATCATGAAGAAATCCCCCTTCAGC





GGCCCCACCGACCCCCGGCCCCCCCCCCGGCGGATCGCCGTGCCCAGCCGGAGCAGCGCCAGCGTGCCCAAGCCCGCCCCCCA





GCCCTACCCCTTCACCAGCAGCCTGAGCACCATCAACTACGACGAGTTCCCCACCATGGTGTTCCCCAGCGGCCAGATCAGCC





AGGCCAGCGCCCTGGCCCCCGCCCCCCCCCAGGTGCTGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCATGGTGAGCGCC





CTGGCCCAGGCCCCCGCCCCCGTGCCCGTGCTGGCCCCCGGCCCCCCCCAGGCCGTGGCCCCCCCCGCCCCCAAGCCCACCCA





GGCCGGCGAGGGCACCCTGAGCGAGGCCCTGCTGCAGCTGCAGTTCGACGACGAGGACCTGGGCGCCCTGCTGGGCAACAGCA





CCGACCCCGCCGTGTTCACCGACCTGGCCAGCGTGGACAACAGCGAGTTCCAGCAGCTGCTGAACCAGGGCATCCCCGTGGCC





CCCCACACCACCGAGCCCATGCTGATGGAGTACCCCGAGGCCATCACCCGGCTGGTGACCGGCGCCCAGCGGCCCCCCGACCC





CGCCCCCGCCCCCCTGGGCGCCCCCGGCCTGCCCAACGGCCTGCTGAGCGGCGACGAGGACTTCAGCAGCATCGCCGACATGG





ACTTCAGCGCCCTGCTGGGCAGCGGCAGCGGCAGCCGGGACAGCCGGGAGGGCATGTTCCTGCCCAAGCCCGAGGCCGGCAGC





GCCATCAGCGACGTGTTCGAGGGCCGGGAGGTGTGCCAGCCCAAGCGGCTCCGGCCCTTCCACCCCCCCGGCAGCCCCTGGGC





CAACCGGCCCCTGCCCGCCAGCCTGGCCCCCACCCCCACCGGCCCCGTGCACGAGCCCGTGGGCAGCCTGACCCCCGCCCCCG





TGCCCCAGCCCCTGGACCCCGCCCCCGCCGTGACCCCCGAGGCCAGCCACCTGCTGGAGGACCCCGACGAGGAGACCAGCCAG





GCCGTGAAGGCCCTGCGGGAGATGGCCGACACCGTGATCCCCCAGAAGGAGGAGGCCGCCATCTGCGGCCAGATGGACCTGAG





CCACCCCCCCCCCCGGGGCCACCTGGACGAGCTGACCACCACCCTGGAGAGCATGACCGAGGACCTGAACCTGGACAGCCCCC





TGACCCCCGAGCTGAACGAGATCCTGGACACCTTCCTGAACGACGAGTGCCTGCTGCACGCCATGCACATCAGCACCGGCCTG





AGCATCTTCGACACCAGCCTGTTCAGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGG





CAGCTACCCCTACGACGTGCCCGACTACGCCTGAGCGGCCGCTTAATTAAGCTGCCTTCTGCGGGGCTTGCCTTCTGGCCATG





CCCTTCTTCTCTCCCTTGCACCTGTACCTCTTGGTCTTTGAATAAAGCCTGAGTAGGAAGTCTAGAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>TAL6 amino acid sequence


SEQ ID NO.: 201



GSHMVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGVGKR






GAGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLULTPEQVVAIASUNGGKQALETVQRLL





PVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVV





AIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETV





QRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTP





EQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQA





LETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAH





GLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDG





GKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGRPALESIVAQLSRP





DPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTURRIPERTSHRVAS





>TAL6 mRNA sequence


SEQ ID NO.: 249



GGAUCCCAUAUGGUUGAUCUGCGUACCCUGGGUUAUAGCCAGCAGCAGCAAGAAAAAAUCAAACCGAAAGUUCGUAGCACCGU






UGCACAGCAUCAUGAAGCACUGGUUGGUCAUGGUUUUACCCAUGCACAUAUUGUUGCACUGAGCCAGCAUCCGGCAGCACUGG





GCACCGUUGCAGUUAAAUAUCAGGAUAUGAUUGCAGCACUGCCGGAAGCAACCCAUGAAGCAAUUGUUGGUGUUGGUAAACGC





GGAGCUGGUGCACGUGCCCUGGAAGCACUGCUGACCGUUGCCGGUGAACUGCGUGGUCCGCCUCUGCAGCUGGAUACCGGUCA





GCUGCUGAAAAUUGCAAAACGUGGUGGUGUUACCGCAGUUGAAGCAGUUCAUGCAUGGCGUAAUGCACUGACCGGUGCACCGC





UGAAUCUGACACCGGAACAGGUUGUUGCAAUUGCCAGCAAUAAUGGUGGCAAACAGGCACUGGAAACCGUUCAGCGUCUGCUG





CCGGUUCUGUGUCAGGCACAUGGUCUGACCCCUGAACAGGUGGUGGCCAUUGCAAGCAACAAUGGCGGUAAACAAGCCCUGGA





AACAGUGCAGCGCCUGUUACCGGUGCUGUGCCAGGCCCAUGGCUUAACUCCGGAACAGGUGGUAGCGAUCGCAUCAAAUGGUG





GAGGUAAACAGGCCUUAGAAACCGUACAGCGCUUACUGCCGGUGUUAUGCCAGGCGCACGGCCUGACGCCAGAACAGGUAGUG





GCAAUCGCCUCAAAUAAUGGUGGAAAACAGGCGUUAGAGACAGUCCAGCGCCUGCUGCCUGUAUUAUGUCAAGCCCAUGGCCU





GACCCCAGAGCAAGUUGUUGCGAUUGCAAGUAAUGGCGGGGGUAAACAGGCACUUGAGACAGUUCAACGUUUACUGCCUGUAC





UGUGCCAAGCUCACGGUCUGACUCCGGAACAAGUCGUCGCGAUUGCGAGUCAUGAUGGUGGCAAACAAGCAUUAGAAACGGUG





CAACGCCUGCUGCCAGUUCUUUGCCAGGCUCACGGUUUAACCCCUGAGCAGGUUGUAGCUAUUGCGAGUAACAUUGGUGGUAA





GCAGGCGUUGGAAACUGUGCAAAGACUGCUGCCCGUGUUGUGCCAAGCACAUGGUUUAACCCCAGAACAAGUCGUAGCAAUCG





CAAGCAACAAUGGUGGCAAGCAAGCGCUUGAAACAGUACAGCGUUUAUUACCGGUACUUUGUCAGGCCCACGGUCUUACACCA





GAACAAGUUGUGGCCAUAGCCAGUCAUGACGGCGGAAAGCAGGCUCUGGAAACGGUACAACGUCUGUUACCUGUUCUGUGUCA





AGCGCACGGAUUAACACCUGAACAAGUAGUUGCCAUUGCGUCACAUGAUGGAGGCAAGCAGGCCUUGGAGACAGUGCAGAGAU





UACUGCCAGUGUUGUGUCAGGCUCAUGGCCUUACACCCGAGCAGGUCGUGGCAAUUGCAUCUAAUAUCGGCGGUAAGCAAGCU





UUAGAGACUGUUCAGAGACUGCUUCCUGUCCUGUGCCAGGCACACGGACUUACGCCUGAGCAAGUGGUUGCAAUCGCCUCUAA





UAAUGGUGGUAAGCAAGCACUGGAAACUGUCCAACGCUUACUUCCGGUGCUUUGUCAAGCACACGGCUUAACGCCAGAGCAGG





UCGUCGCCAUAGCCAGCAAUAUAGGUGGUAAACAGGCCCUUGAAACGGUCCAAAGACUUCUGCCGGUCCUUUGCCAAGCGCAU





GGGCUGACACCUGAGCAGGUAGUCGCGAUUGCCUCAAACAUUGGUGGGAAGCAGGCAUUAGAAACAGUUCAAAGAUUAUUACC





AGUCCUGUGUCAGGCGCAUGGGUUAACCCCAGAGCAGGUAGUUGCAAUAGCAUCCAAUAUCGGCGGAAAACAAGCGUUGGAAA





CGGUUCAGCGGUUAUUGCCUGUUUUGUGCCAGGCGCAUGGUUUGACACCCGAGCAAGUGGUAGCCAUAGCCUCACAUGAUGGG





GGUAAACAAGCUUUGGAGACAGUACAACGGCUGCUUCCAGUUUUAUGUCAGGCCCAUGGAUUGACGCCUGAACAAGUUGUCGC





UAUCGCAAGUCAUGACGGUGGUAAACAAGCGCUUGAAACCGUUCAACGCCUUCUGCCUGUGCUUUGUCAGGCACAUGGAUUAA





CACCCGAACAGGUUGUCGCGAUAGCUUCAAAUAUAGGUGGUCGUCCGGCACUGGAAAGCAUUGUUGCACAGCUGAGCCGUCCU





GAUCCGGCACUGGCAGCACUGACCAAUGAUCAUCUGGUUGCACUGGCAUGUCUGGGUGGUCGCCCUGCCCUGGAUGCAGUUAA





AAAAGGUCUGCCGCAUGCACCGGCACUGAUUAAACGUACCAAUCGUCGUAUUCCGGAACGUACCAGCCAUCGUGUUGCUAGC





>TAL7-VPR protein


SEQ ID NO.: 202



MAPKKKRKVGIHGVPAAGSSGSHMVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQ






DMIAALPEATHEAIVGVGKRGAGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLULTPEQV





VAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALET





VQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLT





PEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQ





ALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQA





HGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNN





GGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPV





LCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAI





ASNIGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTURRIPERTSHRVASGSGGGSGG





DALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYE





TFKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQA





PAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSE





FQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSR





EGMFLPKPEAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEAS





HLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDE





CLLHAMHISTGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>TAL7-VPR mRNA


SEQ ID NO.: 203



AGGAAATAAGAGAGAAAAGAAGAGTAAGAAGAAATATAAGAGCCACCATGGCCCCCAAGAAGAAGCGGAAGGTGGGCATCCAC






GGCGTGCCCGCCGCCGGCAGCAGCGGATCCCATATGGTTGATCTGCGTACCCTGGGTTATAGCCAGCAGCAGCAAGAAAAAAT





CAAACCGAAAGTTCGTAGCACCGTTGCACAGCATCATGAAGCACTGGTTGGTCATGGTTTTACCCATGCACATATTGTTGCAC





TGAGCCAGCATCCGGCAGCACTGGGCACCGTTGCAGTTAAATATCAGGATATGATTGCAGCACTGCCGGAAGCAACCCATGAA





GCAATTGTTGGTGTTGGTAAACGCGGAGCTGGTGCACGTGCCCTGGAAGCACTGCTGACCGTTGCCGGTGAACTGCGTGGTCC





GCCTCTGCAGCTGGATACCGGTCAGCTGCTGAAAATTGCAAAACGTGGTGGTGTTACCGCAGTTGAAGCAGTTCATGCATGGC





GTAATGCACTGACCGGTGCACCGCTGAATCTGACACCGGAACAGGTTGTTGCAATTGCCAGCCATGATGGTGGCAAACAGGCA





CTGGAAACCGTTCAGCGTCTGCTGCCGGTTCTGTGTCAGGCACATGGTCTGACCCCTGAACAGGTGGTGGCCATTGCAAGCAA





TGGTGGCGGTAAACAAGCCCTGGAAACAGTGCAGCGCCTGTTACCGGTGCTGTGCCAGGCCCATGGCTTAACTCCGGAACAGG





TGGTAGCGATCGCATCAAATAATGGAGGTAAACAGGCCTTAGAAACCGTACAGCGCTTACTGCCGGTGTTATGCCAGGCGCAC





GGCCTGACGCCAGAACAGGTAGTGGCAATCGCCTCAAACATTGGTGGAAAACAGGCGTTAGAGACAGTCCAGCGCCTGCTGCC





TGTATTATGTCAAGCCCATGGCCTGACCCCAGAGCAAGTTGTTGCGATTGCAAGTAATATCGGGGGTAAACAGGCACTTGAGA





CAGTTCAACGTTTACTGCCTGTACTGTGCCAAGCTCACGGTCTGACTCCGGAACAAGTCGTCGCGATTGCGAGTCATGACGGT





GGCAAACAAGCATTAGAAACGGTGCAACGCCTGCTGCCAGTTCTTTGCCAGGCTCACGGTTTAACCCCTGAGCAGGTTGTAGC





TATTGCGAGTAATATAGGTGGTAAGCAGGCGTTGGAAACTGTGCAAAGACTGCTGCCCGTGTTGTGCCAAGCACATGGTTTAA





CCCCAGAACAAGTCGTAGCAATCGCAAGCAATGGCGGTGGCAAGCAAGCGCTTGAAACAGTACAGCGTTTATTACCGGTACTT





TGTCAGGCCCACGGTCTTACACCAGAACAAGTTGTGGCCATAGCCAGTCATGATGGCGGAAAGCAGGCTCTGGAAACGGTACA





ACGTCTGTTACCTGTTCTGTGTCAAGCGCACGGATTAACACCTGAACAAGTAGTTGCCATTGCGTCAAATAATGGAGGCAAGC





AGGCCTTGGAGACAGTGCAGAGATTACTGCCAGTGTTGTGTCAGGCTCATGGCCTTACACCCGAGCAGGTCGTGGCAATTGCA





TCTAACAATGGCGGTAAGCAAGCTTTAGAGACTGTTCAGAGACTGCTTCCTGTCCTGTGCCAGGCACACGGACTTACGCCTGA





GCAAGTGGTTGCAATCGCCTCTAATGGGGGTGGTAAGCAAGCACTGGAAACTGTCCAACGCTTACTTCCGGTGCTTTGTCAAG





CACACGGCTTAACGCCAGAGCAGGTCGTCGCCATAGCCAGCAATAATGGTGGTAAACAGGCCCTTGAAACGGTCCAAAGACTT





CTGCCGGTCCTTTGCCAAGCGCATGGGCTGACACCTGAGCAGGTAGTCGCGATTGCCTCAAACATTGGTGGGAAGCAGGCATT





AGAAACAGTTCAAAGATTATTACCAGTCCTGTGTCAGGCGCATGGGTTAACCCCAGAGCAGGTAGTTGCAATAGCATCCAACA





ATGGCGGAAAACAAGCGTTGGAAACGGTTCAGCGGTTATTGCCTGTTTTGTGCCAGGCGCATGGTTTGACACCCGAGCAAGTG





GTAGCCATAGCCTCAAACGGTGGGGGTAAACAAGCTTTGGAGACAGTACAACGGCTGCTTCCAGTTTTATGTCAGGCCCATGG





ATTGACGCCTGAACAAGTTGTCGCTATCGCAAGTAATGGTGGTGGTAAACAAGCGCTTGAAACCGTTCAACGCCTTCTGCCTG





TGCTTTGTCAGGCACATGGATTAACACCCGAACAGGTTGTCGCGATAGCTTCAAATATCGGTGGTCGTCCGGCACTGGAAAGC





ATTGTTGCACAGCTGAGCCGTCCTGATCCGGCACTGGCAGCACTGACCAATGATCATCTGGTTGCACTGGCATGTCTGGGTGG





TCGCCCTGCCCTGGATGCAGTTAAAAAAGGTCTGCCGCATGCACCGGCACTGATTAAACGTACCAATCGTCGTATTCCGGAAC





GTACCAGCCATCGTGTTGCTAGCGGCAGCGGCGGCGGCAGCGGCGGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGC





AGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGCAGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGCAGCGA





CGCCCTGGACGACTTCGACCTGGACATGCTGAGCGGCGGCCCCAAGAAGAAGCGGAAGGTGGGCAGCCAGTACCTGCCCGACA





CCGACGACCGGCACCGGATCGAGGAGAAGCGGAAGCGGACCTACGAGACCTTCAAGAGCATCATGAAGAAATCCCCCTTCAGC





GGCCCCACCGACCCCCGGCCCCCCCCCCGGCGGATCGCCGTGCCCAGCCGGAGCAGCGCCAGCGTGCCCAAGCCCGCCCCCCA





GCCCTACCCCTTCACCAGCAGCCTGAGCACCATCAACTACGACGAGTTCCCCACCATGGTGTTCCCCAGCGGCCAGATCAGCC





AGGCCAGCGCCCTGGCCCCCGCCCCCCCCCAGGTGCTGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCATGGTGAGCGCC





CTGGCCCAGGCCCCCGCCCCCGTGCCCGTGCTGGCCCCCGGCCCCCCCCAGGCCGTGGCCCCCCCCGCCCCCAAGCCCACCCA





GGCCGGCGAGGGCACCCTGAGCGAGGCCCTGCTGCAGCTGCAGTTCGACGACGAGGACCTGGGCGCCCTGCTGGGCAACAGCA





CCGACCCCGCCGTGTTCACCGACCTGGCCAGCGTGGACAACAGCGAGTTCCAGCAGCTGCTGAACCAGGGCATCCCCGTGGCC





CCCCACACCACCGAGCCCATGCTGATGGAGTACCCCGAGGCCATCACCCGGCTGGTGACCGGCGCCCAGCGGCCCCCCGACCC





CGCCCCCGCCCCCCTGGGCGCCCCCGGCCTGCCCAACGGCCTGCTGAGCGGCGACGAGGACTTCAGCAGCATCGCCGACATGG





ACTTCAGCGCCCTGCTGGGCAGCGGCAGCGGCAGCCGGGACAGCCGGGAGGGCATGTTCCTGCCCAAGCCCGAGGCCGGCAGC





GCCATCAGCGACGTGTTCGAGGGCCGGGAGGTGTGCCAGCCCAAGCGGCTCCGGCCCTTCCACCCCCCCGGCAGCCCCTGGGC





CAACCGGCCCCTGCCCGCCAGCCTGGCCCCCACCCCCACCGGCCCCGTGCACGAGCCCGTGGGCAGCCTGACCCCCGCCCCCG





TGCCCCAGCCCCTGGACCCCGCCCCCGCCGTGACCCCCGAGGCCAGCCACCTGCTGGAGGACCCCGACGAGGAGACCAGCCAG





GCCGTGAAGGCCCTGCGGGAGATGGCCGACACCGTGATCCCCCAGAAGGAGGAGGCCGCCATCTGCGGCCAGATGGACCTGAG





CCACCCCCCCCCCCGGGGCCACCTGGACGAGCTGACCACCACCCTGGAGAGCATGACCGAGGACCTGAACCTGGACAGCCCCC





TGACCCCCGAGCTGAACGAGATCCTGGACACCTTCCTGAACGACGAGTGCCTGCTGCACGCCATGCACATCAGCACCGGCCTG





AGCATCTTCGACACCAGCCTGTTCAGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGG





CAGCTACCCCTACGACGTGCCCGACTACGCCTGAGCGGCCGCTTAATTAAGCTGCCTTCTGCGGGGCTTGCCTTCTGGCCATG





CCCTTCTTCTCTCCCTTGCACCTGTACCTCTTGGTCTTTGAATAAAGCCTGAGTAGGAAGTCTAGAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>TAL7 amino acid sequence


SEQ ID NO.: 204



GSHMVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGVGKR






GAGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLNLTPEQVVAIASHDGGKQALETVQRLL





PVLCQAHGLTPEQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPEQVV





AIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETV





QRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNGGGKQALETVQRLLPVLCQAHGLTP





EQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQA





LETVQRLLPVLCQAHGLTPEQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQALETVQRLLPVLCQAH





GLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNGG





GKQALETVQRLLPVLCQAHGLTPEQVVAIASNGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGRPALESIVAQLSRP





DPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTNRRIPERTSHRVAS





>TAL7 mRNA sequence


SEQ ID NO.: 250



GGAUCCCAUAUGGUUGAUCUGCGUACCCUGGGUUAUAGCCAGCAGCAGCAAGAAAAAAUCAAACCGAAAGUUCGUAGCACCGU






UGCACAGCAUCAUGAAGCACUGGUUGGUCAUGGUUUUACCCAUGCACAUAUUGUUGCACUGAGCCAGCAUCCGGCAGCACUGG





GCACCGUUGCAGUUAAAUAUCAGGAUAUGAUUGCAGCACUGCCGGAAGCAACCCAUGAAGCAAUUGUUGGUGUUGGUAAACGC





GGAGCUGGUGCACGUGCCCUGGAAGCACUGCUGACCGUUGCCGGUGAACUGCGUGGUCCGCCUCUGCAGCUGGAUACCGGUCA





GCUGCUGAAAAUUGCAAAACGUGGUGGUGUUACCGCAGUUGAAGCAGUUCAUGCAUGGCGUAAUGCACUGACCGGUGCACCGC





UGAAUCUGACACCGGAACAGGUUGUUGCAAUUGCCAGCCAUGAUGGUGGCAAACAGGCACUGGAAACCGUUCAGCGUCUGCUG





CCGGUUCUGUGUCAGGCACAUGGUCUGACCCCUGAACAGGUGGUGGCCAUUGCAAGCAAUGGUGGCGGUAAACAAGCCCUGGA





AACAGUGCAGCGCCUGUUACCGGUGCUGUGCCAGGCCCAUGGCUUAACUCCGGAACAGGUGGUAGCGAUCGCAUCAAAUAAUG





GAGGUAAACAGGCCUUAGAAACCGUACAGCGCUUACUGCCGGUGUUAUGCCAGGCGCACGGCCUGACGCCAGAACAGGUAGUG





GCAAUCGCCUCAAACAUUGGUGGAAAACAGGCGUUAGAGACAGUCCAGCGCCUGCUGCCUGUAUUAUGUCAAGCCCAUGGCCU





GACCCCAGAGCAAGUUGUUGCGAUUGCAAGUAAUAUCGGGGGUAAACAGGCACUUGAGACAGUUCAACGUUUACUGCCUGUAC





UGUGCCAAGCUCACGGUCUGACUCCGGAACAAGUCGUCGCGAUUGCGAGUCAUGACGGUGGCAAACAAGCAUUAGAAACGGUG





CAACGCCUGCUGCCAGUUCUUUGCCAGGCUCACGGUUUAACCCCUGAGCAGGUUGUAGCUAUUGCGAGUAAUAUAGGUGGUAA





GCAGGCGUUGGAAACUGUGCAAAGACUGCUGCCCGUGUUGUGCCAAGCACAUGGUUUAACCCCAGAACAAGUCGUAGCAAUCG





CAAGCAAUGGCGGUGGCAAGCAAGCGCUUGAAACAGUACAGCGUUUAUUACCGGUACUUUGUCAGGCCCACGGUCUUACACCA





GAACAAGUUGUGGCCAUAGCCAGUCAUGAUGGCGGAAAGCAGGCUCUGGAAACGGUACAACGUCUGUUACCUGUUCUGUGUCA





AGCGCACGGAUUAACACCUGAACAAGUAGUUGCCAUUGCGUCAAAUAAUGGAGGCAAGCAGGCCUUGGAGACAGUGCAGAGAU





UACUGCCAGUGUUGUGUCAGGCUCAUGGCCUUACACCCGAGCAGGUCGUGGCAAUUGCAUCUAACAAUGGCGGUAAGCAAGCU





UUAGAGACUGUUCAGAGACUGCUUCCUGUCCUGUGCCAGGCACACGGACUUACGCCUGAGCAAGUGGUUGCAAUCGCCUCUAA





UGGGGGUGGUAAGCAAGCACUGGAAACUGUCCAACGCUUACUUCCGGUGCUUUGUCAAGCACACGGCUUAACGCCAGAGCAGG





UCGUCGCCAUAGCCAGCAAUAAUGGUGGUAAACAGGCCCUUGAAACGGUCCAAAGACUUCUGCCGGUCCUUUGCCAAGCGCAU





GGGCUGACACCUGAGCAGGUAGUCGCGAUUGCCUCAAACAUUGGUGGGAAGCAGGCAUUAGAAACAGUUCAAAGAUUAUUACC





AGUCCUGUGUCAGGCGCAUGGGUUAACCCCAGAGCAGGUAGUUGCAAUAGCAUCCAACAAUGGCGGAAAACAAGCGUUGGAAA





CGGUUCAGCGGUUAUUGCCUGUUUUGUGCCAGGCGCAUGGUUUGACACCCGAGCAAGUGGUAGCCAUAGCCUCAAACGGUGGG





GGUAAACAAGCUUUGGAGACAGUACAACGGCUGCUUCCAGUUUUAUGUCAGGCCCAUGGAUUGACGCCUGAACAAGUUGUCGC





UAUCGCAAGUAAUGGUGGUGGUAAACAAGCGCUUGAAACCGUUCAACGCCUUCUGCCUGUGCUUUGUCAGGCACAUGGAUUAA





CACCCGAACAGGUUGUCGCGAUAGCUUCAAAUAUCGGUGGUCGUCCGGCACUGGAAAGCAUUGUUGCACAGCUGAGCCGUCCU





GAUCCGGCACUGGCAGCACUGACCAAUGAUCAUCUGGUUGCACUGGCAUGUCUGGGUGGUCGCCCUGCCCUGGAUGCAGUUAA





AAAAGGUCUGCCGCAUGCACCGGCACUGAUUAAACGUACCAAUCGUCGUAUUCCGGAACGUACCAGCCAUCGUGUUGCUAGC





>TAL8-VPR protein


SEQ ID NO.: 205



MAPKKKRKVGIHGVPAAGSSGSHMVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQ






DMIAALPEATHEAIVGVGKRGAGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLULTPEQV





VAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALET





VQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLT





PEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNGGGKQ





ALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQA





HGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNI





GGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPV





LCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAI





ASUGGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTURRIPERTSHRVASGSGGGSGG





DALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYE





TFKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQA





PAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSE





FQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSR





EGMFLPKPEAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEAS





HLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDE





CLLHAMHISTGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>TAL8-VPR mRNA


SEQ ID NO.: 206



AGGAAATAAGAGAGAAAAGAAGAGTAAGAAGAAATATAAGAGCCACCATGGCCCCCAAGAAGAAGCGGAAGGTGGGCATCCAC






GGCGTGCCCGCCGCCGGCAGCAGCGGATCCCATATGGTTGATCTGCGTACCCTGGGTTATAGCCAGCAGCAGCAAGAAAAAAT





CAAACCGAAAGTTCGTAGCACCGTTGCACAGCATCATGAAGCACTGGTTGGTCATGGTTTTACCCATGCACATATTGTTGCAC





TGAGCCAGCATCCGGCAGCACTGGGCACCGTTGCAGTTAAATATCAGGATATGATTGCAGCACTGCCGGAAGCAACCCATGAA





GCAATTGTTGGTGTTGGTAAACGCGGAGCTGGTGCACGTGCCCTGGAAGCACTGCTGACCGTTGCCGGTGAACTGCGTGGTCC





GCCTCTGCAGCTGGATACCGGTCAGCTGCTGAAAATTGCAAAACGTGGTGGTGTTACCGCAGTTGAAGCAGTTCATGCATGGC





GTAATGCACTGACCGGTGCACCGCTGAATCTGACACCGGAACAGGTTGTTGCAATTGCCAGCCATGATGGTGGCAAACAGGCA





CTGGAAACCGTTCAGCGTCTGCTGCCGGTTCTGTGTCAGGCACATGGTCTGACCCCTGAACAGGTGGTGGCCATTGCAAGCAA





TGGTGGCGGTAAACAAGCCCTGGAAACAGTGCAGCGCCTGTTACCGGTGCTGTGCCAGGCCCATGGCTTAACTCCGGAACAGG





TGGTAGCGATCGCATCAAATGGCGGAGGTAAACAGGCCTTAGAAACCGTACAGCGCTTACTGCCGGTGTTATGCCAGGCGCAC





GGCCTGACGCCAGAACAGGTAGTGGCAATCGCCTCAAATAATGGTGGAAAACAGGCGTTAGAGACAGTCCAGCGCCTGCTGCC





TGTATTATGTCAAGCCCATGGCCTGACCCCAGAGCAAGTTGTTGCGATTGCAAGTAACAATGGGGGTAAACAGGCACTTGAGA





CAGTTCAACGTTTACTGCCTGTACTGTGCCAAGCTCACGGTCTGACTCCGGAACAAGTCGTCGCGATTGCGAGTAATAATGGT





GGCAAACAAGCATTAGAAACGGTGCAACGCCTGCTGCCAGTTCTTTGCCAGGCTCACGGTTTAACCCCTGAGCAGGTTGTAGC





TATTGCGAGTAACAATGGTGGTAAGCAGGCGTTGGAAACTGTGCAAAGACTGCTGCCCGTGTTGTGCCAAGCACATGGTTTAA





CCCCAGAACAAGTCGTAGCAATCGCAAGCAATGGGGGTGGCAAGCAAGCGCTTGAAACAGTACAGCGTTTATTACCGGTACTT





TGTCAGGCCCACGGTCTTACACCAGAACAAGTTGTGGCCATAGCCAGTAATAATGGCGGAAAGCAGGCTCTGGAAACGGTACA





ACGTCTGTTACCTGTTCTGTGTCAAGCGCACGGATTAACACCTGAACAAGTAGTTGCCATTGCGTCAAACATTGGAGGCAAGC





AGGCCTTGGAGACAGTGCAGAGATTACTGCCAGTGTTGTGTCAGGCTCATGGCCTTACACCCGAGCAGGTCGTGGCAATTGCA





TCTCATGACGGCGGTAAGCAAGCTTTAGAGACTGTTCAGAGACTGCTTCCTGTCCTGTGCCAGGCACACGGACTTACGCCTGA





GCAAGTGGTTGCAATCGCCTCTAATATCGGTGGTAAGCAAGCACTGGAAACTGTCCAACGCTTACTTCCGGTGCTTTGTCAAG





CACACGGCTTAACGCCAGAGCAGGTCGTCGCCATAGCCAGCAATATAGGTGGTAAACAGGCCCTTGAAACGGTCCAAAGACTT





CTGCCGGTCCTTTGCCAAGCGCATGGGCTGACACCTGAGCAGGTAGTCGCGATTGCCTCAAACGGTGGTGGGAAGCAGGCATT





AGAAACAGTTCAAAGATTATTACCAGTCCTGTGTCAGGCGCATGGGTTAACCCCAGAGCAGGTAGTTGCAATAGCATCCAACA





ATGGCGGAAAACAAGCGTTGGAAACGGTTCAGCGGTTATTGCCTGTTTTGTGCCAGGCGCATGGTTTGACACCCGAGCAAGTG





GTAGCCATAGCCTCAAATAATGGGGGTAAACAAGCTTTGGAGACAGTACAACGGCTGCTTCCAGTTTTATGTCAGGCCCATGG





ATTGACGCCTGAACAAGTTGTCGCTATCGCAAGTCATGATGGTGGTAAACAAGCGCTTGAAACCGTTCAACGCCTTCTGCCTG





TGCTTTGTCAGGCACATGGATTAACACCCGAACAGGTTGTCGCGATAGCTTCAAATGGTGGTGGTCGTCCGGCACTGGAAAGC





ATTGTTGCACAGCTGAGCCGTCCTGATCCGGCACTGGCAGCACTGACCAATGATCATCTGGTTGCACTGGCATGTCTGGGTGG





TCGCCCTGCCCTGGATGCAGTTAAAAAAGGTCTGCCGCATGCACCGGCACTGATTAAACGTACCAATCGTCGTATTCCGGAAC





GTACCAGCCATCGTGTTGCTAGCGGCAGCGGCGGCGGCAGCGGCGGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGC





AGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGCAGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGCAGCGA





CGCCCTGGACGACTTCGACCTGGACATGCTGAGCGGCGGCCCCAAGAAGAAGCGGAAGGTGGGCAGCCAGTACCTGCCCGACA





CCGACGACCGGCACCGGATCGAGGAGAAGCGGAAGCGGACCTACGAGACCTTCAAGAGCATCATGAAGAAATCCCCCTTCAGC





GGCCCCACCGACCCCCGGCCCCCCCCCCGGCGGATCGCCGTGCCCAGCCGGAGCAGCGCCAGCGTGCCCAAGCCCGCCCCCCA





GCCCTACCCCTTCACCAGCAGCCTGAGCACCATCAACTACGACGAGTTCCCCACCATGGTGTTCCCCAGCGGCCAGATCAGCC





AGGCCAGCGCCCTGGCCCCCGCCCCCCCCCAGGTGCTGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCATGGTGAGCGCC





CTGGCCCAGGCCCCCGCCCCCGTGCCCGTGCTGGCCCCCGGCCCCCCCCAGGCCGTGGCCCCCCCCGCCCCCAAGCCCACCCA





GGCCGGCGAGGGCACCCTGAGCGAGGCCCTGCTGCAGCTGCAGTTCGACGACGAGGACCTGGGCGCCCTGCTGGGCAACAGCA





CCGACCCCGCCGTGTTCACCGACCTGGCCAGCGTGGACAACAGCGAGTTCCAGCAGCTGCTGAACCAGGGCATCCCCGTGGCC





CCCCACACCACCGAGCCCATGCTGATGGAGTACCCCGAGGCCATCACCCGGCTGGTGACCGGCGCCCAGCGGCCCCCCGACCC





CGCCCCCGCCCCCCTGGGCGCCCCCGGCCTGCCCAACGGCCTGCTGAGCGGCGACGAGGACTTCAGCAGCATCGCCGACATGG





ACTTCAGCGCCCTGCTGGGCAGCGGCAGCGGCAGCCGGGACAGCCGGGAGGGCATGTTCCTGCCCAAGCCCGAGGCCGGCAGC





GCCATCAGCGACGTGTTCGAGGGCCGGGAGGTGTGCCAGCCCAAGCGGCTCCGGCCCTTCCACCCCCCCGGCAGCCCCTGGGC





CAACCGGCCCCTGCCCGCCAGCCTGGCCCCCACCCCCACCGGCCCCGTGCACGAGCCCGTGGGCAGCCTGACCCCCGCCCCCG





TGCCCCAGCCCCTGGACCCCGCCCCCGCCGTGACCCCCGAGGCCAGCCACCTGCTGGAGGACCCCGACGAGGAGACCAGCCAG





GCCGTGAAGGCCCTGCGGGAGATGGCCGACACCGTGATCCCCCAGAAGGAGGAGGCCGCCATCTGCGGCCAGATGGACCTGAG





CCACCCCCCCCCCCGGGGCCACCTGGACGAGCTGACCACCACCCTGGAGAGCATGACCGAGGACCTGAACCTGGACAGCCCCC





TGACCCCCGAGCTGAACGAGATCCTGGACACCTTCCTGAACGACGAGTGCCTGCTGCACGCCATGCACATCAGCACCGGCCTG





AGCATCTTCGACACCAGCCTGTTCAGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGG





CAGCTACCCCTACGACGTGCCCGACTACGCCTGAGCGGCCGCTTAATTAAGCTGCCTTCTGCGGGGCTTGCCTTCTGGCCATG





CCCTTCTTCTCTCCCTTGCACCTGTACCTCTTGGTCTTTGAATAAAGCCTGAGTAGGAAGTCTAGAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>TAL8 amino acid sequence


SEQ ID NO.: 207



GSHMVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGVGKR






GAGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLULTPEQVVAIASHDGGKQALETVQRLL





PVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVV





AIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNGGKQALETV





QRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTP





EQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQA





LETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAH





GLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNNG





GKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGRPALESIVAQLSRP





DPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTURRIPERTSHRVAS





>TAL8 mRNA sequence


SEQ ID NO.: 251



GGAUCCCAUAUGGUUGAUCUGCGUACCCUGGGUUAUAGCCAGCAGCAGCAAGAAAAAAUCAAACCGAAAGUUCGUAGCACCGU






UGCACAGCAUCAUGAAGCACUGGUUGGUCAUGGUUUUACCCAUGCACAUAUUGUUGCACUGAGCCAGCAUCCGGCAGCACUGG





GCACCGUUGCAGUUAAAUAUCAGGAUAUGAUUGCAGCACUGCCGGAAGCAACCCAUGAAGCAAUUGUUGGUGUUGGUAAACGC





GGAGCUGGUGCACGUGCCCUGGAAGCACUGCUGACCGUUGCCGGUGAACUGCGUGGUCCGCCUCUGCAGCUGGAUACCGGUCA





GCUGCUGAAAAUUGCAAAACGUGGUGGUGUUACCGCAGUUGAAGCAGUUCAUGCAUGGCGUAAUGCACUGACCGGUGCACCGC





UGAAUCUGACACCGGAACAGGUUGUUGCAAUUGCCAGCCAUGAUGGUGGCAAACAGGCACUGGAAACCGUUCAGCGUCUGCUG





CCGGUUCUGUGUCAGGCACAUGGUCUGACCCCUGAACAGGUGGUGGCCAUUGCAAGCAAUGGUGGCGGUAAACAAGCCCUGGA





AACAGUGCAGCGCCUGUUACCGGUGCUGUGCCAGGCCCAUGGCUUAACUCCGGAACAGGUGGUAGCGAUCGCAUCAAAUGGCG





GAGGUAAACAGGCCUUAGAAACCGUACAGCGCUUACUGCCGGUGUUAUGCCAGGCGCACGGCCUGACGCCAGAACAGGUAGUG





GCAAUCGCCUCAAAUAAUGGUGGAAAACAGGCGUUAGAGACAGUCCAGCGCCUGCUGCCUGUAUUAUGUCAAGCCCAUGGCCU





GACCCCAGAGCAAGUUGUUGCGAUUGCAAGUAACAAUGGGGGUAAACAGGCACUUGAGACAGUUCAACGUUUACUGCCUGUAC





UGUGCCAAGCUCACGGUCUGACUCCGGAACAAGUCGUCGCGAUUGCGAGUAAUAAUGGUGGCAAACAAGCAUUAGAAACGGUG





CAACGCCUGCUGCCAGUUCUUUGCCAGGCUCACGGUUUAACCCCUGAGCAGGUUGUAGCUAUUGCGAGUAACAAUGGUGGUAA





GCAGGCGUUGGAAACUGUGCAAAGACUGCUGCCCGUGUUGUGCCAAGCACAUGGUUUAACCCCAGAACAAGUCGUAGCAAUCG





CAAGCAAUGGGGGUGGCAAGCAAGCGCUUGAAACAGUACAGCGUUUAUUACCGGUACUUUGUCAGGCCCACGGUCUUACACCA





GAACAAGUUGUGGCCAUAGCCAGUAAUAAUGGCGGAAAGCAGGCUCUGGAAACGGUACAACGUCUGUUACCUGUUCUGUGUCA





AGCGCACGGAUUAACACCUGAACAAGUAGUUGCCAUUGCGUCAAACAUUGGAGGCAAGCAGGCCUUGGAGACAGUGCAGAGAU





UACUGCCAGUGUUGUGUCAGGCUCAUGGCCUUACACCCGAGCAGGUCGUGGCAAUUGCAUCUCAUGACGGCGGUAAGCAAGCU





UUAGAGACUGUUCAGAGACUGCUUCCUGUCCUGUGCCAGGCACACGGACUUACGCCUGAGCAAGUGGUUGCAAUCGCCUCUAA





UAUCGGUGGUAAGCAAGCACUGGAAACUGUCCAACGCUUACUUCCGGUGCUUUGUCAAGCACACGGCUUAACGCCAGAGCAGG





UCGUCGCCAUAGCCAGCAAUAUAGGUGGUAAACAGGCCCUUGAAACGGUCCAAAGACUUCUGCCGGUCCUUUGCCAAGCGCAU





GGGCUGACACCUGAGCAGGUAGUCGCGAUUGCCUCAAACGGUGGUGGGAAGCAGGCAUUAGAAACAGUUCAAAGAUUAUUACC





AGUCCUGUGUCAGGCGCAUGGGUUAACCCCAGAGCAGGUAGUUGCAAUAGCAUCCAACAAUGGCGGAAAACAAGCGUUGGAAA





CGGUUCAGCGGUUAUUGCCUGUUUUGUGCCAGGCGCAUGGUUUGACACCCGAGCAAGUGGUAGCCAUAGCCUCAAAUAAUGGG





GGUAAACAAGCUUUGGAGACAGUACAACGGCUGCUUCCAGUUUUAUGUCAGGCCCAUGGAUUGACGCCUGAACAAGUUGUCGC





UAUCGCAAGUCAUGAUGGUGGUAAACAAGCGCUUGAAACCGUUCAACGCCUUCUGCCUGUGCUUUGUCAGGCACAUGGAUUAA





CACCCGAACAGGUUGUCGCGAUAGCUUCAAAUGGUGGUGGUCGUCCGGCACUGGAAAGCAUUGUUGCACAGCUGAGCCGUCCU





GAUCCGGCACUGGCAGCACUGACCAAUGAUCAUCUGGUUGCACUGGCAUGUCUGGGUGGUCGCCCUGCCCUGGAUGCAGUUAA





AAAAGGUCUGCCGCAUGCACCGGCACUGAUUAAACGUACCAAUCGUCGUAUUCCGGAACGUACCAGCCAUCGUGUUGCUAGC





>TAL9-VPR protein


SEQ ID NO.: 208



MAPKKKRKVGIHGVPAAGSSGSHMVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQ






DMIAALPEATHEAIVGVGKRGAGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLULTPEQV





VAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALET





VQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLT





PEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQ





ALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQA





HGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHD





GGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPV





LCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAI





ASNIGGRPALESIVAQLSRPDPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTURRIPERTSHRVASGSGGGSGG





DALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLGSDALDDFDLDMLSGGPKKKRKVGSQYLPDTDDRHRIEEKRKRTYE





TFKSIMKKSPFSGPTDPRPPPRRIAVPSRSSASVPKPAPQPYPFTSSLSTINYDEFPTMVFPSGQISQASALAPAPPQVLPQA





PAPAPAPAMVSALAQAPAPVPVLAPGPPQAVAPPAPKPTQAGEGTLSEALLQLQFDDEDLGALLGNSTDPAVFTDLASVDNSE





FQQLLNQGIPVAPHTTEPMLMEYPEAITRLVTGAQRPPDPAPAPLGAPGLPNGLLSGDEDFSSIADMDFSALLGSGSGSRDSR





EGMFLPKPEAGSAISDVFEGREVCQPKRLRPFHPPGSPWANRPLPASLAPTPTGPVHEPVGSLTPAPVPQPLDPAPAVTPEAS





HLLEDPDEETSQAVKALREMADTVIPQKEEAAICGQMDLSHPPPRGHLDELTTTLESMTEDLNLDSPLTPELNEILDTFLNDE





CLLHAMHISTGLSIFDTSLFSGGKRPAATKKAGQAKKKKGSYPYDVPDYA*





>TAL9-VPR mRNA


SEQ ID NO.: 209



AGGAAATAAGAGAGAAAAGAAGAGTAAGAAGAAATATAAGAGCCACCATGGCCCCCAAGAAGAAGCGGAAGGTGGGCATCCAC






GGCGTGCCCGCCGCCGGCAGCAGCGGATCCCATATGGTTGATCTGCGTACCCTGGGTTATAGCCAGCAGCAGCAAGAAAAAAT





CAAACCGAAAGTTCGTAGCACCGTTGCACAGCATCATGAAGCACTGGTTGGTCATGGTTTTACCCATGCACATATTGTTGCAC





TGAGCCAGCATCCGGCAGCACTGGGCACCGTTGCAGTTAAATATCAGGATATGATTGCAGCACTGCCGGAAGCAACCCATGAA





GCAATTGTTGGTGTTGGTAAACGCGGAGCTGGTGCACGTGCCCTGGAAGCACTGCTGACCGTTGCCGGTGAACTGCGTGGTCC





GCCTCTGCAGCTGGATACCGGTCAGCTGCTGAAAATTGCAAAACGTGGTGGTGTTACCGCAGTTGAAGCAGTTCATGCATGGC





GTAATGCACTGACCGGTGCACCGCTGAATCTGACACCGGAACAGGTTGTTGCAATTGCCAGCAATAATGGTGGCAAACAGGCA





CTGGAAACCGTTCAGCGTCTGCTGCCGGTTCTGTGTCAGGCACATGGTCTGACCCCTGAACAGGTGGTGGCCATTGCAAGCAA





CATTGGCGGTAAACAAGCCCTGGAAACAGTGCAGCGCCTGTTACCGGTGCTGTGCCAGGCCCATGGCTTAACTCCGGAACAGG





TGGTAGCGATCGCATCAAATATCGGAGGTAAACAGGCCTTAGAAACCGTACAGCGCTTACTGCCGGTGTTATGCCAGGCGCAC





GGCCTGACGCCAGAACAGGTAGTGGCAATCGCCTCAAATGGTGGTGGAAAACAGGCGTTAGAGACAGTCCAGCGCCTGCTGCC





TGTATTATGTCAAGCCCATGGCCTGACCCCAGAGCAAGTTGTTGCGATTGCAAGTAATGGCGGGGGTAAACAGGCACTTGAGA





CAGTTCAACGTTTACTGCCTGTACTGTGCCAAGCTCACGGTCTGACTCCGGAACAAGTCGTCGCGATTGCGAGTAATATAGGT





GGCAAACAAGCATTAGAAACGGTGCAACGCCTGCTGCCAGTTCTTTGCCAGGCTCACGGTTTAACCCCTGAGCAGGTTGTAGC





TATTGCGAGTAACAATGGTGGTAAGCAGGCGTTGGAAACTGTGCAAAGACTGCTGCCCGTGTTGTGCCAAGCACATGGTTTAA





CCCCAGAACAAGTCGTAGCAATCGCAAGCAATAATGGTGGCAAGCAAGCGCTTGAAACAGTACAGCGTTTATTACCGGTACTT





TGTCAGGCCCACGGTCTTACACCAGAACAAGTTGTGGCCATAGCCAGTAACAATGGCGGAAAGCAGGCTCTGGAAACGGTACA





ACGTCTGTTACCTGTTCTGTGTCAAGCGCACGGATTAACACCTGAACAAGTAGTTGCCATTGCGTCAAATAATGGAGGCAAGC





AGGCCTTGGAGACAGTGCAGAGATTACTGCCAGTGTTGTGTCAGGCTCATGGCCTTACACCCGAGCAGGTCGTGGCAATTGCA





TCTAACATTGGCGGTAAGCAAGCTTTAGAGACTGTTCAGAGACTGCTTCCTGTCCTGTGCCAGGCACACGGACTTACGCCTGA





GCAAGTGGTTGCAATCGCCTCTAATGGGGGTGGTAAGCAAGCACTGGAAACTGTCCAACGCTTACTTCCGGTGCTTTGTCAAG





CACACGGCTTAACGCCAGAGCAGGTCGTCGCCATAGCCAGCCATGATGGTGGTAAACAGGCCCTTGAAACGGTCCAAAGACTT





CTGCCGGTCCTTTGCCAAGCGCATGGGCTGACACCTGAGCAGGTAGTCGCGATTGCCTCAAACGGTGGTGGGAAGCAGGCATT





AGAAACAGTTCAAAGATTATTACCAGTCCTGTGTCAGGCGCATGGGTTAACCCCAGAGCAGGTAGTTGCAATAGCATCCCATG





ACGGCGGAAAACAAGCGTTGGAAACGGTTCAGCGGTTATTGCCTGTTTTGTGCCAGGCGCATGGTTTGACACCCGAGCAAGTG





GTAGCCATAGCCTCAAACAATGGGGGTAAACAAGCTTTGGAGACAGTACAACGGCTGCTTCCAGTTTTATGTCAGGCCCATGG





ATTGACGCCTGAACAAGTTGTCGCTATCGCAAGTAATAATGGTGGTAAACAAGCGCTTGAAACCGTTCAACGCCTTCTGCCTG





TGCTTTGTCAGGCACATGGATTAACACCCGAACAGGTTGTCGCGATAGCTTCAAATATCGGTGGTCGTCCGGCACTGGAAAGC





ATTGTTGCACAGCTGAGCCGTCCTGATCCGGCACTGGCAGCACTGACCAATGATCATCTGGTTGCACTGGCATGTCTGGGTGG





TCGCCCTGCCCTGGATGCAGTTAAAAAAGGTCTGCCGCATGCACCGGCACTGATTAAACGTACCAATCGTCGTATTCCGGAAC





GTACCAGCCATCGTGTTGCTAGCGGCAGCGGCGGCGGCAGCGGCGGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGC





AGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGCAGCGACGCCCTGGACGACTTCGACCTGGACATGCTGGGCAGCGA





CGCCCTGGACGACTTCGACCTGGACATGCTGAGCGGCGGCCCCAAGAAGAAGCGGAAGGTGGGCAGCCAGTACCTGCCCGACA





CCGACGACCGGCACCGGATCGAGGAGAAGCGGAAGCGGACCTACGAGACCTTCAAGAGCATCATGAAGAAATCCCCCTTCAGC





GGCCCCACCGACCCCCGGCCCCCCCCCCGGCGGATCGCCGTGCCCAGCCGGAGCAGCGCCAGCGTGCCCAAGCCCGCCCCCCA





GCCCTACCCCTTCACCAGCAGCCTGAGCACCATCAACTACGACGAGTTCCCCACCATGGTGTTCCCCAGCGGCCAGATCAGCC





AGGCCAGCGCCCTGGCCCCCGCCCCCCCCCAGGTGCTGCCCCAGGCCCCCGCCCCCGCCCCCGCCCCCGCCATGGTGAGCGCC





CTGGCCCAGGCCCCCGCCCCCGTGCCCGTGCTGGCCCCCGGCCCCCCCCAGGCCGTGGCCCCCCCCGCCCCCAAGCCCACCCA





GGCCGGCGAGGGCACCCTGAGCGAGGCCCTGCTGCAGCTGCAGTTCGACGACGAGGACCTGGGCGCCCTGCTGGGCAACAGCA





CCGACCCCGCCGTGTTCACCGACCTGGCCAGCGTGGACAACAGCGAGTTCCAGCAGCTGCTGAACCAGGGCATCCCCGTGGCC





CCCCACACCACCGAGCCCATGCTGATGGAGTACCCCGAGGCCATCACCCGGCTGGTGACCGGCGCCCAGCGGCCCCCCGACCC





CGCCCCCGCCCCCCTGGGCGCCCCCGGCCTGCCCAACGGCCTGCTGAGCGGCGACGAGGACTTCAGCAGCATCGCCGACATGG





ACTTCAGCGCCCTGCTGGGCAGCGGCAGCGGCAGCCGGGACAGCCGGGAGGGCATGTTCCTGCCCAAGCCCGAGGCCGGCAGC





GCCATCAGCGACGTGTTCGAGGGCCGGGAGGTGTGCCAGCCCAAGCGGCTCCGGCCCTTCCACCCCCCCGGCAGCCCCTGGGC





CAACCGGCCCCTGCCCGCCAGCCTGGCCCCCACCCCCACCGGCCCCGTGCACGAGCCCGTGGGCAGCCTGACCCCCGCCCCCG





TGCCCCAGCCCCTGGACCCCGCCCCCGCCGTGACCCCCGAGGCCAGCCACCTGCTGGAGGACCCCGACGAGGAGACCAGCCAG





GCCGTGAAGGCCCTGCGGGAGATGGCCGACACCGTGATCCCCCAGAAGGAGGAGGCCGCCATCTGCGGCCAGATGGACCTGAG





CCACCCCCCCCCCCGGGGCCACCTGGACGAGCTGACCACCACCCTGGAGAGCATGACCGAGGACCTGAACCTGGACAGCCCCC





TGACCCCCGAGCTGAACGAGATCCTGGACACCTTCCTGAACGACGAGTGCCTGCTGCACGCCATGCACATCAGCACCGGCCTG





AGCATCTTCGACACCAGCCTGTTCAGCGGCGGCAAGCGGCCCGCCGCCACCAAGAAGGCCGGCCAGGCCAAGAAGAAGAAGGG





CAGCTACCCCTACGACGTGCCCGACTACGCCTGAGCGGCCGCTTAATTAAGCTGCCTTCTGCGGGGCTTGCCTTCTGGCCATG





CCCTTCTTCTCTCCCTTGCACCTGTACCTCTTGGTCTTTGAATAAAGCCTGAGTAGGAAGTCTAGAAAAAAAAAAAAAAAAAA





AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA





>TAL9 amino acid sequence


SEQ ID NO.: 210



GSHMVDLRTLGYSQQQQEKIKPKVRSTVAQHHEALVGHGFTHAHIVALSQHPAALGTVAVKYQDMIAALPEATHEAIVGVGKR






GAGARALEALLTVAGELRGPPLQLDTGQLLKIAKRGGVTAVEAVHAWRNALTGAPLULTPEQVVAIASUNGGKQALETVQRLL





PVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETVQRLLPVLCQAHGLTPEQVV





AIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQALETV





QRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTP





EQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGKQA





LETVQRLLPVLCQAHGLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAH





GLTPEQVVAIASUGGGKQALETVQRLLPVLCQAHGLTPEQVVAIASHDGGKQALETVQRLLPVLCQAHGLTPEQVVAIASUNG





GKQALETVQRLLPVLCQAHGLTPEQVVAIASUNGGKQALETVQRLLPVLCQAHGLTPEQVVAIASNIGGRPALESIVAQLSRP





DPALAALTNDHLVALACLGGRPALDAVKKGLPHAPALIKRTURRIPERTSHRVAS





>TAL9 mRNA sequence


SEQ ID NO.: 252



GGAUCCCAUAUGGUUGAUCUGCGUACCCUGGGUUAUAGCCAGCAGCAGCAAGAAAAAAUCAAACCGAAAGUUCGUAGCACCGU






UGCACAGCAUCAUGAAGCACUGGUUGGUCAUGGUUUUACCCAUGCACAUAUUGUUGCACUGAGCCAGCAUCCGGCAGCACUGG





GCACCGUUGCAGUUAAAUAUCAGGAUAUGAUUGCAGCACUGCCGGAAGCAACCCAUGAAGCAAUUGUUGGUGUUGGUAAACGC





GGAGCUGGUGCACGUGCCCUGGAAGCACUGCUGACCGUUGCCGGUGAACUGCGUGGUCCGCCUCUGCAGCUGGAUACCGGUCA





GCUGCUGAAAAUUGCAAAACGUGGUGGUGUUACCGCAGUUGAAGCAGUUCAUGCAUGGCGUAAUGCACUGACCGGUGCACCGC





UGAAUCUGACACCGGAACAGGUUGUUGCAAUUGCCAGCAAUAAUGGUGGCAAACAGGCACUGGAAACCGUUCAGCGUCUGCUG





CCGGUUCUGUGUCAGGCACAUGGUCUGACCCCUGAACAGGUGGUGGCCAUUGCAAGCAACAUUGGCGGUAAACAAGCCCUGGA





AACAGUGCAGCGCCUGUUACCGGUGCUGUGCCAGGCCCAUGGCUUAACUCCGGAACAGGUGGUAGCGAUCGCAUCAAAUAUCG





GAGGUAAACAGGCCUUAGAAACCGUACAGCGCUUACUGCCGGUGUUAUGCCAGGCGCACGGCCUGACGCCAGAACAGGUAGUG





GCAAUCGCCUCAAAUGGUGGUGGAAAACAGGCGUUAGAGACAGUCCAGCGCCUGCUGCCUGUAUUAUGUCAAGCCCAUGGCCU





GACCCCAGAGCAAGUUGUUGCGAUUGCAAGUAAUGGCGGGGGUAAACAGGCACUUGAGACAGUUCAACGUUUACUGCCUGUAC





UGUGCCAAGCUCACGGUCUGACUCCGGAACAAGUCGUCGCGAUUGCGAGUAAUAUAGGUGGCAAACAAGCAUUAGAAACGGUG





CAACGCCUGCUGCCAGUUCUUUGCCAGGCUCACGGUUUAACCCCUGAGCAGGUUGUAGCUAUUGCGAGUAACAAUGGUGGUAA





GCAGGCGUUGGAAACUGUGCAAAGACUGCUGCCCGUGUUGUGCCAAGCACAUGGUUUAACCCCAGAACAAGUCGUAGCAAUCG





CAAGCAAUAAUGGUGGCAAGCAAGCGCUUGAAACAGUACAGCGUUUAUUACCGGUACUUUGUCAGGCCCACGGUCUUACACCA





GAACAAGUUGUGGCCAUAGCCAGUAACAAUGGCGGAAAGCAGGCUCUGGAAACGGUACAACGUCUGUUACCUGUUCUGUGUCA





AGCGCACGGAUUAACACCUGAACAAGUAGUUGCCAUUGCGUCAAAUAAUGGAGGCAAGCAGGCCUUGGAGACAGUGCAGAGAU





UACUGCCAGUGUUGUGUCAGGCUCAUGGCCUUACACCCGAGCAGGUCGUGGCAAUUGCAUCUAACAUUGGCGGUAAGCAAGCU





UUAGAGACUGUUCAGAGACUGCUUCCUGUCCUGUGCCAGGCACACGGACUUACGCCUGAGCAAGUGGUUGCAAUCGCCUCUAA





UGGGGGUGGUAAGCAAGCACUGGAAACUGUCCAACGCUUACUUCCGGUGCUUUGUCAAGCACACGGCUUAACGCCAGAGCAGG





UCGUCGCCAUAGCCAGCCAUGAUGGUGGUAAACAGGCCCUUGAAACGGUCCAAAGACUUCUGCCGGUCCUUUGCCAAGCGCAU





GGGCUGACACCUGAGCAGGUAGUCGCGAUUGCCUCAAACGGUGGUGGGAAGCAGGCAUUAGAAACAGUUCAAAGAUUAUUACC





AGUCCUGUGUCAGGCGCAUGGGUUAACCCCAGAGCAGGUAGUUGCAAUAGCAUCCCAUGACGGCGGAAAACAAGCGUUGGAAA





CGGUUCAGCGGUUAUUGCCUGUUUUGUGCCAGGCGCAUGGUUUGACACCCGAGCAAGUGGUAGCCAUAGCCUCAAACAAUGGG





GGUAAACAAGCUUUGGAGACAGUACAACGGCUGCUUCCAGUUUUAUGUCAGGCCCAUGGAUUGACGCCUGAACAAGUUGUCGC





UAUCGCAAGUAAUAAUGGUGGUAAACAAGCGCUUGAAACCGUUCAACGCCUUCUGCCUGUGCUUUGUCAGGCACAUGGAUUAA





CACCCGAACAGGUUGUCGCGAUAGCUUCAAAUAUCGGUGGUCGUCCGGCACUGGAAAGCAUUGUUGCACAGCUGAGCCGUCCU





GAUCCGGCACUGGCAGCACUGACCAAUGAUCAUCUGGUUGCACUGGCAUGUCUGGGUGGUCGCCCUGCCCUGGAUGCAGUUAA





AAAAGGUCUGCCGCAUGCACCGGCACUGAUUAAACGUACCAAUCGUCGUAUUCCGGAACGUACCAGCCAUCGUGUUGCUAGC






EQUIVALENTS

Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the present invention described herein. Such equivalents are intended to be encompassed by the following claims.

Claims
  • 1. A site-specific HNF4α disrupting agent, comprising a site-specific HNF4α targeting moiety which targets an HNF4α expression control region.
  • 2-4. (canceled)
  • 5. The site-specific HNF4α disrupting agent of claim 1, wherein the expression control region comprises an HNF4α-specific transcriptional control element.
  • 6. The site-specific HNF4α disrupting agent of claim 5, wherein the transcriptional control element comprises an HNF4α promoter.
  • 7. The site-specific HNF4α disrupting agent of claim 1, wherein the HNF4α expression control region comprises the nucleotide sequence of HNF4α promoter 1, or a fragment thereof.
  • 8. The site-specific HNF4α disrupting agent of claim 7, wherein the HNF4α expression control region comprises the nucleotide sequence of any one of the nucleotide sequences in column 3 of Table 1 or column 1 of Table 10.
  • 9-10. (canceled)
  • 11. The site-specific HNF4α disrupting agent of claim 1, wherein the HNF4α targeting moiety comprises a nucleotide sequence having at least 85% nucleotide identity to the entire nucleotide sequence of any of the nucleotide sequences in any one of Tables 2, 3, 4, and 9.
  • 12. The site-specific HNF4α disrupting agent of claim 1, wherein the HNF4α target moiety comprises a polynucleotide encoding a DNA-binding domain of a Transcription activator-like effector (TALE) polypeptide or a zinc finger (ZNF) polypeptide, or fragment thereof, that specifically binds to the HNF4α expression control region.
  • 13. The site-specific HNF4α disrupting agent of claim 12, wherein the DNA-binding domain of the TALE or ZNF polypeptide comprises an amino acid sequence having at least about 85% amino acid identity to the entire amino acid sequence of any one of the amino acid sequences listed in column 5 of Table 6A or column 4 of Table 10.
  • 14. The site-specific HNF4α disrupting agent of claim 1, wherein the expression control region comprises one or more HNF4α-associated anchor sequences within an anchor sequence-mediated conjunction comprising a first and a second HNF4α-associated anchor sequence.
  • 15. The site-specific HNF4α disrupting agent of claim 14, wherein the anchor sequence comprises a CCCTC-binding factor (CTCF) binding motif.
  • 16. The site-specific HNF4α disrupting agent of claim 14, wherein the anchor sequence-mediated conjunction comprises one or more transcriptional control elements internal to the conjunction.
  • 17. The site-specific HNF4α disrupting agent of claim 14, wherein the anchor sequence-mediated conjunction comprises one or more transcriptional control elements external to the conjunction.
  • 18. The site-specific HNF4α disrupting agent of claim 14, wherein the first and/or the second anchor sequence is located within about 500 kb of the transcriptional control element.
  • 19. The site-specific HNF4α disrupting agent of claim 18, wherein the first and/or the second anchor sequence is located within 300 kb of the transcriptional control element.
  • 20-24. (canceled)
  • 25. The site-specific HNF4α disrupting agent of claim 1, wherein the site-specific HNF4α disrupting agent is present in a composition.
  • 26.-29. (canceled)
  • 30. A site-specific HNF4α disrupting agent, comprising a nucleic acid molecule encoding a fusion protein, the fusion protein comprising a site-specific HNF4α targeting moiety which targets an HNF4α expression control region and an effector molecule.
  • 31. The site-specific HNF4α disrupting agent of claim 30, wherein the site-specific HNF4α targeting moiety comprises a polynucleotide encoding a DNA-binding domain of a Transcription activator-like effector (TALE) polypeptide or a zinc finger (ZNF) polypeptide, or fragment thereof, that specifically binds to the HNF4α expression control region.
  • 32. The site-specific HNF4α disrupting agent of claim 31, wherein the DNA-binding domain of the TALE or zinc finger polypeptide comprises an amino acid sequence having at least 85% amino acid identity to the entire amino acid sequence of an amino acid sequence selected from the amino acid sequences listed in column 5 of Table 6A or column 4 of Table 10.
  • 33. The site-specific HNF4α disrupting agent of claim 30, wherein the HNF4α expression control region comprises the nucleotide sequence of HNF4α promoter 1, or a fragment thereof.
  • 34. The site-specific HNF4α disrupting agent of claim 33, wherein the HNF4α expression control region comprises the nucleotide sequence of any one of the nucleotide sequences in column 3 of Table 1 or column 1 of Table 10.
  • 35. The site-specific HNF4α disrupting agent of claim 30, wherein the effector is selected from the group consisting of a nuclease, a physical blocker, an epigenetic recruiter, and an epigenetic CpG modifier, and combinations of any of the foregoing.
  • 36-39. (canceled)
  • 40. The site-specific HNF4α disrupting agent of claim 30, wherein the effector is a VPR (VP64-p65-Rta).
  • 41. The site-specific HNF4α disrupting agent of claim 40, wherein the VPR comprises an amino acid sequence having at least about 85% amino acid identity to the entire amino acid sequence of
  • 42. (canceled)
  • 43. The site-specific HNF4α disrupting agent of claim 30, wherein the effector is a p300.
  • 44. The site-specific HNF4α disrupting agent of claim 43, wherein the p300 comprises an amino acid sequence having at least about 85% identity to the entire amino acid sequence of
  • 45-47. (canceled)
  • 48. The site-specific HNF4α disrupting agent of claim 30, further comprising a second nucleic acid molecule encoding a second fusion protein, wherein the second fusion comprises a second site-specific HNF4α targeting moiety which targets a second HNF4α expression control region and a second effector molecule, wherein the second HNF4α expression control region is different than the HNF4α expression control region.
  • 49-51. (canceled)
  • 52. The site-specific HNF4α disrupting agent of claim 48, wherein the fusion protein and the second fusion protein are operably linked.
  • 53-58. (canceled)
  • 59. A vector comprising a nucleic acid molecule encoding the site-specific HNF4α disrupting agent of claim 30.
  • 60. The vector of claim 59, wherein the vector is a viral expression vector.
  • 61. A cell comprising the site-specific HNF4α disrupting agent of claim 30.
  • 62. The site-specific HNF4α disrupting agent of claim 30, wherein the site-specific HNF4α disrupting agent is present in a composition comprising a lipid formulation.
  • 63-65. (canceled)
  • 66. The site-specific HNF4α disrupting agent of claim 62, wherein the composition comprises a lipid nanoparticle.
  • 67. A method of modulating expression of hepatocyte nuclear factor 4 alpha-(HNF4α) in a cell, the method comprising contacting the cell with a site-specific HNF4α disrupting agent, the disrupting agent comprising a site-specific HNF4α targeting moiety which targets an HNF4α expression control region, and an effector molecule, thereby modulating expression of HNF4α in the cell.
  • 68-107. (canceled)
  • 108. The method of claim 67, wherein the site-specific HNF4α disrupting agent comprises a nucleic acid molecule encoding a first fusion protein, wherein the first fusion protein comprises a first site-specific HNF4α targeting moiety which targets a first HNF4α expression control region and a first effector molecule; anda second nucleic acid molecule encoding a second fusion protein, wherein the second fusion protein comprises a second site-specific HNF4α targeting moiety which targets a second HNF4α expression control region and a second effector molecule, wherein the second HNF4α expression control region is different than the HNF4α expression control region.
  • 109-140. (canceled)
  • 141. The method of claim 67, wherein the cell is within a subject, wherein the subject has an HNF4α-associated disease.
  • 142. (canceled)
  • 143. The method of claim 141, wherein the HNF4α-associated disease is selected from the group consisting of fatty liver (steatosis), nonalcoholic steatohepatitis (NASH), cirrhosis of the liver, accumulation of fat in the liver, inflammation of the liver, hepatocellular necrosis, liver fibrosis, and nonalcoholic fatty liver disease (NAFLD), polycystic kidney disease, inflammatory bowel disease (IBD), and MODY I.
  • 144. A method for treating a subject having an HNF4α-associated disease, comprising administering to the subject a therapeutically effective amount of the site-specific HNF4α disrupting agent, the disrupting agent comprising a site-specific HNF4α targeting moiety which targets an HNF4α expression control region, and an effector molecule, thereby treating the subject.
  • 145. (canceled)
  • 146. The method of claim 144, wherein the HNF4α-associated disease is selected from the group consisting of fatty liver (steatosis), nonalcoholic steatohepatitis (NASH), cirrhosis of the liver, accumulation of fat in the liver, inflammation of the liver, hepatocellular necrosis, liver fibrosis, and nonalcoholic fatty liver disease (NAFLD) and the site-specific HNF4α disrupting agent enhances expression of HNF4α in the subject.
  • 147-179. (canceled)
CROSS-REFERENCE TO RELATED APPLICATIONS

The instant application claims the benefit of priority to U.S. Provisional Application No. 62/904,178, filed on Sep. 23, 2019, the entire contents of which are incorporated herein by reference.

Provisional Applications (1)
Number Date Country
62904178 Sep 2019 US